IBM SPSS Modeler V15.0 enables you to build predictive models to solve business issues, quickly and intuitively, without the need for programming. In this demonstration we are going to show, how you can use the “Auto-Classifier Node”.
The Auto Classifier node can be used for nominal or binary targets. It tests and compares various models in a single run. You can select which algorithms (Decision trees, Neural Networks, KNN, …) you want and even tweak some of the properties for each algorithm so you can run different variations of a single algorithm. It makes it really easy to evaluate all algorithms at once and saves the best models for scoring or further analysis. In the end you can choose which algorithm you want to use for scoring or use them all in an ensemble!
First a brief description of the data. The data comes from the 1994 US Census database. You can find the data here http://archive.ics.uci.edu/ml/datasets/Adult from the UCI Machine Learning Repository. The goal here is to determine whether a person makes over 50K a year. It has 14 variables both categorical and numeric.
First step is to import the data. The data are in csv format so we can use the “Var. File” node to import them. All you have to do is define the source path and we are ready to import the data.
Then we can use the “Data Audit” node to inspect the data. This is one of the most useful nodes of SPSS Modeler. It will display a graph and statistics for all variables and locate if there are missing values or outliers in the data. I am going to write more about this in another tutorial.