The implementation assumes classes are 0-indexed, not 1-indexed. You should set numClasses = 3 and change your labels to 0, 1, 2.
On Thu, Dec 11, 2014 at 3:40 AM, Ge, Yao (Y.) <[email protected]> wrote: > I am testing decision tree using iris.scale data set > (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#iris) > > In the data set there are three class labels 1, 2, and 3. However in the > following code, I have to make numClasses = 4. I will get an > ArrayIndexOutOfBound Exception if I make the numClasses = 3. Why? > > > > var conf = new SparkConf().setAppName("DecisionTree") > > var sc = new SparkContext(conf) > > > > val data = MLUtils.loadLibSVMFile(sc,"data/iris.scale.txt"); > > val numClasses = 4; > > val categoricalFeaturesInfo = Map[Int,Int](); > > val impurity = "gini"; > > val maxDepth = 5; > > val maxBins = 100; > > > > val model = DecisionTree.trainClassifier(data, numClasses, > categoricalFeaturesInfo, impurity, maxDepth, maxBins); > > > > val labelAndPreds = data.map{ point => > > val prediction = model.predict(point.features); > > (point.label, prediction) > > } > > > > val trainErr = labelAndPreds.filter(r => r._1 != r._2).count.toDouble / > data.count; > > println("Training Error = " + trainErr); > > println("Learned classification tree model:\n" + model); > > > > -Yao --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
