The implementation assumes classes are 0-indexed, not 1-indexed. You
should set numClasses = 3 and change your labels to 0, 1, 2.

On Thu, Dec 11, 2014 at 3:40 AM, Ge, Yao (Y.) <y...@ford.com> wrote:
> I am testing decision tree using iris.scale data set
> (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html#iris)
>
> In the data set there are three class labels 1, 2, and 3. However in the
> following code, I have to make numClasses = 4. I will get an
> ArrayIndexOutOfBound Exception if I make the numClasses = 3. Why?
>
>
>
>     var conf = new SparkConf().setAppName("DecisionTree")
>
>     var sc = new SparkContext(conf)
>
>
>
>     val data = MLUtils.loadLibSVMFile(sc,"data/iris.scale.txt");
>
>     val numClasses = 4;
>
>     val categoricalFeaturesInfo = Map[Int,Int]();
>
>     val impurity = "gini";
>
>     val maxDepth = 5;
>
>     val maxBins = 100;
>
>
>
>     val model = DecisionTree.trainClassifier(data, numClasses,
> categoricalFeaturesInfo, impurity, maxDepth, maxBins);
>
>
>
>     val labelAndPreds = data.map{ point =>
>
>       val prediction = model.predict(point.features);
>
>       (point.label, prediction)
>
>     }
>
>
>
>     val trainErr = labelAndPreds.filter(r => r._1 != r._2).count.toDouble /
> data.count;
>
>     println("Training Error = " + trainErr);
>
>     println("Learned classification tree model:\n" + model);
>
>
>
> -Yao

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to