Hi, I am trying to use apache spark's decision tree classifier. I am trying to implement the method found in https://spark.apache.org/docs/1.5.1/ml-decision-tree.html 's classification example. I found the dataset at https://github.com/apache/spark/blob/master/data/mllib/sample_libsvm_data.txt and I have some trouble understanding its format. Is the first column the label? Why are there indices and a colon in front of other number values and what do the indices represent? Lastly, how do I print out a prediction given new test data?
Thanks, Serena Sian Yuan -- Sian Ees Super. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org