Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-09 Thread Terry Hole
Sean, Thank you! Finally, I get this to work, although it is a bit ugly: manually to set the meta data of dataframe. import org.apache.spark.ml.attribute._ import org.apache.spark.sql.types._ val df = training.toDF() val schema = df.schema val rowRDD = df.rdd def enrich(m : Metadata) : Metadata

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-07 Thread Terry Hole
Xiangrui, Do you have any idea how to make this work? Thanks - Terry Terry Hole 于2015年9月6日星期日 17:41写道: > Sean > > Do you know how to tell decision tree that the "label" is a binary or set > some attributes to dataframe to carry number of classes? > > Thanks! > - Terry >

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Terry Hole
Hi, Owen, The dataframe "training" is from a RDD of case class: RDD[LabeledDocument], while the case class is defined as this: case class LabeledDocument(id: Long, text: String, *label: Double*) So there is already has the default "label" column with "double" type. I already tried to set the

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Sean Owen
I think somewhere alone the line you've not specified your label column -- it's defaulting to "label" and it does not recognize it, or at least not as a binary or nominal attribute. On Sun, Sep 6, 2015 at 5:47 AM, Terry Hole wrote: > Hi, Experts, > > I followed the guide

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Sean Owen
(Sean) The error suggests that the type is not a binary or nominal attribute though. I think that's the missing step. A double-valued column need not be one of these attribute types. On Sun, Sep 6, 2015 at 10:14 AM, Terry Hole wrote: > Hi, Owen, > > The dataframe

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Terry Hole
Sean Do you know how to tell decision tree that the "label" is a binary or set some attributes to dataframe to carry number of classes? Thanks! - Terry On Sun, Sep 6, 2015 at 5:23 PM, Sean Owen wrote: > (Sean) > The error suggests that the type is not a binary or nominal