Sean,
Thank you!
Finally, I get this to work, although it is a bit ugly: manually to set the
meta data of dataframe.
import org.apache.spark.ml.attribute._
import org.apache.spark.sql.types._
val df = training.toDF()
val schema = df.schema
val rowRDD = df.rdd
def enrich(m : Metadata) : Metadata
Xiangrui,
Do you have any idea how to make this work?
Thanks
- Terry
Terry Hole 于2015年9月6日星期日 17:41写道:
> Sean
>
> Do you know how to tell decision tree that the "label" is a binary or set
> some attributes to dataframe to carry number of classes?
>
> Thanks!
> - Terry
>
Hi, Owen,
The dataframe "training" is from a RDD of case class: RDD[LabeledDocument],
while the case class is defined as this:
case class LabeledDocument(id: Long, text: String, *label: Double*)
So there is already has the default "label" column with "double" type.
I already tried to set the
I think somewhere alone the line you've not specified your label
column -- it's defaulting to "label" and it does not recognize it, or
at least not as a binary or nominal attribute.
On Sun, Sep 6, 2015 at 5:47 AM, Terry Hole wrote:
> Hi, Experts,
>
> I followed the guide
(Sean)
The error suggests that the type is not a binary or nominal attribute
though. I think that's the missing step. A double-valued column need
not be one of these attribute types.
On Sun, Sep 6, 2015 at 10:14 AM, Terry Hole wrote:
> Hi, Owen,
>
> The dataframe
Sean
Do you know how to tell decision tree that the "label" is a binary or set
some attributes to dataframe to carry number of classes?
Thanks!
- Terry
On Sun, Sep 6, 2015 at 5:23 PM, Sean Owen wrote:
> (Sean)
> The error suggests that the type is not a binary or nominal