Re: spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
Thanks, I try to make but i can't.
JavaPairRDD unlabeledTest, the vector is Dence vector. I
add import org.apache.spark.sql.SQLContext.implicits$   but there is no
method toDf(), I am using Java not Scala.

2015-09-18 20:02 GMT+03:00 Feynman Liang :

> What is the type of unlabeledTest?
>
> SQL should be using the VectorUDT we've defined for Vectors
> 
>  so
> you should be able to just "import sqlContext.implicits._" and then call
> "rdd.toDf()" on your RDD to convert it into a dataframe.
>
> On Fri, Sep 18, 2015 at 7:32 AM, Yasemin Kaya  wrote:
>
>> Hi,
>>
>> I am using *spark 1.5, ML Pipeline Decision Tree
>> *
>> to get tree's probability. But I have to convert my data to Dataframe type.
>> While creating model there is no problem but when I am using model on my
>> data there is a problem about converting to data frame type. My data type
>> is *JavaPairRDD* , when I am creating dataframe
>>
>> DataFrame production = sqlContext.createDataFrame(
>> unlabeledTest.values(), Vector.class);
>>
>> *Error says me: *
>> Exception in thread "main" java.lang.ClassCastException:
>> org.apache.spark.mllib.linalg.VectorUDT cannot be cast to
>> org.apache.spark.sql.types.StructType
>>
>> I know if I give LabeledPoint type, there will be no problem. But the
>> data have no label, I wanna predict the label because of this reason I use
>> model on it.
>>
>> Is there way to handle my problem?
>> Thanks.
>>
>>
>> Best,
>> yasemin
>> --
>> hiç ender hiç
>>
>
>


-- 
hiç ender hiç


spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Yasemin Kaya
Hi,

I am using *spark 1.5, ML Pipeline Decision Tree
*
to get tree's probability. But I have to convert my data to Dataframe type.
While creating model there is no problem but when I am using model on my
data there is a problem about converting to data frame type. My data type
is *JavaPairRDD* , when I am creating dataframe

DataFrame production = sqlContext.createDataFrame(
unlabeledTest.values(), Vector.class);

*Error says me: *
Exception in thread "main" java.lang.ClassCastException:
org.apache.spark.mllib.linalg.VectorUDT cannot be cast to
org.apache.spark.sql.types.StructType

I know if I give LabeledPoint type, there will be no problem. But the data
have no label, I wanna predict the label because of this reason I use model
on it.

Is there way to handle my problem?
Thanks.


Best,
yasemin
-- 
hiç ender hiç


Re: spark 1.5, ML Pipeline Decision Tree Dataframe Problem

2015-09-18 Thread Feynman Liang
What is the type of unlabeledTest?

SQL should be using the VectorUDT we've defined for Vectors

so
you should be able to just "import sqlContext.implicits._" and then call
"rdd.toDf()" on your RDD to convert it into a dataframe.

On Fri, Sep 18, 2015 at 7:32 AM, Yasemin Kaya  wrote:

> Hi,
>
> I am using *spark 1.5, ML Pipeline Decision Tree
> *
> to get tree's probability. But I have to convert my data to Dataframe type.
> While creating model there is no problem but when I am using model on my
> data there is a problem about converting to data frame type. My data type
> is *JavaPairRDD* , when I am creating dataframe
>
> DataFrame production = sqlContext.createDataFrame(
> unlabeledTest.values(), Vector.class);
>
> *Error says me: *
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.spark.mllib.linalg.VectorUDT cannot be cast to
> org.apache.spark.sql.types.StructType
>
> I know if I give LabeledPoint type, there will be no problem. But the data
> have no label, I wanna predict the label because of this reason I use model
> on it.
>
> Is there way to handle my problem?
> Thanks.
>
>
> Best,
> yasemin
> --
> hiç ender hiç
>