Hi,
I have an existing random forest model created using R. I want to use that
to predict values on Spark. Is it possible to do the same? If yes, then how?
Thanks & Regards,
Neha
, May 30, 2016 at 4:52 PM
> Subject: Re: Can we use existing R model in Spark
> To: Sun Rui <sunrise_...@163.com>
> Cc: Neha Mehta <nehamehta...@gmail.com>, user <user@spark.apache.org>
>
>
> Try to invoke a R script from Spark using rdd pipe method , get the wor
res.html#vectorslicer .
>
> Regards,
> Yuhao
>
> 2016-06-01 21:18 GMT+08:00 Neha Mehta <nehamehta...@gmail.com>:
>
>> Hi,
>>
>> I am performing Regression using Random Forest. In my input vector, I
>> want the algorithm to ignore certain column
Hi,
I am performing Regression using Random Forest. In my input vector, I want
the algorithm to ignore certain columns/features while training the
classifier and also while prediction. These are basically Id columns. I
checked the documentation and could not find any information on the same.
packages/randomForest/randomForest.pdf>
mytry=3
ntree=500
importance=TRUE
maxnodes = NULL
On May 31, 2016 7:03 AM, "Sun Rui" <sunrise_...@163.com> wrote:
I mean train random forest model (not using R) and use it for prediction
together using Spark ML.
On May 30, 2016, at 2
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:114)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:116)
..
Thanks for the help.
Regards,
Neha Mehta
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:112)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:114)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:116)
..
Thanks for the help.
Regards,
Neha Mehta
kages/randomForest/randomForest.pdf
>
> mytry=3
>
> ntree=500
>
> importance=TRUE
>
> maxnodes = NULL
>
> On May 31, 2016 7:03 AM, "Sun Rui" <sunrise_...@163.com> wrote:
>>
>> I mean train random forest model (not using R) and use it for pred
ot;2"))
>
> val rdd = sc.parallelize(arr)
>
> val prdd = rdd.map(a => (a._1,a))
> val totals = prdd.groupByKey.map(a => (a._1, a._2.size))
>
> var n1 = rdd.map(a => ((a._1, a._2), 1) )
> var n2 = n1.reduceByKey(_+_).map(a => (a._1._1, (a._1._2, a._2)))
> var