from:"Pengcheng"

Re: use CrossValidatorModel for prediction

2016-10-02 Thread Pengcheng Luo

> On Oct 2, 2016, at 1:04 AM, Pengcheng <pch...@gmail.com> wrote: > > Dear Spark Users, > > I was wondering. > > I have a trained crossvalidator model > model: CrossValidatorModel > > I wan to predict a score for features: RDD[Features] >

use CrossValidatorModel for prediction

2016-10-01 Thread Pengcheng

Dear Spark Users, I was wondering. I have a trained crossvalidator model *model: CrossValidatorModel* I wan to predict a score for *features: RDD[Features]* Right now I have to convert features to dataframe and then perform predictions as following: """ val sqlContext = new

SparkML RandomForest

2016-08-10 Thread Pengcheng

.setInputCol(*"category"*) .setOutputCol(*"label"*) *val *pipeline = *new *Pipeline().setStages(*Array*(indexer, rf)) *val *model: PipelineModel = pipeline.fit(trainingData) thanks, pengcheng

Re: spark application was submitted twice unexpectedly

2015-04-19 Thread Pengcheng Liu

looking into the work folder of problematic application, seems that the application is continuing creating executors, and error log of worker is as below: Exception in thread main java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs at

spark application was submitted twice unexpectedly

2015-04-18 Thread Pengcheng Liu

at the same time). One of the two applications will run to end, but the other will always stay in running state and never exit and release resources. Does anyone meet the same issue? The spark version I am using is spark1.1.1. Best Regards, Pengcheng -- View this message in context: http://apache

How to limit the number of concurrent tasks per node?

2015-01-06 Thread Pengcheng YIN

1. But this could take effect globally. My pipeline involves lots of other operations which I do not want to set limit on. Is there any better solution to fulfil the purpose? Thanks! Pengcheng

does calling cache()/persist() on a RDD trigger its immediate evaluation?

2015-01-03 Thread Pengcheng YIN

` with persisted `newRdd`. My concern is that, if RDD is not evaluated and persisted when persist() is called, I need to change the position of persist()/unpersist() called to make it more efficient. Thanks, Pengcheng

merge elements in a Spark RDD under custom condition

2014-12-01 Thread Pengcheng YIN

. For example, suppose RDD[Seq[Int]] = [[1,2,3], [2,4,5], [1,2], [7,8,9]], the result should be [[1,2,3,4,5], [7,8,9]]. Since RDD[Seq[Int]] is very large, I cannot do it in driver program. Is it possible to get it done using distributed groupBy/map/reduce, etc? Thanks in advance, Pengcheng

Re: use CrossValidatorModel for prediction

use CrossValidatorModel for prediction

SparkML RandomForest

Re: spark application was submitted twice unexpectedly

spark application was submitted twice unexpectedly

How to limit the number of concurrent tasks per node?

does calling cache()/persist() on a RDD trigger its immediate evaluation?

merge elements in a Spark RDD under custom condition

8 matches

Site Navigation

Mail list logo

Footer information