Loading previously serialized object to Spark

2015-03-06 Thread Ulanov, Alexander
Hi, I've implemented class MyClass in MLlib that does some operation on LabeledPoint. MyClass extends serializable, so I can map this operation on data of RDD[LabeledPoints], such as data.map(lp => MyClass.operate(lp)). I write this class in file with ObjectOutputStream.writeObject. Then I stop

Re: Loading previously serialized object to Spark

2015-03-08 Thread Akhil Das
Can you paste the complete code? Thanks Best Regards On Sat, Mar 7, 2015 at 2:25 AM, Ulanov, Alexander wrote: > Hi, > > I've implemented class MyClass in MLlib that does some operation on > LabeledPoint. MyClass extends serializable, so I can map this operation on > data of RDD[LabeledPoints],

RE: Loading previously serialized object to Spark

2015-03-09 Thread Ulanov, Alexander
ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean(SparkContext.scala:1453) at org.apache.spark.rdd.RDD.map(RDD.scala:273) From: Akhil Das [mailto:ak...@sigmoidanalytics.com] Sent: Sunday, March 08, 2015 3:17 AM To: Ulanov, Alexander Cc: dev Subject: Re: Loading previously serialize

RE: Loading previously serialized object to Spark

2015-03-09 Thread Ulanov, Alexander
ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean(SparkContext.scala:1453) at org.apache.spark.rdd.RDD.map(RDD.scala:273) From: Akhil Das [mailto:ak...@sigmoidanalytics.com] Sent: Sunday, March 08, 2015 3:17 AM To: Ulanov, Alexander Cc: dev Subject: Re: Loading previously serialize

Re: Loading previously serialized object to Spark

2015-03-09 Thread Xiangrui Meng
ternal Spark serializer: > val serializer = SparkEnv.get.closureSerializer.newInstance > > > -Original Message- > From: Ulanov, Alexander > Sent: Monday, March 09, 2015 10:37 AM > To: Akhil Das > Cc: dev > Subject: RE: Loading previously serialized object to Spa

RE: Loading previously serialized object to Spark

2015-03-09 Thread Ulanov, Alexander
: Ulanov, Alexander Cc: Akhil Das; dev Subject: Re: Loading previously serialized object to Spark Could you try `sc.objectFile` instead? sc.parallelize(Seq(model), 1).saveAsObjectFile("path") val sameModel = sc.objectFile[NaiveBayesModel]("path").first() -Xiangrui On Mon,

Re: Loading previously serialized object to Spark

2015-03-09 Thread Xiangrui Meng
: Ulanov, Alexander > Cc: Akhil Das; dev > Subject: Re: Loading previously serialized object to Spark > > Could you try `sc.objectFile` instead? > > sc.parallelize(Seq(model), 1).saveAsObjectFile("path") val sameModel = > sc.objectFile[NaiveBayesModel]("path&qu

RE: Loading previously serialized object to Spark

2015-03-09 Thread Ulanov, Alexander
: Akhil Das; dev Subject: Re: Loading previously serialized object to Spark Well, it is the standard "hacky" way for model save/load in MLlib. We have SPARK-4587 and SPARK-5991 to provide save/load for all MLlib models, in an exchangeable format. -Xiangrui On Mon, Mar 9, 2015 at 12:25