Re: How to save mllib model to hdfs and reload it

2014-08-13 Thread Jaideep Dhok
Hi, I have faced a similar issue when trying to run a map function with predict. In my case I had some non-serializable fields in my calling class. After making those fields transient, the error went away. On Wed, Aug 13, 2014 at 6:39 PM, lancezhange lancezha...@gmail.com wrote: let's say you

Callbacks on freeing up of RDDs

2014-06-30 Thread Jaideep Dhok
Hi all, I am trying to create a custom RDD class for result set of queries supported in InMobi Grill (http://inmobi.github.io/grill/) Each result set has a schema (similar to Hive's TableSchema) and a path in HDFS containing the result set data. An easy way of doing this would be to create a

Re: TaskNotSerializable when invoking KMeans.run

2014-06-30 Thread Jaideep Dhok
Hi Daniel, I also faced the same issue when using Naive Bayes classifier in MLLib. I was able to solve it by making all fields in the calling object either transient of serializable. Spark will print which class's object it was not able to serialize, in the error message. that can give you a