How to Serialize and Reconstruct JavaRDD later?

2015-09-02 Thread Raja Reddy
Hi All, *Context:* I am exploring topic modelling with LDA with Spark MLLib. However, I need my model to enhance as more batches of documents come in. As of now I see no way of doing something like this, which gensim does:

Re: How to Serialize and Reconstruct JavaRDD later?

2015-09-02 Thread Hemant Bhanawat
You want to persist the state between the execution of two rdds. So, I believe what you need is serialization of your model and not JavaRDD. If you can serialize your model, you can persist that in HDFS or some other datastore to be used by the next RDDs. If you are using Spark Streaming, doing