http://stackoverflow.com/questions/37723308/spark-ml-word2vec-serialization-issues
<http://stackoverflow.com/questions/37723308/spark-ml-word2vec-serialization-issues>
  

I recently refactored our Word2Vec code to move to DataFrame based ml
models, but I am having problem in serializing and loading the model
locally.

I am able to successfully:

1. Fit the dataframe and create the model.
2. Retrieve synonyms.

When I try to serialize the model locally, vectors are not serialized and
hence the size of the file is too small approx 2K for 10GB of data.

        FileOutputStream fo = new FileOutputStream("/tmp/word2vec");
        ObjectOutputStream so = new ObjectOutputStream(fo);
        so.writeObject(word2VecModel);
        so.flush();
        so.close();
        logger.info("Word2Vec model saved");

On loading the model and calling the findSynonyms() function results in
below exception:

java.lang.NullPointerException at
org.apache.spark.ml.feature.Word2VecModel.transform(Word2Vec.scala:224)

Is there a way to save the model locally ?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-ML-Word2Vec-Serialization-Issues-tp27125.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to