Asher Krim created SPARK-19247:
----------------------------------

             Summary: improve ml word2vec save/load
                 Key: SPARK-19247
                 URL: https://issues.apache.org/jira/browse/SPARK-19247
             Project: Spark
          Issue Type: Bug
            Reporter: Asher Krim


ml word2vec models can be somewhat large (~4gb is not uncommon). The current 
save implementation saves the model as a single large datum, which can cause 
rpc issues and fail to save the model.

On the loading side, there are issues with loading this large datum as well. 
This was already solved for mllib word2vec in 
https://issues.apache.org/jira/browse/SPARK-11994, but the change was never 
ported to the ml word2vec implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to