[ 
https://issues.apache.org/jira/browse/SPARK-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14229943#comment-14229943
 ] 

Peter Rudenko commented on SPARK-4101:
--------------------------------------

But i want to be able to extend it further (implement glove, doc2vec, RNN, 
etc.), so would be great to reuse some code from Word2Vec class, but everything 
there is private (learnVocab, createBinaryTree, createExpTable) and model 
itself is private to mllib package only.

> [MLLIB] Improve API in Word2Vec model
> -------------------------------------
>
>                 Key: SPARK-4101
>                 URL: https://issues.apache.org/jira/browse/SPARK-4101
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.1.0
>            Reporter: Peter Rudenko
>            Priority: Minor
>
> 1) Would be nice to be able to retrieve underlying model map, to be able to 
> work with it after (make an RDD, persist/load,  online train, etc.). (Done by 
> [SPARK-4582|https://issues.apache.org/jira/browse/SPARK-4582] )
> 2) Be able to extend Word2VecModel to add custom functionality (like add 
> analogyWords(w1: String, w2: String, target: String, num: Int) method, which 
> returns n words that relates to target as w1 to w2).
> 3) Make cosineSimilarity method public to be able to reuse it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to