[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517621#comment-14517621 ] Apache Spark commented on SPARK-7045: - User 'MechCoder' has created a pull request for this issue: https://github.com/apache/spark/pull/5748 > Word2Vec: avoid intermediate representation when creating model > --- > > Key: SPARK-7045 > URL: https://issues.apache.org/jira/browse/SPARK-7045 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.4.0 >Reporter: Joseph K. Bradley >Priority: Minor > > Word2VecModel now stores the word vectors as a single, flat array; Word2Vec > does as well. However, when Word2Vec creates the model, it builds an > intermediate representation. We should skip that intermediate representation. > However, it will be nice to create a public constructor for Word2VecModel > which takes that intermediate representation (a Map from String words to > their Vectors), since it's a user-friendly representation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507320#comment-14507320 ] Joseph K. Bradley commented on SPARK-7045: -- Oh, ok, I bet tmp() needs to be static. > Word2Vec: avoid intermediate representation when creating model > --- > > Key: SPARK-7045 > URL: https://issues.apache.org/jira/browse/SPARK-7045 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.4.0 >Reporter: Joseph K. Bradley >Priority: Minor > > Word2VecModel now stores the word vectors as a single, flat array; Word2Vec > does as well. However, when Word2Vec creates the model, it builds an > intermediate representation. We should skip that intermediate representation. > However, it will be nice to create a public constructor for Word2VecModel > which takes that intermediate representation (a Map from String words to > their Vectors), since it's a user-friendly representation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507199#comment-14507199 ] Manoj Kumar commented on SPARK-7045: I did try the method you suggested, but it does not work. Simple code to reproduce {code} class a(f: Int) { def tmp(c: Float) = c.asInstanceOf[Int] def this(c: Float) = this(c) } {code} This fails with an error called "constructor's definition must precede calling constructor's definition." > Word2Vec: avoid intermediate representation when creating model > --- > > Key: SPARK-7045 > URL: https://issues.apache.org/jira/browse/SPARK-7045 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.4.0 >Reporter: Joseph K. Bradley >Priority: Minor > > Word2VecModel now stores the word vectors as a single, flat array; Word2Vec > does as well. However, when Word2Vec creates the model, it builds an > intermediate representation. We should skip that intermediate representation. > However, it will be nice to create a public constructor for Word2VecModel > which takes that intermediate representation (a Map from String words to > their Vectors), since it's a user-friendly representation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-7045) Word2Vec: avoid intermediate representation when creating model
[ https://issues.apache.org/jira/browse/SPARK-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506405#comment-14506405 ] Joseph K. Bradley commented on SPARK-7045: -- Ping [~MechCoder] > Word2Vec: avoid intermediate representation when creating model > --- > > Key: SPARK-7045 > URL: https://issues.apache.org/jira/browse/SPARK-7045 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 1.4.0 >Reporter: Joseph K. Bradley >Priority: Minor > > Word2VecModel now stores the word vectors as a single, flat array; Word2Vec > does as well. However, when Word2Vec creates the model, it builds an > intermediate representation. We should skip that intermediate representation. > However, it will be nice to create a public constructor for Word2VecModel > which takes that intermediate representation (a Map from String words to > their Vectors), since it's a user-friendly representation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org