[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648180#comment-14648180 ] Joseph K. Bradley commented on SPARK-5692: -- This was not, but thanks for the reminder; it'd be nice to add. I'll make and link a JIRA for it Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: Manoj Kumar Fix For: 1.4.0 Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647432#comment-14647432 ] Robin East commented on SPARK-5692: --- Hi the description includes the sentence 'We may want to discuss whether we want to be compatible with the original Word2Vec model storage format.'. Was this ever discussed - I can't see anything in comment stream for this JIRA. Is there any interest in adding functionality to import Word2Vec models from the original binary format (e.g. the 300 million word Google News model). Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: Manoj Kumar Fix For: 1.4.0 Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388296#comment-14388296 ] Apache Spark commented on SPARK-5692: - User 'MechCoder' has created a pull request for this issue: https://github.com/apache/spark/pull/5291 Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: Manoj Kumar Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377266#comment-14377266 ] Xiangrui Meng commented on SPARK-5692: -- [~anupamme] You should get familiar with Scala and Spark development first before working on specific JIRAs. I've assigned this ticket to [~MechCoder]. There are many open JIRAs for MLlib. Once you are familiar with Scala/Spark, feel free to ping me on a JIRA that you are interested in. Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357069#comment-14357069 ] Manoj Kumar commented on SPARK-5692: okay, great Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358573#comment-14358573 ] ANUPAM MEDIRATTA commented on SPARK-5692: - I tried working on it. I am new to spark and scala. I am not able to run tests in scala ide. I am not able to compile the code base in eclipse so that I can run tests (to verify my code). any instructions on how to compile this codebase in eclipse (scala ide)? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358583#comment-14358583 ] Manoj Kumar commented on SPARK-5692: I'm not sure about Eclipse, but I work just on sublime text and build it using the instructions given here. https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357055#comment-14357055 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357053#comment-14357053 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357057#comment-14357057 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357054#comment-14357054 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357056#comment-14357056 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14357058#comment-14357058 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Manoj Kumar, Not yet but plan to work on it over the weekend. Is that okay? Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356959#comment-14356959 ] Manoj Kumar commented on SPARK-5692: [~anupamme] Hi, Are you still working on this? Thanks. Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348177#comment-14348177 ] ANUPAM MEDIRATTA commented on SPARK-5692: - Hey Xiangrui Please assign the ticket to me. Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5692) Model import/export for Word2Vec
[ https://issues.apache.org/jira/browse/SPARK-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348212#comment-14348212 ] Xiangrui Meng commented on SPARK-5692: -- Done. The Parquet data file should have two columns: `(word: String, vector: Array[Float])`. Please follow the instructions at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide to submit a PR. Thanks! Model import/export for Word2Vec Key: SPARK-5692 URL: https://issues.apache.org/jira/browse/SPARK-5692 Project: Spark Issue Type: Sub-task Components: MLlib Reporter: Xiangrui Meng Assignee: ANUPAM MEDIRATTA Supoort save and load for Word2VecModel. We may want to discuss whether we want to be compatible with the original Word2Vec model storage format. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org