[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield "OutOfMemoryError: Requested array size exceeds VM limit"

2016-01-28 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122775#comment-15122775 ] Joseph Tang commented on SPARK-4846: Hi Tung, As far as I can remember, the data is serialized by

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield "OutOfMemoryError: Requested array size exceeds VM limit"

2016-01-28 Thread Tung Dang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121503#comment-15121503 ] Tung Dang commented on SPARK-4846: -- [~josephkb]: I have changed the mode to yarn-cluster, however it

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield "OutOfMemoryError: Requested array size exceeds VM limit"

2015-12-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045713#comment-15045713 ] Joseph K. Bradley commented on SPARK-4846: -- This sounds like a limitation of using yarn-client

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield "OutOfMemoryError: Requested array size exceeds VM limit"

2015-12-01 Thread Tung Dang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035427#comment-15035427 ] Tung Dang commented on SPARK-4846: -- I have a question regarding this issue: as far as I understand,

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294884#comment-14294884 ] Xiangrui Meng commented on SPARK-4846: -- We should throw a RuntimeException before

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295011#comment-14295011 ] Apache Spark commented on SPARK-4846: - User 'jinntrance' has created a pull request

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295020#comment-14295020 ] Joseph Tang commented on SPARK-4846: OK. I've sent a new PR as below. When the

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292926#comment-14292926 ] Joseph Tang commented on SPARK-4846: I've added some code at

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292853#comment-14292853 ] Joseph Tang commented on SPARK-4846: Sorry about the procrastination. I'm still

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292855#comment-14292855 ] Joseph Tang commented on SPARK-4846: Sorry about the procrastination. I'm still

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292886#comment-14292886 ] Joseph Tang commented on SPARK-4846: Hi Xiangrui, here is a problem. PR #3693 that

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292718#comment-14292718 ] Xiangrui Meng commented on SPARK-4846: -- [~josephtang] Are you working on this issue?

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-30 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261621#comment-14261621 ] Xiangrui Meng commented on SPARK-4846: -- We merged `setMinCount()` in PR #3693. For

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-23 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14256852#comment-14256852 ] Joseph Tang commented on SPARK-4846: It sounds accomplishable. I'll try this and make

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248802#comment-14248802 ] Joseph K. Bradley commented on SPARK-4846: -- Changing vectorSize sounds too

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246665#comment-14246665 ] Sean Owen commented on SPARK-4846: -- I think you're just running out of memory on your

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246732#comment-14246732 ] Sean Owen commented on SPARK-4846: -- But being lazy doesn't really change whether it is

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-15 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247487#comment-14247487 ] Joseph K. Bradley commented on SPARK-4846: -- I agree with [~srowen] that the

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2014-12-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246337#comment-14246337 ] Apache Spark commented on SPARK-4846: - User 'jinntrance' has created a pull request