[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887730#comment-15887730 ] Apache Spark commented on SPARK-16440: -- User 'AnthonyTruchet' has created a pull request for this issue: https://github.com/apache/spark/pull/14299 > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Anthony Truchet > Fix For: 1.6.3, 2.0.1 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387342#comment-15387342 ] Anthony Truchet commented on SPARK-16440: - Regarding the try finally: we are computing numerous learning from within a same spark context and some with vocabulary so large that they fail (yes we do try to filter out too big ones, but too big is difficult to define). So we are in a context where we do care about resource cleaning in case of error in order to enable thousands of successive learnings some of with expected to fail. As for core readability we can try to refactor the function to reduce the nesting or find a "nice" scala solution: I'll propose a patch and I'll welcome any feedback on it. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Anthony Truchet > Fix For: 1.6.3, 2.0.1 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384608#comment-15384608 ] Apache Spark commented on SPARK-16440: -- User 'AnthonyTruchet' has created a pull request for this issue: https://github.com/apache/spark/pull/14268 > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384131#comment-15384131 ] Sean Owen commented on SPARK-16440: --- Yeah it seems good to destroy even in case of errors, but in practice, an error here means lots of things are wrong. Actually using try-finally to destroy every RDD/variable would make the code a mess. I think many cases where it plausibly won't matter, we don't. If there's a decent argument that errors here are common for some reason, OK, but not sure that's true. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384118#comment-15384118 ] Anthony Truchet commented on SPARK-16440: - I will, as well as putting this is a try finally to ensure proper deletion even in case of errors. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383971#comment-15383971 ] Sean Owen commented on SPARK-16440: --- Oh, may be better still indeed. Feel free to submit a follow up associated to this same JIRA. > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383967#comment-15383967 ] Anthony Truchet commented on SPARK-16440: - Thanks for such a quick fix [~srowen] : I was off-line for the past week that's why I couldn't submit the patch quickly enough. I would have {{destroy}}ed the variable instead of {{unpersist}}ing them though as the issues was memory consumption on the driver side: what am I missing which made you choose the later over the former ? > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet >Assignee: Sean Owen > Fix For: 1.6.3, 2.0.0 > > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372781#comment-15372781 ] Apache Spark commented on SPARK-16440: -- User 'srowen' has created a pull request for this issue: https://github.com/apache/spark/pull/14153 > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367616#comment-15367616 ] Sean Owen commented on SPARK-16440: --- Yeah it would be fine to unpersist these at the end of the method. I suppose I'm surprised that the Broadcast vars don't unpersist themselves when they're out of scope and garbage collected via finalize? > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16440) Undeleted broadcast variables in Word2Vec causing OoM for long runs
[ https://issues.apache.org/jira/browse/SPARK-16440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367615#comment-15367615 ] Anthony Truchet commented on SPARK-16440: - Hello Spark developers, I'm preparing a patch for this issue. This will be my first contribution to Spark. I'll strive to follow the contribution guidelines, but please do not hesitate to tell me how to do it better if required :-) > Undeleted broadcast variables in Word2Vec causing OoM for long runs > > > Key: SPARK-16440 > URL: https://issues.apache.org/jira/browse/SPARK-16440 > Project: Spark > Issue Type: Bug > Components: MLlib >Affects Versions: 1.6.0, 1.6.1, 1.6.2, 2.0.0 >Reporter: Anthony Truchet > Original Estimate: 4h > Remaining Estimate: 4h > > Three broadcast variables created at the beginning of {{Word2Vec.fit()}} are > never deleted nor unpersisted. This seems to cause excessive memory > consumption on the driver for a job running hundreds of successive training. > They are > {code} > val expTable = sc.broadcast(createExpTable()) > val bcVocab = sc.broadcast(vocab) > val bcVocabHash = sc.broadcast(vocabHash) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org