Repository: spark Updated Branches: refs/heads/branch-2.0 74ad486dc -> 5a71a0501
[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for long runs ## What changes were proposed in this pull request? Unpersist broadcasted vars in Word2Vec.fit for more timely / reliable resource cleanup ## How was this patch tested? Jenkins tests Author: Sean Owen <so...@cloudera.com> Closes #14153 from srowen/SPARK-16440. (cherry picked from commit 51ade51a9fd64fc2fe651c505a286e6f29f59d40) Signed-off-by: Sean Owen <so...@cloudera.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5a71a050 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5a71a050 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5a71a050 Branch: refs/heads/branch-2.0 Commit: 5a71a05015ac7aabfb6c4aa8753abc87ead20718 Parents: 74ad486 Author: Sean Owen <so...@cloudera.com> Authored: Wed Jul 13 11:39:32 2016 +0100 Committer: Sean Owen <so...@cloudera.com> Committed: Wed Jul 13 11:39:39 2016 +0100 ---------------------------------------------------------------------- .../src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala | 3 +++ 1 file changed, 3 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/5a71a050/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala index f2211df..6b9c8ee 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala @@ -434,6 +434,9 @@ class Word2Vec extends Serializable with Logging { bcSyn1Global.unpersist(false) } newSentences.unpersist() + expTable.unpersist() + bcVocab.unpersist() + bcVocabHash.unpersist() val wordArray = vocab.map(_.word) new Word2VecModel(wordArray.zipWithIndex.toMap, syn0Global) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org