Repository: spark Updated Branches: refs/heads/branch-1.6 fb0933681 -> 4381e2121
[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for long runs ## What changes were proposed in this pull request? Unpersist broadcasted vars in Word2Vec.fit for more timely / reliable resource cleanup ## How was this patch tested? Jenkins tests Author: Sean Owen <so...@cloudera.com> Closes #14153 from srowen/SPARK-16440. (cherry picked from commit 51ade51a9fd64fc2fe651c505a286e6f29f59d40) Signed-off-by: Sean Owen <so...@cloudera.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/4381e212 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/4381e212 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/4381e212 Branch: refs/heads/branch-1.6 Commit: 4381e212140102b4bce756146c09e866c7b2d85c Parents: fb09336 Author: Sean Owen <so...@cloudera.com> Authored: Wed Jul 13 11:39:32 2016 +0100 Committer: Sean Owen <so...@cloudera.com> Committed: Wed Jul 13 11:39:49 2016 +0100 ---------------------------------------------------------------------- .../src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala | 3 +++ 1 file changed, 3 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/4381e212/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---------------------------------------------------------------------- diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala index 30a1849..c2ed896 100644 --- a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala +++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala @@ -416,6 +416,9 @@ class Word2Vec extends Serializable with Logging { } } newSentences.unpersist() + expTable.unpersist() + bcVocab.unpersist() + bcVocabHash.unpersist() val wordArray = vocab.map(_.word) new Word2VecModel(wordArray.zipWithIndex.toMap, syn0Global) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org