Repository: spark
Updated Branches:
  refs/heads/branch-2.0 74ad486dc -> 5a71a0501


[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causing OoM for 
long runs

## What changes were proposed in this pull request?

Unpersist broadcasted vars in Word2Vec.fit for more timely / reliable resource 
cleanup

## How was this patch tested?

Jenkins tests

Author: Sean Owen <so...@cloudera.com>

Closes #14153 from srowen/SPARK-16440.

(cherry picked from commit 51ade51a9fd64fc2fe651c505a286e6f29f59d40)
Signed-off-by: Sean Owen <so...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5a71a050
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5a71a050
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5a71a050

Branch: refs/heads/branch-2.0
Commit: 5a71a05015ac7aabfb6c4aa8753abc87ead20718
Parents: 74ad486
Author: Sean Owen <so...@cloudera.com>
Authored: Wed Jul 13 11:39:32 2016 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Wed Jul 13 11:39:39 2016 +0100

----------------------------------------------------------------------
 .../src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala  | 3 +++
 1 file changed, 3 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/5a71a050/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
index f2211df..6b9c8ee 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
@@ -434,6 +434,9 @@ class Word2Vec extends Serializable with Logging {
       bcSyn1Global.unpersist(false)
     }
     newSentences.unpersist()
+    expTable.unpersist()
+    bcVocab.unpersist()
+    bcVocabHash.unpersist()
 
     val wordArray = vocab.map(_.word)
     new Word2VecModel(wordArray.zipWithIndex.toMap, syn0Global)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to