Repository: spark
Updated Branches:
  refs/heads/master 4a01bfc2a -> b79bf1df6


[SPARK-9337] [MLLIB] Add an ut for Word2Vec to verify the empty vocabulary check

jira: https://issues.apache.org/jira/browse/SPARK-9337

Word2Vec should throw exception when vocabulary is empty

Author: Yuhao Yang <hhb...@gmail.com>

Closes #7660 from hhbyyh/ut4Word2vec and squashes the following commits:

17a18cb [Yuhao Yang] add ut for word2vec


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b79bf1df
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b79bf1df
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b79bf1df

Branch: refs/heads/master
Commit: b79bf1df6238c087c3ec524344f1fc179719c5de
Parents: 4a01bfc
Author: Yuhao Yang <hhb...@gmail.com>
Authored: Sun Jul 26 14:02:20 2015 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Sun Jul 26 14:02:20 2015 +0100

----------------------------------------------------------------------
 .../org/apache/spark/mllib/feature/Word2VecSuite.scala    | 10 ++++++++++
 1 file changed, 10 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b79bf1df/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala
----------------------------------------------------------------------
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala 
b/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala
index 4cc8d11..a864eec 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala
@@ -45,6 +45,16 @@ class Word2VecSuite extends SparkFunSuite with 
MLlibTestSparkContext {
     assert(newModel.getVectors.mapValues(_.toSeq) === 
word2VecMap.mapValues(_.toSeq))
   }
 
+  test("Word2Vec throws exception when vocabulary is empty") {
+    intercept[IllegalArgumentException] {
+      val sentence = "a b c"
+      val localDoc = Seq(sentence, sentence)
+      val doc = sc.parallelize(localDoc)
+        .map(line => line.split(" ").toSeq)
+      new Word2Vec().setMinCount(10).fit(doc)
+    }
+  }
+
   test("Word2VecModel") {
     val num = 2
     val word2VecMap = Map(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to