Github user LowikC commented on a diff in the pull request: https://github.com/apache/spark/pull/19372#discussion_r141607418 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala --- @@ -368,9 +368,9 @@ class Word2Vec extends Serializable with Logging { var wc = wordCount if (wordCount - lastWordCount > 10000) { lwc = wordCount - alpha = - learningRate * - (1 - numPartitions * wordCount.toDouble / (numIterations * trainWordsCount + 1)) + alpha = learningRate * + (1 - numPartitions * wordCount.toDouble + (k - 1) * trainWordsCount / --- End diff -- you need `numPartitions * wordCount.toDouble + (k - 1) * trainWordsCount` between parentheses `alpha = learningRate * (1 - (numPartitions * wordCount.toDouble + (k - 1) * trainWordsCount) / (numIterations * trainWordsCount + 1))`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org