Github user PhoenixDai commented on the pull request:
https://github.com/apache/spark/pull/11812#issuecomment-212969679
Yes, it's reproducible as mentioned in the third comment at
https://issues.apache.org/jira/browse/SPARK-13289
I thought this PR will solve the issue.
Github user PhoenixDai commented on the pull request:
https://github.com/apache/spark/pull/11812#issuecomment-212915342
My observation (of the current implementation of word2vec) is that the
distances between synonyms are getting larger and larger with more iterations
and finally to
Github user PhoenixDai commented on the pull request:
https://github.com/apache/spark/pull/11812#issuecomment-204120379
How about keep the learning rate related code unchanged?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user PhoenixDai commented on the pull request:
https://github.com/apache/spark/pull/11812#issuecomment-202892564
Is this caused by the changes made on word2vec.scala after this PR was
initialed? Maybe the change developed a conflict to this PR. (This is just my
naive guess. I
Github user PhoenixDai commented on the pull request:
https://github.com/apache/spark/pull/11812#issuecomment-202224483
I tested this commit on the "One Billion Words Language Modeling" dataset
with 72 partitions and 15 iterations. It works well.
---
If your project is