word2vec cosineSimilarity

Arthur Chan Thu, 15 Oct 2015 09:58:43 -0700

Hi,

I am trying sample word2vec  from
http://spark.apache.org/docs/latest/mllib-feature-extraction.html#example


Following are my test results:

scala> for((synonym, cosineSimilarity) <- synonyms) {
     |   println(s"$synonym $cosineSimilarity")
     | }
taiwan 2.0518918365726297
japan 1.8960962308732054
korea 1.8789320149319788
thailand 1.7549218525671182
mongolia 1.7375501108635814


I got the values cosineSimilarity are all greater than 1,  should the
cosineSimilarity be the values between 0 to 1?

How can I get the values of Similarity in 0 to 1?

Regards

word2vec cosineSimilarity

Reply via email to