Xiangrui Meng created SPARK-5419:
------------------------------------

             Summary: Fix the logic in Vectors.sqdist
                 Key: SPARK-5419
                 URL: https://issues.apache.org/jira/browse/SPARK-5419
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
            Reporter: Xiangrui Meng
            Assignee: Liang-Chi Hsieh


The current implementation of sqdist tries to convert sparse vectors to dense 
if they are close to dense. This is not efficient because we need to allocate 
temp arrays. We should simply implement sqdist without allocating new memory.

The current implementation also contains a bug on deciding whether to convert a 
sparse vector to dense.

{code}
v1.indices.length / v1.size < 0.5
{code}

which should get removed with the changes described above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to