[ https://issues.apache.org/jira/browse/MAHOUT-268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805562#action_12805562 ]
Jake Mannix commented on MAHOUT-268: ------------------------------------ Oy, this is wrong in all three places it is implemented (in different ways, :\ ) - even in the "non-optimized" impl in AbstractVector: {code} @Override public double getDistanceSquared(Vector v) { double d = 0; Iterator<Element> it = iterateNonZero(); Element e; while(it.hasNext() && (e = it.next()) != null) { double diff = e.get() - v.getQuick(e.index()); d += (diff * diff); } return d; } {code} Iterating over the nonzero entries of this vector doesn't make sure to iterate over the nonzero entries of the other one as well! > Vector.getDistanceSquared() is incorrect for both SparseVector varieties > ------------------------------------------------------------------------ > > Key: MAHOUT-268 > URL: https://issues.apache.org/jira/browse/MAHOUT-268 > Project: Mahout > Issue Type: Bug > Affects Versions: 0.2 > Environment: all > Reporter: NOT_A_USER > Assignee: Jake Mannix > Fix For: 0.3 > > > I'm pretty sure that getDistanceSquared() should just return as if an > optimized implementation of: > {code} > public double getDistanceSquared(Vector v) { return > this.minus(v).getLengthSquared(); } > {code} > In which case if some vector elements are negative, both > SequentialAccessSparseVector (my fault!) and RandomAccessSparseVector return > the wrong thing. Very easy to write a failing unit test for this one. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.