[ 
https://issues.apache.org/jira/browse/MAHOUT-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782049#action_12782049
 ] 

Jake Mannix commented on MAHOUT-207:
------------------------------------

It looks like the work done on MAHOUT-159 did not use addition as the combiner 
on the hashCode() for Elements of the vector, so the answer was iteration order 
dependent.  Unit tests also didn't check what happened if a sparse vector had 
explicitly zero values set on it, which should not affect hasCode() or equals() 
computation (the latter was fine, the former was not!).

> AbstractVector.hashCode() should not care about the order of iteration over 
> elements
> ------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-207
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-207
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>         Environment: all
>            Reporter: Jake Mannix
>            Assignee: Grant Ingersoll
>             Fix For: 0.3
>
>         Attachments: MAHOUT-207.patch
>
>
> As was discussed in MAHOUT-165, hashCode can be implemented simply like this:
> {code} 
> public int hashCode() {
>     final int prime = 31;
>     int result = prime + ((name == null) ? 0 : name.hashCode());
>     result = prime * result + size();
>     Iterator<Element> iter = iterateNonZero();
>     while (iter.hasNext()) {
>       Element ele = iter.next();
>       long v = Double.doubleToLongBits(ele.get());
>       result += (ele.index() * (int)(v^(v>>32)));
>     }
>     return result;
>   }
> {code}
> which obviates the need to sort the elements in the case of a random access 
> hash-based implementation.  Also, (ele.index() * (int)(v^(v>>32)) ) == 0 when 
> v = Double.doubleToLongBits(0d), which avoids the wrong hashCode() for sparse 
> vectors which have zero elements returned from the iterateNonZero() iterator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to