[ https://issues.apache.org/jira/browse/MAHOUT-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781618#action_12781618 ]
Jake Mannix commented on MAHOUT-207: ------------------------------------ Hmm, in fact, SparseVector and AbstractVector are not currently consistent vis a vis hashCode() / equals on trunk: {code} double[] gold = { 0, 1.1, 2.2, 3.3, 0 }; public void testEqualsHashcodeCompatibility() { SparseVector sparse = new SparseVector(gold.length); SparseVector pretendSparse = new SparseVector(gold.length); for(int i=0; i<gold.length; i++) { if(gold[i] != 0d) { sparse.set(i, gold[i]); } pretendSparse.set(i, gold[i]); } assertEquals("vectors with zeroes", sparse, pretendSparse); assertEquals("hashcodes with zeroes", sparse.hashCode(), pretendSparse.hashCode()); } {code} fails. Submitting patch to fix shortly. > AbstractVector.hashCode() should not care about the order of iteration over > elements > ------------------------------------------------------------------------------------ > > Key: MAHOUT-207 > URL: https://issues.apache.org/jira/browse/MAHOUT-207 > Project: Mahout > Issue Type: Improvement > Components: Matrix > Affects Versions: 0.2 > Environment: all > Reporter: Jake Mannix > Priority: Minor > Fix For: 0.3 > > > As was discussed in MAHOUT-165, hashCode can be implemented simply like this: > {code} > public int hashCode() { > final int prime = 31; > int result = prime + ((name == null) ? 0 : name.hashCode()); > result = prime * result + size(); > Iterator<Element> iter = iterateNonZero(); > while (iter.hasNext()) { > Element ele = iter.next(); > long v = Double.doubleToLongBits(ele.get()); > result += (ele.index() * (int)(v^(v>>32))); > } > return result; > } > {code} > which obviates the need to sort the elements in the case of a random access > hash-based implementation. Also, (ele.index() * (int)(v^(v>>32)) ) == 0 when > v = Double.doubleToLongBits(0d), which avoids the wrong hashCode() for sparse > vectors which have zero elements returned from the iterateNonZero() iterator. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.