[ 
https://issues.apache.org/jira/browse/MAHOUT-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741124#action_12741124
 ] 

Ted Dunning commented on MAHOUT-159:
------------------------------------

If you iterate over non zeros, then you should also include the dimensions in 
the hash code so that different sized vectors will have different hashes.  In 
addition, the indexes of the non-zero elements should be included so that 
vectors with the same values in different positions will have different hashes. 
 I think that we want to be fairly sure that
\\
\\
{noformat}hash([1,0,2,0,0,0]) != hash([1,0,2]){noformat}

and

{noformat}hash([1,0,2]) != hash([1,2,0]){noformat}


> SparseVector and DenseVector hashCode does not conform to the Java standard
> ---------------------------------------------------------------------------
>
>                 Key: MAHOUT-159
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-159
>             Project: Mahout
>          Issue Type: Bug
>          Components: Matrix
>    Affects Versions: 0.2
>            Reporter: Mark Desnoyer
>            Assignee: Grant Ingersoll
>            Priority: Critical
>         Attachments: MAHOUT-159.patch, MAHOUT-159.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The hash codes for SparseVector and DenseVector will not be equal even though 
> equals() may return true. Also, the equals logic is inconsistent because 
> DenseVector takes into account the name parameter but SparseVector does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to