[ https://issues.apache.org/jira/browse/MAHOUT-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761955#action_12761955 ]
Jake Mannix commented on MAHOUT-181: ------------------------------------ *bump* Has anyone looked at this? The patch fixes the bug in all but TanimotoDistanceMeasure, which I didn't fix because I thought that whoever contributed it knew better what they really wanted to do, but if nobody else wants to, I can update the patch to fix that as well, given the correct definition in Wikipedia. > DistanceMeasure is broken: iteration is done over nonZeroElements of > v1.plus(v2), not v1.minus(v2) > -------------------------------------------------------------------------------------------------- > > Key: MAHOUT-181 > URL: https://issues.apache.org/jira/browse/MAHOUT-181 > Project: Mahout > Issue Type: Bug > Components: Matrix > Affects Versions: 0.2 > Environment: all > Reporter: Jake Mannix > Fix For: 0.2 > > Attachments: MAHOUT-181.patch > > > SquaredEuclideanDistanceMeasure iterates over v1.plus(v2), which has the > right number of nonzero elements if v1.get(i) != -v2.get(i) for all i > indexing nonzero elements, but for example, the simple case of looking at > SquaredEuclideanDisanceMeasure.distance(v, v.assign(new NegateFunction())) > yeilds zero on current trunk, instead of 4*v.lengthSquared(). > Attached is a patch with a unit test which checks that > DistanceMeasure.distance always returns nonnegative results and in particular > also does not return , as well as a fix for ManhattanDistanceMeasure, > SquaredEuclideanDistanceMeasure, and EuclideanDistanceMeasure. > Unfortunately, the attached unit test reveals that the > TanimotoDistanceMeasure is more broken than I can fix at present. It doesn't > appear to be properly using the referenced formula in wikipedia, and in fact > sometimes returns negative results. This means that with this patch applied, > TestTanimotoDistanceMeasure is failing (and rightfully so). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.