Dislike should not be modeled by a zero rating IMHO. This might also create problems with the iterateNonZero() method in our vectors.
On 04.04.2013 22:40, Andrew Musselman wrote: > I think it should return an "undefined" symbol. There is no angle between > two zero vectors. > > In a practical sense, taking two zero vectors to be equivalent in the > context of user-item vectors, say, is dodgy in my opinion. That is akin to > saying "If we both hate everything on this restaurant's menu we are the > same person." > > > On Thu, Apr 4, 2013 at 11:56 AM, Dan Filimon > <[email protected]>wrote: > >> Suneel is right. :) >> >> Let me explain how this came up: >> - When clustering, and assigning a point to a cluster, the centroid needs >> to be updated. >> - To update the centroid in the nearest neighbor searcher classes, the >> centroid must first be removed. >> - To remove the centroid, we get the closest vector (search for it, and it >> should be itself) and then remove it from the data structures. >> => However, when the centroid is 0, the nearest vector (which should be >> itself) has a huge distance (1 rather than 0) and this trips a check. >> >> >> On Thu, Apr 4, 2013 at 9:46 PM, Sean Owen <[email protected]> wrote: >> >>> It sounds pretty undefined, but I would tend to define the distance as >>> 0 in this case of course. And that means defining the cosine as 1. >>> Which class in particular? There are a few implementations of this >>> distance measure. >>> >>> On Thu, Apr 4, 2013 at 7:42 PM, Dan Filimon <[email protected] >>> >>> wrote: >>>> In the case where both vectors are all zeros, the angle between them is >>> 0, >>>> so the cosine is therefore 1 and the so the distance returned should >> be 0 >>>> (unless I misunderstood what the distance does). >>>> >>>> In Mahout, when calling distance() however, if both the denominator and >>>> dotProduct are 0 (which is true when both vectors are 0), the returned >>>> value is 1. >>>> >>>> This looks like a bug to me and I would open a JIRA issue and fix it >> but >>> I >>>> want to make sure there's nothing I could possibly be missing. >>>> >>>> Thoughts? >>> >> >
