In recommender systems, it's dangerous to interpret "no interaction" as
dislike. Think of all movies you never watched, do you really dislike
them all? :)


On 04.04.2013 23:03, Andrew Musselman wrote:
> I agree; I mis-spoke before if I said "dislike".  Zero to me means
> literally nothing.  No interaction.  Which could be either "don't like",
> "don't like today", "dislike", etc.  Which adds to the meaninglessness of
> it.
> 
> 
> On Thu, Apr 4, 2013 at 2:00 PM, Sebastian Schelter
> <ssc.o...@googlemail.com>wrote:
> 
>> I think that in our recommender code, 0 should mean no rating or no
>> interaction observed. I think modeling dislike with 0 creates lot of
>> unnecessary problems.
>>
>> On 04.04.2013 22:56, Andrew Musselman wrote:
>>> I see the arguments for having it defined, just raising the point that
>> it's
>>> a very strange spot to be in.
>>>
>>> If all users are zero except for one person who likes the lentil soup,
>> then
>>> the other users are equally different from that person.
>>>
>>> The problem for me is the discontinuity Sean mentions, where at zero you
>> go
>>> off a cliff and have no sense of distance.
>>>
>>> But for convenience and "behaving nicely" I'm fine with distance between
>>> zero vectors being zero.
>>>
>>>
>>> On Thu, Apr 4, 2013 at 1:50 PM, Dan Filimon <dangeorge.fili...@gmail.com
>>> wrote:
>>>
>>>> While I agree that it's fairly meaningless mathematically, this ensures
>>>> that the distance between two vectors that are the same is 0 always
>> holds.
>>>> Think of yourself using this class through the DistanceMeasure
>> interface.
>>>> The implicit expectation [1] here is that d(x, y) = 0 iff x = y.
>>>>
>>>> [1] http://en.wikipedia.org/wiki/Metric_(mathematics)
>>>>
>>>>
>>>> On Thu, Apr 4, 2013 at 11:40 PM, Andrew Musselman <
>>>> andrew.mussel...@gmail.com> wrote:
>>>>
>>>>> I think it should return an "undefined" symbol.  There is no angle
>>>> between
>>>>> two zero vectors.
>>>>>
>>>>> In a practical sense, taking two zero vectors to be equivalent in the
>>>>> context of user-item vectors, say, is dodgy in my opinion.  That is
>> akin
>>>> to
>>>>> saying "If we both hate everything on this restaurant's menu we are the
>>>>> same person."
>>>>>
>>>>>
>>>>> On Thu, Apr 4, 2013 at 11:56 AM, Dan Filimon <
>>>> dangeorge.fili...@gmail.com
>>>>>> wrote:
>>>>>
>>>>>> Suneel is right. :)
>>>>>>
>>>>>> Let me explain how this came up:
>>>>>> - When clustering, and assigning a point to a cluster, the centroid
>>>> needs
>>>>>> to be updated.
>>>>>> - To update the centroid in the nearest neighbor searcher classes, the
>>>>>> centroid must first be removed.
>>>>>> - To remove the centroid, we get the closest vector (search for it,
>> and
>>>>> it
>>>>>> should be itself) and then remove it from the data structures.
>>>>>> => However, when the centroid is 0, the nearest vector (which should
>> be
>>>>>> itself) has a huge distance (1 rather than 0) and this trips a check.
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 4, 2013 at 9:46 PM, Sean Owen <sro...@gmail.com> wrote:
>>>>>>
>>>>>>> It sounds pretty undefined, but I would tend to define the distance
>>>> as
>>>>>>> 0 in this case of course. And that means defining the cosine as 1.
>>>>>>> Which class in particular? There are a few implementations of this
>>>>>>> distance measure.
>>>>>>>
>>>>>>> On Thu, Apr 4, 2013 at 7:42 PM, Dan Filimon <
>>>>> dangeorge.fili...@gmail.com
>>>>>>>
>>>>>>> wrote:
>>>>>>>> In the case where both vectors are all zeros, the angle between
>>>> them
>>>>> is
>>>>>>> 0,
>>>>>>>> so the cosine is therefore 1 and the so the distance returned
>>>> should
>>>>>> be 0
>>>>>>>> (unless I misunderstood what the distance does).
>>>>>>>>
>>>>>>>> In Mahout, when calling distance() however, if both the denominator
>>>>> and
>>>>>>>> dotProduct are 0 (which is true when both vectors are 0), the
>>>>> returned
>>>>>>>> value is 1.
>>>>>>>>
>>>>>>>> This looks like a bug to me and I would open a JIRA issue and fix
>>>> it
>>>>>> but
>>>>>>> I
>>>>>>>> want to make sure there's nothing I could possibly be missing.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
> 

Reply via email to