Hi Sebastian,
But in order not to select items that is not similar to at least one
of the items the user interacted with you have to compute the
similarity with all user items (which is the main task for estimating
the preference of an item in item-based method). So, it seems to me
that AllSimilarItemsStrategy does not bring much advantage over
AllUnknownItemsCandidateItemsStrategy.

On Wed, Mar 5, 2014 at 6:46 PM, Sebastian Schelter <s...@apache.org> wrote:
>> So both strategies seems to be effectively the same, I don't know what
>> the implementers had in mind when designing
>> AllSimilarItemsCandidateItemsStrategy.
>
> It can take a long time to estimate preferences for all items a user doesn't
> know. Especially if you have a lot of items. Traditional item-based
> recommenders will not recommend any item that is not similar to at least one
> of the items the user interacted with, so AllSimilarItemsStrategy already
> selects the maximum set of items that could be potentially recommended to
> the user.
>
> --sebastian
>
>
>
>
> On 03/05/2014 05:38 PM, Tevfik Aytekin wrote:
>>
>> If the similarity between item 5 and two of the items user 1 preferred are
>> not
>> NaN then it will return 1, that is what I'm saying. If the
>> similarities were all NaN then
>> it will not return it.
>>
>> But surely, you might wonder if all similarities between an item and
>> user's items are NaN, then
>> AllUnknownItemsCandidateItemsStrategy probably will not return it.
>>
>
>> On Wed, Mar 5, 2014 at 6:06 PM, Juan José Ramos <jjar...@gmail.com> wrote:
>>>
>>> @Tevfik, running this recommender:
>>>
>>> GenericItemBasedRecommender itemRecommender = new
>>> GenericItemBasedRecommender(dataModel, itemSimilarity, new
>>> AllSimilarItemsCandidateItemsStrategy(itemSimilarity), new
>>> AllSimilarItemsCandidateItemsStrategy(itemSimilarity));
>>>
>>>
>>> With this dataModel:
>>> 1,1,1.0
>>> 1,2,2.0
>>> 1,3,1.0
>>> 1,4,2.0
>>> 2,1,1.0
>>> 2,2,4.0
>>>
>>>
>>> And these similarities
>>> 1,2,0.1
>>> 1,3,0.2
>>> 1,4,0.3
>>> 2,3,0.5
>>> 3,4,0.5
>>> 5,1,0.2
>>> 5,2,1.0
>>>
>>> Returns item 5 for User 1. So item 5 has not been preferred by user 1,
>>> and
>>> the similarity between item 5 and two of the items user 1 preferred are
>>> not
>>> NaN, but AllSimilarItemsCandidateItemsStrategy is returning that item.
>>> So,
>>> I'm truly sorry to insist on this, but I still really do not get the
>>> difference.
>>>
>>>
>>> On Wed, Mar 5, 2014 at 2:53 PM, Tevfik Aytekin
>>> <tevfik.ayte...@gmail.com>wrote:
>>>
>>>> Juan,
>>>> You got me wrong,
>>>>
>>>> AllSimilarItemsCandidateItemsStrategy
>>>>
>>>> returns all items that have not been rated by the user and the
>>>> similarity metric returns a non-NaN similarity value with at
>>>> least one of the items preferred by the user.
>>>>
>>>> So, it does not simply return all items that have not been rated by
>>>> the user. For example, if there is an item X which has not been rated
>>>> by the user and if the similarity value between X and at least one of
>>>> the items rated (preferred) by the user is not NaN, then X will be not
>>>> be returned by AllSimilarItemsCandidateItemsStrategy, but it will be
>>>> returned by AllUnknownItemsCandidateItemsStrategy.
>>>>
>>>>
>>>>
>>>> On Wed, Mar 5, 2014 at 4:42 PM, Juan José Ramos <jjar...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi Tefik,
>>>>>
>>>>> Thanks for the response. I think what you says contradicts what
>>>>> Sebastian
>>>>> pointed out before. Also, if AllSimilarItemsCandidateItemsStrategy
>>>>
>>>> returns
>>>>>
>>>>> all items that have not been rated by the user, what would
>>>>> AllUnknownItemsCandidateItemsStrategy return?
>>>>>
>>>>>
>>>>> On Wed, Mar 5, 2014 at 1:40 PM, Tevfik Aytekin
>>>>> <tevfik.ayte...@gmail.com
>>>>> wrote:
>>>>>
>>>>>> Sorry there was a typo in the previous paragraph.
>>>>>>
>>>>>> If I remember correctly, AllSimilarItemsCandidateItemsStrategy
>>>>>>
>>>>>> returns all items that have not been rated by the user and the
>>>>>> similarity metric returns a non-NaN similarity value with at
>>>>>> least one of the items preferred by the user.
>>>>>>
>>>>>> On Wed, Mar 5, 2014 at 3:38 PM, Tevfik Aytekin <
>>>>
>>>> tevfik.ayte...@gmail.com>
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Juan,
>>>>>>>
>>>>>>> If I remember correctly, AllSimilarItemsCandidateItemsStrategy
>>>>>>>
>>>>>>> returns all items that have not been rated by the user and the
>>>>>>> similarity metric returns a non-NaN similarity value that is with at
>>>>>>> least one of the items preferred by the user.
>>>>>>>
>>>>>>> Tevfik
>>>>>>>
>>>>>>> On Wed, Mar 5, 2014 at 2:30 PM, Sebastian Schelter <s...@apache.org>
>>>>>>
>>>>>> wrote:
>>>>>>>>
>>>>>>>> On 03/05/2014 01:23 PM, Juan José Ramos wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks for the reply, Sebastian.
>>>>>>>>>
>>>>>>>>> I am not sure if that should be implemented in the Abstract base
>>>>
>>>> class
>>>>>>>>>
>>>>>>>>> though because for
>>>>>>>>> instance PreferredItemsNeighborhoodCandidateItemsStrategy, by
>>>>>>
>>>>>> definition,
>>>>>>>>>
>>>>>>>>> it returns the item not rated by the user and rated by somebody
>>>>
>>>> else.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Good point. So we seem to need special implementations.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Back to my last post, I have been playing around with
>>>>>>>>> AllSimilarItemsCandidateItemsStrategy
>>>>>>>>> and AllUnknownItemsCandidateItemsStrategy, and although they both
>>>>>>>>> do
>>>>>>
>>>>>> what
>>>>>>>>>
>>>>>>>>> I
>>>>>>>>> wanted (recommend items not previously rated by any user), I
>>>>
>>>> honestly
>>>>>>>>>
>>>>>>>>> can't
>>>>>>>>> tell the difference between the two strategies. In my tests the
>>>>
>>>> output
>>>>>>
>>>>>> was
>>>>>>>>>
>>>>>>>>> always the same. If the eventual output of the recommender will not
>>>>>>>>> include
>>>>>>>>> items already rated by the user as pointed out here (
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>
>>>> http://mail-archives.apache.org/mod_mbox/mahout-user/201403.mbox/%3CCABHkCkuv35dbwF%2B9sK88FR3hg7MAcdv0MP10v-5QWEvwmNdY%2BA%40mail.gmail.com%3E
>>>>>>
>>>>>> ),
>>>>>>>>>
>>>>>>>>> AllSimilarItemsCandidateItemsStrategy should be equivalent to
>>>>>>>>> AllUnkownItemsCandidateItemsStrategy, shouldn't it?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> AllSimilarItems returns all items that are similar to any item that
>>>>
>>>> the
>>>>>>
>>>>>> user
>>>>>>>>
>>>>>>>> already knows. AllUnknownItems simply returns all items that the
>>>>>>>> user
>>>>>>
>>>>>> has
>>>>>>>>
>>>>>>>> not interacted with yet.
>>>>>>>>
>>>>>>>> These are two different things, although they might overlap in some
>>>>>>>> scenarios.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Sebastian
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> On Wed, Mar 5, 2014 at 10:23 AM, Sebastian Schelter <s...@apache.org
>>>>>
>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Juan,
>>>>>>>>>>
>>>>>>>>>> that is a good catch. CandidateItemsStrategy is the right place to
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> implement this. Maybe we should simply extend its interface to add
>>>>>>>>> a
>>>>>>>>> parameter that says whether to keep or remove the current users
>>>>
>>>> items?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We could even do this in the abstract base class then.
>>>>>>>>>>
>>>>>>>>>> --sebastian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 03/05/2014 10:42 AM, Juan José Ramos wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> In case somebody runs into the same situation, the key seems to
>>>>
>>>> be in
>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>>> CandidateItemStrategy being passed to the constructor
>>>>>>>>>>> of GenericItemBasedRecommender. Looking into the code, if no
>>>>>>>>>>> CandidateItemStrategy is specified in the
>>>>>>>>>>> constructor, PreferredItemsNeighborhoodCandidateItemsStrategy is
>>>>
>>>> used
>>>>>>>>>>>
>>>>>>>>>>> and
>>>>>>>>>>> as the documentation says, the doGetCandidateItems method:
>>>>
>>>> "returns
>>>>>>
>>>>>> all
>>>>>>>>>>>
>>>>>>>>>>> items that have not been rated by the user and that were
>>>>
>>>> preferred by
>>>>>>>>>>>
>>>>>>>>>>> another user that has preferred at least one item that the
>>>>>>>>>>> current
>>>>>>
>>>>>> user
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> has
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> preferred too".
>>>>>>>>>>>
>>>>>>>>>>> So, a different CandidateItemStrategy needs to be passed. For
>>>>>>>>>>> this
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> problem,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> it seems to me that AllSimilarItemsCandidateItemsStrategy,
>>>>>>>>>>> AllUnknownItemsCandidateItemsStrategy are good candidates. Does
>>>>>>
>>>>>> anybody
>>>>>>>>>>>
>>>>>>>>>>> know where to find some documentation about the different
>>>>>>>>>>> CandidateItemStrategy? Based on the name I would say that:
>>>>>>>>>>> 1) AllSimilarItemsCandidateItemsStrategy returns all similar
>>>>>>>>>>> items
>>>>>>>>>>> regardless of whether they have been already rated by someone or
>>>>
>>>> not.
>>>>>>>>>>>
>>>>>>>>>>> 2) AllUnknownItemsCandidateItemsStrategy returns all similar
>>>>>>>>>>> items
>>>>>>
>>>>>> that
>>>>>>>>>>>
>>>>>>>>>>> have not been rated by anyone yet.
>>>>>>>>>>>
>>>>>>>>>>> Does anybody know if it works like that?
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Mar 4, 2014 at 9:16 AM, Juan José Ramos <
>>>>
>>>> jjar...@gmail.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> First thing is thatI know this requirement would not make sense
>>>>
>>>> in
>>>>>>
>>>>>> a CF
>>>>>>>>>>>>
>>>>>>>>>>>> Recommender. In my case, I am trying to use Mahout to create
>>>>>>
>>>>>> something
>>>>>>>>>>>>
>>>>>>>>>>>> closer to a Content-Based Recommender.
>>>>>>>>>>>>
>>>>>>>>>>>> In particular, I am pre-computing a similarity matrix between
>>>>>>>>>>>> all
>>>>>>
>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> documents (items) of my catalogue and using that matrix as the
>>>>>>>>>>>> ItemSimilarity for my Item-Based Recommender.
>>>>>>>>>>>>
>>>>>>>>>>>> So, when a user rates a document, how could I make the
>>>>
>>>> recommender
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> outputs
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> similar documents to that ones the user has already rated even
>>>>
>>>> if no
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> other
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> user in the system has rated them yet? Is that even possible in
>>>>
>>>> the
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> first
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> place?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks a lot.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>

Reply via email to