@Tevfik, running this recommender:

GenericItemBasedRecommender itemRecommender = new
GenericItemBasedRecommender(dataModel, itemSimilarity, new
AllSimilarItemsCandidateItemsStrategy(itemSimilarity), new
AllSimilarItemsCandidateItemsStrategy(itemSimilarity));


With this dataModel:
1,1,1.0
1,2,2.0
1,3,1.0
1,4,2.0
2,1,1.0
2,2,4.0


And these similarities
1,2,0.1
1,3,0.2
1,4,0.3
2,3,0.5
3,4,0.5
5,1,0.2
5,2,1.0

Returns item 5 for User 1. So item 5 has not been preferred by user 1, and
the similarity between item 5 and two of the items user 1 preferred are not
NaN, but AllSimilarItemsCandidateItemsStrategy is returning that item. So,
I'm truly sorry to insist on this, but I still really do not get the
difference.


On Wed, Mar 5, 2014 at 2:53 PM, Tevfik Aytekin <tevfik.ayte...@gmail.com>wrote:

> Juan,
> You got me wrong,
>
> AllSimilarItemsCandidateItemsStrategy
>
> returns all items that have not been rated by the user and the
> similarity metric returns a non-NaN similarity value with at
> least one of the items preferred by the user.
>
> So, it does not simply return all items that have not been rated by
> the user. For example, if there is an item X which has not been rated
> by the user and if the similarity value between X and at least one of
> the items rated (preferred) by the user is not NaN, then X will be not
> be returned by AllSimilarItemsCandidateItemsStrategy, but it will be
> returned by AllUnknownItemsCandidateItemsStrategy.
>
>
>
> On Wed, Mar 5, 2014 at 4:42 PM, Juan José Ramos <jjar...@gmail.com> wrote:
> > Hi Tefik,
> >
> > Thanks for the response. I think what you says contradicts what Sebastian
> > pointed out before. Also, if AllSimilarItemsCandidateItemsStrategy
> returns
> > all items that have not been rated by the user, what would
> > AllUnknownItemsCandidateItemsStrategy return?
> >
> >
> > On Wed, Mar 5, 2014 at 1:40 PM, Tevfik Aytekin <tevfik.ayte...@gmail.com
> >wrote:
> >
> >> Sorry there was a typo in the previous paragraph.
> >>
> >> If I remember correctly, AllSimilarItemsCandidateItemsStrategy
> >>
> >> returns all items that have not been rated by the user and the
> >> similarity metric returns a non-NaN similarity value with at
> >> least one of the items preferred by the user.
> >>
> >> On Wed, Mar 5, 2014 at 3:38 PM, Tevfik Aytekin <
> tevfik.ayte...@gmail.com>
> >> wrote:
> >> > Hi Juan,
> >> >
> >> > If I remember correctly, AllSimilarItemsCandidateItemsStrategy
> >> >
> >> > returns all items that have not been rated by the user and the
> >> > similarity metric returns a non-NaN similarity value that is with at
> >> > least one of the items preferred by the user.
> >> >
> >> > Tevfik
> >> >
> >> > On Wed, Mar 5, 2014 at 2:30 PM, Sebastian Schelter <s...@apache.org>
> >> wrote:
> >> >> On 03/05/2014 01:23 PM, Juan José Ramos wrote:
> >> >>>
> >> >>> Thanks for the reply, Sebastian.
> >> >>>
> >> >>> I am not sure if that should be implemented in the Abstract base
> class
> >> >>> though because for
> >> >>> instance PreferredItemsNeighborhoodCandidateItemsStrategy, by
> >> definition,
> >> >>> it returns the item not rated by the user and rated by somebody
> else.
> >> >>
> >> >>
> >> >> Good point. So we seem to need special implementations.
> >> >>
> >> >>
> >> >>>
> >> >>> Back to my last post, I have been playing around with
> >> >>> AllSimilarItemsCandidateItemsStrategy
> >> >>> and AllUnknownItemsCandidateItemsStrategy, and although they both do
> >> what
> >> >>> I
> >> >>> wanted (recommend items not previously rated by any user), I
> honestly
> >> >>> can't
> >> >>> tell the difference between the two strategies. In my tests the
> output
> >> was
> >> >>> always the same. If the eventual output of the recommender will not
> >> >>> include
> >> >>> items already rated by the user as pointed out here (
> >> >>>
> >> >>>
> >>
> http://mail-archives.apache.org/mod_mbox/mahout-user/201403.mbox/%3CCABHkCkuv35dbwF%2B9sK88FR3hg7MAcdv0MP10v-5QWEvwmNdY%2BA%40mail.gmail.com%3E
> >> ),
> >> >>> AllSimilarItemsCandidateItemsStrategy should be equivalent to
> >> >>> AllUnkownItemsCandidateItemsStrategy, shouldn't it?
> >> >>
> >> >>
> >> >> AllSimilarItems returns all items that are similar to any item that
> the
> >> user
> >> >> already knows. AllUnknownItems simply returns all items that the user
> >> has
> >> >> not interacted with yet.
> >> >>
> >> >> These are two different things, although they might overlap in some
> >> >> scenarios.
> >> >>
> >> >> Best,
> >> >> Sebastian
> >> >>
> >> >>
> >> >>
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> On Wed, Mar 5, 2014 at 10:23 AM, Sebastian Schelter <s...@apache.org
> >
> >> >>> wrote:
> >> >>>>
> >> >>>>
> >> >>>> Hi Juan,
> >> >>>>
> >> >>>> that is a good catch. CandidateItemsStrategy is the right place to
> >> >>>
> >> >>> implement this. Maybe we should simply extend its interface to add a
> >> >>> parameter that says whether to keep or remove the current users
> items?
> >> >>>>
> >> >>>>
> >> >>>> We could even do this in the abstract base class then.
> >> >>>>
> >> >>>> --sebastian
> >> >>>>
> >> >>>>
> >> >>>> On 03/05/2014 10:42 AM, Juan José Ramos wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> In case somebody runs into the same situation, the key seems to
> be in
> >> >>>>> the
> >> >>>>> CandidateItemStrategy being passed to the constructor
> >> >>>>> of GenericItemBasedRecommender. Looking into the code, if no
> >> >>>>> CandidateItemStrategy is specified in the
> >> >>>>> constructor, PreferredItemsNeighborhoodCandidateItemsStrategy is
> used
> >> >>>>> and
> >> >>>>> as the documentation says, the doGetCandidateItems method:
> "returns
> >> all
> >> >>>>> items that have not been rated by the user and that were
> preferred by
> >> >>>>> another user that has preferred at least one item that the current
> >> user
> >> >>>
> >> >>> has
> >> >>>>>
> >> >>>>> preferred too".
> >> >>>>>
> >> >>>>> So, a different CandidateItemStrategy needs to be passed. For this
> >> >>>
> >> >>> problem,
> >> >>>>>
> >> >>>>> it seems to me that AllSimilarItemsCandidateItemsStrategy,
> >> >>>>> AllUnknownItemsCandidateItemsStrategy are good candidates. Does
> >> anybody
> >> >>>>> know where to find some documentation about the different
> >> >>>>> CandidateItemStrategy? Based on the name I would say that:
> >> >>>>> 1) AllSimilarItemsCandidateItemsStrategy returns all similar items
> >> >>>>> regardless of whether they have been already rated by someone or
> not.
> >> >>>>> 2) AllUnknownItemsCandidateItemsStrategy returns all similar items
> >> that
> >> >>>>> have not been rated by anyone yet.
> >> >>>>>
> >> >>>>> Does anybody know if it works like that?
> >> >>>>> Thanks.
> >> >>>>>
> >> >>>>>
> >> >>>>> On Tue, Mar 4, 2014 at 9:16 AM, Juan José Ramos <
> jjar...@gmail.com>
> >> >>>
> >> >>> wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>>> First thing is thatI know this requirement would not make sense
> in
> >> a CF
> >> >>>>>> Recommender. In my case, I am trying to use Mahout to create
> >> something
> >> >>>>>> closer to a Content-Based Recommender.
> >> >>>>>>
> >> >>>>>> In particular, I am pre-computing a similarity matrix between all
> >> the
> >> >>>>>> documents (items) of my catalogue and using that matrix as the
> >> >>>>>> ItemSimilarity for my Item-Based Recommender.
> >> >>>>>>
> >> >>>>>> So, when a user rates a document, how could I make the
> recommender
> >> >>>
> >> >>> outputs
> >> >>>>>>
> >> >>>>>> similar documents to that ones the user has already rated even
> if no
> >> >>>
> >> >>> other
> >> >>>>>>
> >> >>>>>> user in the system has rated them yet? Is that even possible in
> the
> >> >>>
> >> >>> first
> >> >>>>>>
> >> >>>>>> place?
> >> >>>>>>
> >> >>>>>> Thanks a lot.
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
>

Reply via email to