Ideally, you would file a bug and see whether it still happens with trunk.
I think the problems comes from the fact, that we only use a certain number
of preferences from the user for the final recommendation phase. Therefore
we can hit an item as recommendation whose preference we neglected.

Best,
Sebastian



2013/7/31 Rafal Lukawiecki <ra...@projectbotticelli.com>

> Dear Sebastian,
>
> It looks like setting --maxPrefsPerUser 10000 have resolved the issue in
> our case—it seems that the most preferences a user had was just about 5000,
> so I doubled it just-in-case, but when I operationalise this model, I will
> make sure to calculate the actual max number of preferences and set the
> parameter accordingly. I will double-check the resultset to make sure the
> issue is really gone, as I have only checked the few cases where we have
> spotted a recommendation of a previously preferred item.
>
> Would you like me to file a bug, and would you like me to test it on 0.8
> or another version? I am using 0.7.
>
> Thanks for your kind support.
> Rafal
> --
> Rafal Lukawiecki
> Strategic Consultant and Director
> Project Botticelli Ltd
>
> On 31 Jul 2013, at 06:22, Sebastian Schelter <ssc.o...@googlemail.com>
>  wrote:
>
> Hi Rafal,
>
> can you try to set the option --maxPrefsPerUser to the maximum number of
> interactions per user and see if you still get the error?
>
> Best,
> Sebastian
>
> On 30.07.2013 19:29, Rafal Lukawiecki wrote:
> > Thank you Sebastian. The data set is not that large, as we are running
> tests on a subset. It is about 24k users, 40k items, the preference file
> has 65k preferences as triples. This was using Similarity Cooccurrence.
> >
> > I can see if I could anonymise the data set to share if that would be
> helpful.
> >
> > Thanks for your kind help.
> >
> > Rafal
> > --
> > Rafal Lukawiecki
> > Pardon my brevity, sent from a telephone.
> >
> > On 30 Jul 2013, at 18:18, "Sebastian Schelter" <s...@apache.org> wrote:
> >
> >> Hi Rafal,
> >>
> >> can you issue a ticket for this problem at
> >> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that
> >> check whether this happens and currently they work fine. I can only
> imagine
> >> that the problem occurs in larger datasets where we sample the data in
> some
> >> places. Can you describe a scenario/dataset where this happens?
> >>
> >> Best,
> >> Sebastian
> >>
> >> 2013/7/30 Rafal Lukawiecki <ra...@projectbotticelli.com>
> >>
> >>> I'm new here, just registered. Many thanks to everyone for working on
> an
> >>> amazing piece of software, thank you for building Mahout and for your
> >>> support. My apologies if this is not the right place to ask the
> question—I
> >>> have searched for the issue, and I can see this problem has been
> reported
> >>> here:
> >>>
> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items
> >>>
> >>> Unfortunately, the trail leads to the newsgroups, and I have not found
> a
> >>> way, yet, to get an answer from them, without asking you.
> >>>
> >>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, and
> I
> >>> am finding that it is recommending items that the user has already
> >>> expressed a preference for in their input file. I understand that this
> >>> should not be happening, and I am not sure if there is a know fix or
> if I
> >>> should be looking for a workaround (such as using the entire input as
> the
> >>> filterFile).
> >>>
> >>> I will double-check that there is no error on my side, but so far it
> does
> >>> not seem that way.
> >>>
> >>> Many thanks and my regards from Ireland,
> >>> Rafal Lukawiecki
> >>>
> >>> --
> >>>
> >>> Rafal Lukawiecki
> >>>
> >>> Strategic Consultant and Director
> >>>
> >>> Project Botticelli Ltd
> >>>
> >>>
>
>
>
>

Reply via email to