Should I have set that parameter to a value much much larger than the maximum number of actually expressed preferences by a user?
I'm working on an anonymised data set. If it works as an error test case, I'd be happy to share it for your re-test. I am still hoping it is my error, not Mahout's. Rafal -- Rafal Lukawiecki Pardon brevity, mobile device. On 1 Aug 2013, at 17:19, "Sebastian Schelter" <s...@apache.org> wrote: > Ok, please file a bug report detailing what you've tested and what results > you got. > > Just to clarify, setting maxPrefsPerUser to a high number still does not > help? That surprises me. > > > 2013/8/1 Rafal Lukawiecki <ra...@projectbotticelli.com> > >> Hi Sebastian, >> >> I've rechecked the results, and, I'm afraid that the issue has not gone >> away, contrary to my yesterday's enthusiastic response. Using 0.8 I have >> retested with and without --maxPrefsPerUser 9000 parameter (no user has >> more than 5000 prefs). I have also supplied the prefs file, without the >> preference value, that is as: user,item (one per line) as a --filterFile, >> with and without the -maxPrefsPerUser, and I am afraid we are also seeing >> recommendations for items the user has expressed a prior preference for. >> >> I suppose I need to file a bug report. >> >> Rafal >> -- >> Rafal Lukawiecki >> Pardon my brevity, sent from a telephone. >> >> On 31 Jul 2013, at 22:35, "Rafal Lukawiecki" <ra...@projectbotticelli.com> >> wrote: >> >>> Dear Sebastian, >>> >>> It looks like setting --maxPrefsPerUser 10000 have resolved the issue in >> our case—it seems that the most preferences a user had was just about 5000, >> so I doubled it just-in-case, but when I operationalise this model, I will >> make sure to calculate the actual max number of preferences and set the >> parameter accordingly. I will double-check the resultset to make sure the >> issue is really gone, as I have only checked the few cases where we have >> spotted a recommendation of a previously preferred item. >>> >>> Would you like me to file a bug, and would you like me to test it on 0.8 >> or another version? I am using 0.7. >>> >>> Thanks for your kind support. >>> Rafal >>> -- >>> Rafal Lukawiecki >>> Strategic Consultant and Director >>> Project Botticelli Ltd >>> >>> On 31 Jul 2013, at 06:22, Sebastian Schelter <ssc.o...@googlemail.com> >>> wrote: >>> >>> Hi Rafal, >>> >>> can you try to set the option --maxPrefsPerUser to the maximum number of >>> interactions per user and see if you still get the error? >>> >>> Best, >>> Sebastian >>> >>> On 30.07.2013 19:29, Rafal Lukawiecki wrote: >>>> Thank you Sebastian. The data set is not that large, as we are running >> tests on a subset. It is about 24k users, 40k items, the preference file >> has 65k preferences as triples. This was using Similarity Cooccurrence. >>>> >>>> I can see if I could anonymise the data set to share if that would be >> helpful. >>>> >>>> Thanks for your kind help. >>>> >>>> Rafal >>>> -- >>>> Rafal Lukawiecki >>>> Pardon my brevity, sent from a telephone. >>>> >>>> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <s...@apache.org> wrote: >>>> >>>>> Hi Rafal, >>>>> >>>>> can you issue a ticket for this problem at >>>>> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that >>>>> check whether this happens and currently they work fine. I can only >> imagine >>>>> that the problem occurs in larger datasets where we sample the data in >> some >>>>> places. Can you describe a scenario/dataset where this happens? >>>>> >>>>> Best, >>>>> Sebastian >>>>> >>>>> 2013/7/30 Rafal Lukawiecki <ra...@projectbotticelli.com> >>>>> >>>>>> I'm new here, just registered. Many thanks to everyone for working on >> an >>>>>> amazing piece of software, thank you for building Mahout and for your >>>>>> support. My apologies if this is not the right place to ask the >> question—I >>>>>> have searched for the issue, and I can see this problem has been >> reported >>>>>> here: >> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items >>>>>> >>>>>> Unfortunately, the trail leads to the newsgroups, and I have not >> found a >>>>>> way, yet, to get an answer from them, without asking you. >>>>>> >>>>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, >> and I >>>>>> am finding that it is recommending items that the user has already >>>>>> expressed a preference for in their input file. I understand that this >>>>>> should not be happening, and I am not sure if there is a know fix or >> if I >>>>>> should be looking for a workaround (such as using the entire input as >> the >>>>>> filterFile). >>>>>> >>>>>> I will double-check that there is no error on my side, but so far it >> does >>>>>> not seem that way. >>>>>> >>>>>> Many thanks and my regards from Ireland, >>>>>> Rafal Lukawiecki >>>>>> >>>>>> -- >>>>>> >>>>>> Rafal Lukawiecki >>>>>> >>>>>> Strategic Consultant and Director >>>>>> >>>>>> Project Botticelli Ltd >>