Dear Sebastian, It looks like setting --maxPrefsPerUser 10000 have resolved the issue in our case—it seems that the most preferences a user had was just about 5000, so I doubled it just-in-case, but when I operationalise this model, I will make sure to calculate the actual max number of preferences and set the parameter accordingly. I will double-check the resultset to make sure the issue is really gone, as I have only checked the few cases where we have spotted a recommendation of a previously preferred item.
Would you like me to file a bug, and would you like me to test it on 0.8 or another version? I am using 0.7. Thanks for your kind support. Rafal -- Rafal Lukawiecki Strategic Consultant and Director Project Botticelli Ltd On 31 Jul 2013, at 06:22, Sebastian Schelter <ssc.o...@googlemail.com> wrote: Hi Rafal, can you try to set the option --maxPrefsPerUser to the maximum number of interactions per user and see if you still get the error? Best, Sebastian On 30.07.2013 19:29, Rafal Lukawiecki wrote: > Thank you Sebastian. The data set is not that large, as we are running tests > on a subset. It is about 24k users, 40k items, the preference file has 65k > preferences as triples. This was using Similarity Cooccurrence. > > I can see if I could anonymise the data set to share if that would be helpful. > > Thanks for your kind help. > > Rafal > -- > Rafal Lukawiecki > Pardon my brevity, sent from a telephone. > > On 30 Jul 2013, at 18:18, "Sebastian Schelter" <s...@apache.org> wrote: > >> Hi Rafal, >> >> can you issue a ticket for this problem at >> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that >> check whether this happens and currently they work fine. I can only imagine >> that the problem occurs in larger datasets where we sample the data in some >> places. Can you describe a scenario/dataset where this happens? >> >> Best, >> Sebastian >> >> 2013/7/30 Rafal Lukawiecki <ra...@projectbotticelli.com> >> >>> I'm new here, just registered. Many thanks to everyone for working on an >>> amazing piece of software, thank you for building Mahout and for your >>> support. My apologies if this is not the right place to ask the question—I >>> have searched for the issue, and I can see this problem has been reported >>> here: >>> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items >>> >>> Unfortunately, the trail leads to the newsgroups, and I have not found a >>> way, yet, to get an answer from them, without asking you. >>> >>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, and I >>> am finding that it is recommending items that the user has already >>> expressed a preference for in their input file. I understand that this >>> should not be happening, and I am not sure if there is a know fix or if I >>> should be looking for a workaround (such as using the entire input as the >>> filterFile). >>> >>> I will double-check that there is no error on my side, but so far it does >>> not seem that way. >>> >>> Many thanks and my regards from Ireland, >>> Rafal Lukawiecki >>> >>> -- >>> >>> Rafal Lukawiecki >>> >>> Strategic Consultant and Director >>> >>> Project Botticelli Ltd >>> >>>