Dear Sebastian,

It looks like setting --maxPrefsPerUser 10000 have resolved the issue in our 
case—it seems that the most preferences a user had was just about 5000, so I 
doubled it just-in-case, but when I operationalise this model, I will make sure 
to calculate the actual max number of preferences and set the parameter 
accordingly. I will double-check the resultset to make sure the issue is really 
gone, as I have only checked the few cases where we have spotted a 
recommendation of a previously preferred item.

Would you like me to file a bug, and would you like me to test it on 0.8 or 
another version? I am using 0.7.

Thanks for your kind support.
Rafal
--
Rafal Lukawiecki
Strategic Consultant and Director 
Project Botticelli Ltd

On 31 Jul 2013, at 06:22, Sebastian Schelter <ssc.o...@googlemail.com>
 wrote:

Hi Rafal,

can you try to set the option --maxPrefsPerUser to the maximum number of
interactions per user and see if you still get the error?

Best,
Sebastian

On 30.07.2013 19:29, Rafal Lukawiecki wrote:
> Thank you Sebastian. The data set is not that large, as we are running tests 
> on a subset. It is about 24k users, 40k items, the preference file has 65k 
> preferences as triples. This was using Similarity Cooccurrence.
> 
> I can see if I could anonymise the data set to share if that would be helpful.
> 
> Thanks for your kind help. 
> 
> Rafal
> --
> Rafal Lukawiecki
> Pardon my brevity, sent from a telephone.
> 
> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <s...@apache.org> wrote:
> 
>> Hi Rafal,
>> 
>> can you issue a ticket for this problem at
>> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that
>> check whether this happens and currently they work fine. I can only imagine
>> that the problem occurs in larger datasets where we sample the data in some
>> places. Can you describe a scenario/dataset where this happens?
>> 
>> Best,
>> Sebastian
>> 
>> 2013/7/30 Rafal Lukawiecki <ra...@projectbotticelli.com>
>> 
>>> I'm new here, just registered. Many thanks to everyone for working on an
>>> amazing piece of software, thank you for building Mahout and for your
>>> support. My apologies if this is not the right place to ask the question—I
>>> have searched for the issue, and I can see this problem has been reported
>>> here:
>>> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items
>>> 
>>> Unfortunately, the trail leads to the newsgroups, and I have not found a
>>> way, yet, to get an answer from them, without asking you.
>>> 
>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, and I
>>> am finding that it is recommending items that the user has already
>>> expressed a preference for in their input file. I understand that this
>>> should not be happening, and I am not sure if there is a know fix or if I
>>> should be looking for a workaround (such as using the entire input as the
>>> filterFile).
>>> 
>>> I will double-check that there is no error on my side, but so far it does
>>> not seem that way.
>>> 
>>> Many thanks and my regards from Ireland,
>>> Rafal Lukawiecki
>>> 
>>> --
>>> 
>>> Rafal Lukawiecki
>>> 
>>> Strategic Consultant and Director
>>> 
>>> Project Botticelli Ltd
>>> 
>>> 



Reply via email to