Should I have set that parameter to a value much much larger than the maximum 
number of actually expressed preferences by a user?

I'm working on an anonymised data set. If it works as an error test case, I'd 
be happy to share it for your re-test. I am still hoping it is my error, not 
Mahout's.

Rafal
--
Rafal Lukawiecki
Pardon brevity, mobile device.

On 1 Aug 2013, at 17:19, "Sebastian Schelter" <s...@apache.org> wrote:

> Ok, please file a bug report detailing what you've tested and what results
> you got.
> 
> Just to clarify, setting maxPrefsPerUser to a high number still does not
> help? That surprises me.
> 
> 
> 2013/8/1 Rafal Lukawiecki <ra...@projectbotticelli.com>
> 
>> Hi Sebastian,
>> 
>> I've rechecked the results, and, I'm afraid that the issue has not gone
>> away, contrary to my yesterday's enthusiastic response. Using 0.8 I have
>> retested with and without --maxPrefsPerUser 9000 parameter (no user has
>> more than 5000 prefs). I have also supplied the prefs file, without the
>> preference value, that is as: user,item (one per line) as a --filterFile,
>> with and without the -maxPrefsPerUser, and I am afraid we are also seeing
>> recommendations for items the user has expressed a prior preference for.
>> 
>> I suppose I need to file a bug report.
>> 
>> Rafal
>> --
>> Rafal Lukawiecki
>> Pardon my brevity, sent from a telephone.
>> 
>> On 31 Jul 2013, at 22:35, "Rafal Lukawiecki" <ra...@projectbotticelli.com>
>> wrote:
>> 
>>> Dear Sebastian,
>>> 
>>> It looks like setting --maxPrefsPerUser 10000 have resolved the issue in
>> our case—it seems that the most preferences a user had was just about 5000,
>> so I doubled it just-in-case, but when I operationalise this model, I will
>> make sure to calculate the actual max number of preferences and set the
>> parameter accordingly. I will double-check the resultset to make sure the
>> issue is really gone, as I have only checked the few cases where we have
>> spotted a recommendation of a previously preferred item.
>>> 
>>> Would you like me to file a bug, and would you like me to test it on 0.8
>> or another version? I am using 0.7.
>>> 
>>> Thanks for your kind support.
>>> Rafal
>>> --
>>> Rafal Lukawiecki
>>> Strategic Consultant and Director
>>> Project Botticelli Ltd
>>> 
>>> On 31 Jul 2013, at 06:22, Sebastian Schelter <ssc.o...@googlemail.com>
>>> wrote:
>>> 
>>> Hi Rafal,
>>> 
>>> can you try to set the option --maxPrefsPerUser to the maximum number of
>>> interactions per user and see if you still get the error?
>>> 
>>> Best,
>>> Sebastian
>>> 
>>> On 30.07.2013 19:29, Rafal Lukawiecki wrote:
>>>> Thank you Sebastian. The data set is not that large, as we are running
>> tests on a subset. It is about 24k users, 40k items, the preference file
>> has 65k preferences as triples. This was using Similarity Cooccurrence.
>>>> 
>>>> I can see if I could anonymise the data set to share if that would be
>> helpful.
>>>> 
>>>> Thanks for your kind help.
>>>> 
>>>> Rafal
>>>> --
>>>> Rafal Lukawiecki
>>>> Pardon my brevity, sent from a telephone.
>>>> 
>>>> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <s...@apache.org> wrote:
>>>> 
>>>>> Hi Rafal,
>>>>> 
>>>>> can you issue a ticket for this problem at
>>>>> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that
>>>>> check whether this happens and currently they work fine. I can only
>> imagine
>>>>> that the problem occurs in larger datasets where we sample the data in
>> some
>>>>> places. Can you describe a scenario/dataset where this happens?
>>>>> 
>>>>> Best,
>>>>> Sebastian
>>>>> 
>>>>> 2013/7/30 Rafal Lukawiecki <ra...@projectbotticelli.com>
>>>>> 
>>>>>> I'm new here, just registered. Many thanks to everyone for working on
>> an
>>>>>> amazing piece of software, thank you for building Mahout and for your
>>>>>> support. My apologies if this is not the right place to ask the
>> question—I
>>>>>> have searched for the issue, and I can see this problem has been
>> reported
>>>>>> here:
>> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items
>>>>>> 
>>>>>> Unfortunately, the trail leads to the newsgroups, and I have not
>> found a
>>>>>> way, yet, to get an answer from them, without asking you.
>>>>>> 
>>>>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7,
>> and I
>>>>>> am finding that it is recommending items that the user has already
>>>>>> expressed a preference for in their input file. I understand that this
>>>>>> should not be happening, and I am not sure if there is a know fix or
>> if I
>>>>>> should be looking for a workaround (such as using the entire input as
>> the
>>>>>> filterFile).
>>>>>> 
>>>>>> I will double-check that there is no error on my side, but so far it
>> does
>>>>>> not seem that way.
>>>>>> 
>>>>>> Many thanks and my regards from Ireland,
>>>>>> Rafal Lukawiecki
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Rafal Lukawiecki
>>>>>> 
>>>>>> Strategic Consultant and Director
>>>>>> 
>>>>>> Project Botticelli Ltd
>> 

Reply via email to