It is true that a process based on user-user similarity only won't be
able to recommend item 4 in this example. This is a drawback of the
algorithm and not something that can be worked around. You could try
not to choose this item in the test set, but then that does not quite
reflect reality in
Thank you for your reply.
I think the evaluation process involves randomly choosing the evaluation
proportion. The problem is that I always get the best result when I set
neighbors to 2, which seems unreasonable to me. Since there should be
many test case that the recommender system couldn't
It may be true that the results are best with a neighborhood size of
2. Why is that surprising? Very similar people, by nature, rate
similar things, which makes the things you held out of a user's test
set likely to be found in the recommendations.
The mapping you suggest is not that sensible,
Thank you for the quick response.
I agree that a neighborhood size of 2 will make the predictions more
sensible. But my concern is that a neighborhood size of 2 can only
predict a very small proportion of preference for each users. Let's take
a look at the previous example, how can it
You can't predict item 4 in that case. that shows the weakness of
neighborhood approaches for sparse data. That's pretty much the story
-- it's all working correctly. Maybe you should not use this approach.
On Wed, May 8, 2013 at 4:00 PM, Zhongduo Lin zhong...@gmail.com wrote:
Thank you for the
Thank you for your reply. So in the case that item 4 is in the test set,
will Mahout just not take it into consideration or generate any
preference instead? Any is it there any way to evaluate the mapping
algorithm in Mahout?
Best Regards,
Jimmy
On 13-05-08 11:09 AM, Sean Owen wrote:
You
It may be selected as a test item. Other algorithms can predict the
'4'. The test process is random so as to not favor one algorithm.
I think you are just arguing that the algorithm you are using isn't
good for your data -- so just don't use it. Is that not the answer?
I don't know what you mean
Sorry for the confusion. I am comparing different algorithms including
both user-based and item-based. So I think it will be useful to know how
Mahout is dealing with such a situation in order to give a more fair
comparison. Because for now, the user-based approaches get a better
result to me.
AFAIK, the recommender would predict a NaN, which will be ignored by the
evaluator.
However, I am not sure if there is any way to know how many of these
were actually produced in the evaluation step, that is, something like
the count of predictions with a NaN value.
Cheers,
Alex
Zhongduo
This accounts for why a neighborhood size of 2 always gives me the best
result. Thank you!
Best Regards,
Jimmy
Zhongduo Lin (Jimmy)
MASc candidate in ECE department
University of Toronto
On 2013-05-08 2:40 PM, Alejandro Bellogin Kouki wrote:
AFAIK, the recommender would predict a NaN, which
Ah, yes that's right. Yes if you have a lot of these values, the test
is really not valid. It may look 'better' but isn't for just this
reason. You want to make sure the result doesn't have many of these or
else you would discard it. Look for log lines like Unable to
recommend in X cases
On Wed,
I see. Thank you for your information! Any idea about evaluating the
method of mapping inferred preference to a smaller scale with Mahout?
Best Regards,
Jimmy
Zhongduo Lin (Jimmy)
MASc candidate in ECE department
University of Toronto
On 2013-05-08 3:32 PM, Sean Owen wrote:
Ah, yes that's
Hi All,
I am using the Mahout to build a user-based recommender system (RS). The
evaluation method I am using is
AverageAbsoluteDifferenceRecommenderEvaluator, which according to the
Mahout in Action randomly sets aside some existing preference and
calculate the difference between the
13 matches
Mail list logo