Re: Question about evaluating a Recommender System

Zhongduo Lin Wed, 08 May 2013 12:00:43 -0700

This accounts for why a neighborhood size of 2 always gives me the bestresult. Thank you!


Best Regards,
Jimmy


Zhongduo Lin (Jimmy)
MASc candidate in ECE department
University of Toronto

On 2013-05-08 2:40 PM, Alejandro Bellogin Kouki wrote:

AFAIK, the recommender would predict a NaN, which will be ignored by the
evaluator.

However, I am not sure if there is any way to know how many of these
were actually produced in the evaluation step, that is, something like
the count of predictions with a NaN value.

Cheers,
Alex

Zhongduo Lin escribió:

Thank you for the quick response.

I agree that a neighborhood size of 2 will make the predictions more
sensible. But my concern is that a neighborhood size of 2 can only
predict a very small proportion of preference for each users. Let's
take a look at the previous example,  how can it predict item 4 if
item 4 happens to be chosen as in the test set? I think this is quite
common in my case as well as for Amazon or eBay, since the rating is
very sparse. So I just don't know how it can still be run.

User 1                rated item 1, 2, 3, 4
neighbour1 of user 1  rated item 1, 2
neighbour2 of user 1  rated item 1, 3


I wouldn't expect that the Root Mean Square error will have different
performance than the Absolute difference, since in that case most of
the predictions are close to 1, resulting a near zero error no matter
I am using absolute difference or RMSE. How can I say "RMSE is worse
relative to the variance of the data set" using Mahout? Unfortunately
I got an error using the precision and recall evaluation method, I
guess that's because the data are too sparse.

Best Regards,
Jimmy


On 13-05-08 10:05 AM, Sean Owen wrote:

It may be true that the results are best with a neighborhood size of
2. Why is that surprising? Very similar people, by nature, rate
similar things, which makes the things you held out of a user's test
set likely to be found in the recommendations.

The mapping you suggest is not that sensible, yes, since almost
everything maps to 1. Not surprisingly, most of your predictions are
near 1. That's "better" in an absolute sense, but RMSE is worse
relative to the variance of the data set. This is not a good mapping
-- or else, RMSE is not a very good metric, yes. So, don't do one of
those two things.

Try mean average precision for a metric that is not directly related
to the prediction values.

On Wed, May 8, 2013 at 2:45 PM, Zhongduo Lin <zhong...@gmail.com> wrote:

Thank you for your reply.

I think the evaluation process involves randomly choosing the
evaluation
proportion. The problem is that I always get the best result when I set
neighbors to 2, which seems unreasonable to me. Since there should
be many
test case that the recommender system couldn't predict at all. So
why did I
still get a valid result? How does Mahout handle this case?

Sorry I didn't make myself clear for the second question. Here is the
problem: I have a set of inferred preference ranging from 0 to 1000.
But I
want to map it to 1 - 5. So there can be many ways for mapping.
Let's take a
simple example, if the mapping rule is like the following:
         if (inferred_preference < 995) preference = 1;
         else preference = inferred_preference - 995.

You can see that this is a really bad mapping algorithms, but if we
run the
generated preference to Mahout, it is going to give me a really nice
result
because most of the preference is 1. So is there any other metric to
evaluate this?


Any help will be highly appreciated.

Best Regards,
Jimmy


Zhongduo Lin (Jimmy)
MASc candidate in ECE department
University of Toronto


On 2013-05-08 4:44 AM, Sean Owen wrote:

It is true that a process based on user-user similarity only won't be
able to recommend item 4 in this example. This is a drawback of the
algorithm and not something that can be worked around. You could try
not to choose this item in the test set, but then that does not quite
reflect reality in the test.

If you just mean that compressing the range of pref values improves
RMSE in absolute terms, yes it does of course. But not in relative
terms. There is nothing inherently better or worse about a small range
in this example.

RMSE is a fine eval metric, but you can also considered mean average
precision.

Sean

Re: Question about evaluating a Recommender System

Reply via email to