Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-17 Thread Pat Ferrel
Time splits are fine but may contain anomalies that bias the data. If you are going to compare two recommenders based on time splits, make sure the data is exactly the same for each recommender. One time split we did to create a 90-10 training to test set had a split date of 12/24! Some form of

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-17 Thread Sean Owen
I agree with that explanation. Is it "why" it's unsupervised.. well I think of recommendation in the context of things like dimension reduction, which are just structure-finding exercises. Often the input has no positive or negative label (a click); everything is 'positive'. If you're predicting an

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-17 Thread Osman Başkaya
Correction: - Are you saying that this job is unsupervised since no user can rate all of the movies. For this reason, we won't be sure that our predicted top-N list contains no relevant item because it can be possible that our top-N recommendation list has relevant movie(s) which hasn't rated by t

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-17 Thread Osman Başkaya
I am sorry to extend the unsupervised/supervised discussion which is not the main question here but I need to ask. Sean, I don't understand your last answer. Let's assume our rating scale is from 1 to 5. We can say that those movies which a particular user rates as 5 are relevant for him/her. 5 is

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
The very question at hand is how to label the data as "relevant" and "not relevant" results. The question exists because this is not given, which is why I would not call this a supervised problem. That may just be semantics, but the point I wanted to make is that the reasons choosing a random train

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Ted Dunning
Sean I think it is still a supervised learning problem in that there is a labelled training data set and an unlabeled test data set. Learning a ranking doesn't change the basic dichotomy between supervised and unsupervised. It just changes the desired figure of merit. Sent from my iPhone O

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Ted Dunning
There are a variety of common time based effects which make time splits best in many practical cases. Having the training data all be from the past emulates this better than random splits. For one thing, you can have the same user under different names in training and test. For another thing

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Ahmet Ylmaz
Thanks for the replies. From: Sean Owen To: Mahout User List Sent: Saturday, February 16, 2013 11:34 PM Subject: Re: Problems with Mahout's RecommenderIRStatsEvaluator I understand the idea, but this boils down to the current implementation, plus

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
I understand the idea, but this boils down to the current implementation, plus going back and throwing out some additional training data that is lower rated -- it's neither in test or training. Anything's possible, but I do not imagine this is a helpful practice in general. On Sat, Feb 16, 2013 a

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Tevfik Aytekin
I'm suggesting the second one. In that way the test user's ratings in the training set will compose of both low and high rated items, that prevents the problem pointed out by Ahmet. On Sat, Feb 16, 2013 at 11:19 PM, Sean Owen wrote: > If you're suggesting that you hold out only high-rated items,

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
If you're suggesting that you hold out only high-rated items, and then sample them, then that's what is done already in the code, except without the sampling. The sampling doesn't buy anything that I can see. If you're suggesting holding out a random subset and then throwing away the held-out item

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Tevfik Aytekin
What I mean is you can choose ratings randomly and try to recommend the ones above the threshold On Sat, Feb 16, 2013 at 10:32 PM, Sean Owen wrote: > Sure, if you were predicting ratings for one movie given a set of ratings > for that movie and the ratings for many other movies. That isn't what

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
Sure, if you were predicting ratings for one movie given a set of ratings for that movie and the ratings for many other movies. That isn't what the recommender problem is. Here, the problem is to list N movies most likely to be top-rated. The precision-recall test is, in turn, a test of top N resul

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Tevfik Aytekin
No, rating prediction is clearly a supervised ML problem On Sat, Feb 16, 2013 at 10:15 PM, Sean Owen wrote: > This is a good answer for evaluation of supervised ML, but, this is > unsupervised. Choosing randomly is choosing the 'right answers' randomly, > and that's plainly problematic. > > > On

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
This is a good answer for evaluation of supervised ML, but, this is unsupervised. Choosing randomly is choosing the 'right answers' randomly, and that's plainly problematic. On Sat, Feb 16, 2013 at 8:53 PM, Tevfik Aytekin wrote: > I think, it is better to choose ratings of the test user in a ran

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Tevfik Aytekin
similar to B than C, which is not true. >> >> >> >> >> ________ >> From: Sean Owen >> To: Mahout User List ; Ahmet Ylmaz < >> ahmetyilmazefe...@yahoo.com> >> Sent: Saturday, February 16, 2013 8:41 PM >> Subject: Re

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
alize the ratings then A > will be > more similar to B than C, which is not true. > > > > > > From: Sean Owen > To: Mahout User List ; Ahmet Ylmaz < > ahmetyilmazefe...@yahoo.com> > Sent: Saturday, February 16, 2013 8:41

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Ahmet Ylmaz
C, which is not true. From: Sean Owen To: Mahout User List ; Ahmet Ylmaz Sent: Saturday, February 16, 2013 8:41 PM Subject: Re: Problems with Mahout's RecommenderIRStatsEvaluator No, this is not a problem. Yes it builds a model for each user, whi

Re: Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Sean Owen
No, this is not a problem. Yes it builds a model for each user, which takes a long time. It's accurate, but time-consuming. It's meant for small data. You could rewrite your own test to hold out data for all test users at once. That's what I did when I rewrote a lot of this just because it was mor

Problems with Mahout's RecommenderIRStatsEvaluator

2013-02-16 Thread Ahmet Ylmaz
Hi, I have looked at the internals of Mahout's RecommenderIRStatsEvaluator code. I think that there are two important problems here. According to my understanding the experimental protocol used in this code is something like this: It takes away a certain percentage of users as test users. For