[ 
https://issues.apache.org/jira/browse/MAHOUT-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169280#comment-13169280
 ] 

Sean Owen commented on MAHOUT-906:
----------------------------------

OK. I think we're speaking about the estimation test, not the IR tests. In the 
IR test there is not really a notion of training and test data; there are the 
relevant items and non-relevant items. The 'relevant' items are the ones held 
out. You could hold out the latest prefs, I guess, though I wonder if this 
compromises the meaning of the result. It is not necessarily "bad", for 
example, if the recommender doesn't consider those latest prefs the top recs. 
That is not what any implementation is trying to do.

Sorting isn't needed, but it is probably the easiest way to split the data into 
training and test data. I don't know if it will be much slower than 
alternatives, and if it's not, fine for eval purposes. TopN is an existing 
class. It will be faster at picking out the "most recent" prefs for you but I 
don't know of an easy way to reuse it to also give you the rest of the older 
objects efficiently. So, I suppose I'd start with a sort, which is probably 10 
lines of code, and see if it's fast enough.

I do not see a need for any new evaluator, no. The point here is to factor out 
the test/training split logic only, and with that pluggable, you should be able 
to create test/training splits based on time. No?
                
> Allow collaborative filtering evaluators to use custom logic in splitting 
> data set
> ----------------------------------------------------------------------------------
>
>                 Key: MAHOUT-906
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-906
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Anatoliy Kats
>            Priority: Minor
>              Labels: features
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I want to start a discussion about factoring out the logic used in splitting 
> the data set into training and testing.  Here is how things stand:  There are 
> two independent evaluator based classes:  
> AbstractDifferenceRecommenderEvaluator, splits all the preferences randomly 
> into a training and testing set.  GenericRecommenderIRStatsEvaluator takes 
> one user at a time, removes their top AT preferences, and counts how many of 
> them the system recommends back.
> I have two use cases that both deal with temporal dynamics.  In one case, 
> there may be expired items that can be used for building a training model, 
> but not a test model.  In the other, I may want to simulate the behavior of a 
> real system by building a preference matrix on days 1-k, and testing on the 
> ratings the user generated on the day k+1.  In this case, it's not items, but 
> preferences(user, item, rating triplets) which may belong only to the 
> training set.  Before we discuss appropriate design, are there any other use 
> cases we need to keep in mind?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to