[
https://issues.apache.org/jira/browse/MAHOUT-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177422#comment-13177422
]
Hudson commented on MAHOUT-906:
-------------------------------
Integrated in Mahout-Quality #1280 (See
[https://builds.apache.org/job/Mahout-Quality/1280/])
MAHOUT-906 add hook for different relevant item ID logic
srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1225649
Files :
*
/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/eval/RelevantItemsDataSplitter.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/impl/eval/GenericRecommenderIRStatsEvaluator.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/impl/eval/GenericRelevantItemsDataSplitter.java
> Allow collaborative filtering evaluators to use custom logic in splitting
> data set
> ----------------------------------------------------------------------------------
>
> Key: MAHOUT-906
> URL: https://issues.apache.org/jira/browse/MAHOUT-906
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.5
> Reporter: Anatoliy Kats
> Assignee: Sean Owen
> Priority: Minor
> Labels: features
> Fix For: 0.6
>
> Attachments: MAHOUT-906.patch, MAHOUT-906.patch, MAHOUT-906.patch,
> MAHOUT-906.patch, MAHOUT-906.patch
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> I want to start a discussion about factoring out the logic used in splitting
> the data set into training and testing. Here is how things stand: There are
> two independent evaluator based classes:
> AbstractDifferenceRecommenderEvaluator, splits all the preferences randomly
> into a training and testing set. GenericRecommenderIRStatsEvaluator takes
> one user at a time, removes their top AT preferences, and counts how many of
> them the system recommends back.
> I have two use cases that both deal with temporal dynamics. In one case,
> there may be expired items that can be used for building a training model,
> but not a test model. In the other, I may want to simulate the behavior of a
> real system by building a preference matrix on days 1-k, and testing on the
> ratings the user generated on the day k+1. In this case, it's not items, but
> preferences(user, item, rating triplets) which may belong only to the
> training set. Before we discuss appropriate design, are there any other use
> cases we need to keep in mind?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira