[
https://issues.apache.org/jira/browse/MAHOUT-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168299#comment-13168299
]
Sean Owen commented on MAHOUT-925:
----------------------------------
Yes you could create a different kind of test that doesn't hold out any data to
find this reach figure. I don't think it's worth a whole different test class
just for this. The entire test framework is only valid insofar as you run it on
enough data, with enough to train, that the result reflects how the full system
works. So I think it's as valid as anything else to run on the training data
only.
Regarding the "2@" prefs heuristic: it's not really a question of the
recommender deciding *not* to recommend. It's that it will *always* recommend
as much as possible, up to what you ask for. But if the test is based on so
little data to begin with, the result is not very meaningful. If I am figuring
precision@5 and the user has only 4 prefs, what can I do? I can't even call all
4 "relevant" items since it would leave no training data. Even if I did, there
would be no way to achieve 100% precision as there are only 4 relevant items. I
(arbitrarily) picked 2@ as the minimum -- 10 here if @=5 -- since you can
select 5 of the 10 in this case as relevant, and have as many available for
training.
You would not want to drop a user's result just because it recommended 3 items
in a test @5. That's a perfectly valid result (given the condition in the
preceding paragraph) to include. You can still decide how many of those 3 are
relevant, and how many of the relevant items are in those 3.
Precision and recall are not the same in general. If the number of items deemed
relevant is equal to "@", then precision will equal recall, yes. And that is
usually true for data with ratings, the way this class works. It will just
choose some "@" of the items, as there is no basis to call one more relevant
than the other. Choosing that many is also somewhat arbitrary; it can't be 0,
and can't be all items (or there would no training data from the user under
test), so that looked like a nice round number.
> Evaluate the reach of recommender algorithms
> --------------------------------------------
>
> Key: MAHOUT-925
> URL: https://issues.apache.org/jira/browse/MAHOUT-925
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.5
> Reporter: Anatoliy Kats
> Assignee: Sean Owen
> Priority: Minor
> Attachments: MAHOUT-925.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The evaluation of a CF algorithm should include reach, the proportion of
> users for whom a recommendation could be made. An algorithm usually has a
> cutoff value on the confidence of the recommender, and if it is not high
> enough, no recommendation is made. The number of requested recommendations,
> or this parameter could be varied as part of the evaluation. The proposed
> patch adds this.
> My build with this patch breaks
> testMapper(org.apache.mahout.classifier.df.mapreduce.partial.Step1MapperTest):
> org.apache.mahout.classifier.df.node.Leaf.<init>(I)V . The test seems
> unrelated to the patch, so I am assuming this is broken in the trunk head as
> well. Unfortunately I am under a deadline, and I do not have time to write
> tests for the patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira