I'm trying to evaluate a few different recommenders based on boolean
preferences. The in action book suggests using an precision/recall
metric, but I'm not sure I understand what that does, and in particular
how it is dividing my data into test/train sets.
What I think I'd like to do is:
1. Divide the test data by user: identify a set of training data with
data from 80% of the users, and test using the remaining 20% (say).
2. Build a similarity model from the training data
3. For the test users, divide their data in half; a "training" set and
an evaluation set. Then for each test user, use their training data as
input to the recommender, and see if it recommends the data in the
evaluation set or not.
Is this what the precision/recall test is actually doing?
--
Michael Sokolov
Senior Architect
Safari Books Online