> In a sense, evaluating the quality of predictions is slightly the > wrong question to ask. After all a recommender's primary job is to > make ordered recommendations, only. It does not necessarily need to > predict preferences to do this, though most do.
I see. Maybe this is part of my problem, apart from the novelty issue. I want to evaluate the quality of predictions, rather then how well they are ordered. To illustrate, I recommend an unordered list of top 5 items, it does not matter if item 1 and 4 are interchanged. But it may matter if item 4 and 7 are interchanged. Thus, it is not sufficient for my evaluation to ask whether predictions are in correct order. I rather need to evaluate the quality of the entirety of the 5 recommendations. I hoped that PR could be more appropriate then MAE to measure this 'quality of predictions' (rather then 'quality of ordering'). But if I get you correctly both, PR and MAE, are not appropriate to measure quality of predictions (directly). > I don't have a good reference for you but I think there's really one > way forward to evaluation: you need to collect data about how often > your recommended items were viewed / clicked, and how they were rated. > That is you'd really have to deploy the recommender and evaluate it > going forward. I just can't imagine any other solution since it is > necessarily based on information you don't have yet. Yes, I will give this a go. Thanks for your comments, Mirko
