Evaluating boolean preference data sets

Marko Ciric Thu, 21 Jul 2011 05:50:04 -0700

Hi guys,

I wonder if Mahout should have a "precision and recall" evaluator that
calculates the relevant items data set without looking to the relevance
threshold. This would be suitable for data sets with boolean preference
nature. In addition, the relevant items can be removed from the training
data set by random (removing first couple of preferred items every time
wouldn't be a great idea).


On the other hand, having relevance threshold
with RecommenderIRStatsEvaluator set to 1.0 removes exactly "at" number of
items. As the recommender returns that number of items, the precision and
recall would have the same value. Is this Ok or is it a bug, given that
  precision = intersection / num_recommended_items (where
num_recommended_items is almost always "at")
  recall = intersection / num_relevant_items (also "at" as the previously
mentioned why relevanceThreshold is 1.0)?


--
Marko Ćirić
[email protected]

Evaluating boolean preference data sets

Reply via email to