[
https://issues.apache.org/jira/browse/MAHOUT-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12968651#action_12968651
]
Lance Norskog edited comment on MAHOUT-559 at 12/7/10 5:09 AM:
---------------------------------------------------------------
This first version finds the common items in the two recommendations, and
performs various statistical tests comparing the order of these common items.
It does not evaluate the items unique to one recommendation.
It includes a few statistical tests on the difference in order of the two
common sequences:
* A sliding-window Hamming distance
** A match among neighboring positions between vectors counts as a match
* A bubble-sort of one recommendation sequence against the other sequence
** How many swaps must a sorting algorithm use to make one order match the
other?
*** Bubble sort (and some other sorts) do nothing on an already-ordered list
and progressively more work as the list is more and more disordered.
* Statistical Rank
** The Classic Ranking score from the stats world. You're on your own
* A variation on the Wilcoxon ranking score
** Based on this: [A normal-scores alternative to the Wilcoxon
test|http://comp9.psych.cornell.edu/Darlington/normscor.htm]
** (Regular Wilcoxon requires a separate lookup table. This was too weird to
implement.)
The score ranges from 0 on upwards. A 0 score means they're the same; larger
numbers represent large differences.
CompareRecommenders is not a unit test, but rather a sample main() program for
your own testing. You may wish to disable the CSV output option: set the csvOut
field to null.
was (Author: lancenorskog):
This first version finds the common items in the two recommendations, and
performs various statistical tests comparing the order of these common items.
It does not evaluate the items unique to one recommendation.
It includes a few statistical tests on the difference in order of the two
common sequences:
* A sliding-window Hamming distance
** A match among neighboring positions between vectors counts as a match
* A bubble-sort of one recommendation sequence against the other sequence
** How many swaps must a sorting algorithm use to make one order match the
other?
* Statistical Rank
** The Classic Ranking score from the stats world. You're on your own
* A variation on the Wilcoxon ranking score
** Based on this: [A normal-scores alternative to the Wilcoxon
test|http://comp9.psych.cornell.edu/Darlington/normscor.htm]
** (Regular Wilcoxon requires a separate lookup table. This was too weird to
implement.)
The score ranges from 0 on upwards. A 0 score means they're the same; larger
numbers represent large differences.
CompareRecommenders is not a unit test, but rather a sample main() program for
your on testing. You may wish to disable the CSV output option: set the csvOut
field to null.
> Compare Recommender output by order of recommendations.
> -------------------------------------------------------
>
> Key: MAHOUT-559
> URL: https://issues.apache.org/jira/browse/MAHOUT-559
> Project: Mahout
> Issue Type: New Feature
> Components: Collaborative Filtering
> Reporter: Lance Norskog
> Attachments: OrderBasedRecommenderEvaluator.patch
>
>
> The existing RecommenderEvaluator
> (AverageAbsoluteDifferenceRecommenderEvaluator.java) has a very limited API.
> It evaluates a Recommender's performance on a training v.s. test scenario. It
> does not allow comparing the outputs of different recommenders against the
> same data model. Also, I could not figure out how its comparison criteria.
> OrderBasedRecommenderEvaluator compares the output of two recommenders. It
> only checks the order of the items in the recommendations, ignoring the
> returned preference values.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.