Hi,

I'm trying to evaluate a recommendation model, and found that Spark and
Rival <http://dl.acm.org/citation.cfm?id=2645712> give different results,
and it seems that Rival's one is what Kaggle defines
<https://www.kaggle.com/wiki/NormalizedDiscountedCumulativeGain>:
https://gist.github.com/jongwook/5d4e78290eaef22cb69abbf68b52e597

Am I using RankingMetrics in a wrong way, or is Spark's implementation
incorrect?

To my knowledge, NDCG should be dependent on the relevance (or preference)
values, but Spark's implementation
<https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L129-L156>
seems not; it uses 1.0 where it should be 2^(relevance) - 1, probably
assuming that relevance is all 1.0? I also tried tweaking, but its method
to obtain the ideal DCG also seems wrong.

Any feedback from MLlib developers would be appreciated. I made a
modified/extended version of RankingMetrics that produces the identical
numbers to Kaggle and Rival's results, and I'm wondering if it is something
appropriate to be added back to MLlib.

Jong Wook

Reply via email to