Thank you Sean. Maybe I am just confused about the language. When I read that it returns "the average precision at the first k ranking positions" I somehow expect there will ap@k there and a the final output would be MAP@k not average precision at the k-th position.
I guess it is not enough sleep. On 12/06/2016 02:45 AM, Sean Owen wrote: > I read it again and that looks like it implements mean precision@k as > I would expect. What is the issue? > > On Tue, Dec 6, 2016, 07:30 Maciej Szymkiewicz <mszymkiew...@gmail.com > <mailto:mszymkiew...@gmail.com>> wrote: > > Hi, > > Could I ask fora fresh pair of eyes on this piece of code: > > > https://github.com/apache/spark/blob/f830bb9170f6b853565d9dd30ca7418b93a54fe3/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L59-L80 > > @Since("1.2.0") > def precisionAt(k: Int): Double = { > require(k > 0, "ranking position k should be positive") > predictionAndLabels.map { case (pred, lab) => > val labSet = lab.toSet > > if (labSet.nonEmpty) { > val n = math.min(pred.length, k) > var i = 0 > var cnt = 0 > while (i < n) { > if (labSet.contains(pred(i))) { > cnt += 1 > } > i += 1 > } > cnt.toDouble / k > } else { > logWarning("Empty ground truth set, check input data") > 0.0 > } > }.mean() > } > > > Am I the only one who thinks this doesn't do what it claims? Just > for reference: > > * > https://web.archive.org/web/20120415101144/http://sas.uwaterloo.ca/stats_navigation/techreports/04WorkingPapers/2004-09.pdf > * > https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/average_precision.py > > -- > Best, > Maciej > -- Maciej Szymkiewicz