[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

Nick Pentreath (JIRA) Thu, 23 Feb 2017 04:13:06 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880324#comment-15880324
 ]


Nick Pentreath commented on SPARK-14409:
----------------------------------------

[~roberto.mirizzi] If using the current {{ALS.transform}} output as input to 
the {{RankingEvaluator}}, as envisaged here, the model will predict a score for 
each {{user-item}} pair in the evaluation set. For each user, the ground truth 
is exactly this distinct set of items. By definition the top-k items ranked by 
predicted sore will be in the ground truth set, since {{ALS}} is only scoring 
{{user-item}} pairs *that already exist in the evaluation set*. So how is it 
possible *not* to get a perfect score, since all top-k recommended items will 
be "relevant"?

Unless you are cutting off the ground truth set at {{k}} too - in which case 
that does not sound like a correct computation to me.

By contrast, if {{ALS.transform}} output a set of top-k items for each user, 
where the items are scored from *the entire set of possible candidate items*, 
then computing the ranking metric of that top-k set against the actual ground 
truth for each user is correct.

> Investigate adding a RankingEvaluator to ML
> -------------------------------------------
>
>                 Key: SPARK-14409
>                 URL: https://issues.apache.org/jira/browse/SPARK-14409
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Nick Pentreath
>            Priority: Minor
>
> {{mllib.evaluation}} contains a {{RankingMetrics}} class, while there is no 
> {{RankingEvaluator}} in {{ml.evaluation}}. Such an evaluator can be useful 
> for recommendation evaluation (and can be useful in other settings 
> potentially).
> Should be thought about in conjunction with adding the "recommendAll" methods 
> in SPARK-13857, so that top-k ranking metrics can be used in cross-validators.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

Reply via email to