[ 
https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902649#comment-15902649
 ] 

Nick Pentreath commented on SPARK-14409:
----------------------------------------

[~josephkb] in reference to your [PR 
comment|https://github.com/apache/spark/pull/17090#issuecomment-284827573]:

Really the input schema for evaluation is fairly simple - a set of ground truth 
ids and a (sorted) set of predicted ids, for each query (/user). The exact 
format (arrays like for {{mllib}} version, "exploded" version proposed in this 
JIRA) is not relevant in itself. Rather, the format selected is actually 
dictated by the {{Pipeline}} API - specifically, a model's prediction output 
schema from {{transform}} must be compatible with the evaluator's input schema 
for {{evaluate}}.

The schema proposed above is - I believe - the only one that is compatible with 
both "linear model" style things such as `LogisticRegression` for ad CTR 
prediction and learning-to-rank settings, as well as recommendation tasks.

> Investigate adding a RankingEvaluator to ML
> -------------------------------------------
>
>                 Key: SPARK-14409
>                 URL: https://issues.apache.org/jira/browse/SPARK-14409
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Nick Pentreath
>            Priority: Minor
>
> {{mllib.evaluation}} contains a {{RankingMetrics}} class, while there is no 
> {{RankingEvaluator}} in {{ml.evaluation}}. Such an evaluator can be useful 
> for recommendation evaluation (and can be useful in other settings 
> potentially).
> Should be thought about in conjunction with adding the "recommendAll" methods 
> in SPARK-13857, so that top-k ranking metrics can be used in cross-validators.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to