[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882163#comment-15882163 ]
Nick Pentreath commented on SPARK-14409: ---------------------------------------- [~roberto.mirizzi] the {{goodThreshold}} param seems pretty reasonable in this context to exclude irrelevant items. I think it can be a good {{expertParam}} addition. Ok, I think that a first pass at this should just aim to replicate what we have exposed in {{mllib}} and wrap {{RankingMetrics}}. Initially we can look at: (a) supporting numeric columns and doing the windowing & {{collect_list}} approach to feed into {{RankingMetrics}}; (b) support Array columns and feed directly into {{RankingMetrics}} or (c) support both. [~yongtang] already did a PR here: https://github.com/apache/spark/pull/12461. It is fairly complete and also includes MRR. [~yongtang] are you able to work on reviving that PR? If os, [~roberto.mirizzi] [~danilo.ascione] are you able to help review that PR? > Investigate adding a RankingEvaluator to ML > ------------------------------------------- > > Key: SPARK-14409 > URL: https://issues.apache.org/jira/browse/SPARK-14409 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: Nick Pentreath > Priority: Minor > > {{mllib.evaluation}} contains a {{RankingMetrics}} class, while there is no > {{RankingEvaluator}} in {{ml.evaluation}}. Such an evaluator can be useful > for recommendation evaluation (and can be useful in other settings > potentially). > Should be thought about in conjunction with adding the "recommendAll" methods > in SPARK-13857, so that top-k ranking metrics can be used in cross-validators. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org