[ https://issues.apache.org/jira/browse/SPARK-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shuo Xiang updated SPARK-3568: ------------------------------ Description: Include common metrics for ranking algorithms(http://www-nlp.stanford.edu/IR-book/), including: - Mean Average Precision - Precision@n: top-n precision - Discounted cumulative gain (DCG) and NDCG This implementation attempts to create a new class called *RankingMetrics* under *org.apache.spark.mllib.evaluation*, which accepts input (prediction and label pairs) as *RDD[Array[Double], Array[Double]]*. The following methods will be implemented: {code:title=RankingMetrics.scala|borderStyle=solid} class RankingMetrics(predictionAndLabels: RDD[(Array[Double], Array[Double])]) { /* Returns the precsion@k for each query */ lazy val precAtK: RDD[Array[Double]] /* Returns the average precision for each query */ lazy val avePrec: RDD[Double] /*Returns the mean average precision (MAP) of all the queries*/ lazy val meanAvePrec: Double /*Returns the normalized discounted cumulative gain for each query */ lazy val ndcg: RDD[Double] /* Returns the mean NDCG of all the queries */ lazy val meanNdcg: Double } {code} was: Include widely-used metrics for ranking algorithms, including: - Mean Average Precision - Precision@n: top-n precision - Discounted cumulative gain (DCG) and NDCG This implementation attempts to create a new class called *RankingMetrics* under *org.apache.spark.mllib.evaluation*, which accepts input (prediction and label pairs) as *RDD[Array[Double], Array[Double]]*. The following methods will be implemented: {code:title=RankingMetrics.scala|borderStyle=solid} class RankingMetrics(predictionAndLabels: RDD[(Array[Double], Array[Double])]) { /* Returns the precsion@k for each query */ lazy val precAtK: RDD[Array[Double]] /* Returns the average precision for each query */ lazy val avePrec: RDD[Double] /*Returns the mean average precision (MAP) of all the queries*/ lazy val meanAvePrec: Double /*Returns the normalized discounted cumulative gain for each query */ lazy val ndcg: RDD[Double] /* Returns the mean NDCG of all the queries */ lazy val meanNdcg: Double } {code} > Add metrics for ranking algorithms > ---------------------------------- > > Key: SPARK-3568 > URL: https://issues.apache.org/jira/browse/SPARK-3568 > Project: Spark > Issue Type: New Feature > Components: ML, MLlib > Reporter: Shuo Xiang > Assignee: Shuo Xiang > > Include common metrics for ranking > algorithms(http://www-nlp.stanford.edu/IR-book/), including: > - Mean Average Precision > - Precision@n: top-n precision > - Discounted cumulative gain (DCG) and NDCG > This implementation attempts to create a new class called *RankingMetrics* > under *org.apache.spark.mllib.evaluation*, which accepts input (prediction > and label pairs) as *RDD[Array[Double], Array[Double]]*. The following > methods will be implemented: > {code:title=RankingMetrics.scala|borderStyle=solid} > class RankingMetrics(predictionAndLabels: RDD[(Array[Double], > Array[Double])]) { > /* Returns the precsion@k for each query */ > lazy val precAtK: RDD[Array[Double]] > /* Returns the average precision for each query */ > lazy val avePrec: RDD[Double] > /*Returns the mean average precision (MAP) of all the queries*/ > lazy val meanAvePrec: Double > /*Returns the normalized discounted cumulative gain for each query */ > lazy val ndcg: RDD[Double] > /* Returns the mean NDCG of all the queries */ > lazy val meanNdcg: Double > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org