[ 
https://issues.apache.org/jira/browse/SPARK-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-3568:
------------------------------
    Description: 
Include common metrics for ranking algorithms 
(http://www-nlp.stanford.edu/IR-book/), including:
 - Mean Average Precision
 - Precision@n: top-n precision
 - Discounted cumulative gain (DCG) and NDCG 

This implementation attempts to create a new class called *RankingMetrics* 
under *org.apache.spark.mllib.evaluation*, which accepts input (prediction and 
label pairs) as *RDD[Array[T], Array[T]]*. The following methods will be 
implemented:

{code:title=RankingMetrics.scala|borderStyle=solid}
class RankingMetrics[T](predictionAndLabels: RDD[(Array[T], Array[T])]) {
  /* Returns the precsion@k for each query */
  lazy val precAtK: RDD[Array[Double]]

  /**
   * @param k the position to compute the truncated precision
   * @return the average precision at the first k ranking positions
   */
  def precision(k: Int): Double

  /* Returns the average precision for each query */
  lazy val avePrec: RDD[Double]

  /*Returns the mean average precision (MAP) of all the queries*/
  lazy val meanAvePrec: Double

  /*Returns the normalized discounted cumulative gain for each query */
  lazy val ndcgAtK: RDD[Array[Double]]

  /**
   * @param k the position to compute the truncated ndcg
   * @return the average ndcg at the first k ranking positions
   */
  def ndcg(k: Int): Double
}
{code}


  was:
Include common metrics for ranking algorithms 
(http://www-nlp.stanford.edu/IR-book/), including:
 - Mean Average Precision
 - Precision@n: top-n precision
 - Discounted cumulative gain (DCG) and NDCG 

This implementation attempts to create a new class called *RankingMetrics* 
under *org.apache.spark.mllib.evaluation*, which accepts input (prediction and 
label pairs) as *RDD[Array[Double], Array[Double]]*. The following methods will 
be implemented:

{code:title=RankingMetrics.scala|borderStyle=solid}
class RankingMetrics(predictionAndLabels: RDD[(Array[Double], Array[Double])]) {
  /* Returns the precsion@k for each query */
  lazy val precAtK: RDD[Array[Double]]

  /* Returns the average precision for each query */
  lazy val avePrec: RDD[Double]

  /*Returns the mean average precision (MAP) of all the queries*/
  lazy val meanAvePrec: Double

  /*Returns the normalized discounted cumulative gain for each query */
  lazy val ndcg: RDD[Double]

  /* Returns the mean NDCG of all the queries */
  lazy val meanNdcg: Double
}
{code}



> Add metrics for ranking algorithms
> ----------------------------------
>
>                 Key: SPARK-3568
>                 URL: https://issues.apache.org/jira/browse/SPARK-3568
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Shuo Xiang
>            Assignee: Shuo Xiang
>
> Include common metrics for ranking algorithms 
> (http://www-nlp.stanford.edu/IR-book/), including:
>  - Mean Average Precision
>  - Precision@n: top-n precision
>  - Discounted cumulative gain (DCG) and NDCG 
> This implementation attempts to create a new class called *RankingMetrics* 
> under *org.apache.spark.mllib.evaluation*, which accepts input (prediction 
> and label pairs) as *RDD[Array[T], Array[T]]*. The following methods will be 
> implemented:
> {code:title=RankingMetrics.scala|borderStyle=solid}
> class RankingMetrics[T](predictionAndLabels: RDD[(Array[T], Array[T])]) {
>   /* Returns the precsion@k for each query */
>   lazy val precAtK: RDD[Array[Double]]
>   /**
>    * @param k the position to compute the truncated precision
>    * @return the average precision at the first k ranking positions
>    */
>   def precision(k: Int): Double
>   /* Returns the average precision for each query */
>   lazy val avePrec: RDD[Double]
>   /*Returns the mean average precision (MAP) of all the queries*/
>   lazy val meanAvePrec: Double
>   /*Returns the normalized discounted cumulative gain for each query */
>   lazy val ndcgAtK: RDD[Array[Double]]
>   /**
>    * @param k the position to compute the truncated ndcg
>    * @return the average ndcg at the first k ranking positions
>    */
>   def ndcg(k: Int): Double
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to