[GitHub] spark pull request #17086: [SPARK-24101][ML][MLLIB] ML Evaluators should use...

srowen Tue, 06 Nov 2018 09:21:02 -0800

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17086#discussion_r231215393
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala 
---
    @@ -27,10 +27,17 @@ import org.apache.spark.sql.DataFrame
     /**
      * Evaluator for multiclass classification.
      *
    - * @param predictionAndLabels an RDD of (prediction, label) pairs.
    + * @param predAndLabelsWithOptWeight an RDD of (prediction, label, weight) 
or
    + *                         (prediction, label) pairs.
      */
     @Since("1.1.0")
    -class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[(Double, 
Double)]) {
    +class MulticlassMetrics @Since("3.0.0") (predAndLabelsWithOptWeight: 
RDD[_]) {
    --- End diff --
    
    Darn, OK. Hm, so this doesn't actually cause a source or binary change? OK, 
that could be fine. I guess MiMa didn't complain. I guess you can now do weird 
things like pass `RDD[String]` here and it'll fail quickly. I'm a little uneasy 
about it but it's probably acceptable. Any other opinions?
    
    I am not sure what to do about the DataFrame issue though. I suspect most 
people will want to call with a DataFrame now.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17086: [SPARK-24101][ML][MLLIB] ML Evaluators should use...

Reply via email to