[GitHub] spark pull request #21097: [SPARK-14682][ML] Provide evaluateEachIteration m...

jkbradley Fri, 27 Apr 2018 11:00:41 -0700

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21097#discussion_r184760717
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/GBTClassifierSuite.scala
 ---
    @@ -365,6 +365,20 @@ class GBTClassifierSuite extends MLTest with 
DefaultReadWriteTest {
         assert(mostImportantFeature !== mostIF)
       }
     
    +  test("model evaluateEachIteration") {
    +    for (lossType <- Seq("logistic")) {
    +      val gbt = new GBTClassifier()
    +        .setMaxDepth(2)
    +        .setMaxIter(2)
    +        .setLossType(lossType)
    +      val model = gbt.fit(trainData.toDF)
    +      val eval1 = model.evaluateEachIteration(validationData.toDF)
    +      val eval2 = 
GradientBoostedTrees.evaluateEachIteration(validationData,
    --- End diff --
    
    This is testing the spark.ml implementation against itself.  I was about to 
recommend using the old spark.mllib implementation as a reference.   However, 
the old implementation is not tested at all.  Would you be able to test against 
a standard implementation in R or scikit-learn (following the patterns used 
elsewhere in MLlib)?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21097: [SPARK-14682][ML] Provide evaluateEachIteration m...

Reply via email to