Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4906#discussion_r26056478
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala ---
    @@ -69,6 +74,42 @@ class GradientBoostedTrees(private val boostingStrategy: 
BoostingStrategy)
           case _ =>
             throw new IllegalArgumentException(s"$algo is not supported by the 
gradient boosting.")
         }
    +    baseLearners = fitGradientBoostingModel.trees
    +    baseLearnerWeights = fitGradientBoostingModel.treeWeights
    +    fitGradientBoostingModel
    +  }
    +
    +  /**
    +   * Method to compute error or loss for every iteration of gradient 
boosting.
    +   * @param data: RDD of [[org.apache.spark.mllib.regression.LabeledPoint]]
    +   * @param loss: evaluation metric that defaults to boostingStrategy.loss
    +   * @return an array with index i having the losses or errors for the 
ensemble
    +   *         containing trees 1 to i + 1
    +   */
    +  def evaluateEachIteration(
    --- End diff --
    
    This method should be implemented in the model, not in the estimator.  
There's no need to make a duplicate of the model in the estimator class.  (We 
try to keep estimator classes stateless except for parameter values so that 
they remain lightweight types.)
    
    This change will require a bit of refactoring, so I'll hold off on more 
comments until then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to