Github user erikerlandson commented on a diff in the pull request: https://github.com/apache/spark/pull/13440#discussion_r218252156 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Gini.scala --- @@ -71,6 +71,23 @@ object Gini extends Impurity { @Since("1.1.0") def instance: this.type = this + /** + * :: DeveloperApi :: + * p-values for test-statistic measures, unsupported for [[Gini]] + */ + @Since("2.2.0") + @DeveloperApi + def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): Double = --- End diff -- I suspect that the generalization is closer to my newer signature `val pval = imp.calculate(leftImpurityCalculator, rightImpurityCalculator)` where you have all the context from the left and right nodes. The existing gain-based calculation should fit into this framework, just doing its current weighted average of purity gain.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org