Github user thunterdb commented on a diff in the pull request: https://github.com/apache/spark/pull/19156#discussion_r137603986 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -109,31 +108,47 @@ object Summarizer extends Logging { } @Since("2.3.0") - def mean(col: Column): Column = getSingleMetric(col, "mean") + def mean(col: Column, weightCol: Column = lit(1.0)): Column = { --- End diff -- I am not a fan of default parameters, it tends to cause issues with binary compatibility. Unless you have some good reasons, you should have two different functions: ```scala def mean(col: Column): Column = mean(col, lit(1.0)) def mean(col: Column, weightCol: Column): Column = ... ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org