Github user thunterdb commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19156#discussion_r137603986
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala ---
    @@ -109,31 +108,47 @@ object Summarizer extends Logging {
       }
     
       @Since("2.3.0")
    -  def mean(col: Column): Column = getSingleMetric(col, "mean")
    +  def mean(col: Column, weightCol: Column = lit(1.0)): Column = {
    --- End diff --
    
    I am not a fan of default parameters, it tends to cause issues with binary 
compatibility. Unless you have some good reasons, you should have two different 
functions:
    
    ```scala
    def mean(col: Column): Column = mean(col, lit(1.0))
    def mean(col: Column, weightCol: Column): Column = ...
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to