Xinrong Meng created SPARK-39048:
------------------------------------

             Summary: Refactor GroupBy._reduce_for_stat_function on accepted 
data types 
                 Key: SPARK-39048
                 URL: https://issues.apache.org/jira/browse/SPARK-39048
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 3.4.0
            Reporter: Xinrong Meng


`Groupby._reduce_for_stat_function` is a common helper function leveraged by 
multiple statistical functions of GroupBy objects.

It defines parameters `only_numeric` and `bool_as_numeric` to control accepted 
Spark types.

To be consistent with pandas API, we may also have to introduce 
`str_as_numeric` for `sum` for example.

Instead of introducing parameters designated for each Spark type, the PR is 
proposed to introduce a parameter `accepted_spark_types` to specify accepted 
types of Spark columns to be aggregated.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to