Xinrong Meng created SPARK-39048: ------------------------------------ Summary: Refactor GroupBy._reduce_for_stat_function on accepted data types Key: SPARK-39048 URL: https://issues.apache.org/jira/browse/SPARK-39048 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.4.0 Reporter: Xinrong Meng
`Groupby._reduce_for_stat_function` is a common helper function leveraged by multiple statistical functions of GroupBy objects. It defines parameters `only_numeric` and `bool_as_numeric` to control accepted Spark types. To be consistent with pandas API, we may also have to introduce `str_as_numeric` for `sum` for example. Instead of introducing parameters designated for each Spark type, the PR is proposed to introduce a parameter `accepted_spark_types` to specify accepted types of Spark columns to be aggregated. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org