This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new f6e4a466705 [SPARK-46063][PYTHON][CONNECT] Improve error messages related to argument types in cute, rollup, groupby, and pivot f6e4a466705 is described below commit f6e4a4667057e226a06b4d1b063a62b698ffb25f Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Thu Nov 23 15:33:15 2023 +0800 [SPARK-46063][PYTHON][CONNECT] Improve error messages related to argument types in cute, rollup, groupby, and pivot ### What changes were proposed in this pull request? This PR improves error messages related to argument types in `cute`, `rollup`, `groupBy`, and `pivot`. ```bash ./bin/pyspark --remote local ``` ```python >>> help(spark.range(1).cube) Help on method cube in module pyspark.sql.connect.dataframe: cube(*cols: 'ColumnOrName') -> 'GroupedData' method of pyspark.sql.connect.dataframe.DataFrame instance Create a multi-dimensional cube for the current :class:`DataFrame` using the specified columns, allowing aggregations to be performed on them. ... ``` **Before:** ```python >>> spark.range(1).cube(1.2) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../python/pyspark/sql/connect/dataframe.py", line 544, in cube raise PySparkTypeError( pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR] Argument `cube` should be a Column or str, got float. ``` **After:** ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../python/pyspark/sql/connect/dataframe.py", line 544, in cube raise PySparkTypeError( pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR] Argument `cols` should be a Column or str, got float. ``` ### Why are the changes needed? For better error messages to end users. ### Does this PR introduce _any_ user-facing change? Yes, it fixes the user-facing error message. ### How was this patch tested? Manually tested. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43968 from HyukjinKwon/SPARK-46063. Authored-by: Hyukjin Kwon <gurwls...@apache.org> Signed-off-by: Ruifeng Zheng <ruife...@apache.org> --- python/pyspark/sql/connect/dataframe.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/python/pyspark/sql/connect/dataframe.py b/python/pyspark/sql/connect/dataframe.py index c713bb85c1e..c7b51205363 100644 --- a/python/pyspark/sql/connect/dataframe.py +++ b/python/pyspark/sql/connect/dataframe.py @@ -495,7 +495,7 @@ class DataFrame: else: raise PySparkTypeError( error_class="NOT_COLUMN_OR_STR", - message_parameters={"arg_name": "groupBy", "arg_type": type(c).__name__}, + message_parameters={"arg_name": "cols", "arg_type": type(c).__name__}, ) return GroupedData(df=self, group_type="groupby", grouping_cols=_cols) @@ -520,7 +520,7 @@ class DataFrame: else: raise PySparkTypeError( error_class="NOT_COLUMN_OR_STR", - message_parameters={"arg_name": "rollup", "arg_type": type(c).__name__}, + message_parameters={"arg_name": "cols", "arg_type": type(c).__name__}, ) return GroupedData(df=self, group_type="rollup", grouping_cols=_cols) @@ -543,7 +543,7 @@ class DataFrame: else: raise PySparkTypeError( error_class="NOT_COLUMN_OR_STR", - message_parameters={"arg_name": "cube", "arg_type": type(c).__name__}, + message_parameters={"arg_name": "cols", "arg_type": type(c).__name__}, ) return GroupedData(df=self, group_type="cube", grouping_cols=_cols) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org