[spark] branch master updated: [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9aabc527ec27 [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase 9aabc527ec27 is described below commit 9aabc527ec27da30cac2901d8f2eaf865e450295 Author: Takuya Ueshin AuthorDate: Tue Oct 24 08:09:59 2023 +0900 [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase ### What changes were proposed in this pull request? Fix user-facing APIs related to Python UDTF to use camelCase. ### Why are the changes needed? To keep the naming convention for user-facing APIs. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated the related tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43470 from ueshin/issues/SPARK-45620/field_names. Lead-authored-by: Takuya Ueshin Co-authored-by: Hyukjin Kwon Co-authored-by: Takuya UESHIN Signed-off-by: Hyukjin Kwon --- python/docs/source/user_guide/sql/python_udtf.rst | 22 +++--- python/pyspark/sql/functions.py| 12 ++-- python/pyspark/sql/tests/test_udtf.py | 84 +++--- python/pyspark/sql/udtf.py | 24 +++ python/pyspark/sql/worker/analyze_udtf.py | 12 ++-- .../python/UserDefinedPythonFunction.scala | 8 +-- .../apache/spark/sql/IntegratedUDFTestUtils.scala | 22 +++--- 7 files changed, 92 insertions(+), 92 deletions(-) diff --git a/python/docs/source/user_guide/sql/python_udtf.rst b/python/docs/source/user_guide/sql/python_udtf.rst index fb42644dc702..0e0c6e28578b 100644 --- a/python/docs/source/user_guide/sql/python_udtf.rst +++ b/python/docs/source/user_guide/sql/python_udtf.rst @@ -77,29 +77,29 @@ To implement a Python UDTF, you first need to define a class implementing the me the particular UDTF call under consideration. Each parameter is an instance of the `AnalyzeArgument` class, which contains fields including the provided argument's data type and value (in the case of literal scalar arguments only). For table arguments, the -`is_table` field is set to true and the `data_type` field is a StructType representing +`isTable` field is set to true and the `dataType` field is a StructType representing the table's column types: -data_type: DataType +dataType: DataType value: Optional[Any] -is_table: bool +isTable: bool This method returns an instance of the `AnalyzeResult` class which includes the result table's schema as a StructType. If the UDTF accepts an input table argument, then the `AnalyzeResult` can also include a requested way to partition the rows of the input -table across several UDTF calls. If `with_single_partition` is set to True, the query +table across several UDTF calls. If `withSinglePartition` is set to True, the query planner will arrange a repartitioning operation from the previous execution stage such that all rows of the input table are consumed by the `eval` method from exactly one -instance of the UDTF class. On the other hand, if the `partition_by` list is non-empty, +instance of the UDTF class. On the other hand, if the `partitionBy` list is non-empty, the query planner will arrange a repartitioning such that all rows with each unique combination of values of the partitioning columns are consumed by a separate unique -instance of the UDTF class. If `order_by` is non-empty, this specifies the requested +instance of the UDTF class. If `orderBy` is non-empty, this specifies the requested ordering of rows within each partition. schema: StructType -with_single_partition: bool = False -partition_by: Sequence[PartitioningColumn] = field(default_factory=tuple) -order_by: Sequence[OrderingColumn] = field(default_factory=tuple) +withSinglePartition: bool = False +partitionBy: Sequence[PartitioningColumn] = field(default_factory=tuple) +orderBy: Sequence[OrderingColumn] = field(default_factory=tuple) Examples @@ -116,7 +116,7 @@ To implement a Python UDTF, you first need to define a class implementing the me >>> def analyze(self, *args) -> AnalyzeResult: ... assert len(args) == 1, "This function accepts one
[spark] branch master updated: [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e3ba9cf0403 [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase e3ba9cf0403 is described below commit e3ba9cf0403ade734f87621472088687e533b2cd Author: Takuya UESHIN AuthorDate: Mon Oct 23 10:35:30 2023 +0900 [SPARK-45620][PYTHON] Fix user-facing APIs related to Python UDTF to use camelCase ### What changes were proposed in this pull request? Fix user-facing APIs related to Python UDTF to use camelCase. ### Why are the changes needed? To keep the naming convention for user-facing APIs. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated the related tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43470 from ueshin/issues/SPARK-45620/field_names. Authored-by: Takuya UESHIN Signed-off-by: Hyukjin Kwon --- python/docs/source/user_guide/sql/python_udtf.rst | 22 +++--- python/pyspark/sql/functions.py | 12 ++-- python/pyspark/sql/tests/test_udtf.py | 84 +++ python/pyspark/sql/udtf.py| 24 +++ python/pyspark/sql/worker/analyze_udtf.py | 12 ++-- 5 files changed, 77 insertions(+), 77 deletions(-) diff --git a/python/docs/source/user_guide/sql/python_udtf.rst b/python/docs/source/user_guide/sql/python_udtf.rst index fb42644dc70..0e0c6e28578 100644 --- a/python/docs/source/user_guide/sql/python_udtf.rst +++ b/python/docs/source/user_guide/sql/python_udtf.rst @@ -77,29 +77,29 @@ To implement a Python UDTF, you first need to define a class implementing the me the particular UDTF call under consideration. Each parameter is an instance of the `AnalyzeArgument` class, which contains fields including the provided argument's data type and value (in the case of literal scalar arguments only). For table arguments, the -`is_table` field is set to true and the `data_type` field is a StructType representing +`isTable` field is set to true and the `dataType` field is a StructType representing the table's column types: -data_type: DataType +dataType: DataType value: Optional[Any] -is_table: bool +isTable: bool This method returns an instance of the `AnalyzeResult` class which includes the result table's schema as a StructType. If the UDTF accepts an input table argument, then the `AnalyzeResult` can also include a requested way to partition the rows of the input -table across several UDTF calls. If `with_single_partition` is set to True, the query +table across several UDTF calls. If `withSinglePartition` is set to True, the query planner will arrange a repartitioning operation from the previous execution stage such that all rows of the input table are consumed by the `eval` method from exactly one -instance of the UDTF class. On the other hand, if the `partition_by` list is non-empty, +instance of the UDTF class. On the other hand, if the `partitionBy` list is non-empty, the query planner will arrange a repartitioning such that all rows with each unique combination of values of the partitioning columns are consumed by a separate unique -instance of the UDTF class. If `order_by` is non-empty, this specifies the requested +instance of the UDTF class. If `orderBy` is non-empty, this specifies the requested ordering of rows within each partition. schema: StructType -with_single_partition: bool = False -partition_by: Sequence[PartitioningColumn] = field(default_factory=tuple) -order_by: Sequence[OrderingColumn] = field(default_factory=tuple) +withSinglePartition: bool = False +partitionBy: Sequence[PartitioningColumn] = field(default_factory=tuple) +orderBy: Sequence[OrderingColumn] = field(default_factory=tuple) Examples @@ -116,7 +116,7 @@ To implement a Python UDTF, you first need to define a class implementing the me >>> def analyze(self, *args) -> AnalyzeResult: ... assert len(args) == 1, "This function accepts one argument only" -... assert args[0].data_type == StringType(), "Only string arguments are supported" +... assert args[0].dataType == StringType(), "Only string arguments are