This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 9da1e4ca7bb [SPARK-46290][PYTHON] Change saveMode to a boolean flag for DataSourceWriter 9da1e4ca7bb is described below commit 9da1e4ca7bb89a8b5730d9e496c378c8357e003a Author: allisonwang-db <allison.w...@databricks.com> AuthorDate: Thu Dec 7 09:42:04 2023 +0900 [SPARK-46290][PYTHON] Change saveMode to a boolean flag for DataSourceWriter ### What changes were proposed in this pull request? This PR updates the `writer` method in the Python data source API from ``` def writer(self, schema: StructType, saveMode: str) ``` to ``` def writer(self, schema: StructType, overwrite: bool) ``` The motivation here is that `saveMode` offers four modes: append, overwrite, error, and ignore, but practically speaking, only append and overwrite are meaningful. Also, DSv2 only supports the append and overwrite mode. Python data sources should be consistent. ### Why are the changes needed? To make the API simpler. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #44216 from allisonwang-db/spark-46290-overwrite. Authored-by: allisonwang-db <allison.w...@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- python/pyspark/sql/datasource.py | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/python/pyspark/sql/datasource.py b/python/pyspark/sql/datasource.py index 4713ca5366a..e20d44039a6 100644 --- a/python/pyspark/sql/datasource.py +++ b/python/pyspark/sql/datasource.py @@ -130,7 +130,7 @@ class DataSource(ABC): message_parameters={"feature": "reader"}, ) - def writer(self, schema: StructType, saveMode: str) -> "DataSourceWriter": + def writer(self, schema: StructType, overwrite: bool) -> "DataSourceWriter": """ Returns a ``DataSourceWriter`` instance for writing data. @@ -140,9 +140,8 @@ class DataSource(ABC): ---------- schema : StructType The schema of the data to be written. - saveMode : str - A string identifies the save mode. It can be one of the following: - `append`, `overwrite`, `error`, `ignore`. + overwrite : bool + A flag indicating whether to overwrite existing data when writing to the data source. Returns ------- --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org