[spark] branch master updated: [SPARK-45526][PYTHON][DOCS] Improve the example of DataFrameReader/Writer.options to take a dictionary

gurwls223 Thu, 12 Oct 2023 22:23:30 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new f1f856d5463 [SPARK-45526][PYTHON][DOCS] Improve the example of 
DataFrameReader/Writer.options to take a dictionary
f1f856d5463 is described below

commit f1f856d546360d34ca1f7ee1ddc163381586b180
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Fri Oct 13 14:23:09 2023 +0900

    [SPARK-45526][PYTHON][DOCS] Improve the example of 
DataFrameReader/Writer.options to take a dictionary
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to add the example of DataFrameReader/Writer.options to 
take a dictionary.
    
    ### Why are the changes needed?
    
    For users to know how to set options in a dictionary ay PySpark.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, it describes an example for setting the options with a dictionary.
    
    ### How was this patch tested?
    
    Existing doctests in this PR's CI.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #43357
    
    Closes #43358 from HyukjinKwon/SPARK-45528.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/pyspark/sql/readwriter.py           | 14 ++++++++++++--
 python/pyspark/sql/streaming/readwriter.py | 10 ++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/sql/readwriter.py b/python/pyspark/sql/readwriter.py
index ea429a75e15..81977c9e8cc 100644
--- a/python/pyspark/sql/readwriter.py
+++ b/python/pyspark/sql/readwriter.py
@@ -220,7 +220,12 @@ class DataFrameReader(OptionUtils):
 
         Examples
         --------
-        >>> spark.read.option("key", "value")
+        >>> spark.read.options(key="value")
+        <...readwriter.DataFrameReader object ...>
+
+        Specify options in a dictionary.
+
+        >>> spark.read.options(**{"k1": "v1", "k2": "v2"})
         <...readwriter.DataFrameReader object ...>
 
         Specify the option 'nullValue' and 'header' with reading a CSV file.
@@ -1172,7 +1177,12 @@ class DataFrameWriter(OptionUtils):
 
         Examples
         --------
-        >>> spark.range(1).write.option("key", "value")
+        >>> spark.range(1).write.options(key="value")
+        <...readwriter.DataFrameWriter object ...>
+
+        Specify options in a dictionary.
+
+        >>> spark.range(1).write.options(**{"k1": "v1", "k2": "v2"})
         <...readwriter.DataFrameWriter object ...>
 
         Specify the option 'nullValue' and 'header' with writing a CSV file.
diff --git a/python/pyspark/sql/streaming/readwriter.py 
b/python/pyspark/sql/streaming/readwriter.py
index 2026651ce12..b0f01c06b2e 100644
--- a/python/pyspark/sql/streaming/readwriter.py
+++ b/python/pyspark/sql/streaming/readwriter.py
@@ -224,6 +224,11 @@ class DataStreamReader(OptionUtils):
         >>> spark.readStream.options(x="1", y=2)
         <...streaming.readwriter.DataStreamReader object ...>
 
+        Specify options in a dictionary.
+
+        >>> spark.readStream.options(**{"k1": "v1", "k2": "v2"})
+        <...streaming.readwriter.DataStreamReader object ...>
+
         The example below specifies 'rowsPerSecond' and 'numPartitions' 
options to
         Rate source in order to generate 10 rows with 10 partitions every 
second.
 
@@ -943,6 +948,11 @@ class DataStreamWriter:
         >>> df.writeStream.option("x", 1)
         <...streaming.readwriter.DataStreamWriter object ...>
 
+        Specify options in a dictionary.
+
+        >>> df.writeStream.options(**{"k1": "v1", "k2": "v2"})
+        <...streaming.readwriter.DataStreamWriter object ...>
+
         The example below specifies 'numRows' and 'truncate' options to 
Console source in order
         to print 3 rows for every batch without truncating the results.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-45526][PYTHON][DOCS] Improve the example of DataFrameReader/Writer.options to take a dictionary

Reply via email to