This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new b701d6e8951d [SPARK-45258][PYTHON][DOCS] Refine docstring of `sum`
b701d6e8951d is described below

commit b701d6e8951dd1a506e6a6bd0a5c3c7c23b8ddf0
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Tue Nov 7 09:51:51 2023 -0800

    [SPARK-45258][PYTHON][DOCS] Refine docstring of `sum`
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to improve the docstring of `sum`.
    
    ### Why are the changes needed?
    
    For end users, and better usability of PySpark.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, it fixes the user facing documentation.
    
    ### How was this patch tested?
    
    Manually tested.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #43684 from HyukjinKwon/SPARK-45258.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/pyspark/sql/functions.py | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 869506a35586..a32f04164f31 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -1197,13 +1197,27 @@ def sum(col: "ColumnOrName") -> Column:
 
     Examples
     --------
+    Example 1: Calculating the sum of values in a column
+
+    >>> from pyspark.sql import functions as sf
     >>> df = spark.range(10)
-    >>> df.select(sum(df["id"])).show()
+    >>> df.select(sf.sum(df["id"])).show()
     +-------+
     |sum(id)|
     +-------+
     |     45|
     +-------+
+
+    Example 2: Using a plus expression together to calculate the sum
+
+    >>> from pyspark.sql import functions as sf
+    >>> df = spark.createDataFrame([(1, 2), (3, 4)], ["A", "B"])
+    >>> df.select(sf.sum(sf.col("A") + sf.col("B"))).show()
+    +------------+
+    |sum((A + B))|
+    +------------+
+    |          10|
+    +------------+
     """
     return _invoke_function_over_columns("sum", col)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to