This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new ad18393d2f1e [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn` ad18393d2f1e is described below commit ad18393d2f1e411e7898cea14af550360dfc8670 Author: panbingkun <pbk1...@gmail.com> AuthorDate: Wed Sep 20 15:27:46 2023 +0800 [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn` ### What changes were proposed in this pull request? The pr aims to refine docstring of `rand/randn`. ### Why are the changes needed? - We need to add a call without seed in the example, and then skip it in the `doctest`. - To improve PySpark documentation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Manually test. - Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43003 from panbingkun/SPARK-45226. Authored-by: panbingkun <pbk1...@gmail.com> Signed-off-by: Ruifeng Zheng <ruife...@apache.org> --- python/pyspark/sql/functions.py | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py index 5474873df7b2..83049124bdb2 100644 --- a/python/pyspark/sql/functions.py +++ b/python/pyspark/sql/functions.py @@ -5243,16 +5243,28 @@ def rand(seed: Optional[int] = None) -> Column: Parameters ---------- seed : int (default: None) - seed value for random generator. + Seed value for the random generator. Returns ------- :class:`~pyspark.sql.Column` - random values. + A column of random values. Examples -------- + Example 1: Generate a random column without a seed + >>> from pyspark.sql import functions as sf + >>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand()).show() # doctest: +SKIP + +---+-------------------+ + | id| rand| + +---+-------------------+ + | 0|0.14879325244215424| + | 1| 0.4640631044275454| + +---+-------------------+ + + Example 2: Generate a random column with a specific seed + >>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand(seed=42) * 3).show() +---+------------------+ | id| rand| @@ -5269,8 +5281,8 @@ def rand(seed: Optional[int] = None) -> Column: @_try_remote_functions def randn(seed: Optional[int] = None) -> Column: - """Generates a column with independent and identically distributed (i.i.d.) samples from - the standard normal distribution. + """Generates a random column with independent and identically distributed (i.i.d.) samples + from the standard normal distribution. .. versionadded:: 1.4.0 @@ -5284,16 +5296,28 @@ def randn(seed: Optional[int] = None) -> Column: Parameters ---------- seed : int (default: None) - seed value for random generator. + Seed value for the random generator. Returns ------- :class:`~pyspark.sql.Column` - random values. + A column of random values. Examples -------- + Example 1: Generate a random column without a seed + >>> from pyspark.sql import functions as sf + >>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn()).show() # doctest: +SKIP + +---+--------------------+ + | id| randn| + +---+--------------------+ + | 0|-0.45011372342934214| + | 1| 0.6567304165329736| + +---+--------------------+ + + Example 2: Generate a random column with a specific seed + >>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn(seed=42)).show() +---+------------------+ | id| randn| --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org