Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18906#discussion_r163759284
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -2105,6 +2105,14 @@ def udf(f=None, returnType=StringType()):
         >>> import random
         >>> random_udf = udf(lambda: int(random.random() * 100), 
IntegerType()).asNondeterministic()
     
    +    .. note:: The user-defined functions are considered to be able to 
return null values by default.
    +        If your function is not nullable, call `asNonNullable` on the user 
defined function.
    +        E.g.:
    +
    +    >>> from pyspark.sql.types import StringType
    +    >>> import getpass
    +    >>> getuser_udf = udf(lambda: getpass.getuser(), 
StringType()).asNonNullable()
    --- End diff --
    
    Default is string. I think we can omit `StingType`:
    
    ```python
    getuser_udf = udf(lambda: getpass.getuser(), "string").asNonNullable()
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to