Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18906#discussion_r163831513 --- Diff: python/pyspark/sql/functions.py --- @@ -2105,6 +2105,14 @@ def udf(f=None, returnType=StringType()): >>> import random >>> random_udf = udf(lambda: int(random.random() * 100), IntegerType()).asNondeterministic() + .. note:: The user-defined functions are considered to be able to return null values by default. + If your function is not nullable, call `asNonNullable` on the user defined function. + E.g.: + + >>> from pyspark.sql.types import StringType + >>> import getpass + >>> getuser_udf = udf(lambda: getpass.getuser(), StringType()).asNonNullable() --- End diff -- Oh, simply because it's shorter :-). That's fine if you prefer this way. I don't feel strongly.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org