hejsgpuom62c created SPARK-27052: ------------------------------------ Summary: Using PySpark udf in transform yields NULL values Key: SPARK-27052 URL: https://issues.apache.org/jira/browse/SPARK-27052 Project: Spark Issue Type: Bug Components: PySpark, SQL Affects Versions: 2.4.0 Reporter: hejsgpuom62c
Steps to reproduce {code:java} from typing import Optional from pyspark.sql.functions import expr def f(x: Optional[int]) -> Optional[int]: return x + 1 if x is not None else None spark.udf.register('f', f, "integer") df = (spark .createDataFrame([(1, [1, 2, 3])], ("id", "xs")) .withColumn("xsinc", expr("transform(xs, x -> f(x))"))) df.show() # +---+---------+-----+ # | id| xs|xsinc| # +---+---------+-----+ # | 1|[1, 2, 3]| [,,]| # +---+---------+-----+ {code} Source https://stackoverflow.com/a/53762650 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org