Github user viirya commented on the issue: https://github.com/apache/spark/pull/20360 @hankim maybe like: ``` import pyspark.sql.functions as f import uuid df = spark.createDataFrame([(1,2), (3,4)]) f_udf = f.udf(lambda: str(uuid.uuid4())) df2 = df.distinct().cache() df3 = df2.withColumn("a", f_udf()).show() ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org