[jira] [Commented] (SPARK-27052) Using PySpark udf in transform yields NULL values

Herman van Hovell (JIRA) Fri, 22 Mar 2019 06:36:21 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-27052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16799017#comment-16799017
 ]


Herman van Hovell commented on SPARK-27052:
-------------------------------------------

This is not supported at the moment. This will probably be non-trivial to 
implement since we need to figure an performant way to invoke python here. In 
this particular case we can probably rewrite the higher order function into a 
chain map operations of which one will be executed by python. Anyway lets 
discuss this first before starting to code this up.

> Using PySpark udf in transform yields NULL values
> -------------------------------------------------
>
>                 Key: SPARK-27052
>                 URL: https://issues.apache.org/jira/browse/SPARK-27052
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 2.4.0
>            Reporter: hejsgpuom62c
>            Priority: Major
>
> Steps to reproduce
> {code:java}
> from typing import Optional
> from pyspark.sql.functions import expr
> def f(x: Optional[int]) -> Optional[int]:
>     return x + 1 if x is not None else None
> spark.udf.register('f', f, "integer")
> df = (spark
>     .createDataFrame([(1, [1, 2, 3])], ("id", "xs"))
>     .withColumn("xsinc", expr("transform(xs, x -> f(x))")))
> df.show()
> # +---+---------+-----+
> # | id|       xs|xsinc|
> # +---+---------+-----+
> # |  1|[1, 2, 3]| [,,]|
> # +---+---------+-----+
> {code}
>  
> Source https://stackoverflow.com/a/53762650



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-27052) Using PySpark udf in transform yields NULL values

Reply via email to