[ 
https://issues.apache.org/jira/browse/SPARK-41661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-41661:
---------------------------------
    Description: 
See design doc 
[here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].

User-defined Functions in Python consist of (pickled) Python UDFs and 
(Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
on top of the Apache Spark™ engine. Users only have to state "what to do"; 
PySpark, as a sandbox, encapsulates "how to do it".

Spark Connect Python Client (SCPC), as a client and server interface for 
PySpark will eventually replace the legacy API of PySpark in OSS. Supporting 
PySpark UDFs is essential for Spark Connect to reach parity with the PySpark 
legacy API.

  was:
See design doc 
[here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].

User-defined Functions in Python consist of (pickled) Python UDFs and 
(Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
on top of the Apache Spark™ engine. Users only have to state "what to do"; 
PySpark, as a sandbox, encapsulates "how to do it".

Spark Connect Python Client (SCPC), as a client and server interface for 
PySpark, will eventually (probably Spark 4.0) replace the legacy API of PySpark 
in both OSS. Supporting PySpark UDFs is essential for Spark Connect to reach 
parity with the PySpark legacy API.


> Support for User-defined Functions in Python
> --------------------------------------------
>
>                 Key: SPARK-41661
>                 URL: https://issues.apache.org/jira/browse/SPARK-41661
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Connect
>    Affects Versions: 3.4.0
>            Reporter: Martin Grund
>            Assignee: Xinrong Meng
>            Priority: Major
>
> See design doc 
> [here|https://docs.google.com/document/d/e/2PACX-1vRXF8nTdjwH0LbYyp3b6Zt6STEKWsvfKSO7_s4foOB-3zJ2h4_06JF147hUPlADJxZ_X22RFxgZ-fRS/pub].
> User-defined Functions in Python consist of (pickled) Python UDFs and 
> (Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code 
> on top of the Apache Spark™ engine. Users only have to state "what to do"; 
> PySpark, as a sandbox, encapsulates "how to do it".
> Spark Connect Python Client (SCPC), as a client and server interface for 
> PySpark will eventually replace the legacy API of PySpark in OSS. Supporting 
> PySpark UDFs is essential for Spark Connect to reach parity with the PySpark 
> legacy API.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to