[ https://issues.apache.org/jira/browse/SPARK-27463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842335#comment-16842335 ]
Bryan Cutler commented on SPARK-27463: -------------------------------------- [~d80tb7] I think you could remove the SPIP label from this and begin work. It will require some tweaks to the Python worker and add a new API, but not major changes and additions like other SPIPs. If others feel differently though, we could continue with the SPIP process. > SPIP: Support Dataframe Cogroup via Pandas UDFs > ------------------------------------------------ > > Key: SPARK-27463 > URL: https://issues.apache.org/jira/browse/SPARK-27463 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL > Affects Versions: 3.0.0 > Reporter: Chris Martin > Priority: Major > Labels: SPIP > > Recent work on Pandas UDFs in Spark, has allowed for improved > interoperability between Pandas and Spark. This proposal aims to extend this > by introducing a new Pandas UDF type which would allow for a cogroup > operation to be applied to two PySpark DataFrames. > Full details are in the google document linked below. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org