Chaoqin Li created SPARK-47273:
----------------------------------

             Summary: Implement python stream writer interface
                 Key: SPARK-47273
                 URL: https://issues.apache.org/jira/browse/SPARK-47273
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SS
    Affects Versions: 4.0.0
            Reporter: Chaoqin Li


In order to support developing spark streaming sink in python, we need to 
implement

Reuse PythonPartitionWriter to implement the serialization and execution of 
write callback in executor.

Implement python worker process to run python streaming data sink committer and 
communicate with JVM through socket in spark driver. For each python streaming 
data sink instance there will be a long live python worker process created. 
Inside the python process, the python write committer will receive abort or 
commit function call and send back result through socket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to