Chaoqin Li created SPARK-47273: ---------------------------------- Summary: Implement python stream writer interface Key: SPARK-47273 URL: https://issues.apache.org/jira/browse/SPARK-47273 Project: Spark Issue Type: Improvement Components: PySpark, SS Affects Versions: 4.0.0 Reporter: Chaoqin Li
In order to support developing spark streaming sink in python, we need to implement Reuse PythonPartitionWriter to implement the serialization and execution of write callback in executor. Implement python worker process to run python streaming data sink committer and communicate with JVM through socket in spark driver. For each python streaming data sink instance there will be a long live python worker process created. Inside the python process, the python write committer will receive abort or commit function call and send back result through socket. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org