HeartSaVioR edited a comment on issue #25618: [SPARK-28908][SS]Implement Kafka EOS sink for Structured Streaming URL: https://github.com/apache/spark/pull/25618#issuecomment-526593592 Well, someone could say it as 2PC since the behavior is similar, but generally 2PC assumes coordinator and participants. In second phase, coordinator "ask" for commit/abort to participants, not committing/aborting things directly participants just did in first phase by itself. Based on that, driver should request tasks to commit their outputs, but Spark doesn't provide such flow. So that's pretty simplified version of 2PC and also pretty limited. I think the point is whether we are feeling OK to have exactly-once with some restrictions end users need to be aware of. Could you please initiate discussion on this in Spark dev mailing list? That would be good to hear others' voices.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org