Hi Flink devs, I’d like to officially start a discussion for FLIP-319: Integrating with Kafka’s proper support for 2PC participation (KIP-939) [1].
This is the “sister” joint FLIP for KIP-939 [2] [3]. It has been a long-standing issue that Flink’s Kafka connector doesn’t work fully correctly under exactly-once mode due to lack of distributed transaction support in the Kafka transaction protocol. This has led to subpar hacks in the connector such as Java reflections to workaround the protocol's limitations (which causes a bunch of problems on its own, e.g. long recovery times for the connector), while still having corner case scenarios that can lead to data loss. This joint effort with the Kafka community attempts to address this so that the Flink Kafka connector can finally work against public Kafka APIs, which should result in a much more robust integration between the two systems, and for Flink developers, easier maintainability of the code. Obviously, actually implementing this FLIP relies on the joint KIP being implemented and released first. Nevertheless, I'd like to start the discussion for the design as early as possible so we can benefit from the new Kafka changes as soon as it is available. Looking forward to feedback and comments on the proposal! Thanks, Gordon [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071710 [2] https://cwiki.apache.org/confluence/display/KAFKA/KIP-939%3A+Support+Participation+in+2PC [3] https://lists.apache.org/thread/wbs9sqs3z1tdm7ptw5j4o9osmx9s41nf