Hello Community,
I have two questions regarding Flink custom sink with EXACTLY_ONCE semantic.
1. I have a SDK that could publish messages based on HTTP (backed by Oracle
Streaming Service --- very similar to Kafka). This will be my Flink
application’s sink. Is it possible to use this SDK as sink with EXACTLY_ONCE
semantic? HTTP is stateless here… If possible, what could be added in SDK to
support EXACTLY_ONCE?
2. If it is possible for question 1, then I need to implement a custom sink
for this. Which option should I use?
* Option 1:
TwoPhaseCommitSinkFunction<https://nightlies.apache.org/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.html>
* Option 2:
StatefulSink<https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/connector/sink2/StatefulSink.java>
+
TwoPhaseCommittingSink<https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/connector/sink2/TwoPhaseCommittingSink.java>
The legacy FlinkKafkaProducer seems to be using option (a) ---- This will be
removed from Flink in the future. The new
KafkaSink<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#kafka-sink>
seems to be using option (b). Based on the comment in the code, it seems
option (a) is recommended, which one should I use? Please suggest if I am
missing anything, or any other better solutions in my case?
Thanks,
Fuyao