[ https://issues.apache.org/jira/browse/FLINK-21191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jark Wu closed FLINK-21191. --------------------------- Resolution: Fixed Fixed in master: ec9b0c5b60290697769415eb3e1b1ed2052460ac > Support reducing buffer for upsert-kafka sink > --------------------------------------------- > > Key: FLINK-21191 > URL: https://issues.apache.org/jira/browse/FLINK-21191 > Project: Flink > Issue Type: New Feature > Components: Connectors / Kafka, Table SQL / Ecosystem > Reporter: Jark Wu > Assignee: Shengkai Fang > Priority: Major > Labels: pull-request-available > Fix For: 1.13.0 > > > Currently, if there is a job agg -> filter -> upsert-kafka, then upsert-kafka > will receive -U and +U for every updates instead of only a +U. This will > produce a lot of tombstone messages in Kafka. It's not just about the > unnecessary data volume in Kafka, but users may processes that trigger side > effects when a tombstone records is ingested from a Kafka topic. > A simple solution would be add a reducing buffer for the upsert-kafka, to > reduce the -U and +U before emitting to the underlying sink. This should be > very similar to the implementation of upsert JDBC sink. > We can even extract the reducing logic out of the JDBC connector and it can > be reused by other connectors. > This should be something like `BufferedUpsertSinkFunction` which has a > reducing buffer and flush to the underlying SinkFunction > once checkpointing or buffer timeout. We can put it in `flink-connector-base` > which can be shared for builtin connectors and custom connectors. -- This message was sent by Atlassian Jira (v8.3.4#803005)