C0urante commented on PR #14158: URL: https://github.com/apache/kafka/pull/14158#issuecomment-1668229509
I've been scratching my head over this one for a bit. One one hand, it's nice to allow heavily-filtered source connectors to record progress (and this was a suggestion I made to address part of the motivation for [KIP-910](https://cwiki.apache.org/confluence/display/KAFKA/KIP-910%3A+Update+Source+offsets+for+Source+Connectors+without+producing+records)) so that there are fewer duplicates if one is restarted. However, the current behavior when exactly-once support is disabled also has some benefits. Right now it's possible to write an SMT that does batching of many source records into a single Kafka record. I'm also curious--what's the behavior with sink connectors when records are filtered via SMT? Does this vary depending on whether the connector's task class overrides the `SinkTask::preCommit` method? @vamossagar12 Ultimately I agree that some work probably has to be done around this logic, and thanks for identifying the discrepancy. I'm just not certain that the decision I made to commit offsets for dropped records when working on exactly-once source connectors was the correct one, and think we should at least consider reverting that change in behavior rather than updating other, longer-existing modes to align with it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org