[ https://issues.apache.org/jira/browse/KAFKA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323648#comment-15323648 ]
Ewen Cheslack-Postava commented on KAFKA-3821: ---------------------------------------------- I was thinking about this a bit more, along with the fact that having to shoehorn extra data into offsets is not ideal. Maybe a better way to expose this would be to provide a separate {{data}} field or something like that, which is also key-value based (similar to source partition/source offset) and has as flexible a data structure. We could manage the two types of data together and they'd have the same basic semantics, but allow you to decouple state/data changes that really only have to happen once in awhile from the offsets, which really do need to be associated with every message. We could possibly then use a subclass like {{DataOnlySourceRecord}} which doesn't trigger any data to be written, but still applies the data changes. (I think it might be nice to introduce a parent interface for both instead of still calling it {{SourceRecord}}, but I'm not sure we could do that in a compatible way.) [~rhauch] Thoughts? Would this be a better fit for what you're trying to accomplish? The change to the output of {{poll()}} to contain more than just records to be written to Kafka is a bit weird, but might make sense for these use cases and provides framework support for having those changes get "committed" asynchronously but still at a safe point. > Allow Kafka Connect source tasks to produce offset without writing to topics > ---------------------------------------------------------------------------- > > Key: KAFKA-3821 > URL: https://issues.apache.org/jira/browse/KAFKA-3821 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect > Affects Versions: 0.9.0.1 > Reporter: Randall Hauch > Assignee: Ewen Cheslack-Postava > > Provide a way for a {{SourceTask}} implementation to record a new offset for > a given partition without necessarily writing a source record to a topic. > Consider a connector task that uses the same offset when producing an unknown > number of {{SourceRecord}} objects (e.g., it is taking a snapshot of a > database). Once the task completes those records, the connector wants to > update the offsets (e.g., the snapshot is complete) but has no more records > to be written to a topic. With this change, the task could simply supply an > updated offset. -- This message was sent by Atlassian JIRA (v6.3.4#6332)