I’m creating a custom Kafka Connect source connector, and I’m running into a 
situation for which Kafka Connect doesn’t seem to provide a solution out of the 
box. I thought I’d first post to the users list in case I’m just missing a 
feature that’s already there.

My connector’s SourceTask implementation is reading a relational database 
transaction log. That log contains schema changes and row changes, and the row 
changes include a reference to the table and the row values. Thus, as the task 
processes the log, it has to use any schema changes in the log to adjust how it 
converts subsequent row changes into Kafka source records. Should the task stop 
and be restarted elsewhere, it can continue reading the transaction log where 
it left off only if that new task instance can recover the schema state 
accumulated by an earlier task.

While I certainly can use a custom solution to store this state somewhere, it 
seems like other connectors might benefit from having Kafka Connect include 
something out of the box. And, this accumulated state (and its history with 
respect to the source offset at which the state changes) seems like a perfect 
fit for storing in a Kafka topic.

Does Kafka Connect already have a mechanism for tasks to store and recover 
arbitrary state? If not, then is there interest in adding this capability to 
Kafka Connect? (If there is interest, then perhaps the dev list is a better 
venue.)

Best regards,

Randall Hauch

Reply via email to