Hello,
we have a usecase where we use kafkaConsumer in a SourceTask of a source
connector to poll messages from an aws msk.

if we try to produce data into the source topic immediately after the
connector gets into the running state we sometimes notice that
kafkaconsumer misses some of the records written into the kafka source
topic. (note that sourceTask#start involves subscribing to the topic and
sourceTask#poll involves the acutal kafkaConsumer.poll) call.

i hypothesised that this might be due to kafka Consumer taking time to find
the offset for the topic and given that we have the auto.offset.reset
config set to latest this is the reason why it's happening, but I am unsure
on what observability i can use to confirm this (I have set up the log
level to error). but can it happen that the kafka connector is in running
state but it's polling method which basically uses kaflaConsumer.poll() is
still awaiting offset allocation? is there a way to verify this ina. more
efficient manner?

Reply via email to