Hi,

 

I had a look at the Connect Source Worker code and have two questions:
When a Source Task commits offsets, does it perform compaction / optimisation 
before sending off? E.g.  I read from 1 source partition, and I read 1000 
messages. Will the offset flush send 1000 messages to the offset storage, or 
just 1 (the last one)?
I don’t really understand why WorkerSourceTask is trying to flush outstanding 
messages before committing the offsets? (cf 
https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L328
 ).  
I would believe that committing the offsets would just commit the offsets for 
the messages we know for sure have been flushed at the moment the commit is 
requested. That would remove one massive timeout from happening if the source 
task pulls a lot of message and the producer is overwhelmed / can’t complete 
the message flush in the 5 seconds of timeout.  
 

Thanks a lot for the responses. I may open JIRAs based on the answers of the 
questions, if that would help bring some performance improvements. 

 

Stephane

Reply via email to