emilnkrastev opened a new pull request #11818:
URL: https://github.com/apache/kafka/pull/11818


   This PR addresses the issue described here 
[KAFKA-12558](https://issues.apache.org/jira/browse/KAFKA-12558). 
   
   Additionally, The PR also allows to configure the max outstanding syncs in 
MirrorSourceTask because it is currently 
[hardcoded](https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorSourceTask.java#L53).
   A lot of offset syncs messages are lost during burst of messages in the 
source cluster or when the MirrorMaker has a lot to catch up (fist run or being 
inactive for a while). In such scenario it will take a while to sync the 
offsets in the destination cluster with partitions without regular activity 
even with reaching the maximum parallelism - 1 task per partition.
   
   The PR tries to mitigate the issue by providing a way to change the maximum 
allowed concurrent offset syncs so that there are less offset syncs loses.
   
   Here are my steps to reproduce the offset syncs issue because of the max 
outstanding syncs limited to 10:
   1. Source topic with 12 partitions and 1400 messages with minimal activity. 
Messages are getting produced on daily basis
   2. Run MirrorMaker2 process within the destination cluster network with 
offset syncs topic location set to target and 5 tasks
   3. 372 offset syncs messages arrived in the destination cluster offset syncs 
topic
   4. 9 out of 12 partitions are not synced correctly in the destination cluster
   5. Waiting for hours and more for new messages to arrive in source Kafka 
cluster which will sync the correct offsets


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to