[ 
https://issues.apache.org/jira/browse/KAFKA-14666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703339#comment-17703339
 ] 

Greg Harris edited comment on KAFKA-14666 at 3/21/23 7:08 PM:
--------------------------------------------------------------

I proposed a tactical fix for this in 
[https://github.com/apache/kafka/pull/13429] which provides variable-accuracy 
translation between the most recent restart of MM2 and the end of the 
replicated topic, similar to the existing behavior pre KAFKA-12468.

This uses strategy (2) from above, but limited to syncs read during a single 
task lifetime to get monotonicity without re-reading the checkpoint topic.

A separate improvement can be considered to allow for translation of offsets 
prior to the latest restart of MM2, or increasing the accuracy of the 
translation with new configurations.


was (Author: gharris1727):
I proposed a tactical fix for this in 
[https://github.com/apache/kafka/pull/13429] which provides variable-accuracy 
translation between the most recent restart of MM2 and the end of the 
replicated topic, similar to the existing behavior pre KAFKA-12468.

A separate improvement can be considered to allow for translation of offsets 
prior to the latest restart of MM2, or increasing the accuracy of the 
translation with new configurations.

> MM2 should translate consumer group offsets behind replication flow
> -------------------------------------------------------------------
>
>                 Key: KAFKA-14666
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14666
>             Project: Kafka
>          Issue Type: Improvement
>          Components: mirrormaker
>    Affects Versions: 3.5.0
>            Reporter: Greg Harris
>            Assignee: Greg Harris
>            Priority: Major
>
> MirrorMaker2 includes an offset translation feature which can translate the 
> offsets for an upstream consumer group to a corresponding downstream consumer 
> group. It does this by keeping a topic of offset-syncs to correlate upstream 
> and downstream offsets, and translates any source offsets which are ahead of 
> the replication flow.
> However, if a replication flow is closer to the end of a topic than the 
> consumer group, then the offset translation feature will refuse to translate 
> the offset for correctness reasons. This is because the MirrorCheckpointTask 
> only keeps the latest offset correlation between source and target, it does 
> not have sufficient information to translate older offsets.
> The workarounds for this issue are to:
> 1. Pause the replication flow occasionally to allow the source to get ahead 
> of MM2
> 2. Increase the offset.lag.max to delay offset syncs, increasing the window 
> for translation to happen. With the fix for KAFKA-12468, this will also 
> increase the lag of applications that are ahead of the replication flow, so 
> this is a tradeoff.
> Instead, the MirrorCheckpointTask should provide correct and best-effort 
> translation for consumer groups behind the replication flow by keeping 
> additional state, or re-reading the offset-syncs topic. This should be a 
> substantial improvement for use-cases where applications have a higher 
> latency to commit than the replication flow, or where applications are 
> reading from the earliest offset.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to