geric created KAFKA-19607:
-----------------------------
Summary: MirrorMaker2 Offset Replication Issue
Key: KAFKA-19607
URL: https://issues.apache.org/jira/browse/KAFKA-19607
Project: Kafka
Issue Type: Bug
Components: mirrormaker
Affects Versions: 4.0.0
Reporter: geric
I am using *Apache Kafka 4.0* with *MirrorMaker 2* to link the primary cluster
({*}clusterA{*}) to the secondary cluster ({*}clusterB{*}).
The secondary cluster will not have any producers or consumers until a disaster
recovery event occurs, at which point all producers and consumers will switch
to it.
*Setup:*
* Dedicated standalone MirrorMaker 2 node
* {{IdentityReplicationPolicy}} (no topic renaming)
* No clients connected to secondary cluster under normal operation
*MirrorMaker 2 config:*
{{# Cluster aliases
clusters = clusterA, clusterB
# Bootstrap servers
clusterA.bootstrap.servers = serverA-kafka-1:9092
clusterB.bootstrap.servers = serverB-kafka-1:9092
# Replication policy
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
# Offset/Checkpoint sync
emit.checkpoints.enabled=true
emit.checkpoints.interval.seconds=5
sync.group.offsets.enabled=true
sync.group.offsets.interval.seconds=5
offset.lag.max=10
refresh.topics.interval.seconds=5}}
----
h3. Test results:
# *Produce 300 messages when MirrorMaker is running*
*Expected:* Topic offset synced within a minute
*Result:* ✅ Passed
# *Consume 100 messages when MirrorMaker is running, then terminate the
consumer*
*Expected:* Consumer offset synced
*Result:* ❌ Failed — offset is not synced to clusterB
# *Restart MirrorMaker after test #2*
*Expected:* Consumer offset synced
*Result:* ✅ Passed
# *Repeat test #2 — consume 100 messages when MirrorMaker is running, then
terminate the consumer*
*Expected:* Consumer offset synced
*Result:* ❌ Failed — offset is not synced to clusterB
# *Restart MirrorMaker after test #4*
*Expected:* Consumer offset synced
*Result:* ❌ Failed — offset is not synced to clusterB
# *Consume messages but keep consumer running*
*Expected:* Offset synced
*Result:* ✅ Passed
----
h3. Problem:
Consumer offsets appear to only sync in these cases:
# When MirrorMaker is restarted and the consumer offset does *not* already
exist in the secondary cluster (initial sync), or
# When the consumer is still connected at the time of sync, *or* when the
consumer has reached the end of the offset (i.e., consumed all available
messages).
However, if the consumer exits immediately after consuming some messages (but
{*}before reaching the end of the topic{*}), the committed offset is *never
synced* to the target cluster.
----
h3. Additional Context / Related Issues
This problem seems related to an open discussion in the Apache Kafka mailing
list:
*MirrorCheckpointConnector does not replicate final batch of offsets*
[https://lists.apache.org/thread/dxn9jyotl00f7ov541299cd8tlcl1z00]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)