Hi all,

Recently, we've been experimenting with using MM2 to mirror topics that
were populated by transactional producers. We've noticed that MM2
replicates records but not transaction markers, causing certain offsets to
appear in the source topic but not destination topic. These behaviors can
also be seen when using Filter SMTs, or when replicating topics which have
undergone compaction, which cause the same concentration of offsets in the
target topic.

This has the following negative effects with offset translation:
P1. When starting replication on an existing topic with existing consumer
groups, offsets are translated beyond the end of the topic, leading to
"negative lag" for the downstream consumer group
P2. When in a "negative lag" situation, and a consumer fail-over from
source to is triggered, downstream consumption will stall until the
downstream offsets exceed the "negative lag" offsets.
P3. When failing over from source to target, certain records may have been
ahead of the upstream consumer group and behind the downstream consumer
group, leading to records not being delivered at least once.

We merged a solution the above by making a change to the translation logic
in https://issues.apache.org/jira/browse/KAFKA-12468 , and settled on a
strategy to make offset translation more conservative, effectively making
it such that the MirrorCheckpointTask only emits offsets at or immediately
after the latest offset sync. This has the effect that offsets are more
correct than previously, but that did not come without costs:

P4. More offset syncs must be emitted to the offset syncs topic to enforce
the `offset.lag.max` config property, once per `offset.max.lag` records
(regression in the original PR, addressed by
https://issues.apache.org/jira/browse/KAFKA-14797)
P5. More recent offset syncs narrow the window in which translation can
take place, leading to some translated offsets becoming excessively stale.
This limitation is captured in
https://issues.apache.org/jira/browse/KAFKA-14666 .
P6. Even with the above fixes, offset translation won't be able to
translate ahead the latest offset sync, and offsets may not converge
exactly to the end of the topic.

Fixing KAFKA-14797 appears possible without a KIP, but it is unclear
whether KAFKA-14666 requires a KIP to resolve.

To summarize:
* Released versions of Kafka have reasonable behavior for normal topics,
and correctness problems for compacted, filtered, and transactional topics.
* KAFKA-12468 fixes correctness for compacted, filtered, and transactional
topics, and regresses availability for all topics
* KAFKA-14797 makes availability better for normal topics, but still worse
than release.
* KAFKA-14666 makes availability better for all topics, but still worse
than release.

Questions:
Q1. Does KAFKA-14666 require a KIP to resolve?
Q2. Is the increased likelihood of KAFKA-14666 caused by KAFKA-14797 a
regression in behavior?
Q3. Is the KAFKA-12468 correctness fix worth the general availability loss
(P6) that is bounded by offset.lag.max?
Q4. Is some or all of the above eligible for release in a patch release, or
should these fixes be contained to just a minor release?
Q5. Can we make a tactical fix for KAFKA-14666 to enable users to
workaround the issue?
Q6. Do you have any alternative solutions for KAFKA-14666 that we should
consider?

I want to understand if we need to revert the correctness fix already
merged, or if we can address correctness now and availability later.

Thanks,
Greg

Reply via email to