Hi all, Recently, we've been experimenting with using MM2 to mirror topics that were populated by transactional producers. We've noticed that MM2 replicates records but not transaction markers, causing certain offsets to appear in the source topic but not destination topic. These behaviors can also be seen when using Filter SMTs, or when replicating topics which have undergone compaction, which cause the same concentration of offsets in the target topic.
This has the following negative effects with offset translation: P1. When starting replication on an existing topic with existing consumer groups, offsets are translated beyond the end of the topic, leading to "negative lag" for the downstream consumer group P2. When in a "negative lag" situation, and a consumer fail-over from source to is triggered, downstream consumption will stall until the downstream offsets exceed the "negative lag" offsets. P3. When failing over from source to target, certain records may have been ahead of the upstream consumer group and behind the downstream consumer group, leading to records not being delivered at least once. We merged a solution the above by making a change to the translation logic in https://issues.apache.org/jira/browse/KAFKA-12468 , and settled on a strategy to make offset translation more conservative, effectively making it such that the MirrorCheckpointTask only emits offsets at or immediately after the latest offset sync. This has the effect that offsets are more correct than previously, but that did not come without costs: P4. More offset syncs must be emitted to the offset syncs topic to enforce the `offset.lag.max` config property, once per `offset.max.lag` records (regression in the original PR, addressed by https://issues.apache.org/jira/browse/KAFKA-14797) P5. More recent offset syncs narrow the window in which translation can take place, leading to some translated offsets becoming excessively stale. This limitation is captured in https://issues.apache.org/jira/browse/KAFKA-14666 . P6. Even with the above fixes, offset translation won't be able to translate ahead the latest offset sync, and offsets may not converge exactly to the end of the topic. Fixing KAFKA-14797 appears possible without a KIP, but it is unclear whether KAFKA-14666 requires a KIP to resolve. To summarize: * Released versions of Kafka have reasonable behavior for normal topics, and correctness problems for compacted, filtered, and transactional topics. * KAFKA-12468 fixes correctness for compacted, filtered, and transactional topics, and regresses availability for all topics * KAFKA-14797 makes availability better for normal topics, but still worse than release. * KAFKA-14666 makes availability better for all topics, but still worse than release. Questions: Q1. Does KAFKA-14666 require a KIP to resolve? Q2. Is the increased likelihood of KAFKA-14666 caused by KAFKA-14797 a regression in behavior? Q3. Is the KAFKA-12468 correctness fix worth the general availability loss (P6) that is bounded by offset.lag.max? Q4. Is some or all of the above eligible for release in a patch release, or should these fixes be contained to just a minor release? Q5. Can we make a tactical fix for KAFKA-14666 to enable users to workaround the issue? Q6. Do you have any alternative solutions for KAFKA-14666 that we should consider? I want to understand if we need to revert the correctness fix already merged, or if we can address correctness now and availability later. Thanks, Greg