Stanislav Kozlovski created KAFKA-7968:
------------------------------------------
Summary: Delete leader epoch cache files with old message format
versions
Key: KAFKA-7968
URL: https://issues.apache.org/jira/browse/KAFKA-7968
Project: Kafka
Issue Type: Bug
Affects Versions: 2.0.1
Reporter: Stanislav Kozlovski
Assignee: Stanislav Kozlovski
[KAFKA-7897 (Invalid use of epoch cache with old message format
versions)|https://issues.apache.org/jira/browse/KAFKA-7897] fixed a critical
bug where replica followers would inadequately use their leader epoch cache for
truncating their logs upon becoming a follower. [The root of the
issue|https://issues.apache.org/jira/browse/KAFKA-7897?focusedCommentId=16761049&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16761049]
was that a regression in KAFKA-7415 caused the leader epoch cache to be
populated upon becoming a follower, even if the message format was older.
KAFKA-7897 fixed that problem by not updating the leader epoch cache if the
message format does not support it. It was merged all the way back to 1.1 but
due to significant branch divergence, the patches for 2.0 and below were
simplified. As said in the commit:
Note this is a simplified fix than what was merged to trunk in #6232 since the
branches have diverged significantly. Rather than removing the epoch cache
file, we guard usage of the cache with the record version.
This results in the same bug being hit at a different time. When the message
format gets upgraded to support the leader epoch cache, brokers start to make
use of it. Due to the previous problem, we still have the sparsely populated
epoch cache file present. This results in the same large truncations we saw in
KAFKA-7897.
The key difference is that the patches for 2.1 and trunk *deleted* the
non-empty leader epoch cache files if the log message format did not support it.
We should update the earlier versions to do the same thing. That way, users
that have upgraded to 2.0.1 but are still using old message formats/protocol
will have their epochs cleaned up on the first roll that upgrades the
`inter.broker.protocol.version`
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)