GitHub user benstopford opened a pull request:
https://github.com/apache/kafka/pull/2808
KIP-101: Alter Replication Protocol to use Leader Epoch rather than High
Watermark for Truncation
This PR describes the addition of Partition Level Leader Epochs to messages
in Kafka as a mechanism for fixing some known issues in the replication
protocol. Full details can be found here:
[KIP-101
Reference](https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation)
*The key elements are*:
- Epochs are stamped on messages as they enter the leader.
- Epochs are tracked in both leader and follower in a new checkpoint file.
- A new API allows followers to retrieve the leader's latest offset for a
particular epoch.
- The logic for truncating the log, when a replica becomes a follower, has
been moved from Partition into the ReplicaFetcherThread
- When partitions are added to the ReplicaFetcherThread they are added in
an initialising state. Initialising partitions request leader epochs and then
truncate their logs appropriately.
This test provides a good overview of the workflow
`EpochDrivenReplicationProtocolAcceptanceTest.shouldFollowLeaderEpochBasicWorkflow()`
The corrupted log use case is covered by the test
`EpochDrivenReplicationProtocolAcceptanceTest.offsetsShouldNotGoBackwards()`
Remaining work: The test
`EpochDrivenReplicationProtocolAcceptanceTest.shouldSurviveFastLeaderChange()`
doesn't correctly reproduce the underlying issue. This will be altered later to
properly support this use case.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/confluentinc/kafka kip-101-v2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/2808.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2808
----
commit a96a8bbee2435bd46cd19746f61b73eeb2f94088
Author: Ben Stopford <[email protected]>
Date: 2017-03-27T16:16:16Z
All work to date squashed (18 committs)
KIP-101: Push after merge.
KIP-101: Fixes for checksytle breaks
KIP-101: Remove TestSuite class
KIP-101: Comments
KIP-101: Comments
KIP-101: Altered logic in ReplicaFetcherThread:
- On NoLeaderForPartition continue to poll for epochs indefinitely
- Add synchronisation around log trucation to ensure we cannot truncate the
log of a leader (light testing, more to follow, noted in TODO)
KIP-101: Rename Epoch -> PartitionLeaderEpoch
KIP-101: First commit based on feedback from Jun/Jason
KIP-101: Second commit based on feedback from Jun/Jason
KIP-101: Third commit based on feedback from Jun/Jason
KIP-101: Fourth commit based on feedback from Jun/Jason
- removed retainMatchingOffset parameter from clearOldest as not used
KIP-101: tidy only
KIP-101: Return Log End Offset If Undefined Epoch Requested (this covers
the case of a bootstrapping broker)
KIP-101: Altered log truncation to always be inclusive, so we always
delete epochs inclusive of the passed offset, whether clearing earliest or
latest entries.
KIP-101: Add optimisation back in for previous commit.
KIP-101: If epochOffset.endOffset() is UNSUPPORTED_EPOCH_OFFSET, which can
happen during the transition phase, we should fall back to HW.
Improved fuglyness too.
KIP-101: Small tidy
KIP-101: Refactored threading model in Abstract/ReplicaFetcherThread.
Functionally identical but now the logic sits largely in the abstract class.
KIP-101: Moved OffsetsForLeaderEpoch.getResponseFor() into ReplicaManager
KIP-101: (1) Altered ReplicaFetcherThread to poll continuously on errors.
(2) Only send epoch requests if version >= 11
KIP-101: As segments are recovered, truncate the epoch cache with the
appropriate segment
KIP-101: Fix bug in DummyFetcherThread which was defaulting to requiring
initialisation. Caused AbstractFetherThread test to hang.
KIP-101: Fix bug in ReplicaManager imports
KIP-101: Fix bug in ReplicaManager imports by making all imports explicit.
Also remove OffsetCheckpointFile which appears to still be in the remote
repostiory. This was causing a compilation issue.
KIP-101: Remove override of OffsetsTopicPartitionsProp (to 5) in
PlaintexConsumerTest as it causes a test in BaseConsumerTest to fail. Will fix
this issue in separate PR
KIP-101: Rename only (OffsesForLeaderEpochRequest)
KIP-101: Fix merge error
KIP-101: Fix couple more merge errors
KIP-101: Re-enable test_zk_security_upgrade on Ismael's request
KIP-101: Commenting out EndToEndClusterIdTest as it fails on jenkins,
although passes consistently locally, including from a fresh checkout. Puzzling.
KIP-101: Large refactor to alter the data structure used in the
Request/Response classes. These are now Maps keyed by TopicPartition. Pushed
this change through other code and cleaned up tests as appropriate.
KIP-101: Addressed Jun's second round of feedback.
KIP-101: Addressed first part of Jun's third round of feedback, this
relates largely to test code.
KIP-101: Don't assign epochs if magic byte indicates previous version.
KIP-101: Altered the logic for clearEarliest so that it keeps the previous
epoch and updates it's offset to the one used to clear.
KIP-101: Added test and removed some of the ; that aren't used
commit ab7abdbe9cb25ccb7f8cf045190a3cf812631aae
Author: Ben Stopford <[email protected]>
Date: 2017-03-27T16:16:16Z
All work to date squashed (18 committs)
KIP-101: Push after merge.
KIP-101: Fixes for checksytle breaks
KIP-101: Remove TestSuite class
KIP-101: Comments
KIP-101: Comments
KIP-101: Altered logic in ReplicaFetcherThread:
- On NoLeaderForPartition continue to poll for epochs indefinitely
- Add synchronisation around log trucation to ensure we cannot truncate the
log of a leader (light testing, more to follow, noted in TODO)
KIP-101: Rename Epoch -> PartitionLeaderEpoch
KIP-101: First commit based on feedback from Jun/Jason
KIP-101: Second commit based on feedback from Jun/Jason
KIP-101: Third commit based on feedback from Jun/Jason
KIP-101: Fourth commit based on feedback from Jun/Jason
- removed retainMatchingOffset parameter from clearOldest as not used
KIP-101: tidy only
KIP-101: Return Log End Offset If Undefined Epoch Requested (this covers
the case of a bootstrapping broker)
KIP-101: Altered log truncation to always be inclusive, so we always
delete epochs inclusive of the passed offset, whether clearing earliest or
latest entries.
KIP-101: Add optimisation back in for previous commit.
KIP-101: If epochOffset.endOffset() is UNSUPPORTED_EPOCH_OFFSET, which can
happen during the transition phase, we should fall back to HW.
Improved fuglyness too.
KIP-101: Small tidy
KIP-101: Refactored threading model in Abstract/ReplicaFetcherThread.
Functionally identical but now the logic sits largely in the abstract class.
KIP-101: Moved OffsetsForLeaderEpoch.getResponseFor() into ReplicaManager
KIP-101: (1) Altered ReplicaFetcherThread to poll continuously on errors.
(2) Only send epoch requests if version >= 11
KIP-101: As segments are recovered, truncate the epoch cache with the
appropriate segment
KIP-101: Fix bug in DummyFetcherThread which was defaulting to requiring
initialisation. Caused AbstractFetherThread test to hang.
KIP-101: Fix bug in ReplicaManager imports
KIP-101: Fix bug in ReplicaManager imports by making all imports explicit.
Also remove OffsetCheckpointFile which appears to still be in the remote
repostiory. This was causing a compilation issue.
KIP-101: Remove override of OffsetsTopicPartitionsProp (to 5) in
PlaintexConsumerTest as it causes a test in BaseConsumerTest to fail. Will fix
this issue in separate PR
KIP-101: Rename only (OffsesForLeaderEpochRequest)
KIP-101: Fix merge error
KIP-101: Fix couple more merge errors
KIP-101: Re-enable test_zk_security_upgrade on Ismael's request
KIP-101: Commenting out EndToEndClusterIdTest as it fails on jenkins,
although passes consistently locally, including from a fresh checkout. Puzzling.
KIP-101: Large refactor to alter the data structure used in the
Request/Response classes. These are now Maps keyed by TopicPartition. Pushed
this change through other code and cleaned up tests as appropriate.
KIP-101: Addressed Jun's second round of feedback.
KIP-101: Addressed first part of Jun's third round of feedback, this
relates largely to test code.
KIP-101: Don't assign epochs if magic byte indicates previous version.
KIP-101: Altered the logic for clearEarliest so that it keeps the previous
epoch and updates it's offset to the one used to clear.
KIP-101: Added test and removed some of the ; that aren't used
commit 2f4f171444fcc4246ba7224b99fc626ec0980b93
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T11:56:18Z
KIP-101: fix test break
commit a281b581f0da4bdd424bc198bdff01723dd2f8e5
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T12:26:25Z
KIP-101: just testing...
commit de672b31169f56a336bb0c046745448a90d2e290
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T12:27:27Z
KIP-101: just testing...
commit 1a674bbe104dd7f30045b24c8d245edc83896cc5
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T13:00:46Z
KIP-101: small changes based on Jun's review
commit 275a130bd8be320de6b584e4a6cf3e196cc1eb37
Author: Jun Rao <[email protected]>
Date: 2017-04-04T21:51:02Z
recover leader epoch during log recovery; other minor cleanups
commit fedbd6150246d0a7adb47aaabe0a7d92a3a9a9fc
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T21:40:31Z
KIP-101: Ensure log directory is created before we create the
LeaderEpochCache (addresses a couple of Jun's feedback points)
commit adb3b98de10b712be8137fc95e2a63e3f97e8444
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T21:41:55Z
KIP-101: Remove todo
commit 85f5b9364e97cd56194fa09b9ddaa906c1ea91ff
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T21:44:11Z
KIP-101: Move clearLatest call so it doesn't overlap with existing clear()
in truncation phase (in response to Jun's comment)
commit e52f0178e613c64f2f0a07762c4a30760f5658d5
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T21:50:23Z
KIP-101: Tidy only
commit 2d7f55dd7ee074e304154f2d84590268ab40e237
Author: Ben Stopford <[email protected]>
Date: 2017-04-04T22:44:21Z
KIP-101: Comments only
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---