gharris1727 opened a new pull request, #14156: URL: https://github.com/apache/kafka/pull/14156
The new OffsetSyncStore historical translation cache clears more syncs than necessary when the gap between syncs is variable. This is a problem with the replacement promotion logic, which only covered the base case when promoting an one index to the immediately following index. This has the effect that if a sync fails to satisfy an invariant for the following index, then it is discarded immediately, even if the value would satisfy the invariants at a different index. In particular, invariant B which enforces a maximum distance between two syncs gets more permissive as the index in the array increases, so a sync which does not satisfy invariant B at index 1 may satisfy it at index j > 1. Instead of the greedy discarding algorithm, the replacement promotion logic should keep a separate index into the potential replacements from the original array, and delay discarding a sync until it can be determined that the sync is not valid at any place in the array. In particular, syncs are certainly not worth keeping if they are duplicates of other syncs, or if they fail invariant C. Invariant C becomes more strict as the index in the array increases, so a sync which does not satisfy invariant C at index 1 will certainly never satisfy it at index j > 1. In order to verify the changes, new tests which use Random to generate gaps between syncs, and generalize the test for maintaining unique syncs to an arbitrary stream of gaps. The different tests cover: 1. Constant spacing 2. Uniform random spacing between 0 and N 3. Uniform random spacing between M and N with M > 0 4. Bimodal spacing at maxOffsetLag and 2x maxOffsetLag The new algorithm is an extension of the existing one, so all of the consistent-spacing tests have the exact same behavior. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org