gharris1727 opened a new pull request, #14156:
URL: https://github.com/apache/kafka/pull/14156

   The new OffsetSyncStore historical translation cache clears more syncs than 
necessary when the gap between syncs is variable. This is a problem with the 
replacement promotion logic, which only covered the base case when promoting an 
one index to the immediately following index.
   
   This has the effect that if a sync fails to satisfy an invariant for the 
following index, then it is discarded immediately, even if the value would 
satisfy the invariants at a different index. In particular, invariant B which 
enforces a maximum distance between two syncs gets more permissive as the index 
in the array increases, so a sync which does not satisfy invariant B at index 1 
may satisfy it at index j > 1.
   
   Instead of the greedy discarding algorithm, the replacement promotion logic 
should keep a separate index into the potential replacements from the original 
array, and delay discarding a sync until it can be determined that the sync is 
not valid at any place in the array. In particular, syncs are certainly not 
worth keeping if they are duplicates of other syncs, or if they fail invariant 
C. Invariant C becomes more strict as the index in the array increases, so a 
sync which does not satisfy invariant C at index 1 will certainly never satisfy 
it at index j > 1.
   
   In order to verify the changes, new tests which use Random to generate gaps 
between syncs, and generalize the test for maintaining unique syncs to an 
arbitrary stream of gaps. The different tests cover:
   
   1. Constant spacing
   2. Uniform random spacing between 0 and N
   3. Uniform random spacing between M and N with M > 0
   4. Bimodal spacing at maxOffsetLag and 2x maxOffsetLag
   
   The new algorithm is an extension of the existing one, so all of the 
consistent-spacing tests have the exact same behavior.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to