codelipenghui opened a new pull request, #24855:
URL: https://github.com/apache/pulsar/pull/24855

   ## Summary
   
   Fixes a critical race condition where ledger trimming could delete ledgers 
while cursors still pointed to them, causing:
   - Negative backlog values from `getNumberOfEntriesInBacklog(isPrecise=false)`
   - Cursors pointing to non-existent ledgers after topic reload
   - `messagesConsumedCounter > entriesAddedCounter` inconsistency
   
   ## Root Cause
   
   The issue occurs when:
   1. Messages are acknowledged → cursor position advances in memory immediately
   2. Cursor state persists to BookKeeper asynchronously (can be slow)
   3. Ledger trimming runs during the persistence delay, using the in-memory 
cursor position
   4. Ledgers get deleted before cursor state is durably saved
   5. On topic reload, cursor reverts to old persistent position pointing to 
deleted ledgers
   
   ## The Fix
   
   Changed `maybeUpdateCursorBeforeTrimmingConsumedLedger()` in 
`ManagedLedgerImpl.java:2704-2705` to use the persistent cursor position 
instead of the in-memory position:
   
   ```java
   // Before:
   Position lastAckedPosition = cursor.getMarkDeletedPosition();
   
   // After:
   Position lastAckedPosition = cursor.getPersistentMarkDeletedPosition() != 
null
           ? cursor.getPersistentMarkDeletedPosition() : 
cursor.getMarkDeletedPosition();
   ```
   
   This ensures ledgers are only deleted after cursor advancement has been 
durably persisted to BookKeeper.
   
   ## Test Coverage
   
   Added `testCursorPointsToDeletedLedgerAfterTrim()` in 
`ManagedLedgerTest.java` which:
   - Simulates BookKeeper persistence delay (30 seconds)
   - Acknowledges messages asynchronously during the delay
   - Triggers ledger trimming
   - Verifies ledgers are NOT trimmed when persistent position hasn't advanced 
yet
   
   ## Verification
   
   Without the fix, the test fails because:
   - First ledger gets trimmed even though persistent cursor still points to it
   - Creates the exact race condition seen in production
   
   With the fix, the test passes because:
   - Trimming respects the persistent cursor position
   - Ledgers are only deleted after cursor state is durably saved
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to