void-ptr974 opened a new issue, #25858:
URL: https://github.com/apache/pulsar/issues/25858

   ### Describe the bug
   
   `ManagedLedger.terminate()` can race with ledger rollover and allow a 
managed ledger that has already entered `Terminated` state to be moved back to 
`LedgerOpened` by a delayed rollover callback.
   
   I originally filed a fix in #25795, but since the PR has not received 
activity yet, I am opening this issue so the bug can be tracked independently.
   
   ### Problem
   
   `ManagedLedger.terminate()` seals the managed ledger at the current 
BookKeeper committed boundary. After termination, no new entries should be 
accepted and the managed ledger should not become writable again.
   
   The key invariant should be:
   
   > Any add operation that is acknowledged successfully to the caller must 
have a position less than or equal to the final `terminatedPosition`.
   
   However, there is a race between `terminate()` and ledger rollover:
   
   1. An add fills the current ledger and triggers rollover.
   2. The managed ledger moves into `ClosingLedger` / `CreatingLedger`.
   3. `terminate()` runs before the rollover create/switch callback finishes 
and marks the managed ledger as `Terminated`.
   4. The delayed `createComplete()` or `updateLedgersIdsComplete()` callback 
resumes the old rollover flow.
   5. The callback can set the state back to `LedgerOpened`, making a 
terminated managed ledger writable again.
   
   ### Expected behavior
   
   After `terminate()` takes ownership of the managed ledger state:
   
   - `Terminated` should remain the final write state.
   - Queued adds that were not sent to BookKeeper should fail with 
`ManagedLedgerTerminatedException`.
   - In-flight adds already sent to BookKeeper should only succeed if they are 
included in the final LAC / `terminatedPosition`.
   - Late ledger create or ledger switch callbacks should not reopen the 
managed ledger.
   - Termination should not create or switch to another writable ledger for 
pending writes.
   
   ### Actual behavior
   
   A delayed rollover callback can continue the normal ledger-switch path after 
termination and move the managed ledger back to `LedgerOpened`. This can break 
termination semantics, incorrectly handle pending writes as normal rollover 
writes, or leave add callbacks hanging when BookKeeper close drains writes that 
were not included in the final LAC.
   
   ### Verification / reproducer
   
   The scenario is covered by tests added in #25795:
   
   - `terminateDuringLedgerSwitchKeepsTerminatedState`
   - `terminatePositionIncludesAddAlreadyAckedByBookKeeper`
   - `terminateFailsInflightAddDrainedByLedgerClose`
   - `ledgerSwitchCompletionDoesNotReopenTerminatedLedger`
   
   Local verification from the PR:
   
   ```bash
   ./gradlew :managed-ledger:test --tests 
org.apache.bookkeeper.mledger.impl.ManagedLedgerTerminationTest
   ./gradlew :managed-ledger:checkstyleMain :managed-ledger:checkstyleTest
   ```
   
   ### Affected area
   
   Managed ledger termination and ledger rollover state transitions.
   
   ### Related PR
   
   #25795


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to