3pacccccc opened a new pull request, #24955: URL: https://github.com/apache/pulsar/pull/24955
Fixes https://github.com/apache/pulsar/issues/24906 ### Motivation the test log is: https://gist.github.com/3pacccccc/7050810fa187f17046d826e7971c174a From the error logs, we can see the exact failure: ``` java.util.ConcurrentModificationException: null at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013) ~[?:?] at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967) ~[?:?] at org.apache.bookkeeper.client.PulsarMockLedgerHandle.getLength(PulsarMockLedgerHandle.java:235) ~[testmocks-4.2.0-SNAPSHOT.jar:4.17.2] at org.apache.bookkeeper.mledger.impl.cache.RangeEntryCacheImpl.getEstimatedEntrySize(RangeEntryCacheImpl.java:503) ~[managed-ledger-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT] at org.apache.bookkeeper.mledger.impl.cache.RangeEntryCacheImpl.asyncReadEntriesByPosition(RangeEntryCacheImpl.java:310) ~[managed-ledger-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT] at org.apache.bookkeeper.mledger.impl.cache.RangeEntryCacheImpl.asyncReadEntry0(RangeEntryCacheImpl.java:286) ~[managed-ledger-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT] at org.apache.bookkeeper.mledger.impl.cache.RangeEntryCacheImpl.asyncReadEntry(RangeEntryCacheImpl.java:268) ~[managed-ledger-4.2.0-SNAPSHOT.jar:4.2.0-SNAPSHOT] at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntry(ManagedLedgerImpl.java:2438) ~[managed-ledger-4.2.0-SNAPSHOT.jar:?] at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.internalReadFromLedger(ManagedLedgerImpl.java:2408) ~[managed-ledger-4.2.0-SNAPSHOT.jar:?] at org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.asyncReadEntries(ManagedLedgerImpl.java:2089) ~[managed-ledger-4.2.0-SNAPSHOT.jar:?] at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntriesWithSkip(ManagedCursorImpl.java:918) ~[managed-ledger-4.2.0-SNAPSHOT.jar:?] at org.apache.bookkeeper.mledger.impl.ManagedCursorImpl.asyncReadEntriesWithSkipOrWait(ManagedCursorImpl.java:1071) ~[managed-ledger-4.2.0-SNAPSHOT.jar:?] at org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers.readMoreEntries(PersistentDispatcherMultipleConsumers.java:430) ~[classes/:?] at org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers.lambda$readMoreEntriesAsync$4(PersistentDispatcherMultipleConsumers.java:324) ~[classes/:?] ``` The root cause analysis shows: 1. **Thread safety issue**: The `entries` list in `PulsarMockLedgerHandle` was implemented as an `ArrayList`, which is not thread-safe 2. **Concurrent access**: Multiple threads were concurrently reading from and potentially modifying the entries list during test execution 3. **Cache invalidation**: When the `ConcurrentModificationException` occurred, it caused all caches to be removed, making all subsequent read operations fail[1] 4. **Complete read failure**: With the cache cleared and the test's interceptor configured to fail all BookKeeper reads (to enforce cache-only behavior), all subsequent read operations failed with `NonRecoverableLedgerException: Should not read from BK since cache should be used`[2] [1]: https://github.com/apache/pulsar/blob/9737d038485e0c15f251dc334d6963fd0207953e/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/RangeEntryCacheImpl.java#L274 [2]: https://github.com/apache/pulsar/blob/b6cfecce5f3a1eecbf6f5df81cb835fbbfe35980/pulsar-broker/src/test/java/org/apache/pulsar/client/api/KeySharedSubscriptionBrokerCacheTest.java#L198 ### Modifications Replace `ArrayList` with `CopyOnWriteArrayList` for the `entries` field in `PulsarMockLedgerHandle` ### Verifying this change - [x] Make sure that the change passes the CI checks. - [ ] Dependencies (add or upgrade a dependency) - [ ] The public API - [ ] The schema - [ ] The default values of configurations - [ ] The threading model - [ ] The binary protocol - [ ] The REST endpoints - [ ] The admin CLI options - [ ] The metrics - [ ] Anything that affects deployment ### Documentation <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. --> - [ ] `doc` <!-- Your PR contains doc changes. --> - [ ] `doc-required` <!-- Your PR changes impact docs and you will update later --> - [x] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-complete` <!-- Docs have been already added --> ### Matching PR in forked repository PR in forked repository: https://github.com/3pacccccc/pulsar/pull/30 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
