If we have a managed ledger, ml and we write 2 entries to it, if both
entries fail, both will end up calling ManagedLedgerImpl#ledgerClosed
with the ledger the write failed on as a parameter.

However, depending on timing, the second call to ledgerClosed could
end up adding a new ledger to the ledger list, even though the current
ledger is _not_ failing (as the failing ledger was replaced by the
first call).

This was the cause of a flake in
ManagedLedgerErrorsTest#recoverLongTimeAfterMultipleWriteErrors as
reported in (#240). However, it's not possible to get a deterministic
test for this as the timings need to be very precise. The failing
addComplete needs to run before first error handling completes, but
the runnable with ledgerClosed for the second failure needs to run
after the first error handling completes, but before the write resends
from the first error handling complete.


[ Full content available at: 
https://github.com/apache/incubator-pulsar/pull/2573 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to