FluffySkyCat opened a new issue, #1290:
URL: https://github.com/apache/curator/issues/1290
# DistributedDoubleBarrier - Barrier bypass due to spurious wakeups or
SyncConnected events
Present since Curator 4.1.0 (CURATOR-495).
DistributedDoubleBarrier.enter() can return true before the required number
of participants have arrived. A single participant can pass through the barrier
with no other participants. We discovered this by tracing test flakiness in an
internal software test suite.
The bug has two parts:
1. internalEnter() uses do { ... } while (false), which executes exactly
once. After wait() returns, the method exits without re-checking the
participant count despite Object.wait() being subject to spurious wakeups.
2. The watcher fires on any ZooKeeper event, including SyncConnected session
events. Since CURATOR-495 (4.1.0), the notification is async via runSafe(), so
hasBeenNotified can be set to true by a session event during wait(). Since the
actual member count isn't checked, this results in a barrier participant
proceeding erroneously.
The untimed enter() is not affected — it always takes an unconditional
wait() path that doesn't rely on hasBeenNotified.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]