GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/1173
[FLINK-2616] [test-stability] Fixes
ZooKeeperLeaderElectionTest.testMultipleLeaders by introducing second retrieval
service
I think this time I've figured out why the
`ZooKeeperLeaderElectionTest.testMultipleLeaders` test case sometimes failed.
Apparently, Curator's `NodeCache` does not receive all node changes. If for
example, the node's data has been changed twice, the `NodeCache` eventually
sees only the most recent state. This led to problems in the test case, because
the `LeaderRetrievalListener` did not see the firstly changed leader address.
The `ZooKeeperLeaderRetrievalService` only notifies the
`LeaderRetrievalListener` about a new leader if the read address from the
ZooKeeper nodes is different to the last read information. If the node cache
misses the firstly changed leader address and only sees the overwritten
(corrected) address, then it won't notify the listener, because for him nothing
has changed. Therefore, the test failed because it waited for a changing leader
address.
I resolved the test failure by using a second `LeaderRetrievalService`
which is just started after the faulty leader information has been written to
ZooKeeper. That way we can be sure that it will see any leader information, the
false or the corrected data, for the first time.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink
fixZooKeeperLeaderElectionTest2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1173.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1173
----
commit 573c3fac5f36df38f794b3a44f0573ff61c63ce4
Author: Till Rohrmann <[email protected]>
Date: 2015-09-23T12:34:38Z
[FLINK-2616] [test-stability] Fixes
ZooKeeperLeaderElectionTest.testMultipleLeaders by introducing a second
retrieval service to retrieve the leader address after the faulty address has
been written.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---