Donny Nadolny created ZOOKEEPER-2204:
----------------------------------------

             Summary: 
LearnerSnapshotThrottlerTest.testHighContentionWithTimeout fails occasionally
                 Key: ZOOKEEPER-2204
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2204
             Project: ZooKeeper
          Issue Type: Test
    Affects Versions: 3.5.0
            Reporter: Donny Nadolny
            Assignee: Donny Nadolny
            Priority: Minor


The {{LearnerSnapshotThrottler}} will only allow 2 concurrent snapshots to be 
taken, and if there are already 2 snapshots in progress it will wait up to 
200ms for one to complete. This isn't enough time for 
{{testHighContentionWithTimeout}} to consistently pass - on a cold JVM running 
just the one test I was able to get it to fail 3 times in around 50 runs. This 
200ms timeout will be hit if there is a delay between a thread calling 
{{LearnerSnapshot snap = throttler.beginSnapshot(false);}} and 
{{throttler.endSnapshot();}}.
This also erroneously fails on the build server, see 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2747/testReport/org.apache.zookeeper.server.quorum/LearnerSnapshotThrottlerTest/testHighContentionWithTimeout/
 for an example.

I have bumped the timeout up to 5 seconds (which should be more than enough for 
warmup / gc pauses), as well as added logging to the {{catch (Exception e)}} 
block to assist in debugging any future issues.

An alternate approach would be to separate out results gathered from the 
threads, because although we only record true/false there are really three 
outcomes:
1. The {{snapshotNumber}} was <= 2, meaning the individual call operated 
correctly
2. The {{snapshotNumber}} was > 2, meaning the test should definitely fail
3. We were unable to snapshot in the time given, so we can't determine if we 
should fail or pass (although if we have "enough" successes from #1 with no 
failures from #2 maybe we would pass the test anyway).
Bumping up the timeout is easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to