kezhuw commented on code in PR #1989:
URL: https://github.com/apache/zookeeper/pull/1989#discussion_r1130309880


##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/watch/WatcherCleanerTest.java:
##########
@@ -139,8 +141,10 @@ public void testMaxInProcessingDeadWatchers() {
         assertTrue(listener.wait(5000));
     }
 
-    @Test
-    public void testDeadWatcherMetrics() {
+    // There used to be a race condition surrounding this test which was 
reproducible by running the test multiple
+    // times. This test is kept as repeated to flag if the race condition 
reappears.
+    @RepeatedTest(5)

Review Comment:
   I think it might be better to keep `@Test`. If this test is still flaky, it 
will fail finally. `@RepeatedTest` is a simple waste of resource.



##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/watch/WatcherCleanerTest.java:
##########
@@ -158,7 +162,9 @@ public void testDeadWatcherMetrics() {
         Map<String, Object> values = MetricsUtils.currentServerMetrics();
         assertThat("Adding dead watcher should be stalled twice", (Long) 
values.get("add_dead_watcher_stall_time"), greaterThan(0L));
         assertEquals(3L, values.get("dead_watchers_queued"), "Total dead 
watchers added to the queue should be 3");
-        assertEquals(3L, values.get("dead_watchers_cleared"), "Total dead 
watchers cleared should be 3");
+        // This metric is updated _after_ the dead watcher listener is 
invoked, so it is not always immediately visible,
+        // hence the wait.
+        waitForMetricValue("dead_watchers_cleared", 3L, 5_000);

Review Comment:
   Seems that both `dead_watchers_cleared` and 
`cnt_dead_watchers_cleaner_latency` are suffer from same issues. I think we 
should treat them same.



##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/watch/WatcherCleanerTest.java:
##########
@@ -158,7 +162,9 @@ public void testDeadWatcherMetrics() {
         Map<String, Object> values = MetricsUtils.currentServerMetrics();
         assertThat("Adding dead watcher should be stalled twice", (Long) 
values.get("add_dead_watcher_stall_time"), greaterThan(0L));
         assertEquals(3L, values.get("dead_watchers_queued"), "Total dead 
watchers added to the queue should be 3");
-        assertEquals(3L, values.get("dead_watchers_cleared"), "Total dead 
watchers cleared should be 3");
+        // This metric is updated _after_ the dead watcher listener is 
invoked, so it is not always immediately visible,
+        // hence the wait.
+        waitForMetricValue("dead_watchers_cleared", 3L, 5_000);

Review Comment:
   Seems that both `dead_watchers_cleared` and 
`cnt_dead_watchers_cleaner_latency` are suffer from same issues. I think we 
should treat them same.



##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/watch/WatcherCleanerTest.java:
##########
@@ -171,4 +177,15 @@ public void testDeadWatcherMetrics() {
         assertEquals(20D, ((Long) 
values.get("p99_dead_watchers_cleaner_latency")).doubleValue(), 20);
     }
 
+    /**
+     * Waits in a loop for the given metric to have the required value. If the 
given timeout is reached, the test fails.
+     */
+    private static void waitForMetricValue(String metricName, Object expected, 
long timeoutMs) throws InterruptedException {

Review Comment:
   I saw `LearnerMetricsTest.waitForMetric`. Maybe, we can move such helper 
method to `ZKTestCase` and reuse it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to