[ https://issues.apache.org/jira/browse/CASSANDRA-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947038#comment-15947038 ]
Alex Petrov edited comment on CASSANDRA-13216 at 3/29/17 12:42 PM: ------------------------------------------------------------------- Found the problem. I didn't anticipate initially that this test is time-dependent. The initial fix is still applicable. It's reproducible quite easily by adding a {{sleep}} of as few as 100 milliseconds around [here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112]. YMMV with an exact sleep number. However, I do not think there's any way we can reliably fetch latency numbers, since dropwizard metrics reservoirs (used within [timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64] are tracking real time, and snapshots we're doing (however precise) won't ever be perfect. I've mocked the clock: ||[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-followup-3.11]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-3.11-dtest/]| ||[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-followup-trunk-dtest/]| 3.0 branch is not susceptible to this problem, since we use time-independent [Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31] instead of timer there. Let's wait for 24 hours, I've put the utest on retry. was (Author: ifesdjeen): Found the problem. I didn't anticipate initially that this test is time-dependent. The initial fix is still applicable. It's reproducible quite easily by adding a {{sleep}} of as few as 100 milliseconds around [here|https://github.com/apache/cassandra/blob/732d1af866b91e5ba63e7e2a467d99d4cb90e11f/test/unit/org/apache/cassandra/net/MessagingServiceTest.java#L112]. YMMV with an exact sleep number. However, I do not think there's any way we can reliably fetch latency numbers, since dropwizard metrics reservoirs (used within [timers|https://github.com/dropwizard/metrics/blob/15dde825de1843927898a7ad3c3bb11b2913a931/metrics-core/src/main/java/com/codahale/metrics/Timer.java#L64] are tracking real time, and snapshots we're doing (however precise) won't ever be perfect. I've mocked the clock: |[3.11|https://github.com/ifesdjeen/cassandra/tree/13216-3.11-followup]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-3.11-followup-dtest/]| |[trunk|https://github.com/ifesdjeen/cassandra/tree/13216-followup-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13216-trunk-followup-dtest/]| 3.0 branch is not susceptible to this problem, since we use time-independent [Meter|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/metrics/DroppedMessageMetrics.java#L31] instead of timer there. Let's wait for 24 hours, I've put the utest on retry. > testall failure in > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages > ------------------------------------------------------------------------------------ > > Key: CASSANDRA-13216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13216 > Project: Cassandra > Issue Type: Bug > Components: Testing > Reporter: Sean McCarthy > Assignee: Alex Petrov > Labels: test-failure, testall > Fix For: 3.0.13, 3.11.0, 4.0 > > Attachments: TEST-org.apache.cassandra.net.MessagingServiceTest.log, > TEST-org.apache.cassandra.net.MessagingServiceTest.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.11_testall/81/testReport/org.apache.cassandra.net/MessagingServiceTest/testDroppedMessages > {code} > Error Message > expected:<... dropped latency: 27[30 ms and Mean cross-node dropped latency: > 2731] ms> but was:<... dropped latency: 27[28 ms and Mean cross-node dropped > latency: 2730] ms> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<... dropped latency: 27[30 ms > and Mean cross-node dropped latency: 2731] ms> but was:<... dropped latency: > 27[28 ms and Mean cross-node dropped latency: 2730] ms> > at > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages(MessagingServiceTest.java:83) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)