[ 
https://issues.apache.org/jira/browse/CASSANDRA-18065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759095#comment-17759095
 ] 

Brandon Williams commented on CASSANDRA-18065:
----------------------------------------------

The problem is that when createSamplingBeginRunnable sees an existing cancelled 
job and returns early without doing any sampling, no future is created so the 
cancelled task is never acted upon again, eventually resulting in the failure.  
The cancelTask javadoc says "the corresponding task will be stopped once its 
final sampling completes" so it would seem the correct thing to do is not 
return early and perform the sampling regardless of the cancellation, which 
I've done here:

||Branch||CI||
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-18065-5.0]|[repeat 
2k|https://app.circleci.com/pipelines/github/driftx/cassandra/1245/workflows/507eb697-6af3-492e-8a14-01c9e43fe7b5/jobs/48883]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-18065-trunk]|[repeat 
2k|https://app.circleci.com/pipelines/github/driftx/cassandra/1246/workflows/9cd27ec0-3af2-47e5-ac5f-61bf59fd86f4/jobs/48884]|


> Flaky test 
> org.apache.cassandra.tools.TopPartitionsTest#testStartAndStopScheduledSampling
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18065
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18065
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tool/nodetool
>            Reporter: Andres de la Peña
>            Assignee: Brandon Williams
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>
> The test 
> {{org.apache.cassandra.tools.TopPartitionsTest#testStartAndStopScheduledSampling}}
>  is fails intermittently on trunk with CircleCI:
> * 
> https://app.circleci.com/pipelines/github/adelapena/cassandra/2508/workflows/92f054d7-9386-498f-9ba4-330181cd4782/jobs/24692
> * 
> https://app.circleci.com/pipelines/github/adelapena/cassandra/2511/workflows/7aba8baa-0a6d-404a-b08b-c6a8078caca3/jobs/24706/tests
> The failure looks like:
> {code}
> junit.framework.AssertionFailedError: Scheduled sampled tasks should be 
> removed expected:<[]> but was:<[*.*]>
>       at 
> org.apache.cassandra.tools.TopPartitionsTest.testStartAndStopScheduledSampling(TopPartitionsTest.java:116)
> {code}
> I haven't seen this failure on Butler/Jenkins yet, but it can be reproduced 
> with the CircleCI multiplexer:
> {code}
> .circleci/generate.sh -m \
>   -e REPEATED_UTESTS_COUNT=2000 \
>   -e REPEATED_UTESTS=org.apache.cassandra.tools.TopPartitionsTest
> {code}
> It seems to fail 11 times on 2000 runs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to