[ 
https://issues.apache.org/jira/browse/CASSANDRA-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Lightfoot updated CASSANDRA-21189:
--------------------------------------
    Description: 
There's a race condition between cluster closing and startup between test 
scenarios due to lack of thread lifecycle handling. The spawned thread should 
be joined before the test finishes to prevent the 'in-use port' errors.

Affects
 * bootstrapProgressTest
 * decommissionProgressTest
 * replacementProgressTest

Adopt the same pattern as GossipTest with try-finally thread joining.

Per additional research, a constrained request_timeout causing a Paxos commit 
SERVER_ERROR was the initial trigger of the errors. The timeout is 1s in the 
tests, which can be troublesome when CI is slow.

  was:
There's a race condition between cluster closing and startup between test 
scenarios due to lack of thread lifecycle handling. The spawned thread should 
be joined before the test finishes to prevent the 'in-use port' errors.

Affects
 * bootstrapProgressTest
 * decommissionProgressTest
 * replacementProgressTest

Adopt the same pattern as GossipTest with try-finally thread joining.


> Fix flaky DTest: InProgressSequenceCoordinationTest
> ---------------------------------------------------
>
>                 Key: CASSANDRA-21189
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21189
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Test/dtest/java
>            Reporter: Sam Lightfoot
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.1
>
>
> There's a race condition between cluster closing and startup between test 
> scenarios due to lack of thread lifecycle handling. The spawned thread should 
> be joined before the test finishes to prevent the 'in-use port' errors.
> Affects
>  * bootstrapProgressTest
>  * decommissionProgressTest
>  * replacementProgressTest
> Adopt the same pattern as GossipTest with try-finally thread joining.
> Per additional research, a constrained request_timeout causing a Paxos commit 
> SERVER_ERROR was the initial trigger of the errors. The timeout is 1s in the 
> tests, which can be troublesome when CI is slow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to