[
https://issues.apache.org/jira/browse/IGNITE-27921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-27921:
-----------------------------------------
Description:
When the test fails to start all required nodes due to a timeout, for example,
it may leave stale nodes after that:
{noformat}
java.lang.AssertionError: Race operations took too long.
java.lang.AssertionError: Race operations took too long.
at
org.apache.ignite.internal.testframework.IgniteTestUtils.createAssertionError(IgniteTestUtils.java:936)
at
org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:927)
at
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.startNodesInParallel(ItDisasterRecoveryReconfigurationTest.java:2010)
at
org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.setUp(ItDisasterRecoveryReconfigurationTest.java:188)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: java.lang.InterruptedException
at
org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:919)
{noformat}
The main idea is that `runRace` (which is used here to start nodes)
{noformat}
private void startNodesInParallel(int... nodeIndexes) {
runRace(20_000, IntStream.of(nodeIndexes).<RunnableX>mapToObj(i -> ()
-> cluster.startNode(i)).toArray(RunnableX[]::new));
}
{noformat}
doesn't wait for finishing all internal threads before failing its own
execution. So, it is possible that the shutdown procedure might not see all
nodes to stop.
> ItDisasterRecoveryReconfigurationTest may leave stale nodes
> -----------------------------------------------------------
>
> Key: IGNITE-27921
> URL: https://issues.apache.org/jira/browse/IGNITE-27921
> Project: Ignite
> Issue Type: Bug
> Reporter: Vyacheslav Koptilin
> Assignee: Vyacheslav Koptilin
> Priority: Major
> Labels: MakeTeamcityGreenAgain, ignite-3
>
> When the test fails to start all required nodes due to a timeout, for
> example, it may leave stale nodes after that:
> {noformat}
> java.lang.AssertionError: Race operations took too long.
> java.lang.AssertionError: Race operations took too long.
> at
> org.apache.ignite.internal.testframework.IgniteTestUtils.createAssertionError(IgniteTestUtils.java:936)
> at
> org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:927)
> at
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.startNodesInParallel(ItDisasterRecoveryReconfigurationTest.java:2010)
> at
> org.apache.ignite.internal.disaster.ItDisasterRecoveryReconfigurationTest.setUp(ItDisasterRecoveryReconfigurationTest.java:188)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
> at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
> Caused by: java.lang.InterruptedException
> at
> org.apache.ignite.internal.testframework.IgniteTestUtils.runRace(IgniteTestUtils.java:919)
> {noformat}
> The main idea is that `runRace` (which is used here to start nodes)
> {noformat}
> private void startNodesInParallel(int... nodeIndexes) {
> runRace(20_000, IntStream.of(nodeIndexes).<RunnableX>mapToObj(i -> ()
> -> cluster.startNode(i)).toArray(RunnableX[]::new));
> }
> {noformat}
> doesn't wait for finishing all internal threads before failing its own
> execution. So, it is possible that the shutdown procedure might not see all
> nodes to stop.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)