Andrey Khitrin created IGNITE-24342:
---------------------------------------
Summary: [Flaky] Cannot reliably start 3-nodes cluster on a single
Windows machine
Key: IGNITE-24342
URL: https://issues.apache.org/jira/browse/IGNITE-24342
Project: Ignite
Issue Type: Bug
Affects Versions: 3.0
Environment: A single Windows 10 machine with 32 Gb of RAM
Reporter: Andrey Khitrin
Attachments: logs.tgz
This issue doesn't have a 100% reproducibility rate, but is frequent enough to
observe.
How to reproduce:
# Try to start 3 AI nodes with a static `nodeFinder` on a single machine
(configs are attached)
{code:java}
nodeFinder {
netClusterNodes=[
"127.0.0.1:3344",
"127.0.0.1:3345",
"127.0.0.1:3346"
]
type=STATIC
}
{code}
Expected result: all nodes are up.
Actual result: 2 of 3 nodes terminated with thread dumps, cannot initialize
cluster.
Key exceptions in logs:
# "IllegalStateException: cannot send more responses than requests" (see
attachment)
# Various RAFT-related and timeout errors:
{code:java}
2025-01-28 06:03:05:471 -0600
[ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][AbstractClientService]
Fail to connect TablesAmountCapacityMultiNodeTest_cluster_0, exception:
java.util.concurrent.TimeoutException.
2025-01-28 06:03:05:815 -0600
[INFO][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Request-Processor-24][NodeImpl]
Node <cmg_group/TablesAmountCapacityMultiNodeTest_cluster_1> ignore
PreVoteRequest from TablesAmountCapacityMultiNodeTest_cluster_0, term=2,
currTerm=1, because the leader TablesAmountCapacityMultiNodeTest_cluster_1's
lease is still valid.
2025-01-28 06:03:05:815 -0600
[ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][ReplicatorGroupImpl]
Fail to check replicator connection to
peer=TablesAmountCapacityMultiNodeTest_cluster_0, replicatorType=Follower.
2025-01-28 06:03:05:836 -0600
[ERROR][%TablesAmountCapacityMultiNodeTest_cluster_1%JRaft-Response-Processor-8][NodeImpl]
Fail to add a replicator, peer=TablesAmountCapacityMultiNodeTest_cluster_0.
{code}
# Thread dumps in logs for 2 of 3 nodes (see attachment)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)