[ 
https://issues.apache.org/jira/browse/IGNITE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814853#comment-17814853
 ] 

Mirza Aliev edited comment on IGNITE-21391 at 2/6/24 3:05 PM:
--------------------------------------------------------------

What has been debugged so far:

we can simplify the test scenario: start 3 raft node, restart two of them and 
wait for the topology on restarted nodes see all three nodes.

We see that sometimes assertion for that is failed after 10 seconds;
{noformat}
        assertTrue(waitForTopology(cluster, followerPeer1, 3, 10_000));
{noformat}
Logs show that while we wait for that assertion on the node A, node A do not 
receive {{[MembershipProtocol]}} events about any node, so topology on that 
node is not equal to the full topology.

After increasing of the timeout for the assertion from 10 to 60 seconds, we see 
that missing events from{{{}[MembershipProtocol]{}}}
{noformat}
[2024-02-06T15:05:59,401][INFO ][main][ItNodeTest] start follower int_tsp_5003
[2024-02-06T15:05:59,421][INFO ][main][ConnectionManager] Server started 
[address=/0:0:0:0:0:0:0:0:5003]
[2024-02-06T15:05:59,423][INFO ][main][Cluster] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][doStart] Starting, 
config: ClusterConfig[metadata=null, metadataTimeout=1000, 
metadataCodec=io.scalecube.cluster.metadata.JdkMetadataCodec@4eb9f2af, 
memberId='061c47a7-5b81-403f-bb1d-c94311653766', memberAlias='int_tsp_5003', 
externalHost='null', externalPort=null, transportConfig=TransportConfig[port=0, 
clientSecured=false, connectTimeout=1000, 
messageCodec=io.scalecube.cluster.transport.api.JdkMessageCodec@4a481728, 
maxFrameLength=2097152, 
transportFactory=org.apache.ignite.internal.network.scalecube.ScaleCubeClusterServiceFactory$2$$Lambda$676/0x0000000800458c40@3e252e42,
 
addressMapper=java.util.function.Function$$Lambda$99/0x0000000800122040@77fa9381],
 failureDetectorConfig=FailureDetectorConfig[pingInterval=500, pingTimeout=200, 
pingReqMembers=1], gossipConfig=GossipConfig[gossipFanout=3, gossipInterval=10, 
gossipRepeatMult=2, gossipSegmentationThreshold=1000], 
membershipConfig=MembershipConfig[seedMembers=[], syncInterval=1000, 
syncTimeout=3000, suspicionMult=1, namespace='default', 
removedMembersHistorySize=42]]
[2024-02-06T15:05:59,423][INFO ][main][Cluster] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][doStart] Started
[2024-02-06T15:05:59,423][INFO ][main][ScaleCubeTopologyService] Node joined 
[node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:05:59,423][INFO ][main][ScaleCubeTopologyService] Topology 
snapshot [nodes=[int_tsp_5003]]
[2024-02-06T15:05:59,492][INFO ][main][FSMCallerImpl] Starts FSMCaller 
successfully.
[2024-02-06T15:05:59,492][INFO ][2024-02-06T15:05:59,492][WARN 
][main][LocalSnapshotStorage] No data for snapshot reader 
build/work/ItNodeTest/testSetPeer2_613743809901666/int_tsp_5003/snapshot.
[main][NodeImpl] Init node int_tsp_5003 with empty conf.
[2024-02-06T15:05:59,492][INFO ][main][NodeImpl] Node <unitest/int_tsp_5003> 
init, term=0, lastLogId=LogId [index=0, term=0], conf=, oldConf=.
[2024-02-06T15:05:59,492][INFO ][main][RaftGroupService] Start the 
RaftGroupService successfully <unitest/int_tsp_5003>
[2024-02-06T15:05:59,492][INFO ][main][ItNodeTest] start follower int_tsp_5005
[2024-02-06T15:05:59,503][INFO ][main][ConnectionManager] Server started 
[address=/0:0:0:0:0:0:0:0:5005]
[2024-02-06T15:05:59,504][INFO ][main][Cluster] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][doStart] Starting, 
config: ClusterConfig[metadata=null, metadataTimeout=1000, 
metadataCodec=io.scalecube.cluster.metadata.JdkMetadataCodec@4eb9f2af, 
memberId='d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78', memberAlias='int_tsp_5005', 
externalHost='null', externalPort=null, transportConfig=TransportConfig[port=0, 
clientSecured=false, connectTimeout=1000, 
messageCodec=io.scalecube.cluster.transport.api.JdkMessageCodec@4a481728, 
maxFrameLength=2097152, 
transportFactory=org.apache.ignite.internal.network.scalecube.ScaleCubeClusterServiceFactory$2$$Lambda$676/0x0000000800458c40@61d812ff,
 
addressMapper=java.util.function.Function$$Lambda$99/0x0000000800122040@77fa9381],
 failureDetectorConfig=FailureDetectorConfig[pingInterval=500, pingTimeout=200, 
pingReqMembers=1], gossipConfig=GossipConfig[gossipFanout=3, gossipInterval=10, 
gossipRepeatMult=2, gossipSegmentationThreshold=1000], 
membershipConfig=MembershipConfig[seedMembers=[], syncInterval=1000, 
syncTimeout=3000, suspicionMult=1, namespace='default', 
removedMembersHistorySize=42]]
[2024-02-06T15:05:59,504][INFO ][main][Cluster] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][doStart] Started
[2024-02-06T15:05:59,504][INFO ][main][ScaleCubeTopologyService] Node joined 
[node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:05:59,504][INFO ][main][ScaleCubeTopologyService] Topology 
snapshot [nodes=[int_tsp_5005]]
[2024-02-06T15:05:59,573][INFO ][main][FSMCallerImpl] Starts FSMCaller 
successfully.
[2024-02-06T15:05:59,573][WARN ][main][LocalSnapshotStorage] No data for 
snapshot reader 
build/work/ItNodeTest/testSetPeer2_613743809901666/int_tsp_5005/snapshot.
[2024-02-06T15:05:59,573][INFO ][main][NodeImpl] Init node int_tsp_5005 with 
empty conf.
[2024-02-06T15:05:59,573][INFO ][main][NodeImpl] Node <unitest/int_tsp_5005> 
init, term=0, lastLogId=LogId [index=0, term=0], conf=, oldConf=.
[2024-02-06T15:05:59,573][INFO ][main][RaftGroupService] Start the 
RaftGroupService successfully <unitest/int_tsp_5005>
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][MembershipProtocol] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:05:59.593Z]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=6c68eb78-ac57-4dba-876a-0f46f907ee99, 
name=int_tsp_5004, address=127.0.0.1:5004, nodeMetadata=null]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][MembershipProtocol] 
[default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:05:59.593Z]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004]]
[2024-02-06T15:06:11,598][INFO ][sc-cluster-5003-949][MembershipProtocol] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.598Z]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=6c68eb78-ac57-4dba-876a-0f46f907ee99, 
name=int_tsp_5004, address=127.0.0.1:5004, nodeMetadata=null]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][MembershipProtocol] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.599Z]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][MembershipProtocol] 
[default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.600Z]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][MembershipProtocol] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.603Z]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
{noformat}
and finally we see
{noformat}
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
{noformat}
on the node A.

>From that log and some other failed runs we can see that this awaiting of the 
>{{[MembershipProtocol]}} messages on the node A last approx. 10-12 seconds.

The root cause may be connected with the incorrect scalecube timeouts, need to 
investigate further.


was (Author: maliev):
What has been debugged so far:

we can simplify the test scenario: start 3 raft node, restart two of them and 
wait for the topology on restarted nodes see all three nodes.

We see that sometimes assertion for that is failed after 10 seconds;
{noformat}
        assertTrue(waitForTopology(cluster, followerPeer1, 3, 10_000));
{noformat}
Logs show that while we wait for that assertion on the node A, node A do not 
receive {{[MembershipProtocol]}} events about any node, so topology on that 
node is not equal to the full topology.

After increasing of the timeout for the assertion from 10 to 60 seconds, we see 
that missing events from{{{}[MembershipProtocol]{}}}
{noformat}
[2024-02-06T15:05:59,401][INFO ][main][ItNodeTest] start follower int_tsp_5003
[2024-02-06T15:05:59,421][INFO ][main][ConnectionManager] Server started 
[address=/0:0:0:0:0:0:0:0:5003]
[2024-02-06T15:05:59,423][INFO ][main][Cluster] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][doStart] Starting, 
config: ClusterConfig[metadata=null, metadataTimeout=1000, 
metadataCodec=io.scalecube.cluster.metadata.JdkMetadataCodec@4eb9f2af, 
memberId='061c47a7-5b81-403f-bb1d-c94311653766', memberAlias='int_tsp_5003', 
externalHost='null', externalPort=null, transportConfig=TransportConfig[port=0, 
clientSecured=false, connectTimeout=1000, 
messageCodec=io.scalecube.cluster.transport.api.JdkMessageCodec@4a481728, 
maxFrameLength=2097152, 
transportFactory=org.apache.ignite.internal.network.scalecube.ScaleCubeClusterServiceFactory$2$$Lambda$676/0x0000000800458c40@3e252e42,
 
addressMapper=java.util.function.Function$$Lambda$99/0x0000000800122040@77fa9381],
 failureDetectorConfig=FailureDetectorConfig[pingInterval=500, pingTimeout=200, 
pingReqMembers=1], gossipConfig=GossipConfig[gossipFanout=3, gossipInterval=10, 
gossipRepeatMult=2, gossipSegmentationThreshold=1000], 
membershipConfig=MembershipConfig[seedMembers=[], syncInterval=1000, 
syncTimeout=3000, suspicionMult=1, namespace='default', 
removedMembersHistorySize=42]]
[2024-02-06T15:05:59,423][INFO ][main][Cluster] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][doStart] Started
[2024-02-06T15:05:59,423][INFO ][main][ScaleCubeTopologyService] Node joined 
[node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:05:59,423][INFO ][main][ScaleCubeTopologyService] Topology 
snapshot [nodes=[int_tsp_5003]]
[2024-02-06T15:05:59,492][INFO ][main][FSMCallerImpl] Starts FSMCaller 
successfully.
[2024-02-06T15:05:59,492][INFO ][2024-02-06T15:05:59,492][WARN 
][main][LocalSnapshotStorage] No data for snapshot reader 
build/work/ItNodeTest/testSetPeer2_613743809901666/int_tsp_5003/snapshot.
[main][NodeImpl] Init node int_tsp_5003 with empty conf.
[2024-02-06T15:05:59,492][INFO ][main][NodeImpl] Node <unitest/int_tsp_5003> 
init, term=0, lastLogId=LogId [index=0, term=0], conf=, oldConf=.
[2024-02-06T15:05:59,492][INFO ][main][RaftGroupService] Start the 
RaftGroupService successfully <unitest/int_tsp_5003>
[2024-02-06T15:05:59,492][INFO ][main][ItNodeTest] start follower int_tsp_5005
[2024-02-06T15:05:59,503][INFO ][main][ConnectionManager] Server started 
[address=/0:0:0:0:0:0:0:0:5005]
[2024-02-06T15:05:59,504][INFO ][main][Cluster] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][doStart] Starting, 
config: ClusterConfig[metadata=null, metadataTimeout=1000, 
metadataCodec=io.scalecube.cluster.metadata.JdkMetadataCodec@4eb9f2af, 
memberId='d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78', memberAlias='int_tsp_5005', 
externalHost='null', externalPort=null, transportConfig=TransportConfig[port=0, 
clientSecured=false, connectTimeout=1000, 
messageCodec=io.scalecube.cluster.transport.api.JdkMessageCodec@4a481728, 
maxFrameLength=2097152, 
transportFactory=org.apache.ignite.internal.network.scalecube.ScaleCubeClusterServiceFactory$2$$Lambda$676/0x0000000800458c40@61d812ff,
 
addressMapper=java.util.function.Function$$Lambda$99/0x0000000800122040@77fa9381],
 failureDetectorConfig=FailureDetectorConfig[pingInterval=500, pingTimeout=200, 
pingReqMembers=1], gossipConfig=GossipConfig[gossipFanout=3, gossipInterval=10, 
gossipRepeatMult=2, gossipSegmentationThreshold=1000], 
membershipConfig=MembershipConfig[seedMembers=[], syncInterval=1000, 
syncTimeout=3000, suspicionMult=1, namespace='default', 
removedMembersHistorySize=42]]
[2024-02-06T15:05:59,504][INFO ][main][Cluster] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][doStart] Started
[2024-02-06T15:05:59,504][INFO ][main][ScaleCubeTopologyService] Node joined 
[node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:05:59,504][INFO ][main][ScaleCubeTopologyService] Topology 
snapshot [nodes=[int_tsp_5005]]
[2024-02-06T15:05:59,573][INFO ][main][FSMCallerImpl] Starts FSMCaller 
successfully.
[2024-02-06T15:05:59,573][WARN ][main][LocalSnapshotStorage] No data for 
snapshot reader 
build/work/ItNodeTest/testSetPeer2_613743809901666/int_tsp_5005/snapshot.
[2024-02-06T15:05:59,573][INFO ][main][NodeImpl] Init node int_tsp_5005 with 
empty conf.
[2024-02-06T15:05:59,573][INFO ][main][NodeImpl] Node <unitest/int_tsp_5005> 
init, term=0, lastLogId=LogId [index=0, term=0], conf=, oldConf=.
[2024-02-06T15:05:59,573][INFO ][main][RaftGroupService] Start the 
RaftGroupService successfully <unitest/int_tsp_5005>
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][MembershipProtocol] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:05:59.593Z]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=6c68eb78-ac57-4dba-876a-0f46f907ee99, 
name=int_tsp_5004, address=127.0.0.1:5004, nodeMetadata=null]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][MembershipProtocol] 
[default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:05:59.593Z]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:05:59,593][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004]]
[2024-02-06T15:06:11,598][INFO ][sc-cluster-5003-949][MembershipProtocol] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.598Z]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=6c68eb78-ac57-4dba-876a-0f46f907ee99, 
name=int_tsp_5004, address=127.0.0.1:5004, nodeMetadata=null]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][MembershipProtocol] 
[default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.599Z]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=d73fbde0-3e84-45fa-a27c-4ca5c7bdbe78, 
name=int_tsp_5005, address=127.0.0.1:5005, nodeMetadata=null]]
[2024-02-06T15:06:11,599][INFO ][sc-cluster-5003-949][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][MembershipProtocol] 
[default:int_tsp_5004:6c68eb78ac574dba@127.0.0.1:5004][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.600Z]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:06:11,600][INFO ][sc-cluster-5004-947][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][MembershipProtocol] 
[default:int_tsp_5005:573fbde03e8445fa@127.0.0.1:5005][publishEvent] 
MembershipEvent[type=ADDED, 
member=default:int_tsp_5003:61c47a75b81403f@127.0.0.1:5003, oldMetadata=null, 
newMetadata=7e16449-5, timestamp=2024-02-06T11:06:11.603Z]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Node joined [node=ClusterNodeImpl [id=061c47a7-5b81-403f-bb1d-c94311653766, 
name=int_tsp_5003, address=127.0.0.1:5003, nodeMetadata=null]]
[2024-02-06T15:06:11,603][INFO ][sc-cluster-5005-950][ScaleCubeTopologyService] 
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
{noformat}
and finally we see
{noformat}
Topology snapshot [nodes=[int_tsp_5005, int_tsp_5004, int_tsp_5003]]
{noformat}
on the node A.

 

>From that log and some other failed runs we ca see that this awaiting of the 
>{{[MembershipProtocol]}} messages on the node A last approx. 10-12 seconds. 

The root cause may be connected with the incorrect scalecube timeouts, need to 
investigate further. 

> ItNodeTest#testAppendEntriesWhenFollowerIsInErrorState is flaky
> ---------------------------------------------------------------
>
>                 Key: IGNITE-21391
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21391
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alexander Lapin
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>   at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>   at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
>   at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
>   at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:180)
>   at 
> app//org.apache.ignite.raft.jraft.core.TestCluster.ensureSame(TestCluster.java:558)
>   at 
> app//org.apache.ignite.raft.jraft.core.TestCluster.ensureSame(TestCluster.java:530)
>   at 
> app//org.apache.ignite.raft.jraft.core.ItNodeTest.testAppendEntriesWhenFollowerIsInErrorState(ItNodeTest.java:2568){code}
> ItNodeTest.java:2568 is 
> {code:java}
> assertTrue(cluster.start(findById(peers, oldLeaderAddr)));{code}
> Didn't manage to reproduce the issue locally, however there are several 
> failures in TC with same reason.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to