Hi, Thanks for the recommendations. in this case, both server and client didn't show memory issues (heap and available memory in the container). The GC pauses were very short too.
The configured timeouts are default: clientFailureDetectionTimeout = 30000 failureDetectionTimeout = 10000 The latch did not get notified in more than 24hs and the timeout is 30 seconds. How can this explain the node hanging for a day? That's why I was thinking about a message that got lost. Do you think that using a different value for those parameters would avoid this scenario? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
