Hi,
Is there any benchmarking about what is an acceptable latency between nodes
for an Ignite cluster to function stably?

We are currently having a single cluster across AZ's (same region). The AZ
latency published by the cloud provider is ~0.4-1ms.

What we have observed is for boxes where the AZ latency is larger i.e. >
0.8, we start seeing server engine memory growing exponentially. We
controlled that by setting the msg queue and slow client limits to 1024 &
1023 respectively. This helped get the memory in check.

However now we are seeing client nodes failing with "Client node outbound
message queue size exceeded slowClientQueueLimit, the client will be
dropped (consider changing 'slowClientQueueLimit' configuration property)".

This results in continuous disconnect and reconnect happening on these
client nodes and subsequently no processing going through.

Is there any benchmarking done for Ignite or documents available which say,
for a stable ignite cluster the latency between nodes cannot be > x ms?

However, if this is indeed our application issue then I would like to
understand how to troubleshoot or get around this issue.

Thanks
Victor

Reply via email to