[
https://issues.apache.org/jira/browse/IGNITE-25324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955235#comment-17955235
]
Roman Puchkovskiy commented on IGNITE-25324:
--------------------------------------------
Suspicion timeout is calculated as multiplier * pingInterval *
ceil(log2(clusterSize + 1)). Default multiplier is 5, pingInterval is 1000ms, so
* For 2-3 nodes the suspicion timeout is 10 seconds
* For 4-7 it's 15 seconds
* For 8-15 it's 20 seconds
In Apache Ignite 2, default failure detection timeout is 30 seconds (for any
cluster size). Given that a typical cluster size is around 7-10 nodes, it makes
sense to raise pingInterval to 2 seconds, so that for 2-3 nodes the timeout is
20 seconds, for 4-7 - 30sec, for 8-15 - 40 sec.
> Increase scalecube failure detection timeouts
> ---------------------------------------------
>
> Key: IGNITE-25324
> URL: https://issues.apache.org/jira/browse/IGNITE-25324
> Project: Ignite
> Issue Type: Improvement
> Reporter: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> Default timeouts seem to be too short as clusters sometimes fall apart under
> load.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)