[ https://issues.apache.org/jira/browse/IGNITE-20165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753233#comment-17753233 ]
Mirza Aliev edited comment on IGNITE-20165 at 8/11/23 1:59 PM: --------------------------------------------------------------- ||Pool name||Description||Number of Threads|| |JRaft-Common-Executor|A pool for processing short-lived asynchronous tasks. Should never be blocked.|Utils.cpus() (core == max)| |JRaft-Node-Scheduler|A scheduled executor for running delayed or repeating tasks.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |JRaft-Request-Processor|A default pool for handling RAFT requests. Should never be blocked.|Utils.cpus() * 6 (core == max)| |JRaft-Response-Processor|A default pool for handling RAFT responses. Should never be blocked.|80 (core == max/3)| |JRaft-AppendEntries-Processor|A pool of single thread executors. Used only if a replication pipelining is enabled. Handles append entries requests and responses (used by the replication flow). Threads are started on demand. Each replication pair (leader-follower) uses dedicated single thread executor from the pool, so all messages between replication peer pairs are processed sequentially.|SystemPropertyUtil.getInt( "jraft.append.entries.threads.send", Math.max(16, Ints.findNextPositivePowerOfTwo(cpus() * 2)));| |NodeImpl-Disruptor|A striped disruptor for batching FSM (finite state machine) user tasks.|DEFAULT_STRIPES = Utils.cpus() * 2| |ReadOnlyService-Disruptor|A striped disruptor for batching read requests before doing read index request.|DEFAULT_STRIPES = Utils.cpus() * 2| |LogManager-Disruptor|A striped disruptor for delivering log entries to a storage.|DEFAULT_STRIPES = Utils.cpus() * 2| |FSMCaller-Disruptor|A striped disruptor for FSM callbacks.|DEFAULT_STRIPES = Utils.cpus() * 2| |SnapshotTimer|A timer for periodic snapshot creation.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |ElectionTimer|A timer to handle election timeout on followers.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |VoteTimer|A timer to handle vote timeout when a leader was not confirmed by majority.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |StepDownTimer|A timer to process leader step down condition.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| was (Author: maliev): ||Pool name||Description||Number of Threads|| |JRaft-Common-Executor|A pool for processing short-lived asynchronous tasks. Should never be blocked.|Utils.cpus() (core == max)| |JRaft-Node-Scheduler|A scheduled executor for running delayed or repeating tasks.|Math.min(Utils.cpus() * 3, 20)| |JRaft-Request-Processor|A default pool for handling RAFT requests. Should never be blocked.|Utils.cpus() * 6 (core == max)| |JRaft-Response-Processor|A default pool for handling RAFT responses. Should never be blocked.|80 (core == max/3)| |JRaft-AppendEntries-Processor|A pool of single thread executors. Used only if a replication pipelining is enabled. Handles append entries requests and responses (used by the replication flow). Threads are started on demand. Each replication pair (leader-follower) uses dedicated single thread executor from the pool, so all messages between replication peer pairs are processed sequentially.|SystemPropertyUtil.getInt( "jraft.append.entries.threads.send", Math.max(16, Ints.findNextPositivePowerOfTwo(cpus() * 2)));| |NodeImpl-Disruptor|A striped disruptor for batching FSM (finite state machine) user tasks.|DEFAULT_STRIPES = Utils.cpus() * 2| |ReadOnlyService-Disruptor|A striped disruptor for batching read requests before doing read index request.|DEFAULT_STRIPES = Utils.cpus() * 2| |LogManager-Disruptor|A striped disruptor for delivering log entries to a storage.|DEFAULT_STRIPES = Utils.cpus() * 2| |FSMCaller-Disruptor|A striped disruptor for FSM callbacks.|DEFAULT_STRIPES = Utils.cpus() * 2| |SnapshotTimer|A timer for periodic snapshot creation.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |ElectionTimer|A timer to handle election timeout on followers.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |VoteTimer|A timer to handle vote timeout when a leader was not confirmed by majority.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| |StepDownTimer|A timer to process leader step down condition.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)| > Revisit the configuration of thread pools used by JRaft > ------------------------------------------------------- > > Key: IGNITE-20165 > URL: https://issues.apache.org/jira/browse/IGNITE-20165 > Project: Ignite > Issue Type: Improvement > Reporter: Aleksandr Polovtcev > Assignee: Vyacheslav Koptilin > Priority: Major > Labels: ignite-3 > > JRaft uses a bunch of thread pools to execute its operations. Most of these > thread pools use the number of CPUs to determine the amount of threads they > can use. For example, as described in IGNITE-20080, having 64 cores led to > JRaft allocating around 600 threads. Even though these thread pools are > shared between all Raft nodes, this approach is clearly sub-optimal, because > it should take into account both the amount of nodes as well as the number of > processors. It may also be beneficial to revise the amount of thread pools > used and why they are needed and reduce their number, if possible. -- This message was sent by Atlassian Jira (v8.20.10#820010)