[ https://issues.apache.org/jira/browse/ARTEMIS-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Francesco Nigro updated ARTEMIS-3289: ------------------------------------- Description: As mentioned in https://issues.apache.org/jira/browse/ARTEMIS-2877 one of the major factors that contribute to reduce the scalability of a shared-nothing replication setup is the thread wake-up cost of the {{JournalImpl}}'s {{appendExecutor}} I/O threads. See the flamegraph below for a busy replica while appending replicated journal record: !image-2021-05-11-09-32-15-538.png|width=966,height=313! The violet bars represent the CPU spent while awaking the Journal appender thread(s): despite https://issues.apache.org/jira/browse/ARTEMIS-2877 allow backup to batch append tasks as much as possible, it seems the signaling cost is still too high, if compared with the rest of replica packet processing. Given that the append executor is an ordered executor built on top of I/O thread pool, see {{ActiveMQServerImpl}}: {code:java} if (serviceRegistry.getIOExecutorService() != null) { this.ioExecutorFactory = new OrderedExecutorFactory(serviceRegistry.getIOExecutorService()); } else { ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction<ThreadFactory>() { @Override public ThreadFactory run() { return new ActiveMQThreadFactory("ActiveMQ-IO-server-" + this.toString(), false, ClientSessionFactoryImpl.class.getClassLoader()); } }); this.ioExecutorPool = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 60L, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), tFactory); this.ioExecutorFactory = new OrderedExecutorFactory(ioExecutorPool); } {code} And it's using a {{SynchronousQueue}} to submit/take new wakeup tasks, it worths investigate if using a different thread pool, executor or a different "sleeping" strategy could reduce such cost under heavy load and improve dramatically response time with/without replication. was: As mentioned in https://issues.apache.org/jira/browse/ARTEMIS-2877 one of the major factors that contribute to reduce the scalability of a shared-nothing replication setup is the thread wake-up cost of the {{JournalImpl}}'s {{appendExecutor}} I/O threads. See the flamegraph below for a busy replica while appending replicated journal record: !image-2021-05-11-09-32-15-538.png|width=966,height=313! The violet bars represent the CPU spent while awaking the Journal appender thread(s): despite https://issues.apache.org/jira/browse/ARTEMIS-2877 allow backup to batch append tasks as much as possible, it seems the signaling cost is still too high, if compared with the rest of replica packet processing. Given that the append executor is an ordered executor built on top of I/O thread pool, see {{ActiveMQServerImpl}}: {code:java} if (serviceRegistry.getIOExecutorService() != null) { this.ioExecutorFactory = new OrderedExecutorFactory(serviceRegistry.getIOExecutorService()); } else { ThreadFactory tFactory = AccessController.doPrivileged(new PrivilegedAction<ThreadFactory>() { @Override public ThreadFactory run() { return new ActiveMQThreadFactory("ActiveMQ-IO-server-" + this.toString(), false, ClientSessionFactoryImpl.class.getClassLoader()); } }); this.ioExecutorPool = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 60L, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), tFactory); this.ioExecutorFactory = new OrderedExecutorFactory(ioExecutorPool); } {code} And it's using a {{SynchronousQueue}} to submit/take new wakeup tasks, using a different thread pool, executor or a different "sleeping" strategy could reduce such cost under heavy load and improve dramatically response time with/without replication. > Reduce journal appender executor Thread wakeup cost > --------------------------------------------------- > > Key: ARTEMIS-3289 > URL: https://issues.apache.org/jira/browse/ARTEMIS-3289 > Project: ActiveMQ Artemis > Issue Type: Improvement > Reporter: Francesco Nigro > Assignee: Francesco Nigro > Priority: Major > Attachments: image-2021-05-11-09-32-15-538.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > As mentioned in https://issues.apache.org/jira/browse/ARTEMIS-2877 one of the > major factors that contribute to reduce the scalability of a shared-nothing > replication setup is the thread wake-up cost of the {{JournalImpl}}'s > {{appendExecutor}} I/O threads. > See the flamegraph below for a busy replica while appending replicated > journal record: > !image-2021-05-11-09-32-15-538.png|width=966,height=313! > The violet bars represent the CPU spent while awaking the Journal appender > thread(s): despite https://issues.apache.org/jira/browse/ARTEMIS-2877 allow > backup to batch append tasks as much as possible, it seems the signaling cost > is still too high, if compared with the rest of replica packet processing. > Given that the append executor is an ordered executor built on top of I/O > thread pool, see {{ActiveMQServerImpl}}: > {code:java} > if (serviceRegistry.getIOExecutorService() != null) { > this.ioExecutorFactory = new > OrderedExecutorFactory(serviceRegistry.getIOExecutorService()); > } else { > ThreadFactory tFactory = AccessController.doPrivileged(new > PrivilegedAction<ThreadFactory>() { > @Override > public ThreadFactory run() { > return new ActiveMQThreadFactory("ActiveMQ-IO-server-" + > this.toString(), false, ClientSessionFactoryImpl.class.getClassLoader()); > } > }); > this.ioExecutorPool = new ThreadPoolExecutor(0, Integer.MAX_VALUE, > 60L, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), tFactory); > this.ioExecutorFactory = new OrderedExecutorFactory(ioExecutorPool); > } > {code} > And it's using a {{SynchronousQueue}} to submit/take new wakeup tasks, it > worths investigate if using a different thread pool, executor or a different > "sleeping" strategy could reduce such cost under heavy load and improve > dramatically response time with/without replication. -- This message was sent by Atlassian Jira (v8.3.4#803005)