[ https://issues.apache.org/jira/browse/ARTEMIS-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Domenico Francesco Bruscino closed ARTEMIS-2877. ------------------------------------------------ Resolution: Done > Fix journal replication scalability > ------------------------------------ > > Key: ARTEMIS-2877 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2877 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 2.7.0, 2.8.1, 2.9.0, 2.10.0, 2.10.1, 2.11.0, 2.12.0, > 2.13.0, 2.14.0 > Reporter: Francesco Nigro > Assignee: Francesco Nigro > Priority: Major > Fix For: 2.15.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Journal scalability with a replicated pair has degraded due to: > * a semantic change on journal sync that was causing the Netty event loop on > the backup to await any journal operation to hit the disk - see > https://issues.apache.org/jira/browse/ARTEMIS-2837 > * a semantic change on NettyConnection::write from within the Netty event > loop, that is now immediately writing and flushing buffers, while it was > delaying it by offering it again in the event loop - see: > https://issues.apache.org/jira/browse/ARTEMIS-2205 (in particular > https://github.com/apache/activemq-artemis/commit/a40a459f8c536a10a0dccae6e522ec38f09dd544#diff-3477fe0d8138d589ef33feeea2ecd28eL377-L392) > The former issues has been solved by reverting the changes and reimplemented > without introducing any semantic change. > The latter need some more explanation to be understood: > # ReplicationEndpoint is responsible to handle packets from live > # Netty provide incoming packets to ReplicationEndpoint in batches > # after each processed packet coming from live (that would likely end to > append something to the journal), a replication packet response need to be > sent back from backup to the live: in the original behavior (< 2.7.0) the > responses were delayed to be flushed to the connection until the end of a > processed batch of packets, causing the journal to append records in bursts > and amortizing the full cost of awaking the I/O thread responsible of > appending data to the journal. > To emulate the original "bursty" behavior. but making the batching more > explicit (and tunable too), it can be solved: > # using Netty's ChannelInboundHandler::channelReadComplete event to flush > each batch of packet responses as before > # [OPTIONAL] implement a new append executor on the journal to further reduce > the cost of awaking the appending thread, reducing the appending cost -- This message was sent by Atlassian Jira (v8.3.4#803005)