[ 
https://issues.apache.org/jira/browse/ARTEMIS-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Domenico Francesco Bruscino closed ARTEMIS-2877.
------------------------------------------------
    Resolution: Done

> Fix journal replication scalability 
> ------------------------------------
>
>                 Key: ARTEMIS-2877
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2877
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.7.0, 2.8.1, 2.9.0, 2.10.0, 2.10.1, 2.11.0, 2.12.0, 
> 2.13.0, 2.14.0
>            Reporter: Francesco Nigro
>            Assignee: Francesco Nigro
>            Priority: Major
>             Fix For: 2.15.0
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Journal scalability with a replicated pair has degraded due to:
> * a semantic change on journal sync that was causing the Netty event loop on 
> the backup to await any journal operation to hit the disk - see 
> https://issues.apache.org/jira/browse/ARTEMIS-2837
> * a semantic change on NettyConnection::write from within the Netty event 
> loop, that is now immediately writing and flushing buffers, while it was 
> delaying it by offering it again in the event loop  -  see: 
> https://issues.apache.org/jira/browse/ARTEMIS-2205 (in particular 
> https://github.com/apache/activemq-artemis/commit/a40a459f8c536a10a0dccae6e522ec38f09dd544#diff-3477fe0d8138d589ef33feeea2ecd28eL377-L392)
> The former issues has been solved by reverting the changes and reimplemented 
> without introducing any semantic change.
> The latter need some more explanation to be understood:
> # ReplicationEndpoint is responsible to handle packets from live
> # Netty provide incoming packets to ReplicationEndpoint in batches
> # after each processed packet coming from live (that would likely end to 
> append something to the journal), a replication packet response need to be 
> sent back from backup to the live: in the original behavior (< 2.7.0) the 
> responses were delayed to be flushed to the connection until the end of a 
> processed batch of packets, causing the journal to append records in bursts 
> and amortizing the full cost of awaking the I/O thread responsible of 
> appending data to the journal. 
> To emulate the original "bursty" behavior. but making the batching more 
> explicit (and tunable too), it can be solved:
> # using Netty's ChannelInboundHandler::channelReadComplete event to flush 
> each batch of packet responses as before
> # [OPTIONAL] implement a new append executor on the journal to further reduce 
> the cost of awaking the appending thread, reducing the appending cost



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to