+1 -- Matteo Merli <matteo.me...@gmail.com>
On Mon, Mar 20, 2023 at 6:05 AM Hang Chen <chenh...@apache.org> wrote: > This is the vote for BP-62. > > ### Motivation > The bookie server process add-entry requests pipeline: > - Get one request from the Netty socket channel > - Choose one thread to process the written request > - Write the entry into the target ledger entry logger's write > cache(memory table) > - Put the entry into the journal's pending queue > - Journal thread takes the entry from the pending queue and writes it > into PageCache/Journal Disk > - Write the callback response to the Netty buffer and flush to the > bookie client side. > > For every add entry request, the bookie server needs to go through the > above steps one by one. It will introduce a lot of thread context > switches. > > We can batch the add requests according to the Netty socket channel, > and write a batch of entries into the ledger entry logger and journal > disk. > > ### Modifications > The PR will change the add requests processing pipeline into the > following steps. > - Get a batch of add-entry requests from the socket channel until the > socket channel is empty or reached the max capacity (default is 1_000) > - Choose one thread to process the batch of add-entry requests. > - Write the entries into the target ledger entry logger's write cache one > by one > - Put the batch of entries into the journal's pending queue > - Journal thread drains a batch of entries from the pending queue and > writes them into PageCache/Journal disk > - Write the callback response to the Netty buffer and flush to the > bookie client side. > > With this change, we can save a lot of thread context switches. > > > ### Performance > I start one bookie on my laptop and use the Bookkeeper benchmark to > test the performance > ```shell > bin/benchmark writes -ensemble 1 -quorum 1 -ackQuorum 1 -ledgers 50 > -throttle 20000 > ``` > > **Before this change** > > | times | ops/sec | p95 latency | p99 latency | > | --- | --- | --- | --- | > | 1 | 147507 | 114.93 | 122.42 | > | 2 | 154571 | 111.46 | 115.86 | > | 3 | 141459 | 117.23 | 124.18 | > | 4 | 142037 | 121.75 | 128.54 | > | 5 | 143682 | 121.05 | 127.97 | > > > **After this change** > > | times | ops/sec | p95 latency | p99 latency | > | --- | --- | --- | --- | > | 1 | 157328 | 118.30 | 121.79 | > | 2 | 165774 | 112.86 | 115.69 | > | 3 | 144790 | 128.94 | 133.24 | > | 4 | 151984 | 121.88 | 125.32 | > | 5 | 154574 | 121.57 | 124.57 | > > The new change has a 2.2% improvement. > > Please leave +1/-1 in this thread to join the vote. And feel free to > leave any concerns. > > Thanks, > Hang >