Hi Hiroki, On Thu, May 12, 2022 at 8:57 AM 成川 弘樹 <hnaru...@yahoo-corp.jp> wrote: > > Thank you for your fix. > > I confirmed that after applying this patch, my intended performance > improvement by 4c41c69e is still kept in our environment.
Is that equivalent to a formal Tested-by: Hiroki Narukawa <hnaru...@yahoo-corp.jp> tag? > On 2022/05/11 0:10, Kevin Wolf wrote: > > Commit 4c41c69e changed the way the coroutine pool is sized because for > > virtio-blk devices with a large queue size and heavy I/O, it was just > > too small and caused coroutines to be deleted and reallocated soon > > afterwards. The change made the size dynamic based on the number of > > queues and the queue size of virtio-blk devices. > > > > There are two important numbers here: Slightly simplified, when a > > coroutine terminates, it is generally stored in the global release pool > > up to a certain pool size, and if the pool is full, it is freed. > > Conversely, when allocating a new coroutine, the coroutines in the > > release pool are reused if the pool already has reached a certain > > minimum size (the batch size), otherwise we allocate new coroutines. > > > > The problem after commit 4c41c69e is that it not only increases the > > maximum pool size (which is the intended effect), but also the batch > > size for reusing coroutines (which is a bug). It means that in cases > > with many devices and/or a large queue size (which defaults to the > > number of vcpus for virtio-blk-pci), many thousand coroutines could be > > sitting in the release pool without being reused. > > > > This is not only a waste of memory and allocations, but it actually > > makes the QEMU process likely to hit the vm.max_map_count limit on Linux > > because each coroutine requires two mappings (its stack and the guard > > page for the stack), causing it to abort() in qemu_alloc_stack() because > > when the limit is hit, mprotect() starts to fail with ENOMEM. > > > > In order to fix the problem, change the batch size back to 64 to avoid > > uselessly accumulating coroutines in the release pool, but keep the > > dynamic maximum pool size so that coroutines aren't freed too early > > in heavy I/O scenarios. > > > > Note that this fix doesn't strictly make it impossible to hit the limit, > > but this would only happen if most of the coroutines are actually in use > > at the same time, not just sitting in a pool. This is the same behaviour > > as we already had before commit 4c41c69e. Fully preventing this would > > require allowing qemu_coroutine_create() to return an error, but it > > doesn't seem to be a scenario that people hit in practice. > > > > Cc: qemu-sta...@nongnu.org > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938 > > Fixes: 4c41c69e05fe28c0f95f8abd2ebf407e95a4f04b > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > ---