On Thu, 18 Feb 2021, Jens Axboe wrote: > On 2/18/21 4:16 PM, Andrew Morton wrote: > > On Thu, 18 Feb 2021 14:36:31 -0700 Jens Axboe <ax...@kernel.dk> wrote: > > > >> Currently we cap the batch count at max(32, 2*nr_online_cpus), which these > >> days is kind of silly as systems have gotten much bigger than in 2009 when > >> this heuristic was introduced. > >> > >> Bump it to capping it at 256 instead. This has a noticeable improvement > >> for certain io_uring workloads, as io_uring tracks per-task inflight count > >> using percpu counters.
I want to quibble with the word "capping" here, it's misleading - but I'm sorry I cannot think of the right word. The macro is max() not min(): you're making an improvement for certain io_uring workloads on machines with 1 to 15 cpus, right? Does "bigger than in 2009" apply to those? Though, io_uring could as well use percpu_counter_add_batch() instead? (Yeah, this has nothing to do with me really, but I was looking at percpu_counter_compare() just now, for tmpfs reasons, so took more interest. Not objecting to a change, but the wording leaves me wondering if the patch does what you think - or, not for the first time, I'm confused.) Hugh > >> > > > > It will also make percpu_counter_read() and > > percpu_counter_read_positive() more inaccurate than at present. Any > > effects from this will take a while to discover. > > It will, but the value of 32 is very low, especially when you are potentially > doing millions of these per second. So I do think it should track the times > a bit. > > > But yes, worth trying - I'll add it to the post-rc1 pile. > > Thanks! > > -- > Jens Axboe