On Fri, Jun 26, 2020 at 02:23:03PM -0700, Tim Chen wrote: > Enlarge the pagevec size to 31 to reduce LRU lock contention for > large systems. > > The LRU lock contention is reduced from 8.9% of total CPU cycles > to 2.2% of CPU cyles. And the pmbench throughput increases > from 88.8 Mpages/sec to 95.1 Mpages/sec.
The downside here is that pagevecs are often stored on the stack (eg truncate_inode_pages_range()) as well as being used for the LRU list. On a 64-bit system, this increases the stack usage from 128 to 256 bytes for this array. I wonder if we could do something where we transform the ones on the stack to DECLARE_STACK_PAGEVEC(pvec), and similarly DECLARE_LRU_PAGEVEC the ones used for the LRUs. There's plenty of space in the header to add an unsigned char sz, delete PAGEVEC_SIZE and make it an variable length struct. Or maybe our stacks are now big enough that we just don't care. What do you think?