On Tue, Mar 09, 2021 at 03:57:06PM +0800, kernel test robot wrote: > FYI, we noticed a -7.6% regression of > fxmark.hdd_ext4_no_jnl_DRBM_9_bufferedio.works/sec due to commit: > > commit: cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251 ("mm/filemap: use head pages > in generic_file_buffered_read") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > in testcase: fxmark > on test machine: 288 threads Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with > 80G memory
Can you send me one of those to test on? ;-) > %stddev %change %stddev > \ | \ > 0.05 ± 5% -10.1% 0.05 ± 3% > fxmark.hdd_ext4_no_jnl_DRBM_18_bufferedio.softirq_util > 4168491 -7.6% 3849925 > fxmark.hdd_ext4_no_jnl_DRBM_9_bufferedio.works/sec > 300.00 +2.1% 306.16 fxmark.time.system_time > 87.53 -6.7% 81.69 fxmark.time.user_time > 784.83 ± 5% +23.6% 970.33 ± 7% > perf-sched.wait_and_delay.count.preempt_schedule_common.__cond_resched.copy_page_to_iter.generic_file_buffered_read.new_sync_read 23% more delay while preempted copying to user? That seems bad, but I don't see anything in this commit that would cause that. > 7.59 -7.6 0.00 > perf-profile.calltrace.cycles-pp.find_get_pages_contig.filemap_get_pages.generic_file_buffered_read.new_sync_read.vfs_read That makes sense; we don't call find_get_pages_contig() any more, instead we call ... > 0.00 +11.9 11.90 > perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.generic_file_buffered_read.new_sync_read.vfs_read filemap_get_read_batch() ... which is more expensive ;-( if (PageReadahead(head)) break; + if (!PageHead(head)) + continue; xas.xa_index = head->index + thp_nr_pages(head) - 1; xas.xa_offset = (xas.xa_index >> xas.xa_shift) & XA_CHUNK_MASK; might be worth a try, but I have a medical appointment to get to. I'll test it out later.