On 7/22/2020 2:17 PM, Xing Zhengjun wrote:


On 7/15/2020 7:04 PM, Ritesh Harjani wrote:
Hello Xing,

On 4/7/20 1:30 PM, kernel test robot wrote:
Greeting,

FYI, we noticed a -60.5% regression of stress-ng.fiemap.ops_per_sec due to commit:


commit: d3b6f23f71670007817a5d59f3fbafab2b794e8c ("ext4: move ext4_fiemap to use iomap framework")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

    nr_threads: 10%
    disk: 1HDD
    testtime: 1s
    class: os
    cpufreq_governor: performance
    ucode: 0x500002c
    fs: ext4

I started looking into this issue. But with my unit testing, I didn't
find any perf issue with fiemap ioctl call. I haven't yet explored about
how stress-ng take fiemap performance numbers, it could be doing
something differently. But in my testing I just made sure to create a
file with large number of extents and used xfs_io -c "fiemap -v" cmd
to check how much time it takes to read all the entries in 1st
and subsequent iterations.


Setup comprised of qemu machine on x86_64 with latest linux branch.

1. created a file of 10G using fallocate. (this allocated unwritten
extents for this file).

2. Then I punched hole on every alternate block of file. This step took
a long time. And after sufficiently long time, I had to cancel it.
for i in $(seq 1 2 xxxxx); do echo $i; fallocate -p -o $(($i*4096)) -l 4096; done

3. Then issued fiemap call via xfs_io and took the time measurement.
time xfs_io -c "fiemap -v" bigfile > /dev/null


Perf numbers on latest default kernel build for above cmd.

1st iteration
==============
real    0m31.684s
user    0m1.593s
sys     0m24.174s

2nd and subsequent iteration
============================
real    0m3.379s
user    0m1.300s
sys     0m2.080s


4. Then I reverted all the iomap_fiemap patches and re-tested this.
With this the older ext4_fiemap implementation will be tested:-


1st iteration
==============
real    0m31.591s
user    0m1.400s
sys     0m24.243s


2nd and subsequent iteration (had to cancel it since it was taking more time then 15m)
============================
^C^C
real    15m49.884s
user    0m0.032s
sys     15m49.722s

I guess the reason why 2nd iteration with older implementation is taking
too much time is since with previous implementation we never cached
extent entries in extent_status tree. And also in 1st iteration the page
cache may get filled with lot of buffer_head entries. So maybe page
reclaims are taking more time.

While with the latest implementation using iomap_fiemap(), the call to query extent blocks is done using ext4_map_blocks(). ext4_map_blocks()
by default will also cache the extent entries into extent_status tree.
Hence during 2nd iteration, we will directly read the entries from extent_status tree and will not do any disk I/O.

-ritesh

I re-test it on the v5.9-rc1, the regression still existed. Have you tried stress-ng test cases?


Could you try stress-ng( https://kernel.ubuntu.com/~cking/stress-ng/) test cases?  The tarballs can be get from https://kernel.ubuntu.com/~cking/tarballs/stress-ng/. The command for this case you can try "stress-ng --timeout 1 --times --verify --metrics-brief --sequential 9 --class os --minimize --exclude spawn,exec,swap"
I re-test it on the v5.8-rc6, the regression still existed.

========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/debug-setup/nr_threads/disk/testtime/fs/class/cpufreq_governor/ucode:

lkp-csl-2sp5/stress-ng/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/test/10%/1HDD/1s/ext4/os/performance/0x5002f01

commit:
   b2c5764262edded1b1cfff5a6ca82c3d61bb4a4a
   d3b6f23f71670007817a5d59f3fbafab2b794e8c
   v5.8-rc6

b2c5764262edded1 d3b6f23f71670007817a5d59f3f                    v5.8-rc6
---------------- --------------------------- ---------------------------
          %stddev     %change         %stddev     %change         %stddev
              \          |                \          |                \
     20419 ±  3%      -4.9%      19423 ±  4%     +27.1%      25959 stress-ng.af-alg.ops      19655 ±  3%      -5.7%      18537 ±  4%     +27.8%      25111 stress-ng.af-alg.ops_per_sec      64.67 ±  5%     -17.0%      53.67 ± 38%     +22.2%      79.00 ± 9%  stress-ng.chdir.ops      55.34 ±  3%     -13.3%      47.99 ± 38%     +26.4%      69.96 ± 10%  stress-ng.chdir.ops_per_sec      64652 ±  7%     -14.1%      55545 ± 11%     -13.6%      55842 ± 6%  stress-ng.chown.ops      64683 ±  7%     -14.1%      55565 ± 11%     -13.6%      55858 ± 6%  stress-ng.chown.ops_per_sec       2805 ±  2%      +0.6%       2820 ±  2%    +130.0%       6452 stress-ng.clone.ops       2802 ±  2%      +0.6%       2818 ±  2%    +129.9%       6443 stress-ng.clone.ops_per_sec      34.67            +1.9%      35.33 ±  3%      -9.6%      31.33 ± 3%  stress-ng.copy-file.ops      22297 ± 23%     +26.7%      28258 ±  2%     +38.1%      30783 ± 14%  stress-ng.dir.ops_per_sec       4743 ±  7%     -12.6%       4147 ±  4%     -19.9%       3801 ± 14%  stress-ng.dirdeep.ops_per_sec      24499 ±  8%      -9.3%      22223 ±  3%     -19.8%      19647 ± 2%  stress-ng.enosys.ops      24494 ±  8%      -9.3%      22207 ±  3%     -19.8%      19637 ± 2%  stress-ng.enosys.ops_per_sec      76618           -63.8%      27756 ±  5%     -55.5%      34092 ± 2%  stress-ng.fiemap.ops      76042           -63.8%      27492 ±  5%     -55.4%      33899 ± 2%  stress-ng.fiemap.ops_per_sec    3816791            -6.2%    3581880 ±  2%     -10.7%    3409584 ± 3%  stress-ng.file-ioctl.ops    3817421            -6.2%    3582248 ±  2%     -10.7%    3409544 ± 3%  stress-ng.file-ioctl.ops_per_sec      27332 ± 17%     +30.2%      35574           +37.2%      37506 ± 6%  stress-ng.filename.ops      26222 ± 16%     +28.6%      33722           +35.6%      35567 ± 6%  stress-ng.filename.ops_per_sec      23431 ±  2%      +0.5%      23559           +22.5%      28711 ± 4%  stress-ng.fork.ops      23433 ±  2%      +0.5%      23558           +22.5%      28709 ± 4%  stress-ng.fork.ops_per_sec      48074 ± 15%      -0.6%      47786 ±  9%     -42.5%      27623 stress-ng.get.ops      48088 ± 15%      -0.6%      47793 ±  9%     -42.6%      27624 stress-ng.get.ops_per_sec     111379            -1.4%     109815 ±  2%      -7.8%     102662 ± 4%  stress-ng.getdent.ops     111398            -1.4%     109818 ±  2%      -7.9%     102651 ± 4%  stress-ng.getdent.ops_per_sec     226.67 ±  2%      -1.0%     224.33           -15.1%     192.33 ± 6%  stress-ng.inotify.ops     226.39 ±  2%      -1.3%     223.53           -15.3%     191.82 ± 6%  stress-ng.inotify.ops_per_sec      18.00            +9.3%      19.67 ±  2%      +3.7%      18.67 ± 2%  stress-ng.iomix.ops       4846 ± 10%      +2.9%       4984 ±  7%     -36.8%       3062 ± 7%  stress-ng.kill.ops       4846 ± 10%      +2.8%       4982 ±  7%     -36.9%       3058 ± 7%  stress-ng.kill.ops_per_sec     851324 ±  3%      -9.9%     766988 ±  2%      +2.3%     870703 stress-ng.lease.ops     850238 ±  3%      -9.9%     766255 ±  2%      +2.3%     870047 stress-ng.lease.ops_per_sec     134492 ± 14%      -7.0%     125085 ± 16%     -19.9%     107736 ± 3%  stress-ng.mlock.ops_per_sec       1.38            +2.9%       1.42 ±  5%     -20.2%       1.10 ± 12%  stress-ng.mmapfork.ops_per_sec    1117721 ±  5%      -9.2%    1015321 ± 20%     -36.2%     713598 stress-ng.mq.ops    1117186 ±  5%      -9.2%    1014593 ± 20%     -36.2%     712387 stress-ng.mq.ops_per_sec    4584592 ±  2%      -2.7%    4460470 ±  7%     -12.0%    4036627 ± 8%  stress-ng.msg.ops_per_sec       8179 ± 10%      +7.3%       8772 ±  9%     +24.0%      10144 stress-ng.opcode.ops       8177 ± 10%      +7.2%       8770 ±  9%     +24.0%      10141 stress-ng.opcode.ops_per_sec       5218 ±  3%      +3.6%       5408 ±  2%     +27.3%       6642 stress-ng.reboot.ops       5217 ±  3%      +3.6%       5407 ±  2%     +27.3%       6640 stress-ng.reboot.ops_per_sec       1724 ± 21%     +27.9%       2205 ± 18%     +54.3%       2660 ± 19%  stress-ng.resources.ops    1217728           -31.4%     835808 ± 14%     -18.7%     989423 ± 10%  stress-ng.revio.ops    1217899           -31.4%     835846 ± 14%     -18.8%     989313 ± 10%  stress-ng.revio.ops_per_sec     752664 ±  9%      -7.6%     695116 ± 10%      -0.6%     748117 ± 6%  stress-ng.seal.ops     752957 ±  9%      -7.7%     695278 ± 10%      -0.6%     748068 ± 6%  stress-ng.seal.ops_per_sec     628.00 ± 83%     -20.1%     502.00 ±  8%    +176.2%       1734 stress-ng.seccomp.ops     628.22 ± 83%     -20.1%     502.08 ±  8%    +176.1%       1734 stress-ng.seccomp.ops_per_sec    1361741 ±  7%      +4.0%    1416628 ±  8%     +17.2%    1596549 ± 9%  stress-ng.seek.ops    1361799 ±  7%      +4.0%    1416578 ±  8%     +17.2%    1596376 ± 9%  stress-ng.seek.ops_per_sec    1537496            +1.2%    1556468            -9.9%    1385679 stress-ng.sigrt.ops    1532297            +1.3%    1551993            -9.9%    1381175 stress-ng.sigrt.ops_per_sec    1324589 ± 12%     -26.6%     972626 ±  9%     -14.8%    1128923 ± 4%  stress-ng.sigsuspend.ops    1323957 ± 12%     -26.6%     972084 ±  9%     -14.8%    1128174 ± 4%  stress-ng.sigsuspend.ops_per_sec     806.67 ± 13%     +11.2%     896.67 ± 10%     +31.3%       1059 ± 4%  stress-ng.sockabuse.ops     806.73 ± 13%     +11.1%     896.44 ± 10%     +31.3%       1059 ± 4%  stress-ng.sockabuse.ops_per_sec    6224635 ±  2%      +0.8%    6274521 ±  2%     -12.8%    5427883 ± 2%  stress-ng.sockpair.ops    5383397            -1.9%    5279526 ±  2%     -14.9%    4579084 ± 4%  stress-ng.sockpair.ops_per_sec      22409 ±  6%      -2.8%      21784 ±  5%     -41.2%      13172 ± 9%  stress-ng.sysbadaddr.ops      22414 ±  6%      -2.8%      21784 ±  5%     -41.3%      13167 ± 9%  stress-ng.sysbadaddr.ops_per_sec     525.33 ± 13%      +2.6%     539.00 ± 15%     -42.1%     304.33 ± 25%  stress-ng.sysfs.ops     524.98 ± 13%      +2.6%     538.58 ± 15%     -42.1%     303.79 ± 25%  stress-ng.sysfs.ops_per_sec       3077 ± 12%     +19.9%       3689 ±  4%     +23.0%       3786 ± 4%  stress-ng.sysinfo.ops       3009 ± 11%     +22.6%       3688 ±  4%     +25.7%       3783 ± 4%  stress-ng.sysinfo.ops_per_sec    3001932            +1.2%    3038017           +15.7%    3474274 stress-ng.udp.ops    3002618            +1.2%    3037939           +15.7%    3473606 stress-ng.udp.ops_per_sec     184634 ± 12%      -6.9%     171823           +24.5%     229788 ± 4%  stress-ng.utime.ops     184678 ± 12%      -7.0%     171836           +24.4%     229784 ± 4%  stress-ng.utime.ops_per_sec       1835 ±  5%      +2.2%       1875 ±  4%      +8.2%       1986 ± 3%  stress-ng.vforkmany.ops       1829 ±  5%      +1.9%       1863 ±  4%      +8.1%       1978 ± 3%  stress-ng.vforkmany.ops_per_sec      14829            -1.0%      14682 ±  2%     -20.1%      11848 ± 3%  stress-ng.vm-segv.ops      14830            -1.0%      14682 ±  2%     -29.7%      10433 ± 13%  stress-ng.vm-segv.ops_per_sec    3527835 ±  3%      +1.9%    3596613 ±  5%    +495.6%   21010377 stress-ng.vm-splice.ops    3528845 ±  3%      +1.9%    3597219 ±  5%    +495.4%   21011045 stress-ng.vm-splice.ops_per_sec






--
Zhengjun Xing

Reply via email to