Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-05 Thread Lu, Aaron
On Tue, 2017-12-05 at 06:01 -0500, Jeff Layton wrote:
> On Tue, 2017-12-05 at 13:57 +0800, Aaron Lu wrote:
> > On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> > > On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > > > 
> > > > Greeting,
> > > > 
> > > > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due 
> > > > to commit:
> > > > 
> > > > 
> > > > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use 
> > > > allocation rather than the stack in fcntl_getlk()")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > It's been a while, I wonder what do you think of this regression?
> > > 
> > > The test stresses byte-range locks AFAICS and since the commit uses
> > > dynamic allocation instead of stack for the 'struct file_lock', it sounds
> > > natural the performance regressed for this test.
> > > 
> > > Now the question is, do we care about the performance regression here?
> > 
> > Appreciated it if you can share your opinion on this, thanks.
> > 
> > Regards,
> > Aaron
> >  
> 
> Sorry I missed your earlier mail about this. My feeling is to not worry

Never mind :)

> about it. struct file_lock is rather large, so putting it on the stack
> was always a bit dangerous, and F_GETLK is a rather uncommon operation
> anyway.
> 
> That said, if there are real-world workloads that have regressed because
> of this patch, I'm definitely open to backing it out.
> 
> Does anyone else have opinions on the matter?

Your comments makes sense to me, thanks for the reply.

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-05 Thread Lu, Aaron
On Tue, 2017-12-05 at 06:01 -0500, Jeff Layton wrote:
> On Tue, 2017-12-05 at 13:57 +0800, Aaron Lu wrote:
> > On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> > > On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > > > 
> > > > Greeting,
> > > > 
> > > > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due 
> > > > to commit:
> > > > 
> > > > 
> > > > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use 
> > > > allocation rather than the stack in fcntl_getlk()")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > It's been a while, I wonder what do you think of this regression?
> > > 
> > > The test stresses byte-range locks AFAICS and since the commit uses
> > > dynamic allocation instead of stack for the 'struct file_lock', it sounds
> > > natural the performance regressed for this test.
> > > 
> > > Now the question is, do we care about the performance regression here?
> > 
> > Appreciated it if you can share your opinion on this, thanks.
> > 
> > Regards,
> > Aaron
> >  
> 
> Sorry I missed your earlier mail about this. My feeling is to not worry

Never mind :)

> about it. struct file_lock is rather large, so putting it on the stack
> was always a bit dangerous, and F_GETLK is a rather uncommon operation
> anyway.
> 
> That said, if there are real-world workloads that have regressed because
> of this patch, I'm definitely open to backing it out.
> 
> Does anyone else have opinions on the matter?

Your comments makes sense to me, thanks for the reply.

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-05 Thread Jeff Layton
On Tue, 2017-12-05 at 13:57 +0800, Aaron Lu wrote:
> On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> > On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > > 
> > > Greeting,
> > > 
> > > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> > > commit:
> > > 
> > > 
> > > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use 
> > > allocation rather than the stack in fcntl_getlk()")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > It's been a while, I wonder what do you think of this regression?
> > 
> > The test stresses byte-range locks AFAICS and since the commit uses
> > dynamic allocation instead of stack for the 'struct file_lock', it sounds
> > natural the performance regressed for this test.
> > 
> > Now the question is, do we care about the performance regression here?
> 
> Appreciated it if you can share your opinion on this, thanks.
> 
> Regards,
> Aaron
>  

Sorry I missed your earlier mail about this. My feeling is to not worry
about it. struct file_lock is rather large, so putting it on the stack
was always a bit dangerous, and F_GETLK is a rather uncommon operation
anyway.

That said, if there are real-world workloads that have regressed because
of this patch, I'm definitely open to backing it out.

Does anyone else have opinions on the matter?

 
> > Feel free to let me know if you need any other data.
> > 
> > Thanks for your time.
> > 
> > > in testcase: stress-ng
> > > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 
> > > with 128G memory
> > > with following parameters:
> > > 
> > >   testtime: 1s
> > >   class: filesystem
> > >   cpufreq_governor: performance
> > > 
> > > 
> > > 
> > > 
> > > Details are as below:
> > > -->
> > > 
> > > 
> > > To reproduce:
> > > 
> > > git clone https://github.com/intel/lkp-tests.git
> > > cd lkp-tests
> > > bin/lkp install job.yaml  # job file is attached in this email
> > > bin/lkp run job.yaml
> > > 
> > > testcase/path_params/tbox_group/run: 
> > > stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> > > 
> > >v4.13-rc1  52306e882f77d3fd73f91435c4  
> > >   --  
> > >  %stddev  change %stddev
> > >  \  |\  
> > >  1.219e+08 -11%   1.09e+08
> > > stress-ng.lockofd.ops_per_sec
> > >  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
> > >  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
> > >  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
> > >1061237  10%1168476stress-ng.eventfd.ops
> > >1061205  10%1168414
> > > stress-ng.eventfd.ops_per_sec
> > >2913174   9%3163165
> > > stress-ng.time.voluntary_context_switches
> > >  89.90  -4%  86.58stress-ng.time.user_time
> > >  26510  -6%  24822stress-ng.io.ops
> > >  26489  -6%  24798stress-ng.io.ops_per_sec
> > > 885499 ± 14%18%1042236perf-stat.cpu-migrations
> > >  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
> > >1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
> > >5384755 ±  4% 7%5747689perf-stat.context-switches
> > >  32.28   7%  34.42 ±  3%  
> > > perf-stat.node-store-miss-rate%
> > >  12245 ±110% -7e+03   5367 ± 29%  
> > > latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > > 311261 ±173% -3e+05  11702 ±100%  
> > > latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> > >   1472 ± 60%  4e+03   5144 ± 97%  
> > > latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
> > >225 ± 39%  3e+03   3698 ±132%  
> > > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
> > >228 ± 34%  3e+03   3103 ±159%  
> > > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
> > >270 ± 24%  3e+03   3110 ±162%  
> > > 

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-05 Thread Jeff Layton
On Tue, 2017-12-05 at 13:57 +0800, Aaron Lu wrote:
> On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> > On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > > 
> > > Greeting,
> > > 
> > > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> > > commit:
> > > 
> > > 
> > > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use 
> > > allocation rather than the stack in fcntl_getlk()")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > It's been a while, I wonder what do you think of this regression?
> > 
> > The test stresses byte-range locks AFAICS and since the commit uses
> > dynamic allocation instead of stack for the 'struct file_lock', it sounds
> > natural the performance regressed for this test.
> > 
> > Now the question is, do we care about the performance regression here?
> 
> Appreciated it if you can share your opinion on this, thanks.
> 
> Regards,
> Aaron
>  

Sorry I missed your earlier mail about this. My feeling is to not worry
about it. struct file_lock is rather large, so putting it on the stack
was always a bit dangerous, and F_GETLK is a rather uncommon operation
anyway.

That said, if there are real-world workloads that have regressed because
of this patch, I'm definitely open to backing it out.

Does anyone else have opinions on the matter?

 
> > Feel free to let me know if you need any other data.
> > 
> > Thanks for your time.
> > 
> > > in testcase: stress-ng
> > > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 
> > > with 128G memory
> > > with following parameters:
> > > 
> > >   testtime: 1s
> > >   class: filesystem
> > >   cpufreq_governor: performance
> > > 
> > > 
> > > 
> > > 
> > > Details are as below:
> > > -->
> > > 
> > > 
> > > To reproduce:
> > > 
> > > git clone https://github.com/intel/lkp-tests.git
> > > cd lkp-tests
> > > bin/lkp install job.yaml  # job file is attached in this email
> > > bin/lkp run job.yaml
> > > 
> > > testcase/path_params/tbox_group/run: 
> > > stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> > > 
> > >v4.13-rc1  52306e882f77d3fd73f91435c4  
> > >   --  
> > >  %stddev  change %stddev
> > >  \  |\  
> > >  1.219e+08 -11%   1.09e+08
> > > stress-ng.lockofd.ops_per_sec
> > >  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
> > >  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
> > >  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
> > >1061237  10%1168476stress-ng.eventfd.ops
> > >1061205  10%1168414
> > > stress-ng.eventfd.ops_per_sec
> > >2913174   9%3163165
> > > stress-ng.time.voluntary_context_switches
> > >  89.90  -4%  86.58stress-ng.time.user_time
> > >  26510  -6%  24822stress-ng.io.ops
> > >  26489  -6%  24798stress-ng.io.ops_per_sec
> > > 885499 ± 14%18%1042236perf-stat.cpu-migrations
> > >  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
> > >1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
> > >5384755 ±  4% 7%5747689perf-stat.context-switches
> > >  32.28   7%  34.42 ±  3%  
> > > perf-stat.node-store-miss-rate%
> > >  12245 ±110% -7e+03   5367 ± 29%  
> > > latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > > 311261 ±173% -3e+05  11702 ±100%  
> > > latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> > >   1472 ± 60%  4e+03   5144 ± 97%  
> > > latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
> > >225 ± 39%  3e+03   3698 ±132%  
> > > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
> > >228 ± 34%  3e+03   3103 ±159%  
> > > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
> > >270 ± 24%  3e+03   3110 ±162%  
> > > 

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-04 Thread Aaron Lu
On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > 
> > Greeting,
> > 
> > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> > commit:
> > 
> > 
> > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use allocation 
> > rather than the stack in fcntl_getlk()")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> It's been a while, I wonder what do you think of this regression?
> 
> The test stresses byte-range locks AFAICS and since the commit uses
> dynamic allocation instead of stack for the 'struct file_lock', it sounds
> natural the performance regressed for this test.
> 
> Now the question is, do we care about the performance regression here?

Appreciated it if you can share your opinion on this, thanks.

Regards,
Aaron
 
> Feel free to let me know if you need any other data.
> 
> Thanks for your time.
> 
> > in testcase: stress-ng
> > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 
> > 128G memory
> > with following parameters:
> > 
> > testtime: 1s
> > class: filesystem
> > cpufreq_governor: performance
> > 
> > 
> > 
> > 
> > Details are as below:
> > -->
> > 
> > 
> > To reproduce:
> > 
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > bin/lkp install job.yaml  # job file is attached in this email
> > bin/lkp run job.yaml
> > 
> > testcase/path_params/tbox_group/run: 
> > stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> > 
> >v4.13-rc1  52306e882f77d3fd73f91435c4  
> >   --  
> >  %stddev  change %stddev
> >  \  |\  
> >  1.219e+08 -11%   1.09e+08stress-ng.lockofd.ops_per_sec
> >  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
> >  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
> >  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
> >1061237  10%1168476stress-ng.eventfd.ops
> >1061205  10%1168414stress-ng.eventfd.ops_per_sec
> >2913174   9%3163165
> > stress-ng.time.voluntary_context_switches
> >  89.90  -4%  86.58stress-ng.time.user_time
> >  26510  -6%  24822stress-ng.io.ops
> >  26489  -6%  24798stress-ng.io.ops_per_sec
> > 885499 ± 14%18%1042236perf-stat.cpu-migrations
> >  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
> >1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
> >5384755 ±  4% 7%5747689perf-stat.context-switches
> >  32.28   7%  34.42 ±  3%  
> > perf-stat.node-store-miss-rate%
> >  12245 ±110% -7e+03   5367 ± 29%  
> > latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > 311261 ±173% -3e+05  11702 ±100%  
> > latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> >   1472 ± 60%  4e+03   5144 ± 97%  
> > latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
> >225 ± 39%  3e+03   3698 ±132%  
> > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
> >228 ± 34%  3e+03   3103 ±159%  
> > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
> >270 ± 24%  3e+03   3110 ±162%  
> > latency_stats.max.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
> >  12245 ±110% -7e+03   5367 ± 29%  
> > latency_stats.max.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > 927506 ±173% -9e+05  11702 ±100%  
> > latency_stats.max.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> >   7892 ± 54%  3e+04  33793 ±131%  
> > 

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-12-04 Thread Aaron Lu
On Wed, Nov 08, 2017 at 03:22:33PM +0800, Aaron Lu wrote:
> On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> > 
> > Greeting,
> > 
> > FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> > commit:
> > 
> > 
> > commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use allocation 
> > rather than the stack in fcntl_getlk()")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> It's been a while, I wonder what do you think of this regression?
> 
> The test stresses byte-range locks AFAICS and since the commit uses
> dynamic allocation instead of stack for the 'struct file_lock', it sounds
> natural the performance regressed for this test.
> 
> Now the question is, do we care about the performance regression here?

Appreciated it if you can share your opinion on this, thanks.

Regards,
Aaron
 
> Feel free to let me know if you need any other data.
> 
> Thanks for your time.
> 
> > in testcase: stress-ng
> > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 
> > 128G memory
> > with following parameters:
> > 
> > testtime: 1s
> > class: filesystem
> > cpufreq_governor: performance
> > 
> > 
> > 
> > 
> > Details are as below:
> > -->
> > 
> > 
> > To reproduce:
> > 
> > git clone https://github.com/intel/lkp-tests.git
> > cd lkp-tests
> > bin/lkp install job.yaml  # job file is attached in this email
> > bin/lkp run job.yaml
> > 
> > testcase/path_params/tbox_group/run: 
> > stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> > 
> >v4.13-rc1  52306e882f77d3fd73f91435c4  
> >   --  
> >  %stddev  change %stddev
> >  \  |\  
> >  1.219e+08 -11%   1.09e+08stress-ng.lockofd.ops_per_sec
> >  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
> >  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
> >  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
> >1061237  10%1168476stress-ng.eventfd.ops
> >1061205  10%1168414stress-ng.eventfd.ops_per_sec
> >2913174   9%3163165
> > stress-ng.time.voluntary_context_switches
> >  89.90  -4%  86.58stress-ng.time.user_time
> >  26510  -6%  24822stress-ng.io.ops
> >  26489  -6%  24798stress-ng.io.ops_per_sec
> > 885499 ± 14%18%1042236perf-stat.cpu-migrations
> >  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
> >1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
> >5384755 ±  4% 7%5747689perf-stat.context-switches
> >  32.28   7%  34.42 ±  3%  
> > perf-stat.node-store-miss-rate%
> >  12245 ±110% -7e+03   5367 ± 29%  
> > latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > 311261 ±173% -3e+05  11702 ±100%  
> > latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> >   1472 ± 60%  4e+03   5144 ± 97%  
> > latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
> >225 ± 39%  3e+03   3698 ±132%  
> > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
> >228 ± 34%  3e+03   3103 ±159%  
> > latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
> >270 ± 24%  3e+03   3110 ±162%  
> > latency_stats.max.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
> >  12245 ±110% -7e+03   5367 ± 29%  
> > latency_stats.max.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> > 927506 ±173% -9e+05  11702 ±100%  
> > latency_stats.max.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> >   7892 ± 54%  3e+04  33793 ±131%  
> > 

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-11-07 Thread Aaron Lu
On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> commit:
> 
> 
> commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use allocation 
> rather than the stack in fcntl_getlk()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

It's been a while, I wonder what do you think of this regression?

The test stresses byte-range locks AFAICS and since the commit uses
dynamic allocation instead of stack for the 'struct file_lock', it sounds
natural the performance regressed for this test.

Now the question is, do we care about the performance regression here?
Feel free to let me know if you need any other data.

Thanks for your time.

Regards,
Aaron

> in testcase: stress-ng
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 
> 128G memory
> with following parameters:
> 
>   testtime: 1s
>   class: filesystem
>   cpufreq_governor: performance
> 
> 
> 
> 
> Details are as below:
> -->
> 
> 
> To reproduce:
> 
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml  # job file is attached in this email
> bin/lkp run job.yaml
> 
> testcase/path_params/tbox_group/run: 
> stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> 
>v4.13-rc1  52306e882f77d3fd73f91435c4  
>   --  
>  %stddev  change %stddev
>  \  |\  
>  1.219e+08 -11%   1.09e+08stress-ng.lockofd.ops_per_sec
>  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
>  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
>  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
>1061237  10%1168476stress-ng.eventfd.ops
>1061205  10%1168414stress-ng.eventfd.ops_per_sec
>2913174   9%3163165
> stress-ng.time.voluntary_context_switches
>  89.90  -4%  86.58stress-ng.time.user_time
>  26510  -6%  24822stress-ng.io.ops
>  26489  -6%  24798stress-ng.io.ops_per_sec
> 885499 ± 14%18%1042236perf-stat.cpu-migrations
>  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
>1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
>5384755 ±  4% 7%5747689perf-stat.context-switches
>  32.28   7%  34.42 ±  3%  perf-stat.node-store-miss-rate%
>  12245 ±110% -7e+03   5367 ± 29%  
> latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> 311261 ±173% -3e+05  11702 ±100%  
> latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>   1472 ± 60%  4e+03   5144 ± 97%  
> latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>225 ± 39%  3e+03   3698 ±132%  
> latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
>228 ± 34%  3e+03   3103 ±159%  
> latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
>270 ± 24%  3e+03   3110 ±162%  
> latency_stats.max.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
>  12245 ±110% -7e+03   5367 ± 29%  
> latency_stats.max.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> 927506 ±173% -9e+05  11702 ±100%  
> latency_stats.max.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>   7892 ± 54%  3e+04  33793 ±131%  
> latency_stats.sum.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>  12030 ±109%  2e+04  33536 ±136%  
> latency_stats.sum.autofs4_wait.autofs4_mount_wait.autofs4_d_manage.follow_managed.lookup_fast.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
>  14311 ± 15%  7e+03  21729 ±116%  
> 

Re: [LKP] [lkp-robot] [fs/locks] 52306e882f: stress-ng.lockofd.ops_per_sec -11% regression

2017-11-07 Thread Aaron Lu
On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to 
> commit:
> 
> 
> commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use allocation 
> rather than the stack in fcntl_getlk()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

It's been a while, I wonder what do you think of this regression?

The test stresses byte-range locks AFAICS and since the commit uses
dynamic allocation instead of stack for the 'struct file_lock', it sounds
natural the performance regressed for this test.

Now the question is, do we care about the performance regression here?
Feel free to let me know if you need any other data.

Thanks for your time.

Regards,
Aaron

> in testcase: stress-ng
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 
> 128G memory
> with following parameters:
> 
>   testtime: 1s
>   class: filesystem
>   cpufreq_governor: performance
> 
> 
> 
> 
> Details are as below:
> -->
> 
> 
> To reproduce:
> 
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp install job.yaml  # job file is attached in this email
> bin/lkp run job.yaml
> 
> testcase/path_params/tbox_group/run: 
> stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> 
>v4.13-rc1  52306e882f77d3fd73f91435c4  
>   --  
>  %stddev  change %stddev
>  \  |\  
>  1.219e+08 -11%   1.09e+08stress-ng.lockofd.ops_per_sec
>  1.229e+08 -10%  1.103e+08stress-ng.locka.ops_per_sec
>  1.233e+08 -10%  1.105e+08stress-ng.locka.ops
>  1.223e+08 -11%  1.093e+08stress-ng.lockofd.ops
>1061237  10%1168476stress-ng.eventfd.ops
>1061205  10%1168414stress-ng.eventfd.ops_per_sec
>2913174   9%3163165
> stress-ng.time.voluntary_context_switches
>  89.90  -4%  86.58stress-ng.time.user_time
>  26510  -6%  24822stress-ng.io.ops
>  26489  -6%  24798stress-ng.io.ops_per_sec
> 885499 ± 14%18%1042236perf-stat.cpu-migrations
>  2.537e+08  10%  2.783e+08perf-stat.node-store-misses
>1067830 ±  4% 8%1154877 ±  3%  perf-stat.page-faults
>5384755 ±  4% 7%5747689perf-stat.context-switches
>  32.28   7%  34.42 ±  3%  perf-stat.node-store-miss-rate%
>  12245 ±110% -7e+03   5367 ± 29%  
> latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> 311261 ±173% -3e+05  11702 ±100%  
> latency_stats.avg.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>   1472 ± 60%  4e+03   5144 ± 97%  
> latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>225 ± 39%  3e+03   3698 ±132%  
> latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.fput.task_work_run
>228 ± 34%  3e+03   3103 ±159%  
> latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
>270 ± 24%  3e+03   3110 ±162%  
> latency_stats.max.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
>  12245 ±110% -7e+03   5367 ± 29%  
> latency_stats.max.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
> 927506 ±173% -9e+05  11702 ±100%  
> latency_stats.max.tty_release_struct.tty_release.__fput.fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>   7892 ± 54%  3e+04  33793 ±131%  
> latency_stats.sum.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>  12030 ±109%  2e+04  33536 ±136%  
> latency_stats.sum.autofs4_wait.autofs4_mount_wait.autofs4_d_manage.follow_managed.lookup_fast.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
>  14311 ± 15%  7e+03  21729 ±116%  
>