RE: [LKP] [mm] 9bc8039e71: will-it-scale.per_thread_ops -64.1% regression

2018-12-27 Thread Wang, Kemi
Hi, Waiman
   Did you post that patch? Let's see if it helps.

-Original Message-
From: LKP [mailto:lkp-boun...@lists.01.org] On Behalf Of Waiman Long
Sent: Tuesday, November 6, 2018 6:40 AM
To: Linus Torvalds ; vba...@suse.cz; Davidlohr 
Bueso 
Cc: yang@linux.alibaba.com; Linux Kernel Mailing List 
; Matthew Wilcox ; 
mho...@kernel.org; Colin King ; Andrew Morton 
; lduf...@linux.vnet.ibm.com; l...@01.org; 
kirill.shute...@linux.intel.com
Subject: Re: [LKP] [mm] 9bc8039e71: will-it-scale.per_thread_ops -64.1% 
regression

On 11/05/2018 05:14 PM, Linus Torvalds wrote:
> On Mon, Nov 5, 2018 at 12:12 PM Vlastimil Babka  wrote:
>> I didn't spot an obvious mistake in the patch itself, so it looks
>> like some bad interaction between scheduler and the mmap downgrade?
> I'm thinking it's RWSEM_SPIN_ON_OWNER that ends up being confused by
> the downgrade.
>
> It looks like the benchmark used to be basically CPU-bound, at about
> 800% CPU, and now it's somewhere in the 200% CPU region:
>
>   will-it-scale.time.percent_of_cpu_this_job_got
>
>   800 +-+---+
>   |.+.+.+.+.+.+.+.  .+.+.+.+.+.+.+.+.+.+.+.+.+.+.+.+.+..+.+.+.+. .+.+.+.|
>   700 +-+ +.+   |
>   | |
>   600 +-+   |
>   | |
>   500 +-+   |
>   | |
>   400 +-+   |
>   | |
>   300 +-+   |
>   | |
>   200 O-O O O O OO  |
>   |   O O O  O O O O   O O O O O O O O O O O|
>   100 +-+---+
>
> which sounds like the downgrade really messes with the "spin waiting
> for lock" logic.
>
> I'm thinking it's the "wake up waiter" logic that has some bad
> interaction with spinning, and breaks that whole optimization.
>
> Adding Waiman and Davidlohr to the participants, because they seem to
> be the obvious experts in this area.
>
> Linus

Optimistic spinning on rwsem is done only on writers spinning on a
writer-owned rwsem. If a write-lock is downgraded to a read-lock, all
the spinning waiters will quit. That may explain the drop in cpu
utilization. I do have a old patch that enable a certain amount of
reader spinning which may help the situation. I can rebase that and send
it out for review if people have interest.

Cheers,
Longman


___
LKP mailing list
l...@lists.01.org
https://lists.01.org/mailman/listinfo/lkp


RE: [LKP] [lkp-robot] [brd] 316ba5736c: aim7.jobs-per-min -11.2% regression

2018-07-26 Thread Wang, Kemi
Hi, SeongJae
Any update or any info you need from my side?

-Original Message-
From: SeongJae Park [mailto:sj38.p...@gmail.com] 
Sent: Wednesday, July 11, 2018 12:53 AM
To: Wang, Kemi 
Cc: Ye, Xiaolong ; ax...@kernel.dk; ax...@fb.com; 
l...@01.org; linux-kernel@vger.kernel.org
Subject: Re: [LKP] [lkp-robot] [brd] 316ba5736c: aim7.jobs-per-min -11.2% 
regression

Oops, I mistakenly found this mail now.  I will look inside for this though it 
will take some time because I will not be in office for this week.


Thanks,
SeongJae Park
On Tue, Jul 10, 2018 at 1:30 AM kemi  wrote:
>
> Hi, SeongJae
>   Do you have any input for this regression? thanks
>
> On 2018年06月04日 13:52, kernel test robot wrote:
> >
> > Greeting,
> >
> > FYI, we noticed a -11.2% regression of aim7.jobs-per-min due to commit:
> >
> >
> > commit: 316ba5736c9caa5dbcd84085989862d2df57431d ("brd: Mark as 
> > non-rotational") 
> > https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git 
> > for-4.18/block
> >
> > in testcase: aim7
> > on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 
> > 3.00GHz with 384G memory with following parameters:
> >
> >   disk: 1BRD_48G
> >   fs: btrfs
> >   test: disk_rw
> >   load: 1500
> >   cpufreq_governor: performance
> >
> > test-description: AIM7 is a traditional UNIX system level benchmark suite 
> > which is used to test and measure the performance of multiuser system.
> > test-url: 
> > https://sourceforge.net/projects/aimbench/files/aim-suite7/
> >
> >
> >
> > Details are as below:
> > -->
> >
> > 
> > =
> > compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
> >   
> > gcc-7/performance/1BRD_48G/btrfs/x86_64-rhel-7.2/1500/debian-x86_64-
> > 2016-08-31.cgz/lkp-ivb-ep01/disk_rw/aim7
> >
> > commit:
> >   522a777566 ("block: consolidate struct request timestamp fields")
> >   316ba5736c ("brd: Mark as non-rotational")
> >
> > 522a777566f56696 316ba5736c9caa5dbcd8408598
> >  --
> >  %stddev %change %stddev
> >  \  |\
> >  28321   -11.2%  25147aim7.jobs-per-min
> > 318.19   +12.6% 358.23aim7.time.elapsed_time
> > 318.19   +12.6% 358.23aim7.time.elapsed_time.max
> >1437526 ±  2% +14.6%1646849 ±  2%  
> > aim7.time.involuntary_context_switches
> >  11986   +14.2%  13691aim7.time.system_time
> >  73.06 ±  2%  -3.6%  70.43aim7.time.user_time
> >2449470 ±  2% -25.0%1837521 ±  4%  
> > aim7.time.voluntary_context_switches
> >  20.25 ± 58%   +1681.5% 360.75 ±109%  numa-meminfo.node1.Mlocked
> > 456062   -16.3% 381859softirqs.SCHED
> >   9015 ±  7% -21.3%   7098 ± 22%  meminfo.CmaFree
> >  47.50 ± 58%   +1355.8% 691.50 ± 92%  meminfo.Mlocked
> >   5.24 ±  3%  -1.23.99 ±  2%  mpstat.cpu.idle%
> >   0.61 ±  2%  -0.10.52 ±  2%  mpstat.cpu.usr%
> >  16627   +12.8%  18762 ±  4%  
> > slabinfo.Acpi-State.active_objs
> >  16627   +12.9%  18775 ±  4%  slabinfo.Acpi-State.num_objs
> >  57.00 ±  2% +17.5%  67.00vmstat.procs.r
> >  20936   -24.8%  15752 ±  2%  vmstat.system.cs
> >  45474-1.7%  44681vmstat.system.in
> >   6.50 ± 59%   +1157.7%  81.75 ± 75%  numa-vmstat.node0.nr_mlock
> > 242870 ±  3% +13.2% 274913 ±  7%  numa-vmstat.node0.nr_written
> >   2278 ±  7% -22.6%   1763 ± 21%  numa-vmstat.node1.nr_free_cma
> >   4.75 ± 58%   +1789.5%  89.75 ±109%  numa-vmstat.node1.nr_mlock
> >   88018135 ±  3% -48.9%   44980457 ±  7%  cpuidle.C1.time
> >1398288 ±  3% -51.1% 683493 ±  9%  cpuidle.C1.usage
> >3499814 ±  2% -38.5%2153158 ±  5%  cpuidle.C1E.time
> >  52722 ±  4% -45.6%  28692 ±  6%  cpuidle.C1E.usage
> >9865857 ±  3% -40.1%5905155 ±  5%  cpuidle.C3.time
> >  69656 ±  2% -42.6%  39990 ±  5%  cpuidle.C3.usage
> > 590856 ±  2% -12.3% 517910cpuidle.C6.usage
> >  46160 ±  7% -53.7%  21372 ± 11%  cpuidle.POLL

RE: [LKP] [lkp-robot] [brd] 316ba5736c: aim7.jobs-per-min -11.2% regression

2018-07-26 Thread Wang, Kemi
Hi, SeongJae
Any update or any info you need from my side?

-Original Message-
From: SeongJae Park [mailto:sj38.p...@gmail.com] 
Sent: Wednesday, July 11, 2018 12:53 AM
To: Wang, Kemi 
Cc: Ye, Xiaolong ; ax...@kernel.dk; ax...@fb.com; 
l...@01.org; linux-kernel@vger.kernel.org
Subject: Re: [LKP] [lkp-robot] [brd] 316ba5736c: aim7.jobs-per-min -11.2% 
regression

Oops, I mistakenly found this mail now.  I will look inside for this though it 
will take some time because I will not be in office for this week.


Thanks,
SeongJae Park
On Tue, Jul 10, 2018 at 1:30 AM kemi  wrote:
>
> Hi, SeongJae
>   Do you have any input for this regression? thanks
>
> On 2018年06月04日 13:52, kernel test robot wrote:
> >
> > Greeting,
> >
> > FYI, we noticed a -11.2% regression of aim7.jobs-per-min due to commit:
> >
> >
> > commit: 316ba5736c9caa5dbcd84085989862d2df57431d ("brd: Mark as 
> > non-rotational") 
> > https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git 
> > for-4.18/block
> >
> > in testcase: aim7
> > on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 
> > 3.00GHz with 384G memory with following parameters:
> >
> >   disk: 1BRD_48G
> >   fs: btrfs
> >   test: disk_rw
> >   load: 1500
> >   cpufreq_governor: performance
> >
> > test-description: AIM7 is a traditional UNIX system level benchmark suite 
> > which is used to test and measure the performance of multiuser system.
> > test-url: 
> > https://sourceforge.net/projects/aimbench/files/aim-suite7/
> >
> >
> >
> > Details are as below:
> > -->
> >
> > 
> > =
> > compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
> >   
> > gcc-7/performance/1BRD_48G/btrfs/x86_64-rhel-7.2/1500/debian-x86_64-
> > 2016-08-31.cgz/lkp-ivb-ep01/disk_rw/aim7
> >
> > commit:
> >   522a777566 ("block: consolidate struct request timestamp fields")
> >   316ba5736c ("brd: Mark as non-rotational")
> >
> > 522a777566f56696 316ba5736c9caa5dbcd8408598
> >  --
> >  %stddev %change %stddev
> >  \  |\
> >  28321   -11.2%  25147aim7.jobs-per-min
> > 318.19   +12.6% 358.23aim7.time.elapsed_time
> > 318.19   +12.6% 358.23aim7.time.elapsed_time.max
> >1437526 ±  2% +14.6%1646849 ±  2%  
> > aim7.time.involuntary_context_switches
> >  11986   +14.2%  13691aim7.time.system_time
> >  73.06 ±  2%  -3.6%  70.43aim7.time.user_time
> >2449470 ±  2% -25.0%1837521 ±  4%  
> > aim7.time.voluntary_context_switches
> >  20.25 ± 58%   +1681.5% 360.75 ±109%  numa-meminfo.node1.Mlocked
> > 456062   -16.3% 381859softirqs.SCHED
> >   9015 ±  7% -21.3%   7098 ± 22%  meminfo.CmaFree
> >  47.50 ± 58%   +1355.8% 691.50 ± 92%  meminfo.Mlocked
> >   5.24 ±  3%  -1.23.99 ±  2%  mpstat.cpu.idle%
> >   0.61 ±  2%  -0.10.52 ±  2%  mpstat.cpu.usr%
> >  16627   +12.8%  18762 ±  4%  
> > slabinfo.Acpi-State.active_objs
> >  16627   +12.9%  18775 ±  4%  slabinfo.Acpi-State.num_objs
> >  57.00 ±  2% +17.5%  67.00vmstat.procs.r
> >  20936   -24.8%  15752 ±  2%  vmstat.system.cs
> >  45474-1.7%  44681vmstat.system.in
> >   6.50 ± 59%   +1157.7%  81.75 ± 75%  numa-vmstat.node0.nr_mlock
> > 242870 ±  3% +13.2% 274913 ±  7%  numa-vmstat.node0.nr_written
> >   2278 ±  7% -22.6%   1763 ± 21%  numa-vmstat.node1.nr_free_cma
> >   4.75 ± 58%   +1789.5%  89.75 ±109%  numa-vmstat.node1.nr_mlock
> >   88018135 ±  3% -48.9%   44980457 ±  7%  cpuidle.C1.time
> >1398288 ±  3% -51.1% 683493 ±  9%  cpuidle.C1.usage
> >3499814 ±  2% -38.5%2153158 ±  5%  cpuidle.C1E.time
> >  52722 ±  4% -45.6%  28692 ±  6%  cpuidle.C1E.usage
> >9865857 ±  3% -40.1%5905155 ±  5%  cpuidle.C3.time
> >  69656 ±  2% -42.6%  39990 ±  5%  cpuidle.C3.usage
> > 590856 ±  2% -12.3% 517910cpuidle.C6.usage
> >  46160 ±  7% -53.7%  21372 ± 11%  cpuidle.POLL

RE: [PATCH v11 00/26] Speculative page faults

2018-05-28 Thread Wang, Kemi
Full run would take one or two weeks depended on our resource available. Could 
you pick some ones up, e.g. those have performance regression?

-Original Message-
From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf Of 
Laurent Dufour
Sent: Monday, May 28, 2018 4:55 PM
To: Song, HaiyanX <haiyanx.s...@intel.com>
Cc: a...@linux-foundation.org; mho...@kernel.org; pet...@infradead.org; 
kir...@shutemov.name; a...@linux.intel.com; d...@stgolabs.net; j...@suse.cz; 
Matthew Wilcox <wi...@infradead.org>; khand...@linux.vnet.ibm.com; 
aneesh.ku...@linux.vnet.ibm.com; b...@kernel.crashing.org; m...@ellerman.id.au; 
pau...@samba.org; Thomas Gleixner <t...@linutronix.de>; Ingo Molnar 
<mi...@redhat.com>; h...@zytor.com; Will Deacon <will.dea...@arm.com>; Sergey 
Senozhatsky <sergey.senozhat...@gmail.com>; sergey.senozhatsky.w...@gmail.com; 
Andrea Arcangeli <aarca...@redhat.com>; Alexei Starovoitov 
<alexei.starovoi...@gmail.com>; Wang, Kemi <kemi.w...@intel.com>; Daniel Jordan 
<daniel.m.jor...@oracle.com>; David Rientjes <rient...@google.com>; Jerome 
Glisse <jgli...@redhat.com>; Ganesh Mahendran <opensource.gan...@gmail.com>; 
Minchan Kim <minc...@kernel.org>; Punit Agrawal <punitagra...@gmail.com>; 
vinayak menon <vinayakm.l...@gmail.com>; Yang Shi <yang@linux.alibaba.com>; 
linux-kernel@vger.kernel.org; linux...@kvack.org; ha...@linux.vnet.ibm.com; 
npig...@gmail.com; bsinghar...@gmail.com; paul...@linux.vnet.ibm.com; Tim Chen 
<tim.c.c...@linux.intel.com>; linuxppc-...@lists.ozlabs.org; x...@kernel.org
Subject: Re: [PATCH v11 00/26] Speculative page faults

On 28/05/2018 10:22, Haiyan Song wrote:
> Hi Laurent,
> 
> Yes, these tests are done on V9 patch.

Do you plan to give this V11 a run ?

> 
> 
> Best regards,
> Haiyan Song
> 
> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:
>> On 28/05/2018 07:23, Song, HaiyanX wrote:
>>>
>>> Some regression and improvements is found by LKP-tools(linux kernel 
>>> performance) on V9 patch series tested on Intel 4s Skylake platform.
>>
>> Hi,
>>
>> Thanks for reporting this benchmark results, but you mentioned the 
>> "V9 patch series" while responding to the v11 header series...
>> Were these tests done on v9 or v11 ?
>>
>> Cheers,
>> Laurent.
>>
>>>
>>> The regression result is sorted by the metric will-it-scale.per_thread_ops.
>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 
>>> patch series) Commit id:
>>> base commit: d55f34411b1b126429a823d06c3124c16283231f
>>> head commit: 0355322b3577eeab7669066df42c550a56801110
>>> Benchmark suite: will-it-scale
>>> Download link:
>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests
>>> Metrics:
>>> will-it-scale.per_process_ops=processes/nr_cpu
>>> will-it-scale.per_thread_ops=threads/nr_cpu
>>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
>>> THP: enable / disable
>>> nr_task: 100%
>>>
>>> 1. Regressions:
>>> a) THP enabled:
>>> testcasebasechange  head   
>>> metric
>>> page_fault3/ enable THP 10092   -17.5%  8323   
>>> will-it-scale.per_thread_ops
>>> page_fault2/ enable THP  8300   -17.2%  6869   
>>> will-it-scale.per_thread_ops
>>> brk1/ enable THP  957.67 -7.6%   885   
>>> will-it-scale.per_thread_ops
>>> page_fault3/ enable THP172821-5.3%163692   
>>> will-it-scale.per_process_ops
>>> signal1/ enable THP  9125-3.2%  8834   
>>> will-it-scale.per_process_ops
>>>
>>> b) THP disabled:
>>> testcasebasechange  head   
>>> metric
>>> page_fault3/ disable THP10107   -19.1%  8180   
>>> will-it-scale.per_thread_ops
>>> page_fault2/ disable THP 8432   -17.8%  6931   
>>> will-it-scale.per_thread_ops
>>> context_switch1/ disable THP   215389-6.8%200776   
>>> will-it-scale.per_thread_ops
>>> brk1/ disable THP 939.67 -6.6%   877.33
>>> will-it-scale.per_thread_ops
>>> page_fault3/ disable THP   173145-4.7%165064   
>>> will-it-scale.per_process_ops
>>> signal1/ disable THP 9162  

RE: [PATCH v11 00/26] Speculative page faults

2018-05-28 Thread Wang, Kemi
Full run would take one or two weeks depended on our resource available. Could 
you pick some ones up, e.g. those have performance regression?

-Original Message-
From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf Of 
Laurent Dufour
Sent: Monday, May 28, 2018 4:55 PM
To: Song, HaiyanX 
Cc: a...@linux-foundation.org; mho...@kernel.org; pet...@infradead.org; 
kir...@shutemov.name; a...@linux.intel.com; d...@stgolabs.net; j...@suse.cz; 
Matthew Wilcox ; khand...@linux.vnet.ibm.com; 
aneesh.ku...@linux.vnet.ibm.com; b...@kernel.crashing.org; m...@ellerman.id.au; 
pau...@samba.org; Thomas Gleixner ; Ingo Molnar 
; h...@zytor.com; Will Deacon ; Sergey 
Senozhatsky ; sergey.senozhatsky.w...@gmail.com; 
Andrea Arcangeli ; Alexei Starovoitov 
; Wang, Kemi ; Daniel Jordan 
; David Rientjes ; Jerome 
Glisse ; Ganesh Mahendran ; 
Minchan Kim ; Punit Agrawal ; 
vinayak menon ; Yang Shi ; 
linux-kernel@vger.kernel.org; linux...@kvack.org; ha...@linux.vnet.ibm.com; 
npig...@gmail.com; bsinghar...@gmail.com; paul...@linux.vnet.ibm.com; Tim Chen 
; linuxppc-...@lists.ozlabs.org; x...@kernel.org
Subject: Re: [PATCH v11 00/26] Speculative page faults

On 28/05/2018 10:22, Haiyan Song wrote:
> Hi Laurent,
> 
> Yes, these tests are done on V9 patch.

Do you plan to give this V11 a run ?

> 
> 
> Best regards,
> Haiyan Song
> 
> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote:
>> On 28/05/2018 07:23, Song, HaiyanX wrote:
>>>
>>> Some regression and improvements is found by LKP-tools(linux kernel 
>>> performance) on V9 patch series tested on Intel 4s Skylake platform.
>>
>> Hi,
>>
>> Thanks for reporting this benchmark results, but you mentioned the 
>> "V9 patch series" while responding to the v11 header series...
>> Were these tests done on v9 or v11 ?
>>
>> Cheers,
>> Laurent.
>>
>>>
>>> The regression result is sorted by the metric will-it-scale.per_thread_ops.
>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 
>>> patch series) Commit id:
>>> base commit: d55f34411b1b126429a823d06c3124c16283231f
>>> head commit: 0355322b3577eeab7669066df42c550a56801110
>>> Benchmark suite: will-it-scale
>>> Download link:
>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests
>>> Metrics:
>>> will-it-scale.per_process_ops=processes/nr_cpu
>>> will-it-scale.per_thread_ops=threads/nr_cpu
>>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G)
>>> THP: enable / disable
>>> nr_task: 100%
>>>
>>> 1. Regressions:
>>> a) THP enabled:
>>> testcasebasechange  head   
>>> metric
>>> page_fault3/ enable THP 10092   -17.5%  8323   
>>> will-it-scale.per_thread_ops
>>> page_fault2/ enable THP  8300   -17.2%  6869   
>>> will-it-scale.per_thread_ops
>>> brk1/ enable THP  957.67 -7.6%   885   
>>> will-it-scale.per_thread_ops
>>> page_fault3/ enable THP172821-5.3%163692   
>>> will-it-scale.per_process_ops
>>> signal1/ enable THP  9125-3.2%  8834   
>>> will-it-scale.per_process_ops
>>>
>>> b) THP disabled:
>>> testcasebasechange  head   
>>> metric
>>> page_fault3/ disable THP10107   -19.1%  8180   
>>> will-it-scale.per_thread_ops
>>> page_fault2/ disable THP 8432   -17.8%  6931   
>>> will-it-scale.per_thread_ops
>>> context_switch1/ disable THP   215389-6.8%200776   
>>> will-it-scale.per_thread_ops
>>> brk1/ disable THP 939.67 -6.6%   877.33
>>> will-it-scale.per_thread_ops
>>> page_fault3/ disable THP   173145-4.7%165064   
>>> will-it-scale.per_process_ops
>>> signal1/ disable THP 9162-3.9%  8802   
>>> will-it-scale.per_process_ops
>>>
>>> 2. Improvements:
>>> a) THP enabled:
>>> testcasebasechange  head   
>>> metric
>>> malloc1/ enable THP   66.33+469.8%   383.67
>>> will-it-scale.per_thread_ops
>>> writeseek3/ enable THP  2531 +4.5%  2646   
>>> will-it-scale.per_thread_ops
>>> si

RE: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

2017-11-30 Thread Wang, Kemi
Of course, we should do that AFAP. Thanks for your comments :)

-Original Message-
From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf Of 
Michal Hocko
Sent: Thursday, November 30, 2017 5:45 PM
To: Wang, Kemi <kemi.w...@intel.com>
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org>; Andrew Morton 
<a...@linux-foundation.org>; Vlastimil Babka <vba...@suse.cz>; Mel Gorman 
<mgor...@techsingularity.net>; Johannes Weiner <han...@cmpxchg.org>; 
Christopher Lameter <c...@linux.com>; YASUAKI ISHIMATSU 
<yasu.isim...@gmail.com>; Andrey Ryabinin <aryabi...@virtuozzo.com>; Nikolay 
Borisov <nbori...@suse.com>; Pavel Tatashin <pasha.tatas...@oracle.com>; David 
Rientjes <rient...@google.com>; Sebastian Andrzej Siewior 
<bige...@linutronix.de>; Dave <dave.han...@linux.intel.com>; Kleen, Andi 
<andi.kl...@intel.com>; Chen, Tim C <tim.c.c...@intel.com>; Jesper Dangaard 
Brouer <bro...@redhat.com>; Huang, Ying <ying.hu...@intel.com>; Lu, Aaron 
<aaron...@intel.com>; Li, Aubrey <aubrey...@intel.com>; Linux MM 
<linux...@kvack.org>; Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

On Thu 30-11-17 17:32:08, kemi wrote:
[...]
> Your patch saves more code than mine because the node stats framework 
> is reused for numa stats. But it has a performance regression because 
> of the limitation of threshold size (125 at most, see 
> calculate_normal_threshold() in vmstat.c) in inc_node_state().

But this "regression" would be visible only on those workloads which really 
need to squeeze every single cycle out of the allocation hot path and those are 
supposed to disable the accounting altogether. Or is this visible on a wider 
variety of workloads.

Do not get me wrong. If we want to make per-node stats more optimal, then by 
all means let's do that. But having 3 sets of counters is just way to much.

--
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to 
majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


RE: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

2017-11-30 Thread Wang, Kemi
Of course, we should do that AFAP. Thanks for your comments :)

-Original Message-
From: owner-linux...@kvack.org [mailto:owner-linux...@kvack.org] On Behalf Of 
Michal Hocko
Sent: Thursday, November 30, 2017 5:45 PM
To: Wang, Kemi 
Cc: Greg Kroah-Hartman ; Andrew Morton 
; Vlastimil Babka ; Mel Gorman 
; Johannes Weiner ; 
Christopher Lameter ; YASUAKI ISHIMATSU 
; Andrey Ryabinin ; Nikolay 
Borisov ; Pavel Tatashin ; David 
Rientjes ; Sebastian Andrzej Siewior 
; Dave ; Kleen, Andi 
; Chen, Tim C ; Jesper Dangaard 
Brouer ; Huang, Ying ; Lu, Aaron 
; Li, Aubrey ; Linux MM 
; Linux Kernel 
Subject: Re: [PATCH 1/2] mm: NUMA stats code cleanup and enhancement

On Thu 30-11-17 17:32:08, kemi wrote:
[...]
> Your patch saves more code than mine because the node stats framework 
> is reused for numa stats. But it has a performance regression because 
> of the limitation of threshold size (125 at most, see 
> calculate_normal_threshold() in vmstat.c) in inc_node_state().

But this "regression" would be visible only on those workloads which really 
need to squeeze every single cycle out of the allocation hot path and those are 
supposed to disable the accounting altogether. Or is this visible on a wider 
variety of workloads.

Do not get me wrong. If we want to make per-node stats more optimal, then by 
all means let's do that. But having 3 sets of counters is just way to much.

--
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to 
majord...@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: mailto:"d...@kvack.org;> em...@kvack.org 


RE: [PATCH 1/3] mm, sysctl: make VM stats configurable

2017-09-15 Thread Wang, Kemi
-Original Message-
From: Michal Hocko [mailto:mho...@kernel.org] 
Sent: Friday, September 15, 2017 7:50 PM
To: Wang, Kemi <kemi.w...@intel.com>
Cc: Luis R . Rodriguez <mcg...@kernel.org>; Kees Cook <keesc...@chromium.org>; 
Andrew Morton <a...@linux-foundation.org>; Jonathan Corbet <cor...@lwn.net>; 
Mel Gorman <mgor...@techsingularity.net>; Johannes Weiner <han...@cmpxchg.org>; 
Christopher Lameter <c...@linux.com>; Sebastian Andrzej Siewior 
<bige...@linutronix.de>; Vlastimil Babka <vba...@suse.cz>; Hillf Danton 
<hillf...@alibaba-inc.com>; Dave <dave.han...@linux.intel.com>; Chen, Tim C 
<tim.c.c...@intel.com>; Kleen, Andi <andi.kl...@intel.com>; Jesper Dangaard 
Brouer <bro...@redhat.com>; Huang, Ying <ying.hu...@intel.com>; Lu, Aaron 
<aaron...@intel.com>; Proc sysctl <linux-fsde...@vger.kernel.org>; Linux MM 
<linux...@kvack.org>; Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] mm, sysctl: make VM stats configurable

On Fri 15-09-17 17:23:24, Kemi Wang wrote:
> This patch adds a tunable interface that allows VM stats configurable, as
> suggested by Dave Hansen and Ying Huang.
> 
> When performance becomes a bottleneck and you can tolerate some possible
> tool breakage and some decreased counter precision (e.g. numa counter), you
> can do:
>   echo [C|c]oarse > /proc/sys/vm/vmstat_mode
> 
> When performance is not a bottleneck and you want all tooling to work, you
> can do:
>   echo [S|s]trict > /proc/sys/vm/vmstat_mode
> 
> We recommend automatic detection of virtual memory statistics by system,
> this is also system default configuration, you can do:
>   echo [A|a]uto > /proc/sys/vm/vmstat_mode
> 
> The next patch handles numa statistics distinctively based-on different VM
> stats mode.

I would just merge this with the second patch so that it is clear how
those modes are implemented. I am also wondering why cannot we have a
much simpler interface and implementation to enable/disable numa stats
(btw. sysctl_vm_numa_stats would be more descriptive IMHO).

The motivation is that we propose a general tunable  interface for VM stats.
This would be more scalable, since we don't have to add an individual
Interface for each type of counter that can be configurable.
In the second patch, NUMA stats, as an example, can benefit for that.

If you still hold your idea, I don't mind to merge them together.
-- 
Michal Hocko
SUSE Labs


RE: [PATCH 1/3] mm, sysctl: make VM stats configurable

2017-09-15 Thread Wang, Kemi
-Original Message-
From: Michal Hocko [mailto:mho...@kernel.org] 
Sent: Friday, September 15, 2017 7:50 PM
To: Wang, Kemi 
Cc: Luis R . Rodriguez ; Kees Cook ; 
Andrew Morton ; Jonathan Corbet ; 
Mel Gorman ; Johannes Weiner ; 
Christopher Lameter ; Sebastian Andrzej Siewior 
; Vlastimil Babka ; Hillf Danton 
; Dave ; Chen, Tim C 
; Kleen, Andi ; Jesper Dangaard 
Brouer ; Huang, Ying ; Lu, Aaron 
; Proc sysctl ; Linux MM 
; Linux Kernel 
Subject: Re: [PATCH 1/3] mm, sysctl: make VM stats configurable

On Fri 15-09-17 17:23:24, Kemi Wang wrote:
> This patch adds a tunable interface that allows VM stats configurable, as
> suggested by Dave Hansen and Ying Huang.
> 
> When performance becomes a bottleneck and you can tolerate some possible
> tool breakage and some decreased counter precision (e.g. numa counter), you
> can do:
>   echo [C|c]oarse > /proc/sys/vm/vmstat_mode
> 
> When performance is not a bottleneck and you want all tooling to work, you
> can do:
>   echo [S|s]trict > /proc/sys/vm/vmstat_mode
> 
> We recommend automatic detection of virtual memory statistics by system,
> this is also system default configuration, you can do:
>   echo [A|a]uto > /proc/sys/vm/vmstat_mode
> 
> The next patch handles numa statistics distinctively based-on different VM
> stats mode.

I would just merge this with the second patch so that it is clear how
those modes are implemented. I am also wondering why cannot we have a
much simpler interface and implementation to enable/disable numa stats
(btw. sysctl_vm_numa_stats would be more descriptive IMHO).

The motivation is that we propose a general tunable  interface for VM stats.
This would be more scalable, since we don't have to add an individual
Interface for each type of counter that can be configurable.
In the second patch, NUMA stats, as an example, can benefit for that.

If you still hold your idea, I don't mind to merge them together.
-- 
Michal Hocko
SUSE Labs