Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-05 Thread Yuanhan Liu
On Tue, Nov 05, 2013 at 11:10:43AM +0800, Yuanhan Liu wrote:
> On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote:
> > On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
> > > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
> > > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> > > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > > > > > 
> > > > > > * Yuanhan Liu  wrote:
> > > > > > 
> > > > > > > > Btw., another _really_ interesting comparison would be against 
> > > > > > > > the latest rwsem patches. Mind doing such a comparison?
> > > > > > > 
> > > > > > > Sure. Where can I get it? Are they on some git tree?
> > > > > > 
> > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > > > > > version.
> > > > > > 
> > > > > > The last on-lkml submission was in this thread:
> > > > > > 
> > > > > >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > > > > > 
> > > > > 
> > > > > Thanks.
> > > > > 
> > > > > I queued bunchs of tests about one hour ago, and already got some
> > > > > results(If necessary, I can add more data tomorrow when those tests 
> > > > > are
> > > > > finished):
> > > > 
> > > > What kind of system are you using to run these workloads on?
> > > 
> > > I queued jobs on 5 testboxes:
> > >   - brickland1: 120 core Ivybridge server
> > >   - lkp-ib03:   48 core Ivybridge server
> > >   - lkp-sb03:   32 core Sandybridge server
> > >   - lkp-nex04:  64 core NHM server
> > >   - lkp-a04:Atom server
> > > > 
> > > > > 
> > > > > 
> > > > >v3.12-rc7  fe001e3de090e179f95d  
> > > > >     
> > > > > -9.3%   
> > > > > brickland1/micro/aim7/shared
> > > > > +4.3%   
> > > > > lkp-ib03/micro/aim7/fork_test
> > > > > +2.2%   
> > > > > lkp-ib03/micro/aim7/shared
> > > > > -2.6%   TOTAL 
> > > > > aim7.2000.jobs-per-min
> > > > > 
> > > > 
> > > > Sorry if I'm missing something, but could you elaborate more on what
> > > > these percentages represent?
> > > 
> > >v3.12-rc7  fe001e3de090e179f95d  
> > >     
> > > -9.3%   
> > > brickland1/micro/aim7/shared
> > > 
> > > 
> > > -2.6%   TOTAL 
> > > aim7.2000.jobs-per-min
> > > 
> > > The comparation base is v3.12-rc7, and we got 9.3 performance regression
> > > at commit fe001e3de090e179f95d, which is the head of rwsem performance
> > > optimizations patch set.
> > 
> > Yunahan, thanks for the data.  This I assume is with the entire rwsem
> > v8 patchset.
> 
> Yes, it is; 9 patches in total.
> 
> > Any idea of the run variation on the workload?
> 
> Your concern is right. The variation is quite big on the 
> brickland1/micro/aim7/shared
> testcase.
> 
>* - v3.12-rc7
>O - fe001e3de090e179f95d
> 
>  brickland1/micro/aim7/shared: aim7.2000.jobs-per-min
> 
>32 +++
>   | |
>31 ++  .*.   |
>   |      ...|
>30 ++    ... |
>   |... ..   |
>29 ++     ...|
>   | *
>28 ++... |
>   | |
>27 ++|
>   *.O
>26 O+|
>   |O|
>25 +++
> 

Tim,

Please ignore this "regression", it disappears when I run that testcase
6 times both for v3.12-rc7 and fe001e3de090e179f95d.

I guess 2000 users is a bit small for 120 core IVB server. I may try to
increase the user count and do test again to see how it will behavior
with your patches applied.

Sorry for the inconvenience.

--yliu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-05 Thread Yuanhan Liu
On Tue, Nov 05, 2013 at 11:10:43AM +0800, Yuanhan Liu wrote:
 On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote:
  On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
   On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
 On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
  
  * Yuanhan Liu yuanhan@linux.intel.com wrote:
  
Btw., another _really_ interesting comparison would be against 
the latest rwsem patches. Mind doing such a comparison?
   
   Sure. Where can I get it? Are they on some git tree?
  
  I've Cc:-ed Tim Chen who might be able to point you to the latest 
  version.
  
  The last on-lkml submission was in this thread:
  
Subject: [PATCH v8 0/9] rwsem performance optimizations
  
 
 Thanks.
 
 I queued bunchs of tests about one hour ago, and already got some
 results(If necessary, I can add more data tomorrow when those tests 
 are
 finished):

What kind of system are you using to run these workloads on?
   
   I queued jobs on 5 testboxes:
 - brickland1: 120 core Ivybridge server
 - lkp-ib03:   48 core Ivybridge server
 - lkp-sb03:   32 core Sandybridge server
 - lkp-nex04:  64 core NHM server
 - lkp-a04:Atom server

 
 
v3.12-rc7  fe001e3de090e179f95d  
     
 -9.3%   
 brickland1/micro/aim7/shared
 +4.3%   
 lkp-ib03/micro/aim7/fork_test
 +2.2%   
 lkp-ib03/micro/aim7/shared
 -2.6%   TOTAL 
 aim7.2000.jobs-per-min
 

Sorry if I'm missing something, but could you elaborate more on what
these percentages represent?
   
  v3.12-rc7  fe001e3de090e179f95d  
       
   -9.3%   
   brickland1/micro/aim7/shared
   
   
   -2.6%   TOTAL 
   aim7.2000.jobs-per-min
   
   The comparation base is v3.12-rc7, and we got 9.3 performance regression
   at commit fe001e3de090e179f95d, which is the head of rwsem performance
   optimizations patch set.
  
  Yunahan, thanks for the data.  This I assume is with the entire rwsem
  v8 patchset.
 
 Yes, it is; 9 patches in total.
 
  Any idea of the run variation on the workload?
 
 Your concern is right. The variation is quite big on the 
 brickland1/micro/aim7/shared
 testcase.
 
* - v3.12-rc7
O - fe001e3de090e179f95d
 
  brickland1/micro/aim7/shared: aim7.2000.jobs-per-min
 
32 +++
   | |
31 ++  .*.   |
   |      ...|
30 ++    ... |
   |... ..   |
29 ++     ...|
   | *
28 ++... |
   | |
27 ++|
   *.O
26 O+|
   |O|
25 +++
 

Tim,

Please ignore this regression, it disappears when I run that testcase
6 times both for v3.12-rc7 and fe001e3de090e179f95d.

I guess 2000 users is a bit small for 120 core IVB server. I may try to
increase the user count and do test again to see how it will behavior
with your patches applied.

Sorry for the inconvenience.

--yliu

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Yuanhan Liu
On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote:
> On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
> > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
> > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > > > > 
> > > > > * Yuanhan Liu  wrote:
> > > > > 
> > > > > > > Btw., another _really_ interesting comparison would be against 
> > > > > > > the latest rwsem patches. Mind doing such a comparison?
> > > > > > 
> > > > > > Sure. Where can I get it? Are they on some git tree?
> > > > > 
> > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > > > > version.
> > > > > 
> > > > > The last on-lkml submission was in this thread:
> > > > > 
> > > > >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > > > > 
> > > > 
> > > > Thanks.
> > > > 
> > > > I queued bunchs of tests about one hour ago, and already got some
> > > > results(If necessary, I can add more data tomorrow when those tests are
> > > > finished):
> > > 
> > > What kind of system are you using to run these workloads on?
> > 
> > I queued jobs on 5 testboxes:
> >   - brickland1: 120 core Ivybridge server
> >   - lkp-ib03:   48 core Ivybridge server
> >   - lkp-sb03:   32 core Sandybridge server
> >   - lkp-nex04:  64 core NHM server
> >   - lkp-a04:Atom server
> > > 
> > > > 
> > > > 
> > > >v3.12-rc7  fe001e3de090e179f95d  
> > > >     
> > > > -9.3%   
> > > > brickland1/micro/aim7/shared
> > > > +4.3%   
> > > > lkp-ib03/micro/aim7/fork_test
> > > > +2.2%   
> > > > lkp-ib03/micro/aim7/shared
> > > > -2.6%   TOTAL 
> > > > aim7.2000.jobs-per-min
> > > > 
> > > 
> > > Sorry if I'm missing something, but could you elaborate more on what
> > > these percentages represent?
> > 
> >v3.12-rc7  fe001e3de090e179f95d  
> >     
> > -9.3%   
> > brickland1/micro/aim7/shared
> > 
> > 
> > -2.6%   TOTAL 
> > aim7.2000.jobs-per-min
> > 
> > The comparation base is v3.12-rc7, and we got 9.3 performance regression
> > at commit fe001e3de090e179f95d, which is the head of rwsem performance
> > optimizations patch set.
> 
> Yunahan, thanks for the data.  This I assume is with the entire rwsem
> v8 patchset.

Yes, it is; 9 patches in total.

> Any idea of the run variation on the workload?

Your concern is right. The variation is quite big on the 
brickland1/micro/aim7/shared
testcase.

   * - v3.12-rc7
   O - fe001e3de090e179f95d

 brickland1/micro/aim7/shared: aim7.2000.jobs-per-min

   32 +++
  | |
   31 ++  .*.   |
  |      ...|
   30 ++    ... |
  |... ..   |
   29 ++     ...|
  | *
   28 ++... |
  | |
   27 ++|
  *.O
   26 O+|
  |O|
   25 +++


--yliu
> > 
> > "brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase:
> > shared workfile of aim7.
> > 
> > The last line tell what field we are comparing, and it's
> > "aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7.
> > 
> > > Are they anon vma rwsem + optimistic
> > > spinning patches vs anon vma rwlock?
> > 
> > I tested "[PATCH v8 0/9] rwsem performance optimizations" only.
> > 
> > > 
> > > Also, I see your running aim7, you might be interested in some of the
> > > results I found when trying out Ingo's rwlock conversion patch on a
> > > largish 80 core system: https://lkml.org/lkml/2013/9/29/280
> > 
> > Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
> > can image and see from the data, aim7 benifit most from the anon_vma
> > optimization stuff due to high contention of anon_vma lock.
> > 

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Tim Chen
On Mon, 2013-11-04 at 17:44 -0800, Tim Chen wrote:
> On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
> > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
> > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > > > > 
> > > > > * Yuanhan Liu  wrote:
> > > > > 
> > > > > > > Btw., another _really_ interesting comparison would be against 
> > > > > > > the latest rwsem patches. Mind doing such a comparison?
> > > > > > 
> > > > > > Sure. Where can I get it? Are they on some git tree?
> > > > > 
> > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > > > > version.
> > > > > 
> > > > > The last on-lkml submission was in this thread:
> > > > > 
> > > > >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > > > > 
> > > > 
> > > > Thanks.
> > > > 
> > > > I queued bunchs of tests about one hour ago, and already got some
> > > > results(If necessary, I can add more data tomorrow when those tests are
> > > > finished):
> > > 
> > > What kind of system are you using to run these workloads on?
> > 
> > I queued jobs on 5 testboxes:
> >   - brickland1: 120 core Ivybridge server
> >   - lkp-ib03:   48 core Ivybridge server
> >   - lkp-sb03:   32 core Sandybridge server
> >   - lkp-nex04:  64 core NHM server
> >   - lkp-a04:Atom server
> > > 
> > > > 
> > > > 
> > > >v3.12-rc7  fe001e3de090e179f95d  
> > > >     
> > > > -9.3%   
> > > > brickland1/micro/aim7/shared
> > > > +4.3%   
> > > > lkp-ib03/micro/aim7/fork_test
> > > > +2.2%   
> > > > lkp-ib03/micro/aim7/shared
> > > > -2.6%   TOTAL 
> > > > aim7.2000.jobs-per-min
> > > > 
> > > 
> > > Sorry if I'm missing something, but could you elaborate more on what
> > > these percentages represent?
> > 
> >v3.12-rc7  fe001e3de090e179f95d  
> >     
> > -9.3%   
> > brickland1/micro/aim7/shared
> > 
> > 
> > -2.6%   TOTAL 
> > aim7.2000.jobs-per-min
> > 
> > The comparation base is v3.12-rc7, and we got 9.3 performance regression
> > at commit fe001e3de090e179f95d, which is the head of rwsem performance
> > optimizations patch set.
> 
> Yunahan, thanks for the data.  This I assume is with the entire rwsem
> v8 patchset. Any idea of the run variation on the workload?

Yunhan,

I haven't got a chance to make multiple runs to check the standard
deviation.  From the few runs I did, I got a 5.1% increase in
performance for aim7 shared workload for the complete rwsem patchset
on a similar machine that you are using.  The patches are applied
to the 3.12-rc7 and compared to the vanilla kernel.

Thanks.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Tim Chen
On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
> On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
> > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > > > 
> > > > * Yuanhan Liu  wrote:
> > > > 
> > > > > > Btw., another _really_ interesting comparison would be against 
> > > > > > the latest rwsem patches. Mind doing such a comparison?
> > > > > 
> > > > > Sure. Where can I get it? Are they on some git tree?
> > > > 
> > > > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > > > version.
> > > > 
> > > > The last on-lkml submission was in this thread:
> > > > 
> > > >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > > > 
> > > 
> > > Thanks.
> > > 
> > > I queued bunchs of tests about one hour ago, and already got some
> > > results(If necessary, I can add more data tomorrow when those tests are
> > > finished):
> > 
> > What kind of system are you using to run these workloads on?
> 
> I queued jobs on 5 testboxes:
>   - brickland1: 120 core Ivybridge server
>   - lkp-ib03:   48 core Ivybridge server
>   - lkp-sb03:   32 core Sandybridge server
>   - lkp-nex04:  64 core NHM server
>   - lkp-a04:Atom server
> > 
> > > 
> > > 
> > >v3.12-rc7  fe001e3de090e179f95d  
> > >     
> > > -9.3%   
> > > brickland1/micro/aim7/shared
> > > +4.3%   
> > > lkp-ib03/micro/aim7/fork_test
> > > +2.2%   
> > > lkp-ib03/micro/aim7/shared
> > > -2.6%   TOTAL 
> > > aim7.2000.jobs-per-min
> > > 
> > 
> > Sorry if I'm missing something, but could you elaborate more on what
> > these percentages represent?
> 
>v3.12-rc7  fe001e3de090e179f95d  
>     
> -9.3%   
> brickland1/micro/aim7/shared
> 
> 
> -2.6%   TOTAL 
> aim7.2000.jobs-per-min
> 
> The comparation base is v3.12-rc7, and we got 9.3 performance regression
> at commit fe001e3de090e179f95d, which is the head of rwsem performance
> optimizations patch set.

Yunahan, thanks for the data.  This I assume is with the entire rwsem
v8 patchset. Any idea of the run variation on the workload?

Tim

> 
> "brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase:
> shared workfile of aim7.
> 
> The last line tell what field we are comparing, and it's
> "aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7.
> 
> > Are they anon vma rwsem + optimistic
> > spinning patches vs anon vma rwlock?
> 
> I tested "[PATCH v8 0/9] rwsem performance optimizations" only.
> 
> > 
> > Also, I see your running aim7, you might be interested in some of the
> > results I found when trying out Ingo's rwlock conversion patch on a
> > largish 80 core system: https://lkml.org/lkml/2013/9/29/280
> 
> Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
> can image and see from the data, aim7 benifit most from the anon_vma
> optimization stuff due to high contention of anon_vma lock.
> 
> Thanks.
> 
>   --yliu
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Tim Chen
On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
 On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
  On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
   On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:

* Yuanhan Liu yuanhan@linux.intel.com wrote:

  Btw., another _really_ interesting comparison would be against 
  the latest rwsem patches. Mind doing such a comparison?
 
 Sure. Where can I get it? Are they on some git tree?

I've Cc:-ed Tim Chen who might be able to point you to the latest 
version.

The last on-lkml submission was in this thread:

  Subject: [PATCH v8 0/9] rwsem performance optimizations

   
   Thanks.
   
   I queued bunchs of tests about one hour ago, and already got some
   results(If necessary, I can add more data tomorrow when those tests are
   finished):
  
  What kind of system are you using to run these workloads on?
 
 I queued jobs on 5 testboxes:
   - brickland1: 120 core Ivybridge server
   - lkp-ib03:   48 core Ivybridge server
   - lkp-sb03:   32 core Sandybridge server
   - lkp-nex04:  64 core NHM server
   - lkp-a04:Atom server
  
   
   
  v3.12-rc7  fe001e3de090e179f95d  
       
   -9.3%   
   brickland1/micro/aim7/shared
   +4.3%   
   lkp-ib03/micro/aim7/fork_test
   +2.2%   
   lkp-ib03/micro/aim7/shared
   -2.6%   TOTAL 
   aim7.2000.jobs-per-min
   
  
  Sorry if I'm missing something, but could you elaborate more on what
  these percentages represent?
 
v3.12-rc7  fe001e3de090e179f95d  
     
 -9.3%   
 brickland1/micro/aim7/shared
 
 
 -2.6%   TOTAL 
 aim7.2000.jobs-per-min
 
 The comparation base is v3.12-rc7, and we got 9.3 performance regression
 at commit fe001e3de090e179f95d, which is the head of rwsem performance
 optimizations patch set.

Yunahan, thanks for the data.  This I assume is with the entire rwsem
v8 patchset. Any idea of the run variation on the workload?

Tim

 
 brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase:
 shared workfile of aim7.
 
 The last line tell what field we are comparing, and it's
 aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7.
 
  Are they anon vma rwsem + optimistic
  spinning patches vs anon vma rwlock?
 
 I tested [PATCH v8 0/9] rwsem performance optimizations only.
 
  
  Also, I see your running aim7, you might be interested in some of the
  results I found when trying out Ingo's rwlock conversion patch on a
  largish 80 core system: https://lkml.org/lkml/2013/9/29/280
 
 Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
 can image and see from the data, aim7 benifit most from the anon_vma
 optimization stuff due to high contention of anon_vma lock.
 
 Thanks.
 
   --yliu
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Tim Chen
On Mon, 2013-11-04 at 17:44 -0800, Tim Chen wrote:
 On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
  On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
   On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
 
 * Yuanhan Liu yuanhan@linux.intel.com wrote:
 
   Btw., another _really_ interesting comparison would be against 
   the latest rwsem patches. Mind doing such a comparison?
  
  Sure. Where can I get it? Are they on some git tree?
 
 I've Cc:-ed Tim Chen who might be able to point you to the latest 
 version.
 
 The last on-lkml submission was in this thread:
 
   Subject: [PATCH v8 0/9] rwsem performance optimizations
 

Thanks.

I queued bunchs of tests about one hour ago, and already got some
results(If necessary, I can add more data tomorrow when those tests are
finished):
   
   What kind of system are you using to run these workloads on?
  
  I queued jobs on 5 testboxes:
- brickland1: 120 core Ivybridge server
- lkp-ib03:   48 core Ivybridge server
- lkp-sb03:   32 core Sandybridge server
- lkp-nex04:  64 core NHM server
- lkp-a04:Atom server
   


   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   
brickland1/micro/aim7/shared
+4.3%   
lkp-ib03/micro/aim7/fork_test
+2.2%   
lkp-ib03/micro/aim7/shared
-2.6%   TOTAL 
aim7.2000.jobs-per-min

   
   Sorry if I'm missing something, but could you elaborate more on what
   these percentages represent?
  
 v3.12-rc7  fe001e3de090e179f95d  
      
  -9.3%   
  brickland1/micro/aim7/shared
  
  
  -2.6%   TOTAL 
  aim7.2000.jobs-per-min
  
  The comparation base is v3.12-rc7, and we got 9.3 performance regression
  at commit fe001e3de090e179f95d, which is the head of rwsem performance
  optimizations patch set.
 
 Yunahan, thanks for the data.  This I assume is with the entire rwsem
 v8 patchset. Any idea of the run variation on the workload?

Yunhan,

I haven't got a chance to make multiple runs to check the standard
deviation.  From the few runs I did, I got a 5.1% increase in
performance for aim7 shared workload for the complete rwsem patchset
on a similar machine that you are using.  The patches are applied
to the 3.12-rc7 and compared to the vanilla kernel.

Thanks.

Tim

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-04 Thread Yuanhan Liu
On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote:
 On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote:
  On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
   On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
 
 * Yuanhan Liu yuanhan@linux.intel.com wrote:
 
   Btw., another _really_ interesting comparison would be against 
   the latest rwsem patches. Mind doing such a comparison?
  
  Sure. Where can I get it? Are they on some git tree?
 
 I've Cc:-ed Tim Chen who might be able to point you to the latest 
 version.
 
 The last on-lkml submission was in this thread:
 
   Subject: [PATCH v8 0/9] rwsem performance optimizations
 

Thanks.

I queued bunchs of tests about one hour ago, and already got some
results(If necessary, I can add more data tomorrow when those tests are
finished):
   
   What kind of system are you using to run these workloads on?
  
  I queued jobs on 5 testboxes:
- brickland1: 120 core Ivybridge server
- lkp-ib03:   48 core Ivybridge server
- lkp-sb03:   32 core Sandybridge server
- lkp-nex04:  64 core NHM server
- lkp-a04:Atom server
   


   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   
brickland1/micro/aim7/shared
+4.3%   
lkp-ib03/micro/aim7/fork_test
+2.2%   
lkp-ib03/micro/aim7/shared
-2.6%   TOTAL 
aim7.2000.jobs-per-min

   
   Sorry if I'm missing something, but could you elaborate more on what
   these percentages represent?
  
 v3.12-rc7  fe001e3de090e179f95d  
      
  -9.3%   
  brickland1/micro/aim7/shared
  
  
  -2.6%   TOTAL 
  aim7.2000.jobs-per-min
  
  The comparation base is v3.12-rc7, and we got 9.3 performance regression
  at commit fe001e3de090e179f95d, which is the head of rwsem performance
  optimizations patch set.
 
 Yunahan, thanks for the data.  This I assume is with the entire rwsem
 v8 patchset.

Yes, it is; 9 patches in total.

 Any idea of the run variation on the workload?

Your concern is right. The variation is quite big on the 
brickland1/micro/aim7/shared
testcase.

   * - v3.12-rc7
   O - fe001e3de090e179f95d

 brickland1/micro/aim7/shared: aim7.2000.jobs-per-min

   32 +++
  | |
   31 ++  .*.   |
  |      ...|
   30 ++    ... |
  |... ..   |
   29 ++     ...|
  | *
   28 ++... |
  | |
   27 ++|
  *.O
   26 O+|
  |O|
   25 +++


--yliu
  
  brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase:
  shared workfile of aim7.
  
  The last line tell what field we are comparing, and it's
  aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7.
  
   Are they anon vma rwsem + optimistic
   spinning patches vs anon vma rwlock?
  
  I tested [PATCH v8 0/9] rwsem performance optimizations only.
  
   
   Also, I see your running aim7, you might be interested in some of the
   results I found when trying out Ingo's rwlock conversion patch on a
   largish 80 core system: https://lkml.org/lkml/2013/9/29/280
  
  Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
  can image and see from the data, aim7 benifit most from the anon_vma
  optimization stuff due to high contention of anon_vma lock.
  
  Thanks.
  
  --yliu
  
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-03 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
> On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > > 
> > > * Yuanhan Liu  wrote:
> > > 
> > > > > Btw., another _really_ interesting comparison would be against 
> > > > > the latest rwsem patches. Mind doing such a comparison?
> > > > 
> > > > Sure. Where can I get it? Are they on some git tree?
> > > 
> > > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > > version.
> > > 
> > > The last on-lkml submission was in this thread:
> > > 
> > >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > > 
> > 
> > Thanks.
> > 
> > I queued bunchs of tests about one hour ago, and already got some
> > results(If necessary, I can add more data tomorrow when those tests are
> > finished):
> 
> What kind of system are you using to run these workloads on?

I queued jobs on 5 testboxes:
  - brickland1: 120 core Ivybridge server
  - lkp-ib03:   48 core Ivybridge server
  - lkp-sb03:   32 core Sandybridge server
  - lkp-nex04:  64 core NHM server
  - lkp-a04:Atom server
> 
> > 
> > 
> >v3.12-rc7  fe001e3de090e179f95d  
> >     
> > -9.3%   
> > brickland1/micro/aim7/shared
> > +4.3%   
> > lkp-ib03/micro/aim7/fork_test
> > +2.2%   
> > lkp-ib03/micro/aim7/shared
> > -2.6%   TOTAL 
> > aim7.2000.jobs-per-min
> > 
> 
> Sorry if I'm missing something, but could you elaborate more on what
> these percentages represent?

   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   brickland1/micro/aim7/shared


-2.6%   TOTAL aim7.2000.jobs-per-min

The comparation base is v3.12-rc7, and we got 9.3 performance regression
at commit fe001e3de090e179f95d, which is the head of rwsem performance
optimizations patch set.

"brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase:
shared workfile of aim7.

The last line tell what field we are comparing, and it's
"aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7.

> Are they anon vma rwsem + optimistic
> spinning patches vs anon vma rwlock?

I tested "[PATCH v8 0/9] rwsem performance optimizations" only.

> 
> Also, I see your running aim7, you might be interested in some of the
> results I found when trying out Ingo's rwlock conversion patch on a
> largish 80 core system: https://lkml.org/lkml/2013/9/29/280

Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
can image and see from the data, aim7 benifit most from the anon_vma
optimization stuff due to high contention of anon_vma lock.

Thanks.

--yliu

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-03 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote:
 On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
  On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
   
   * Yuanhan Liu yuanhan@linux.intel.com wrote:
   
 Btw., another _really_ interesting comparison would be against 
 the latest rwsem patches. Mind doing such a comparison?

Sure. Where can I get it? Are they on some git tree?
   
   I've Cc:-ed Tim Chen who might be able to point you to the latest 
   version.
   
   The last on-lkml submission was in this thread:
   
 Subject: [PATCH v8 0/9] rwsem performance optimizations
   
  
  Thanks.
  
  I queued bunchs of tests about one hour ago, and already got some
  results(If necessary, I can add more data tomorrow when those tests are
  finished):
 
 What kind of system are you using to run these workloads on?

I queued jobs on 5 testboxes:
  - brickland1: 120 core Ivybridge server
  - lkp-ib03:   48 core Ivybridge server
  - lkp-sb03:   32 core Sandybridge server
  - lkp-nex04:  64 core NHM server
  - lkp-a04:Atom server
 
  
  
 v3.12-rc7  fe001e3de090e179f95d  
      
  -9.3%   
  brickland1/micro/aim7/shared
  +4.3%   
  lkp-ib03/micro/aim7/fork_test
  +2.2%   
  lkp-ib03/micro/aim7/shared
  -2.6%   TOTAL 
  aim7.2000.jobs-per-min
  
 
 Sorry if I'm missing something, but could you elaborate more on what
 these percentages represent?

   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   brickland1/micro/aim7/shared


-2.6%   TOTAL aim7.2000.jobs-per-min

The comparation base is v3.12-rc7, and we got 9.3 performance regression
at commit fe001e3de090e179f95d, which is the head of rwsem performance
optimizations patch set.

brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase:
shared workfile of aim7.

The last line tell what field we are comparing, and it's
aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7.

 Are they anon vma rwsem + optimistic
 spinning patches vs anon vma rwlock?

I tested [PATCH v8 0/9] rwsem performance optimizations only.

 
 Also, I see your running aim7, you might be interested in some of the
 results I found when trying out Ingo's rwlock conversion patch on a
 largish 80 core system: https://lkml.org/lkml/2013/9/29/280

Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you
can image and see from the data, aim7 benifit most from the anon_vma
optimization stuff due to high contention of anon_vma lock.

Thanks.

--yliu

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 11:55 -0700, Linus Torvalds wrote:
> On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse  wrote:
> >
> > Should copy Andrea on this. I talked with him during KS, and there are
> > no current in-tree users who are doing such sleeping; however there
> > are prospective users for networking (RDMA) or GPU stuff who want to
> > use this to let hardware directly copy data into user mappings.
> 
> Tough.
> 
> I spoke up the first time this came up and I'll say the same thing
> again: we're not screwing over the VM subsystem because some crazy
> user might want to do crazy and stupid things that nobody sane cares
> about.
> 
> The whole "somebody might want to .." argument is just irrelevant.

Ok, I was under the impression that this was something already in the
kernel and hence "too late to go back". Based on the results I'm
definitely in favor of the whole rwlock conversion.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
> On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> > 
> > * Yuanhan Liu  wrote:
> > 
> > > > Btw., another _really_ interesting comparison would be against 
> > > > the latest rwsem patches. Mind doing such a comparison?
> > > 
> > > Sure. Where can I get it? Are they on some git tree?
> > 
> > I've Cc:-ed Tim Chen who might be able to point you to the latest 
> > version.
> > 
> > The last on-lkml submission was in this thread:
> > 
> >   Subject: [PATCH v8 0/9] rwsem performance optimizations
> > 
> 
> Thanks.
> 
> I queued bunchs of tests about one hour ago, and already got some
> results(If necessary, I can add more data tomorrow when those tests are
> finished):

What kind of system are you using to run these workloads on?

> 
> 
>v3.12-rc7  fe001e3de090e179f95d  
>     
> -9.3%   
> brickland1/micro/aim7/shared
> +4.3%   
> lkp-ib03/micro/aim7/fork_test
> +2.2%   lkp-ib03/micro/aim7/shared
> -2.6%   TOTAL 
> aim7.2000.jobs-per-min
> 

Sorry if I'm missing something, but could you elaborate more on what
these percentages represent? Are they anon vma rwsem + optimistic
spinning patches vs anon vma rwlock?

Also, I see your running aim7, you might be interested in some of the
results I found when trying out Ingo's rwlock conversion patch on a
largish 80 core system: https://lkml.org/lkml/2013/9/29/280

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread KOSAKI Motohiro
(11/1/13 3:54 AM), Yuanhan Liu wrote:
> Patch 1 turns locking the anon_vma's root to locking itself to let it be
> a per anon_vma lock, which would reduce contentions.
> 
> In the same time, lock range becomes quite small then, which is bascially
> a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
> It's a patch made from Ingo, I just made some change to let it apply based on
> patch 1.
> 
> Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
> 
> Here is the detailed changed stats with this patch applied. The test base is 
> v3.12-rc7,
> and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
> 
> NOTE: both commits are compared to base v3.12-rc7.

I'd suggest you CCing linux-mm when posting mm patches.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Linus Torvalds
On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse  wrote:
>
> Should copy Andrea on this. I talked with him during KS, and there are
> no current in-tree users who are doing such sleeping; however there
> are prospective users for networking (RDMA) or GPU stuff who want to
> use this to let hardware directly copy data into user mappings.

Tough.

I spoke up the first time this came up and I'll say the same thing
again: we're not screwing over the VM subsystem because some crazy
user might want to do crazy and stupid things that nobody sane cares
about.

The whole "somebody might want to .." argument is just irrelevant.
Some people want to sleep in interrupt handlers too, or while holding
random spinlocks. Too bad. They don't get to, because doing that
results in problems for the rest of the system.

Our job in the kernel is to do the best job technically that we can.
And sometimes that very much involves saying "No, you can't do that".

We have limitations in the kernel. The stack is of limited size. You
can't allocate arbitrarily sized memory. You must follow some very
strict rules.

If people can't handle that, then they can go cry to mommy, and go
back to writing user mode code. In the kernel, you have to live with
certain constraints that makes the kernel better.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Michel Lespinasse
On Fri, Nov 1, 2013 at 11:09 AM, Linus Torvalds
 wrote:
> On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso  wrote:
>>
>> Andrea's last input from this kind of conversion is that it cannot be
>> done (at least yet): https://lkml.org/lkml/2013/9/30/53
>
> No, none of the invalidate_page users really need to sleep. If doing
> this makes some people not do stupid sh*t, then that's all good. So at
> least _that_ worry was a false alarm. We definitely don't want to
> support crap in the VM, and sleeping during teardown is crap.

Should copy Andrea on this. I talked with him during KS, and there are
no current in-tree users who are doing such sleeping; however there
are prospective users for networking (RDMA) or GPU stuff who want to
use this to let hardware directly copy data into user mappings. I'm
not too aware of the details, but my understanding is that we then
need to send the NIC and/or GPU some commands to tear down the
mapping, and that command is currently acknowledged with an interrupt,
which is where the lseepability requirement comes from. Andrea was
thinking about cooking up some scheme to dynamically change between
sleepable and non-sleepable locks at runtime depending on when such
drivers are used; this seems quite complicated to me but I haven't
heard of alternative plans for RDMA usage either.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Linus Torvalds
On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso  wrote:
>
> Andrea's last input from this kind of conversion is that it cannot be
> done (at least yet): https://lkml.org/lkml/2013/9/30/53

No, none of the invalidate_page users really need to sleep. If doing
this makes some people not do stupid sh*t, then that's all good. So at
least _that_ worry was a false alarm. We definitely don't want to
support crap in the VM, and sleeping during teardown is crap.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 09:01 +0100, Ingo Molnar wrote:
> * Yuanhan Liu  wrote:
> 
> > Patch 1 turns locking the anon_vma's root to locking itself to let it be
> > a per anon_vma lock, which would reduce contentions.
> > 
> > In the same time, lock range becomes quite small then, which is bascially
> > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
> > It's a patch made from Ingo, I just made some change to let it apply based 
> > on
> > patch 1.

Andrea's last input from this kind of conversion is that it cannot be
done (at least yet): https://lkml.org/lkml/2013/9/30/53

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
> 
> * Yuanhan Liu  wrote:
> 
> > > Btw., another _really_ interesting comparison would be against 
> > > the latest rwsem patches. Mind doing such a comparison?
> > 
> > Sure. Where can I get it? Are they on some git tree?
> 
> I've Cc:-ed Tim Chen who might be able to point you to the latest 
> version.
> 
> The last on-lkml submission was in this thread:
> 
>   Subject: [PATCH v8 0/9] rwsem performance optimizations
> 

Thanks.

I queued bunchs of tests about one hour ago, and already got some
results(If necessary, I can add more data tomorrow when those tests are
finished):


   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   brickland1/micro/aim7/shared
+4.3%   
lkp-ib03/micro/aim7/fork_test
+2.2%   lkp-ib03/micro/aim7/shared
-2.6%   TOTAL aim7.2000.jobs-per-min

   v3.12-rc7  fe001e3de090e179f95d  
    
   204056.67   -23.5%156082.33  brickland1/micro/aim7/shared
79248.00  +144.3%193617.25  
lkp-ib03/micro/aim7/fork_test
   298355.33   -25.2%223084.67  lkp-ib03/micro/aim7/shared
   581660.00-1.5%572784.25  TOTAL 
time.involuntary_context_switches

   v3.12-rc7  fe001e3de090e179f95d  
    
22487.33-4.7% 21429.33  brickland1/micro/aim7/dbase
61412.67   -29.1% 43511.00  brickland1/micro/aim7/shared
   531142.00   -27.7%383818.75  
lkp-ib03/micro/aim7/fork_test
20158.33   -50.9%  9899.67  lkp-ib03/micro/aim7/shared
   635200.33   -27.8%458658.75  TOTAL vmstat.system.in

   v3.12-rc7  fe001e3de090e179f95d  
    
 6408.67-4.5%  6117.33  brickland1/micro/aim7/dbase
87856.00   -39.5% 53170.67  brickland1/micro/aim7/shared
  1043620.00   -28.0%751214.75  
lkp-ib03/micro/aim7/fork_test
47152.33   -38.0% 29245.33  lkp-ib03/micro/aim7/shared
  1185037.00   -29.1%839748.08  TOTAL vmstat.system.cs

   v3.12-rc7  fe001e3de090e179f95d  
    
13295.00   -10.0% 11960.00  brickland1/micro/aim7/dbase
  1901175.00   -35.5%   1226787.33  brickland1/micro/aim7/shared
13951.00-6.5% 13051.00  lkp-ib03/micro/aim7/dbase
239773251.17   -30.9% 165727820.75  
lkp-ib03/micro/aim7/fork_test
  1014933.67   -31.1%699259.67  lkp-ib03/micro/aim7/shared
242716605.83   -30.9% 167678878.75  TOTAL 
time.voluntary_context_switches

   v3.12-rc7  fe001e3de090e179f95d  
    
9.56-1.0% 9.46  brickland1/micro/aim7/dbase
   11.01   -10.1% 9.90  brickland1/micro/aim7/shared
   36.23   +15.3%41.77  
lkp-ib03/micro/aim7/fork_test
   10.51   -11.9% 9.26  lkp-ib03/micro/aim7/shared
   67.31+4.6%70.39  TOTAL iostat.cpu.system

   v3.12-rc7  fe001e3de090e179f95d  
    
   36.39-3.6%35.09  brickland1/micro/aim7/dbase
   34.97-8.1%32.13  brickland1/micro/aim7/shared
   20.34+6.7%21.70  lkp-ib03/micro/aim7/shared
   91.70-3.0%88.92  TOTAL boottime.dhcp

   v3.12-rc7  fe001e3de090e179f95d  
    
   60.00+6.7%64.00  brickland1/micro/aim7/shared
   60.83-9.2%55.25  
lkp-ib03/micro/aim7/fork_test
  120.83-1.3%   119.25  TOTAL vmstat.cpu.id

   v3.12-rc7  fe001e3de090e179f95d  
    
  345.50-1.1%   341.73  brickland1/micro/aim7/dbase
 3788.80   +11.5%  4223.15  
lkp-ib03/micro/aim7/fork_test
  108.29-7.1%   100.62  lkp-ib03/micro/aim7/shared
 4242.59   +10.0%  4665.50  TOTAL time.system_time

   v3.12-rc7  fe001e3de090e179f95d  
    
 7481.33-0.4%  7454.00  

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Ingo Molnar

* Yuanhan Liu  wrote:

> > Btw., another _really_ interesting comparison would be against 
> > the latest rwsem patches. Mind doing such a comparison?
> 
> Sure. Where can I get it? Are they on some git tree?

I've Cc:-ed Tim Chen who might be able to point you to the latest 
version.

The last on-lkml submission was in this thread:

  Subject: [PATCH v8 0/9] rwsem performance optimizations

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 09:01:36AM +0100, Ingo Molnar wrote:
> 
> * Yuanhan Liu  wrote:
> 
> > Patch 1 turns locking the anon_vma's root to locking itself to let it be
> > a per anon_vma lock, which would reduce contentions.
> > 
> > In the same time, lock range becomes quite small then, which is bascially
> > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
> > It's a patch made from Ingo, I just made some change to let it apply based 
> > on
> > patch 1.
> > 
> > Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
> > 
> > Here is the detailed changed stats with this patch applied. The test base 
> > is v3.12-rc7,
> > and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
> > 
> > NOTE: both commits are compared to base v3.12-rc7.
> > 
> >   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
> >       
> >+35.0%+89.9%   
> > brickland1/micro/aim7/fork_test
> >+28.4%+49.3%   
> > lkp-ib03/micro/aim7/fork_test
> > +2.0% +2.7%   
> > lkp-ib03/micro/aim7/shared
> > -0.4% +0.0%   
> > lkp-sb03/micro/aim7/dbase
> >+16.4%+59.0%   
> > lkp-sb03/micro/aim7/fork_test
> > +0.1% +0.3%   
> > lkp-sb03/micro/aim7/shared
> > +2.2% +5.0%   TOTAL 
> > aim7.2000.jobs-per-min
> 
> Impressive!
> 
> >   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
> >       
> >-25.9%  1008.55   -47.3%   717.39  
> > brickland1/micro/aim7/fork_test
> > -1.4%   641.19-3.4%   628.45  
> > brickland1/micro/hackbench/1600%-process-pipe
> > -1.0%   122.84+1.1%   125.36  
> > brickland1/micro/netperf/120s-200%-UDP_RR
> > +0.0%   121.29+0.2%   121.57  
> > lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
> >-22.1%   351.41   -26.3%   332.54  
> > lkp-ib03/micro/aim7/fork_test
> > -1.9%31.33-2.6%31.11  
> > lkp-ib03/micro/aim7/shared
> > -0.4%   630.36+0.4%   635.05  
> > lkp-ib03/micro/hackbench/1600%-process-socket
> > -0.0%   612.62+1.8%   623.80  
> > lkp-ib03/micro/hackbench/1600%-threads-socket
> >-14.1%   340.30   -37.1%   249.26  
> > lkp-sb03/micro/aim7/fork_test
> > -0.1%41.31-0.3%41.22  
> > lkp-sb03/micro/aim7/shared
> > -0.0%   614.26+0.6%   617.81  
> > lkp-sb03/micro/hackbench/1600%-process-socket
> >-10.4%  4515.47   -18.2%  4123.55  TOTAL 
> > time.elapsed_time
> 
> Here you scared me for a second with those negative percentages! :-)

Aha.. 

> 
> >   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
> >       
> >+26.7%323386.33   -75.7% 61980.00  
> > brickland1/micro/aim7/fork_test
> >-22.9% 67734.00   -64.1% 31531.33  
> > brickland1/micro/aim7/shared
> > +0.4%  3303.67-0.8%  3264.33  
> > brickland1/micro/dbench/100%
> > +0.7%   1871483.67-0.4%   1850846.00  
> > brickland1/micro/netperf/120s-200%-TCP_MAERTS
> > -1.0%109553.00+0.4%111038.67  
> > brickland1/micro/pigz/100%
> > -0.7% 13600.67+0.1% 13718.67  
> > lkp-a04/micro/netperf/120s-200%-TCP_CRR
> > -4.6%995898.00   -85.2%154621.40  
> > lkp-ib03/micro/aim7/fork_test
> >-31.8% 32178.00   -50.3% 23442.67  
> > lkp-ib03/micro/aim7/shared
> > +1.1%   7466432.67-0.7%   7334831.67  
> > lkp-ib03/micro/hackbench/1600%-threads-pipe
> > +2.5%   1044936.33-1.3%   1006084.00  
> > lkp-ib03/micro/hackbench/1600%-threads-socket
> > -1.3%   5635979.00+0.2%   5721011.67  
> > lkp-ib03/micro/netperf/120s-200%-TCP_RR
> >-24.3% 42853.33   -56.8% 24484.33  
> > lkp-nex04/micro/aim7/shared
> >-23.3%754297.67   -83.2%165479.00  
> > lkp-sb03/micro/aim7/fork_test
> > -7.4% 21586.00   -24.1% 17698.33  
> > lkp-sb03/micro/aim7/shared
> > +1.1%   3838724.00+0.3%   3808206.67  
> > lkp-sb03/micro/hackbench/1600%-process-pipe
> > +0.8%   5143255.00-1.1%   5046716.67  
> > lkp-sb03/micro/hackbench/1600%-threads-pipe
> > +2.8%537048.67-0.8%518351.67  
> > lkp-sb03/micro/hackbench/1600%-threads-socket
> > +4.0% 50446.67-5.3% 45960.00  
> > lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS
> >-42.0% 52693.00   -26.4% 66849.67  
> > lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
> 

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Ingo Molnar

* Yuanhan Liu  wrote:

> Patch 1 turns locking the anon_vma's root to locking itself to let it be
> a per anon_vma lock, which would reduce contentions.
> 
> In the same time, lock range becomes quite small then, which is bascially
> a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
> It's a patch made from Ingo, I just made some change to let it apply based on
> patch 1.
> 
> Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
> 
> Here is the detailed changed stats with this patch applied. The test base is 
> v3.12-rc7,
> and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
> 
> NOTE: both commits are compared to base v3.12-rc7.
> 
>   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
>       
>+35.0%+89.9%   
> brickland1/micro/aim7/fork_test
>+28.4%+49.3%   
> lkp-ib03/micro/aim7/fork_test
> +2.0% +2.7%   
> lkp-ib03/micro/aim7/shared
> -0.4% +0.0%   
> lkp-sb03/micro/aim7/dbase
>+16.4%+59.0%   
> lkp-sb03/micro/aim7/fork_test
> +0.1% +0.3%   
> lkp-sb03/micro/aim7/shared
> +2.2% +5.0%   TOTAL 
> aim7.2000.jobs-per-min

Impressive!

>   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
>       
>-25.9%  1008.55   -47.3%   717.39  
> brickland1/micro/aim7/fork_test
> -1.4%   641.19-3.4%   628.45  
> brickland1/micro/hackbench/1600%-process-pipe
> -1.0%   122.84+1.1%   125.36  
> brickland1/micro/netperf/120s-200%-UDP_RR
> +0.0%   121.29+0.2%   121.57  
> lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
>-22.1%   351.41   -26.3%   332.54  
> lkp-ib03/micro/aim7/fork_test
> -1.9%31.33-2.6%31.11  
> lkp-ib03/micro/aim7/shared
> -0.4%   630.36+0.4%   635.05  
> lkp-ib03/micro/hackbench/1600%-process-socket
> -0.0%   612.62+1.8%   623.80  
> lkp-ib03/micro/hackbench/1600%-threads-socket
>-14.1%   340.30   -37.1%   249.26  
> lkp-sb03/micro/aim7/fork_test
> -0.1%41.31-0.3%41.22  
> lkp-sb03/micro/aim7/shared
> -0.0%   614.26+0.6%   617.81  
> lkp-sb03/micro/hackbench/1600%-process-socket
>-10.4%  4515.47   -18.2%  4123.55  TOTAL time.elapsed_time

Here you scared me for a second with those negative percentages! :-)

>   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
>       
>+26.7%323386.33   -75.7% 61980.00  
> brickland1/micro/aim7/fork_test
>-22.9% 67734.00   -64.1% 31531.33  
> brickland1/micro/aim7/shared
> +0.4%  3303.67-0.8%  3264.33  
> brickland1/micro/dbench/100%
> +0.7%   1871483.67-0.4%   1850846.00  
> brickland1/micro/netperf/120s-200%-TCP_MAERTS
> -1.0%109553.00+0.4%111038.67  
> brickland1/micro/pigz/100%
> -0.7% 13600.67+0.1% 13718.67  
> lkp-a04/micro/netperf/120s-200%-TCP_CRR
> -4.6%995898.00   -85.2%154621.40  
> lkp-ib03/micro/aim7/fork_test
>-31.8% 32178.00   -50.3% 23442.67  
> lkp-ib03/micro/aim7/shared
> +1.1%   7466432.67-0.7%   7334831.67  
> lkp-ib03/micro/hackbench/1600%-threads-pipe
> +2.5%   1044936.33-1.3%   1006084.00  
> lkp-ib03/micro/hackbench/1600%-threads-socket
> -1.3%   5635979.00+0.2%   5721011.67  
> lkp-ib03/micro/netperf/120s-200%-TCP_RR
>-24.3% 42853.33   -56.8% 24484.33  
> lkp-nex04/micro/aim7/shared
>-23.3%754297.67   -83.2%165479.00  
> lkp-sb03/micro/aim7/fork_test
> -7.4% 21586.00   -24.1% 17698.33  
> lkp-sb03/micro/aim7/shared
> +1.1%   3838724.00+0.3%   3808206.67  
> lkp-sb03/micro/hackbench/1600%-process-pipe
> +0.8%   5143255.00-1.1%   5046716.67  
> lkp-sb03/micro/hackbench/1600%-threads-pipe
> +2.8%537048.67-0.8%518351.67  
> lkp-sb03/micro/hackbench/1600%-threads-socket
> +4.0% 50446.67-5.3% 45960.00  
> lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS
>-42.0% 52693.00   -26.4% 66849.67  
> lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
> -0.6%  28005389.67-7.7%  26006116.73  TOTAL vmstat.system.cs

looks like a win all across, with a few below 1% regressions what 
might be statistical outliners - it's hard to tell without a stddev 
column ...

>   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
>   

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Ingo Molnar

* Yuanhan Liu yuanhan@linux.intel.com wrote:

 Patch 1 turns locking the anon_vma's root to locking itself to let it be
 a per anon_vma lock, which would reduce contentions.
 
 In the same time, lock range becomes quite small then, which is bascially
 a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
 It's a patch made from Ingo, I just made some change to let it apply based on
 patch 1.
 
 Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
 
 Here is the detailed changed stats with this patch applied. The test base is 
 v3.12-rc7,
 and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
 
 NOTE: both commits are compared to base v3.12-rc7.
 
   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
       
+35.0%+89.9%   
 brickland1/micro/aim7/fork_test
+28.4%+49.3%   
 lkp-ib03/micro/aim7/fork_test
 +2.0% +2.7%   
 lkp-ib03/micro/aim7/shared
 -0.4% +0.0%   
 lkp-sb03/micro/aim7/dbase
+16.4%+59.0%   
 lkp-sb03/micro/aim7/fork_test
 +0.1% +0.3%   
 lkp-sb03/micro/aim7/shared
 +2.2% +5.0%   TOTAL 
 aim7.2000.jobs-per-min

Impressive!

   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
       
-25.9%  1008.55   -47.3%   717.39  
 brickland1/micro/aim7/fork_test
 -1.4%   641.19-3.4%   628.45  
 brickland1/micro/hackbench/1600%-process-pipe
 -1.0%   122.84+1.1%   125.36  
 brickland1/micro/netperf/120s-200%-UDP_RR
 +0.0%   121.29+0.2%   121.57  
 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
-22.1%   351.41   -26.3%   332.54  
 lkp-ib03/micro/aim7/fork_test
 -1.9%31.33-2.6%31.11  
 lkp-ib03/micro/aim7/shared
 -0.4%   630.36+0.4%   635.05  
 lkp-ib03/micro/hackbench/1600%-process-socket
 -0.0%   612.62+1.8%   623.80  
 lkp-ib03/micro/hackbench/1600%-threads-socket
-14.1%   340.30   -37.1%   249.26  
 lkp-sb03/micro/aim7/fork_test
 -0.1%41.31-0.3%41.22  
 lkp-sb03/micro/aim7/shared
 -0.0%   614.26+0.6%   617.81  
 lkp-sb03/micro/hackbench/1600%-process-socket
-10.4%  4515.47   -18.2%  4123.55  TOTAL time.elapsed_time

Here you scared me for a second with those negative percentages! :-)

   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
       
+26.7%323386.33   -75.7% 61980.00  
 brickland1/micro/aim7/fork_test
-22.9% 67734.00   -64.1% 31531.33  
 brickland1/micro/aim7/shared
 +0.4%  3303.67-0.8%  3264.33  
 brickland1/micro/dbench/100%
 +0.7%   1871483.67-0.4%   1850846.00  
 brickland1/micro/netperf/120s-200%-TCP_MAERTS
 -1.0%109553.00+0.4%111038.67  
 brickland1/micro/pigz/100%
 -0.7% 13600.67+0.1% 13718.67  
 lkp-a04/micro/netperf/120s-200%-TCP_CRR
 -4.6%995898.00   -85.2%154621.40  
 lkp-ib03/micro/aim7/fork_test
-31.8% 32178.00   -50.3% 23442.67  
 lkp-ib03/micro/aim7/shared
 +1.1%   7466432.67-0.7%   7334831.67  
 lkp-ib03/micro/hackbench/1600%-threads-pipe
 +2.5%   1044936.33-1.3%   1006084.00  
 lkp-ib03/micro/hackbench/1600%-threads-socket
 -1.3%   5635979.00+0.2%   5721011.67  
 lkp-ib03/micro/netperf/120s-200%-TCP_RR
-24.3% 42853.33   -56.8% 24484.33  
 lkp-nex04/micro/aim7/shared
-23.3%754297.67   -83.2%165479.00  
 lkp-sb03/micro/aim7/fork_test
 -7.4% 21586.00   -24.1% 17698.33  
 lkp-sb03/micro/aim7/shared
 +1.1%   3838724.00+0.3%   3808206.67  
 lkp-sb03/micro/hackbench/1600%-process-pipe
 +0.8%   5143255.00-1.1%   5046716.67  
 lkp-sb03/micro/hackbench/1600%-threads-pipe
 +2.8%537048.67-0.8%518351.67  
 lkp-sb03/micro/hackbench/1600%-threads-socket
 +4.0% 50446.67-5.3% 45960.00  
 lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS
-42.0% 52693.00   -26.4% 66849.67  
 lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
 -0.6%  28005389.67-7.7%  26006116.73  TOTAL vmstat.system.cs

looks like a win all across, with a few below 1% regressions what 
might be statistical outliners - it's hard to tell without a stddev 
column ...

   1c00bef768d4341afa7d  e3e37183ee805f33e88f  
       
 -4.7%

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 09:01:36AM +0100, Ingo Molnar wrote:
 
 * Yuanhan Liu yuanhan@linux.intel.com wrote:
 
  Patch 1 turns locking the anon_vma's root to locking itself to let it be
  a per anon_vma lock, which would reduce contentions.
  
  In the same time, lock range becomes quite small then, which is bascially
  a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
  It's a patch made from Ingo, I just made some change to let it apply based 
  on
  patch 1.
  
  Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
  
  Here is the detailed changed stats with this patch applied. The test base 
  is v3.12-rc7,
  and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
  
  NOTE: both commits are compared to base v3.12-rc7.
  
1c00bef768d4341afa7d  e3e37183ee805f33e88f  
    
 +35.0%+89.9%   
  brickland1/micro/aim7/fork_test
 +28.4%+49.3%   
  lkp-ib03/micro/aim7/fork_test
  +2.0% +2.7%   
  lkp-ib03/micro/aim7/shared
  -0.4% +0.0%   
  lkp-sb03/micro/aim7/dbase
 +16.4%+59.0%   
  lkp-sb03/micro/aim7/fork_test
  +0.1% +0.3%   
  lkp-sb03/micro/aim7/shared
  +2.2% +5.0%   TOTAL 
  aim7.2000.jobs-per-min
 
 Impressive!
 
1c00bef768d4341afa7d  e3e37183ee805f33e88f  
    
 -25.9%  1008.55   -47.3%   717.39  
  brickland1/micro/aim7/fork_test
  -1.4%   641.19-3.4%   628.45  
  brickland1/micro/hackbench/1600%-process-pipe
  -1.0%   122.84+1.1%   125.36  
  brickland1/micro/netperf/120s-200%-UDP_RR
  +0.0%   121.29+0.2%   121.57  
  lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
 -22.1%   351.41   -26.3%   332.54  
  lkp-ib03/micro/aim7/fork_test
  -1.9%31.33-2.6%31.11  
  lkp-ib03/micro/aim7/shared
  -0.4%   630.36+0.4%   635.05  
  lkp-ib03/micro/hackbench/1600%-process-socket
  -0.0%   612.62+1.8%   623.80  
  lkp-ib03/micro/hackbench/1600%-threads-socket
 -14.1%   340.30   -37.1%   249.26  
  lkp-sb03/micro/aim7/fork_test
  -0.1%41.31-0.3%41.22  
  lkp-sb03/micro/aim7/shared
  -0.0%   614.26+0.6%   617.81  
  lkp-sb03/micro/hackbench/1600%-process-socket
 -10.4%  4515.47   -18.2%  4123.55  TOTAL 
  time.elapsed_time
 
 Here you scared me for a second with those negative percentages! :-)

Aha.. 

 
1c00bef768d4341afa7d  e3e37183ee805f33e88f  
    
 +26.7%323386.33   -75.7% 61980.00  
  brickland1/micro/aim7/fork_test
 -22.9% 67734.00   -64.1% 31531.33  
  brickland1/micro/aim7/shared
  +0.4%  3303.67-0.8%  3264.33  
  brickland1/micro/dbench/100%
  +0.7%   1871483.67-0.4%   1850846.00  
  brickland1/micro/netperf/120s-200%-TCP_MAERTS
  -1.0%109553.00+0.4%111038.67  
  brickland1/micro/pigz/100%
  -0.7% 13600.67+0.1% 13718.67  
  lkp-a04/micro/netperf/120s-200%-TCP_CRR
  -4.6%995898.00   -85.2%154621.40  
  lkp-ib03/micro/aim7/fork_test
 -31.8% 32178.00   -50.3% 23442.67  
  lkp-ib03/micro/aim7/shared
  +1.1%   7466432.67-0.7%   7334831.67  
  lkp-ib03/micro/hackbench/1600%-threads-pipe
  +2.5%   1044936.33-1.3%   1006084.00  
  lkp-ib03/micro/hackbench/1600%-threads-socket
  -1.3%   5635979.00+0.2%   5721011.67  
  lkp-ib03/micro/netperf/120s-200%-TCP_RR
 -24.3% 42853.33   -56.8% 24484.33  
  lkp-nex04/micro/aim7/shared
 -23.3%754297.67   -83.2%165479.00  
  lkp-sb03/micro/aim7/fork_test
  -7.4% 21586.00   -24.1% 17698.33  
  lkp-sb03/micro/aim7/shared
  +1.1%   3838724.00+0.3%   3808206.67  
  lkp-sb03/micro/hackbench/1600%-process-pipe
  +0.8%   5143255.00-1.1%   5046716.67  
  lkp-sb03/micro/hackbench/1600%-threads-pipe
  +2.8%537048.67-0.8%518351.67  
  lkp-sb03/micro/hackbench/1600%-threads-socket
  +4.0% 50446.67-5.3% 45960.00  
  lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS
 -42.0% 52693.00   -26.4% 66849.67  
  lkp-sb03/micro/netperf/120s-200%-TCP_STREAM
  -0.6%  28005389.67-7.7%  26006116.73  TOTAL vmstat.system.cs
 
 looks like a win all across, with a few below 1% regressions what 
 might be statistical outliners 

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Ingo Molnar

* Yuanhan Liu yuanhan@linux.intel.com wrote:

  Btw., another _really_ interesting comparison would be against 
  the latest rwsem patches. Mind doing such a comparison?
 
 Sure. Where can I get it? Are they on some git tree?

I've Cc:-ed Tim Chen who might be able to point you to the latest 
version.

The last on-lkml submission was in this thread:

  Subject: [PATCH v8 0/9] rwsem performance optimizations

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Yuanhan Liu
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
 
 * Yuanhan Liu yuanhan@linux.intel.com wrote:
 
   Btw., another _really_ interesting comparison would be against 
   the latest rwsem patches. Mind doing such a comparison?
  
  Sure. Where can I get it? Are they on some git tree?
 
 I've Cc:-ed Tim Chen who might be able to point you to the latest 
 version.
 
 The last on-lkml submission was in this thread:
 
   Subject: [PATCH v8 0/9] rwsem performance optimizations
 

Thanks.

I queued bunchs of tests about one hour ago, and already got some
results(If necessary, I can add more data tomorrow when those tests are
finished):


   v3.12-rc7  fe001e3de090e179f95d  
    
-9.3%   brickland1/micro/aim7/shared
+4.3%   
lkp-ib03/micro/aim7/fork_test
+2.2%   lkp-ib03/micro/aim7/shared
-2.6%   TOTAL aim7.2000.jobs-per-min

   v3.12-rc7  fe001e3de090e179f95d  
    
   204056.67   -23.5%156082.33  brickland1/micro/aim7/shared
79248.00  +144.3%193617.25  
lkp-ib03/micro/aim7/fork_test
   298355.33   -25.2%223084.67  lkp-ib03/micro/aim7/shared
   581660.00-1.5%572784.25  TOTAL 
time.involuntary_context_switches

   v3.12-rc7  fe001e3de090e179f95d  
    
22487.33-4.7% 21429.33  brickland1/micro/aim7/dbase
61412.67   -29.1% 43511.00  brickland1/micro/aim7/shared
   531142.00   -27.7%383818.75  
lkp-ib03/micro/aim7/fork_test
20158.33   -50.9%  9899.67  lkp-ib03/micro/aim7/shared
   635200.33   -27.8%458658.75  TOTAL vmstat.system.in

   v3.12-rc7  fe001e3de090e179f95d  
    
 6408.67-4.5%  6117.33  brickland1/micro/aim7/dbase
87856.00   -39.5% 53170.67  brickland1/micro/aim7/shared
  1043620.00   -28.0%751214.75  
lkp-ib03/micro/aim7/fork_test
47152.33   -38.0% 29245.33  lkp-ib03/micro/aim7/shared
  1185037.00   -29.1%839748.08  TOTAL vmstat.system.cs

   v3.12-rc7  fe001e3de090e179f95d  
    
13295.00   -10.0% 11960.00  brickland1/micro/aim7/dbase
  1901175.00   -35.5%   1226787.33  brickland1/micro/aim7/shared
13951.00-6.5% 13051.00  lkp-ib03/micro/aim7/dbase
239773251.17   -30.9% 165727820.75  
lkp-ib03/micro/aim7/fork_test
  1014933.67   -31.1%699259.67  lkp-ib03/micro/aim7/shared
242716605.83   -30.9% 167678878.75  TOTAL 
time.voluntary_context_switches

   v3.12-rc7  fe001e3de090e179f95d  
    
9.56-1.0% 9.46  brickland1/micro/aim7/dbase
   11.01   -10.1% 9.90  brickland1/micro/aim7/shared
   36.23   +15.3%41.77  
lkp-ib03/micro/aim7/fork_test
   10.51   -11.9% 9.26  lkp-ib03/micro/aim7/shared
   67.31+4.6%70.39  TOTAL iostat.cpu.system

   v3.12-rc7  fe001e3de090e179f95d  
    
   36.39-3.6%35.09  brickland1/micro/aim7/dbase
   34.97-8.1%32.13  brickland1/micro/aim7/shared
   20.34+6.7%21.70  lkp-ib03/micro/aim7/shared
   91.70-3.0%88.92  TOTAL boottime.dhcp

   v3.12-rc7  fe001e3de090e179f95d  
    
   60.00+6.7%64.00  brickland1/micro/aim7/shared
   60.83-9.2%55.25  
lkp-ib03/micro/aim7/fork_test
  120.83-1.3%   119.25  TOTAL vmstat.cpu.id

   v3.12-rc7  fe001e3de090e179f95d  
    
  345.50-1.1%   341.73  brickland1/micro/aim7/dbase
 3788.80   +11.5%  4223.15  
lkp-ib03/micro/aim7/fork_test
  108.29-7.1%   100.62  lkp-ib03/micro/aim7/shared
 4242.59   +10.0%  4665.50  TOTAL time.system_time

   v3.12-rc7  fe001e3de090e179f95d  
    
 7481.33-0.4%  7454.00  

Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 09:01 +0100, Ingo Molnar wrote:
 * Yuanhan Liu yuanhan@linux.intel.com wrote:
 
  Patch 1 turns locking the anon_vma's root to locking itself to let it be
  a per anon_vma lock, which would reduce contentions.
  
  In the same time, lock range becomes quite small then, which is bascially
  a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
  It's a patch made from Ingo, I just made some change to let it apply based 
  on
  patch 1.

Andrea's last input from this kind of conversion is that it cannot be
done (at least yet): https://lkml.org/lkml/2013/9/30/53

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Linus Torvalds
On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso davidl...@hp.com wrote:

 Andrea's last input from this kind of conversion is that it cannot be
 done (at least yet): https://lkml.org/lkml/2013/9/30/53

No, none of the invalidate_page users really need to sleep. If doing
this makes some people not do stupid sh*t, then that's all good. So at
least _that_ worry was a false alarm. We definitely don't want to
support crap in the VM, and sleeping during teardown is crap.

Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Michel Lespinasse
On Fri, Nov 1, 2013 at 11:09 AM, Linus Torvalds
torva...@linux-foundation.org wrote:
 On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso davidl...@hp.com wrote:

 Andrea's last input from this kind of conversion is that it cannot be
 done (at least yet): https://lkml.org/lkml/2013/9/30/53

 No, none of the invalidate_page users really need to sleep. If doing
 this makes some people not do stupid sh*t, then that's all good. So at
 least _that_ worry was a false alarm. We definitely don't want to
 support crap in the VM, and sleeping during teardown is crap.

Should copy Andrea on this. I talked with him during KS, and there are
no current in-tree users who are doing such sleeping; however there
are prospective users for networking (RDMA) or GPU stuff who want to
use this to let hardware directly copy data into user mappings. I'm
not too aware of the details, but my understanding is that we then
need to send the NIC and/or GPU some commands to tear down the
mapping, and that command is currently acknowledged with an interrupt,
which is where the lseepability requirement comes from. Andrea was
thinking about cooking up some scheme to dynamically change between
sleepable and non-sleepable locks at runtime depending on when such
drivers are used; this seems quite complicated to me but I haven't
heard of alternative plans for RDMA usage either.

-- 
Michel Walken Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Linus Torvalds
On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wal...@google.com wrote:

 Should copy Andrea on this. I talked with him during KS, and there are
 no current in-tree users who are doing such sleeping; however there
 are prospective users for networking (RDMA) or GPU stuff who want to
 use this to let hardware directly copy data into user mappings.

Tough.

I spoke up the first time this came up and I'll say the same thing
again: we're not screwing over the VM subsystem because some crazy
user might want to do crazy and stupid things that nobody sane cares
about.

The whole somebody might want to .. argument is just irrelevant.
Some people want to sleep in interrupt handlers too, or while holding
random spinlocks. Too bad. They don't get to, because doing that
results in problems for the rest of the system.

Our job in the kernel is to do the best job technically that we can.
And sometimes that very much involves saying No, you can't do that.

We have limitations in the kernel. The stack is of limited size. You
can't allocate arbitrarily sized memory. You must follow some very
strict rules.

If people can't handle that, then they can go cry to mommy, and go
back to writing user mode code. In the kernel, you have to live with
certain constraints that makes the kernel better.

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread KOSAKI Motohiro
(11/1/13 3:54 AM), Yuanhan Liu wrote:
 Patch 1 turns locking the anon_vma's root to locking itself to let it be
 a per anon_vma lock, which would reduce contentions.
 
 In the same time, lock range becomes quite small then, which is bascially
 a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t.
 It's a patch made from Ingo, I just made some change to let it apply based on
 patch 1.
 
 Patch 3 is from Peter. It was a diff, I edited it to be a patch ;)
 
 Here is the detailed changed stats with this patch applied. The test base is 
 v3.12-rc7,
 and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2.
 
 NOTE: both commits are compared to base v3.12-rc7.

I'd suggest you CCing linux-mm when posting mm patches.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote:
 On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote:
  
  * Yuanhan Liu yuanhan@linux.intel.com wrote:
  
Btw., another _really_ interesting comparison would be against 
the latest rwsem patches. Mind doing such a comparison?
   
   Sure. Where can I get it? Are they on some git tree?
  
  I've Cc:-ed Tim Chen who might be able to point you to the latest 
  version.
  
  The last on-lkml submission was in this thread:
  
Subject: [PATCH v8 0/9] rwsem performance optimizations
  
 
 Thanks.
 
 I queued bunchs of tests about one hour ago, and already got some
 results(If necessary, I can add more data tomorrow when those tests are
 finished):

What kind of system are you using to run these workloads on?

 
 
v3.12-rc7  fe001e3de090e179f95d  
     
 -9.3%   
 brickland1/micro/aim7/shared
 +4.3%   
 lkp-ib03/micro/aim7/fork_test
 +2.2%   lkp-ib03/micro/aim7/shared
 -2.6%   TOTAL 
 aim7.2000.jobs-per-min
 

Sorry if I'm missing something, but could you elaborate more on what
these percentages represent? Are they anon vma rwsem + optimistic
spinning patches vs anon vma rwlock?

Also, I see your running aim7, you might be interested in some of the
results I found when trying out Ingo's rwlock conversion patch on a
largish 80 core system: https://lkml.org/lkml/2013/9/29/280

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t

2013-11-01 Thread Davidlohr Bueso
On Fri, 2013-11-01 at 11:55 -0700, Linus Torvalds wrote:
 On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wal...@google.com wrote:
 
  Should copy Andrea on this. I talked with him during KS, and there are
  no current in-tree users who are doing such sleeping; however there
  are prospective users for networking (RDMA) or GPU stuff who want to
  use this to let hardware directly copy data into user mappings.
 
 Tough.
 
 I spoke up the first time this came up and I'll say the same thing
 again: we're not screwing over the VM subsystem because some crazy
 user might want to do crazy and stupid things that nobody sane cares
 about.
 
 The whole somebody might want to .. argument is just irrelevant.

Ok, I was under the impression that this was something already in the
kernel and hence too late to go back. Based on the results I'm
definitely in favor of the whole rwlock conversion.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/