Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Tue, Nov 05, 2013 at 11:10:43AM +0800, Yuanhan Liu wrote: > On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote: > > On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: > > > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: > > > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > > > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > > > > > > > > > * Yuanhan Liu wrote: > > > > > > > > > > > > > > Btw., another _really_ interesting comparison would be against > > > > > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > > > > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > > > > > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > > > > > version. > > > > > > > > > > > > The last on-lkml submission was in this thread: > > > > > > > > > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > I queued bunchs of tests about one hour ago, and already got some > > > > > results(If necessary, I can add more data tomorrow when those tests > > > > > are > > > > > finished): > > > > > > > > What kind of system are you using to run these workloads on? > > > > > > I queued jobs on 5 testboxes: > > > - brickland1: 120 core Ivybridge server > > > - lkp-ib03: 48 core Ivybridge server > > > - lkp-sb03: 32 core Sandybridge server > > > - lkp-nex04: 64 core NHM server > > > - lkp-a04:Atom server > > > > > > > > > > > > > > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > > > > > > > -9.3% > > > > > brickland1/micro/aim7/shared > > > > > +4.3% > > > > > lkp-ib03/micro/aim7/fork_test > > > > > +2.2% > > > > > lkp-ib03/micro/aim7/shared > > > > > -2.6% TOTAL > > > > > aim7.2000.jobs-per-min > > > > > > > > > > > > > Sorry if I'm missing something, but could you elaborate more on what > > > > these percentages represent? > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > > > -9.3% > > > brickland1/micro/aim7/shared > > > > > > > > > -2.6% TOTAL > > > aim7.2000.jobs-per-min > > > > > > The comparation base is v3.12-rc7, and we got 9.3 performance regression > > > at commit fe001e3de090e179f95d, which is the head of rwsem performance > > > optimizations patch set. > > > > Yunahan, thanks for the data. This I assume is with the entire rwsem > > v8 patchset. > > Yes, it is; 9 patches in total. > > > Any idea of the run variation on the workload? > > Your concern is right. The variation is quite big on the > brickland1/micro/aim7/shared > testcase. > >* - v3.12-rc7 >O - fe001e3de090e179f95d > > brickland1/micro/aim7/shared: aim7.2000.jobs-per-min > >32 +++ > | | >31 ++ .*. | > | ...| >30 ++ ... | > |... .. | >29 ++ ...| > | * >28 ++... | > | | >27 ++| > *.O >26 O+| > |O| >25 +++ > Tim, Please ignore this "regression", it disappears when I run that testcase 6 times both for v3.12-rc7 and fe001e3de090e179f95d. I guess 2000 users is a bit small for 120 core IVB server. I may try to increase the user count and do test again to see how it will behavior with your patches applied. Sorry for the inconvenience. --yliu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Tue, Nov 05, 2013 at 11:10:43AM +0800, Yuanhan Liu wrote: On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote: On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. Yunahan, thanks for the data. This I assume is with the entire rwsem v8 patchset. Yes, it is; 9 patches in total. Any idea of the run variation on the workload? Your concern is right. The variation is quite big on the brickland1/micro/aim7/shared testcase. * - v3.12-rc7 O - fe001e3de090e179f95d brickland1/micro/aim7/shared: aim7.2000.jobs-per-min 32 +++ | | 31 ++ .*. | | ...| 30 ++ ... | |... .. | 29 ++ ...| | * 28 ++... | | | 27 ++| *.O 26 O+| |O| 25 +++ Tim, Please ignore this regression, it disappears when I run that testcase 6 times both for v3.12-rc7 and fe001e3de090e179f95d. I guess 2000 users is a bit small for 120 core IVB server. I may try to increase the user count and do test again to see how it will behavior with your patches applied. Sorry for the inconvenience. --yliu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote: > On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: > > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: > > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > > > > > > > * Yuanhan Liu wrote: > > > > > > > > > > > > Btw., another _really_ interesting comparison would be against > > > > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > > > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > > > > version. > > > > > > > > > > The last on-lkml submission was in this thread: > > > > > > > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > > > > > > > > > > Thanks. > > > > > > > > I queued bunchs of tests about one hour ago, and already got some > > > > results(If necessary, I can add more data tomorrow when those tests are > > > > finished): > > > > > > What kind of system are you using to run these workloads on? > > > > I queued jobs on 5 testboxes: > > - brickland1: 120 core Ivybridge server > > - lkp-ib03: 48 core Ivybridge server > > - lkp-sb03: 32 core Sandybridge server > > - lkp-nex04: 64 core NHM server > > - lkp-a04:Atom server > > > > > > > > > > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > > > > > -9.3% > > > > brickland1/micro/aim7/shared > > > > +4.3% > > > > lkp-ib03/micro/aim7/fork_test > > > > +2.2% > > > > lkp-ib03/micro/aim7/shared > > > > -2.6% TOTAL > > > > aim7.2000.jobs-per-min > > > > > > > > > > Sorry if I'm missing something, but could you elaborate more on what > > > these percentages represent? > > > >v3.12-rc7 fe001e3de090e179f95d > > > > -9.3% > > brickland1/micro/aim7/shared > > > > > > -2.6% TOTAL > > aim7.2000.jobs-per-min > > > > The comparation base is v3.12-rc7, and we got 9.3 performance regression > > at commit fe001e3de090e179f95d, which is the head of rwsem performance > > optimizations patch set. > > Yunahan, thanks for the data. This I assume is with the entire rwsem > v8 patchset. Yes, it is; 9 patches in total. > Any idea of the run variation on the workload? Your concern is right. The variation is quite big on the brickland1/micro/aim7/shared testcase. * - v3.12-rc7 O - fe001e3de090e179f95d brickland1/micro/aim7/shared: aim7.2000.jobs-per-min 32 +++ | | 31 ++ .*. | | ...| 30 ++ ... | |... .. | 29 ++ ...| | * 28 ++... | | | 27 ++| *.O 26 O+| |O| 25 +++ --yliu > > > > "brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase: > > shared workfile of aim7. > > > > The last line tell what field we are comparing, and it's > > "aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7. > > > > > Are they anon vma rwsem + optimistic > > > spinning patches vs anon vma rwlock? > > > > I tested "[PATCH v8 0/9] rwsem performance optimizations" only. > > > > > > > > Also, I see your running aim7, you might be interested in some of the > > > results I found when trying out Ingo's rwlock conversion patch on a > > > largish 80 core system: https://lkml.org/lkml/2013/9/29/280 > > > > Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you > > can image and see from the data, aim7 benifit most from the anon_vma > > optimization stuff due to high contention of anon_vma lock. > >
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, 2013-11-04 at 17:44 -0800, Tim Chen wrote: > On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: > > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: > > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > > > > > > > * Yuanhan Liu wrote: > > > > > > > > > > > > Btw., another _really_ interesting comparison would be against > > > > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > > > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > > > > version. > > > > > > > > > > The last on-lkml submission was in this thread: > > > > > > > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > > > > > > > > > > Thanks. > > > > > > > > I queued bunchs of tests about one hour ago, and already got some > > > > results(If necessary, I can add more data tomorrow when those tests are > > > > finished): > > > > > > What kind of system are you using to run these workloads on? > > > > I queued jobs on 5 testboxes: > > - brickland1: 120 core Ivybridge server > > - lkp-ib03: 48 core Ivybridge server > > - lkp-sb03: 32 core Sandybridge server > > - lkp-nex04: 64 core NHM server > > - lkp-a04:Atom server > > > > > > > > > > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > > > > > -9.3% > > > > brickland1/micro/aim7/shared > > > > +4.3% > > > > lkp-ib03/micro/aim7/fork_test > > > > +2.2% > > > > lkp-ib03/micro/aim7/shared > > > > -2.6% TOTAL > > > > aim7.2000.jobs-per-min > > > > > > > > > > Sorry if I'm missing something, but could you elaborate more on what > > > these percentages represent? > > > >v3.12-rc7 fe001e3de090e179f95d > > > > -9.3% > > brickland1/micro/aim7/shared > > > > > > -2.6% TOTAL > > aim7.2000.jobs-per-min > > > > The comparation base is v3.12-rc7, and we got 9.3 performance regression > > at commit fe001e3de090e179f95d, which is the head of rwsem performance > > optimizations patch set. > > Yunahan, thanks for the data. This I assume is with the entire rwsem > v8 patchset. Any idea of the run variation on the workload? Yunhan, I haven't got a chance to make multiple runs to check the standard deviation. From the few runs I did, I got a 5.1% increase in performance for aim7 shared workload for the complete rwsem patchset on a similar machine that you are using. The patches are applied to the 3.12-rc7 and compared to the vanilla kernel. Thanks. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: > On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: > > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > > > > > * Yuanhan Liu wrote: > > > > > > > > > > Btw., another _really_ interesting comparison would be against > > > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > > > version. > > > > > > > > The last on-lkml submission was in this thread: > > > > > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > > > > > > > Thanks. > > > > > > I queued bunchs of tests about one hour ago, and already got some > > > results(If necessary, I can add more data tomorrow when those tests are > > > finished): > > > > What kind of system are you using to run these workloads on? > > I queued jobs on 5 testboxes: > - brickland1: 120 core Ivybridge server > - lkp-ib03: 48 core Ivybridge server > - lkp-sb03: 32 core Sandybridge server > - lkp-nex04: 64 core NHM server > - lkp-a04:Atom server > > > > > > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > > > -9.3% > > > brickland1/micro/aim7/shared > > > +4.3% > > > lkp-ib03/micro/aim7/fork_test > > > +2.2% > > > lkp-ib03/micro/aim7/shared > > > -2.6% TOTAL > > > aim7.2000.jobs-per-min > > > > > > > Sorry if I'm missing something, but could you elaborate more on what > > these percentages represent? > >v3.12-rc7 fe001e3de090e179f95d > > -9.3% > brickland1/micro/aim7/shared > > > -2.6% TOTAL > aim7.2000.jobs-per-min > > The comparation base is v3.12-rc7, and we got 9.3 performance regression > at commit fe001e3de090e179f95d, which is the head of rwsem performance > optimizations patch set. Yunahan, thanks for the data. This I assume is with the entire rwsem v8 patchset. Any idea of the run variation on the workload? Tim > > "brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase: > shared workfile of aim7. > > The last line tell what field we are comparing, and it's > "aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7. > > > Are they anon vma rwsem + optimistic > > spinning patches vs anon vma rwlock? > > I tested "[PATCH v8 0/9] rwsem performance optimizations" only. > > > > > Also, I see your running aim7, you might be interested in some of the > > results I found when trying out Ingo's rwlock conversion patch on a > > largish 80 core system: https://lkml.org/lkml/2013/9/29/280 > > Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you > can image and see from the data, aim7 benifit most from the anon_vma > optimization stuff due to high contention of anon_vma lock. > > Thanks. > > --yliu > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. Yunahan, thanks for the data. This I assume is with the entire rwsem v8 patchset. Any idea of the run variation on the workload? Tim brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase: shared workfile of aim7. The last line tell what field we are comparing, and it's aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7. Are they anon vma rwsem + optimistic spinning patches vs anon vma rwlock? I tested [PATCH v8 0/9] rwsem performance optimizations only. Also, I see your running aim7, you might be interested in some of the results I found when trying out Ingo's rwlock conversion patch on a largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you can image and see from the data, aim7 benifit most from the anon_vma optimization stuff due to high contention of anon_vma lock. Thanks. --yliu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, 2013-11-04 at 17:44 -0800, Tim Chen wrote: On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. Yunahan, thanks for the data. This I assume is with the entire rwsem v8 patchset. Any idea of the run variation on the workload? Yunhan, I haven't got a chance to make multiple runs to check the standard deviation. From the few runs I did, I got a 5.1% increase in performance for aim7 shared workload for the complete rwsem patchset on a similar machine that you are using. The patches are applied to the 3.12-rc7 and compared to the vanilla kernel. Thanks. Tim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Mon, Nov 04, 2013 at 05:44:00PM -0800, Tim Chen wrote: On Mon, 2013-11-04 at 11:59 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. Yunahan, thanks for the data. This I assume is with the entire rwsem v8 patchset. Yes, it is; 9 patches in total. Any idea of the run variation on the workload? Your concern is right. The variation is quite big on the brickland1/micro/aim7/shared testcase. * - v3.12-rc7 O - fe001e3de090e179f95d brickland1/micro/aim7/shared: aim7.2000.jobs-per-min 32 +++ | | 31 ++ .*. | | ...| 30 ++ ... | |... .. | 29 ++ ...| | * 28 ++... | | | 27 ++| *.O 26 O+| |O| 25 +++ --yliu brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase: shared workfile of aim7. The last line tell what field we are comparing, and it's aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7. Are they anon vma rwsem + optimistic spinning patches vs anon vma rwlock? I tested [PATCH v8 0/9] rwsem performance optimizations only. Also, I see your running aim7, you might be interested in some of the results I found when trying out Ingo's rwlock conversion patch on a largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you can image and see from the data, aim7 benifit most from the anon_vma optimization stuff due to high contention of anon_vma lock. Thanks. --yliu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: > On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > > > * Yuanhan Liu wrote: > > > > > > > > Btw., another _really_ interesting comparison would be against > > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > > version. > > > > > > The last on-lkml submission was in this thread: > > > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > > > > Thanks. > > > > I queued bunchs of tests about one hour ago, and already got some > > results(If necessary, I can add more data tomorrow when those tests are > > finished): > > What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server > > > > > > >v3.12-rc7 fe001e3de090e179f95d > > > > -9.3% > > brickland1/micro/aim7/shared > > +4.3% > > lkp-ib03/micro/aim7/fork_test > > +2.2% > > lkp-ib03/micro/aim7/shared > > -2.6% TOTAL > > aim7.2000.jobs-per-min > > > > Sorry if I'm missing something, but could you elaborate more on what > these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. "brickland1/micro/aim7/shared" tells the testbox(brickland1) and testcase: shared workfile of aim7. The last line tell what field we are comparing, and it's "aim7.2000.jobs-per-min" in this case. 2000 means 2000 users in aim7. > Are they anon vma rwsem + optimistic > spinning patches vs anon vma rwlock? I tested "[PATCH v8 0/9] rwsem performance optimizations" only. > > Also, I see your running aim7, you might be interested in some of the > results I found when trying out Ingo's rwlock conversion patch on a > largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you can image and see from the data, aim7 benifit most from the anon_vma optimization stuff due to high contention of anon_vma lock. Thanks. --yliu -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 08:15:13PM -0700, Davidlohr Bueso wrote: On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? I queued jobs on 5 testboxes: - brickland1: 120 core Ivybridge server - lkp-ib03: 48 core Ivybridge server - lkp-sb03: 32 core Sandybridge server - lkp-nex04: 64 core NHM server - lkp-a04:Atom server v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min The comparation base is v3.12-rc7, and we got 9.3 performance regression at commit fe001e3de090e179f95d, which is the head of rwsem performance optimizations patch set. brickland1/micro/aim7/shared tells the testbox(brickland1) and testcase: shared workfile of aim7. The last line tell what field we are comparing, and it's aim7.2000.jobs-per-min in this case. 2000 means 2000 users in aim7. Are they anon vma rwsem + optimistic spinning patches vs anon vma rwlock? I tested [PATCH v8 0/9] rwsem performance optimizations only. Also, I see your running aim7, you might be interested in some of the results I found when trying out Ingo's rwlock conversion patch on a largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Besides aim7, I also tested dbench, hackbench, netperf, pigz. And as you can image and see from the data, aim7 benifit most from the anon_vma optimization stuff due to high contention of anon_vma lock. Thanks. --yliu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 11:55 -0700, Linus Torvalds wrote: > On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wrote: > > > > Should copy Andrea on this. I talked with him during KS, and there are > > no current in-tree users who are doing such sleeping; however there > > are prospective users for networking (RDMA) or GPU stuff who want to > > use this to let hardware directly copy data into user mappings. > > Tough. > > I spoke up the first time this came up and I'll say the same thing > again: we're not screwing over the VM subsystem because some crazy > user might want to do crazy and stupid things that nobody sane cares > about. > > The whole "somebody might want to .." argument is just irrelevant. Ok, I was under the impression that this was something already in the kernel and hence "too late to go back". Based on the results I'm definitely in favor of the whole rwlock conversion. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: > On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > > > * Yuanhan Liu wrote: > > > > > > Btw., another _really_ interesting comparison would be against > > > > the latest rwsem patches. Mind doing such a comparison? > > > > > > Sure. Where can I get it? Are they on some git tree? > > > > I've Cc:-ed Tim Chen who might be able to point you to the latest > > version. > > > > The last on-lkml submission was in this thread: > > > > Subject: [PATCH v8 0/9] rwsem performance optimizations > > > > Thanks. > > I queued bunchs of tests about one hour ago, and already got some > results(If necessary, I can add more data tomorrow when those tests are > finished): What kind of system are you using to run these workloads on? > > >v3.12-rc7 fe001e3de090e179f95d > > -9.3% > brickland1/micro/aim7/shared > +4.3% > lkp-ib03/micro/aim7/fork_test > +2.2% lkp-ib03/micro/aim7/shared > -2.6% TOTAL > aim7.2000.jobs-per-min > Sorry if I'm missing something, but could you elaborate more on what these percentages represent? Are they anon vma rwsem + optimistic spinning patches vs anon vma rwlock? Also, I see your running aim7, you might be interested in some of the results I found when trying out Ingo's rwlock conversion patch on a largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
(11/1/13 3:54 AM), Yuanhan Liu wrote: > Patch 1 turns locking the anon_vma's root to locking itself to let it be > a per anon_vma lock, which would reduce contentions. > > In the same time, lock range becomes quite small then, which is bascially > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. > It's a patch made from Ingo, I just made some change to let it apply based on > patch 1. > > Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) > > Here is the detailed changed stats with this patch applied. The test base is > v3.12-rc7, > and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. > > NOTE: both commits are compared to base v3.12-rc7. I'd suggest you CCing linux-mm when posting mm patches. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wrote: > > Should copy Andrea on this. I talked with him during KS, and there are > no current in-tree users who are doing such sleeping; however there > are prospective users for networking (RDMA) or GPU stuff who want to > use this to let hardware directly copy data into user mappings. Tough. I spoke up the first time this came up and I'll say the same thing again: we're not screwing over the VM subsystem because some crazy user might want to do crazy and stupid things that nobody sane cares about. The whole "somebody might want to .." argument is just irrelevant. Some people want to sleep in interrupt handlers too, or while holding random spinlocks. Too bad. They don't get to, because doing that results in problems for the rest of the system. Our job in the kernel is to do the best job technically that we can. And sometimes that very much involves saying "No, you can't do that". We have limitations in the kernel. The stack is of limited size. You can't allocate arbitrarily sized memory. You must follow some very strict rules. If people can't handle that, then they can go cry to mommy, and go back to writing user mode code. In the kernel, you have to live with certain constraints that makes the kernel better. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 11:09 AM, Linus Torvalds wrote: > On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso wrote: >> >> Andrea's last input from this kind of conversion is that it cannot be >> done (at least yet): https://lkml.org/lkml/2013/9/30/53 > > No, none of the invalidate_page users really need to sleep. If doing > this makes some people not do stupid sh*t, then that's all good. So at > least _that_ worry was a false alarm. We definitely don't want to > support crap in the VM, and sleeping during teardown is crap. Should copy Andrea on this. I talked with him during KS, and there are no current in-tree users who are doing such sleeping; however there are prospective users for networking (RDMA) or GPU stuff who want to use this to let hardware directly copy data into user mappings. I'm not too aware of the details, but my understanding is that we then need to send the NIC and/or GPU some commands to tear down the mapping, and that command is currently acknowledged with an interrupt, which is where the lseepability requirement comes from. Andrea was thinking about cooking up some scheme to dynamically change between sleepable and non-sleepable locks at runtime depending on when such drivers are used; this seems quite complicated to me but I haven't heard of alternative plans for RDMA usage either. -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso wrote: > > Andrea's last input from this kind of conversion is that it cannot be > done (at least yet): https://lkml.org/lkml/2013/9/30/53 No, none of the invalidate_page users really need to sleep. If doing this makes some people not do stupid sh*t, then that's all good. So at least _that_ worry was a false alarm. We definitely don't want to support crap in the VM, and sleeping during teardown is crap. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 09:01 +0100, Ingo Molnar wrote: > * Yuanhan Liu wrote: > > > Patch 1 turns locking the anon_vma's root to locking itself to let it be > > a per anon_vma lock, which would reduce contentions. > > > > In the same time, lock range becomes quite small then, which is bascially > > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. > > It's a patch made from Ingo, I just made some change to let it apply based > > on > > patch 1. Andrea's last input from this kind of conversion is that it cannot be done (at least yet): https://lkml.org/lkml/2013/9/30/53 Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: > > * Yuanhan Liu wrote: > > > > Btw., another _really_ interesting comparison would be against > > > the latest rwsem patches. Mind doing such a comparison? > > > > Sure. Where can I get it? Are they on some git tree? > > I've Cc:-ed Tim Chen who might be able to point you to the latest > version. > > The last on-lkml submission was in this thread: > > Subject: [PATCH v8 0/9] rwsem performance optimizations > Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min v3.12-rc7 fe001e3de090e179f95d 204056.67 -23.5%156082.33 brickland1/micro/aim7/shared 79248.00 +144.3%193617.25 lkp-ib03/micro/aim7/fork_test 298355.33 -25.2%223084.67 lkp-ib03/micro/aim7/shared 581660.00-1.5%572784.25 TOTAL time.involuntary_context_switches v3.12-rc7 fe001e3de090e179f95d 22487.33-4.7% 21429.33 brickland1/micro/aim7/dbase 61412.67 -29.1% 43511.00 brickland1/micro/aim7/shared 531142.00 -27.7%383818.75 lkp-ib03/micro/aim7/fork_test 20158.33 -50.9% 9899.67 lkp-ib03/micro/aim7/shared 635200.33 -27.8%458658.75 TOTAL vmstat.system.in v3.12-rc7 fe001e3de090e179f95d 6408.67-4.5% 6117.33 brickland1/micro/aim7/dbase 87856.00 -39.5% 53170.67 brickland1/micro/aim7/shared 1043620.00 -28.0%751214.75 lkp-ib03/micro/aim7/fork_test 47152.33 -38.0% 29245.33 lkp-ib03/micro/aim7/shared 1185037.00 -29.1%839748.08 TOTAL vmstat.system.cs v3.12-rc7 fe001e3de090e179f95d 13295.00 -10.0% 11960.00 brickland1/micro/aim7/dbase 1901175.00 -35.5% 1226787.33 brickland1/micro/aim7/shared 13951.00-6.5% 13051.00 lkp-ib03/micro/aim7/dbase 239773251.17 -30.9% 165727820.75 lkp-ib03/micro/aim7/fork_test 1014933.67 -31.1%699259.67 lkp-ib03/micro/aim7/shared 242716605.83 -30.9% 167678878.75 TOTAL time.voluntary_context_switches v3.12-rc7 fe001e3de090e179f95d 9.56-1.0% 9.46 brickland1/micro/aim7/dbase 11.01 -10.1% 9.90 brickland1/micro/aim7/shared 36.23 +15.3%41.77 lkp-ib03/micro/aim7/fork_test 10.51 -11.9% 9.26 lkp-ib03/micro/aim7/shared 67.31+4.6%70.39 TOTAL iostat.cpu.system v3.12-rc7 fe001e3de090e179f95d 36.39-3.6%35.09 brickland1/micro/aim7/dbase 34.97-8.1%32.13 brickland1/micro/aim7/shared 20.34+6.7%21.70 lkp-ib03/micro/aim7/shared 91.70-3.0%88.92 TOTAL boottime.dhcp v3.12-rc7 fe001e3de090e179f95d 60.00+6.7%64.00 brickland1/micro/aim7/shared 60.83-9.2%55.25 lkp-ib03/micro/aim7/fork_test 120.83-1.3% 119.25 TOTAL vmstat.cpu.id v3.12-rc7 fe001e3de090e179f95d 345.50-1.1% 341.73 brickland1/micro/aim7/dbase 3788.80 +11.5% 4223.15 lkp-ib03/micro/aim7/fork_test 108.29-7.1% 100.62 lkp-ib03/micro/aim7/shared 4242.59 +10.0% 4665.50 TOTAL time.system_time v3.12-rc7 fe001e3de090e179f95d 7481.33-0.4% 7454.00
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
* Yuanhan Liu wrote: > > Btw., another _really_ interesting comparison would be against > > the latest rwsem patches. Mind doing such a comparison? > > Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 09:01:36AM +0100, Ingo Molnar wrote: > > * Yuanhan Liu wrote: > > > Patch 1 turns locking the anon_vma's root to locking itself to let it be > > a per anon_vma lock, which would reduce contentions. > > > > In the same time, lock range becomes quite small then, which is bascially > > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. > > It's a patch made from Ingo, I just made some change to let it apply based > > on > > patch 1. > > > > Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) > > > > Here is the detailed changed stats with this patch applied. The test base > > is v3.12-rc7, > > and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. > > > > NOTE: both commits are compared to base v3.12-rc7. > > > > 1c00bef768d4341afa7d e3e37183ee805f33e88f > > > >+35.0%+89.9% > > brickland1/micro/aim7/fork_test > >+28.4%+49.3% > > lkp-ib03/micro/aim7/fork_test > > +2.0% +2.7% > > lkp-ib03/micro/aim7/shared > > -0.4% +0.0% > > lkp-sb03/micro/aim7/dbase > >+16.4%+59.0% > > lkp-sb03/micro/aim7/fork_test > > +0.1% +0.3% > > lkp-sb03/micro/aim7/shared > > +2.2% +5.0% TOTAL > > aim7.2000.jobs-per-min > > Impressive! > > > 1c00bef768d4341afa7d e3e37183ee805f33e88f > > > >-25.9% 1008.55 -47.3% 717.39 > > brickland1/micro/aim7/fork_test > > -1.4% 641.19-3.4% 628.45 > > brickland1/micro/hackbench/1600%-process-pipe > > -1.0% 122.84+1.1% 125.36 > > brickland1/micro/netperf/120s-200%-UDP_RR > > +0.0% 121.29+0.2% 121.57 > > lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE > >-22.1% 351.41 -26.3% 332.54 > > lkp-ib03/micro/aim7/fork_test > > -1.9%31.33-2.6%31.11 > > lkp-ib03/micro/aim7/shared > > -0.4% 630.36+0.4% 635.05 > > lkp-ib03/micro/hackbench/1600%-process-socket > > -0.0% 612.62+1.8% 623.80 > > lkp-ib03/micro/hackbench/1600%-threads-socket > >-14.1% 340.30 -37.1% 249.26 > > lkp-sb03/micro/aim7/fork_test > > -0.1%41.31-0.3%41.22 > > lkp-sb03/micro/aim7/shared > > -0.0% 614.26+0.6% 617.81 > > lkp-sb03/micro/hackbench/1600%-process-socket > >-10.4% 4515.47 -18.2% 4123.55 TOTAL > > time.elapsed_time > > Here you scared me for a second with those negative percentages! :-) Aha.. > > > 1c00bef768d4341afa7d e3e37183ee805f33e88f > > > >+26.7%323386.33 -75.7% 61980.00 > > brickland1/micro/aim7/fork_test > >-22.9% 67734.00 -64.1% 31531.33 > > brickland1/micro/aim7/shared > > +0.4% 3303.67-0.8% 3264.33 > > brickland1/micro/dbench/100% > > +0.7% 1871483.67-0.4% 1850846.00 > > brickland1/micro/netperf/120s-200%-TCP_MAERTS > > -1.0%109553.00+0.4%111038.67 > > brickland1/micro/pigz/100% > > -0.7% 13600.67+0.1% 13718.67 > > lkp-a04/micro/netperf/120s-200%-TCP_CRR > > -4.6%995898.00 -85.2%154621.40 > > lkp-ib03/micro/aim7/fork_test > >-31.8% 32178.00 -50.3% 23442.67 > > lkp-ib03/micro/aim7/shared > > +1.1% 7466432.67-0.7% 7334831.67 > > lkp-ib03/micro/hackbench/1600%-threads-pipe > > +2.5% 1044936.33-1.3% 1006084.00 > > lkp-ib03/micro/hackbench/1600%-threads-socket > > -1.3% 5635979.00+0.2% 5721011.67 > > lkp-ib03/micro/netperf/120s-200%-TCP_RR > >-24.3% 42853.33 -56.8% 24484.33 > > lkp-nex04/micro/aim7/shared > >-23.3%754297.67 -83.2%165479.00 > > lkp-sb03/micro/aim7/fork_test > > -7.4% 21586.00 -24.1% 17698.33 > > lkp-sb03/micro/aim7/shared > > +1.1% 3838724.00+0.3% 3808206.67 > > lkp-sb03/micro/hackbench/1600%-process-pipe > > +0.8% 5143255.00-1.1% 5046716.67 > > lkp-sb03/micro/hackbench/1600%-threads-pipe > > +2.8%537048.67-0.8%518351.67 > > lkp-sb03/micro/hackbench/1600%-threads-socket > > +4.0% 50446.67-5.3% 45960.00 > > lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS > >-42.0% 52693.00 -26.4% 66849.67 > > lkp-sb03/micro/netperf/120s-200%-TCP_STREAM >
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
* Yuanhan Liu wrote: > Patch 1 turns locking the anon_vma's root to locking itself to let it be > a per anon_vma lock, which would reduce contentions. > > In the same time, lock range becomes quite small then, which is bascially > a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. > It's a patch made from Ingo, I just made some change to let it apply based on > patch 1. > > Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) > > Here is the detailed changed stats with this patch applied. The test base is > v3.12-rc7, > and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. > > NOTE: both commits are compared to base v3.12-rc7. > > 1c00bef768d4341afa7d e3e37183ee805f33e88f > >+35.0%+89.9% > brickland1/micro/aim7/fork_test >+28.4%+49.3% > lkp-ib03/micro/aim7/fork_test > +2.0% +2.7% > lkp-ib03/micro/aim7/shared > -0.4% +0.0% > lkp-sb03/micro/aim7/dbase >+16.4%+59.0% > lkp-sb03/micro/aim7/fork_test > +0.1% +0.3% > lkp-sb03/micro/aim7/shared > +2.2% +5.0% TOTAL > aim7.2000.jobs-per-min Impressive! > 1c00bef768d4341afa7d e3e37183ee805f33e88f > >-25.9% 1008.55 -47.3% 717.39 > brickland1/micro/aim7/fork_test > -1.4% 641.19-3.4% 628.45 > brickland1/micro/hackbench/1600%-process-pipe > -1.0% 122.84+1.1% 125.36 > brickland1/micro/netperf/120s-200%-UDP_RR > +0.0% 121.29+0.2% 121.57 > lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE >-22.1% 351.41 -26.3% 332.54 > lkp-ib03/micro/aim7/fork_test > -1.9%31.33-2.6%31.11 > lkp-ib03/micro/aim7/shared > -0.4% 630.36+0.4% 635.05 > lkp-ib03/micro/hackbench/1600%-process-socket > -0.0% 612.62+1.8% 623.80 > lkp-ib03/micro/hackbench/1600%-threads-socket >-14.1% 340.30 -37.1% 249.26 > lkp-sb03/micro/aim7/fork_test > -0.1%41.31-0.3%41.22 > lkp-sb03/micro/aim7/shared > -0.0% 614.26+0.6% 617.81 > lkp-sb03/micro/hackbench/1600%-process-socket >-10.4% 4515.47 -18.2% 4123.55 TOTAL time.elapsed_time Here you scared me for a second with those negative percentages! :-) > 1c00bef768d4341afa7d e3e37183ee805f33e88f > >+26.7%323386.33 -75.7% 61980.00 > brickland1/micro/aim7/fork_test >-22.9% 67734.00 -64.1% 31531.33 > brickland1/micro/aim7/shared > +0.4% 3303.67-0.8% 3264.33 > brickland1/micro/dbench/100% > +0.7% 1871483.67-0.4% 1850846.00 > brickland1/micro/netperf/120s-200%-TCP_MAERTS > -1.0%109553.00+0.4%111038.67 > brickland1/micro/pigz/100% > -0.7% 13600.67+0.1% 13718.67 > lkp-a04/micro/netperf/120s-200%-TCP_CRR > -4.6%995898.00 -85.2%154621.40 > lkp-ib03/micro/aim7/fork_test >-31.8% 32178.00 -50.3% 23442.67 > lkp-ib03/micro/aim7/shared > +1.1% 7466432.67-0.7% 7334831.67 > lkp-ib03/micro/hackbench/1600%-threads-pipe > +2.5% 1044936.33-1.3% 1006084.00 > lkp-ib03/micro/hackbench/1600%-threads-socket > -1.3% 5635979.00+0.2% 5721011.67 > lkp-ib03/micro/netperf/120s-200%-TCP_RR >-24.3% 42853.33 -56.8% 24484.33 > lkp-nex04/micro/aim7/shared >-23.3%754297.67 -83.2%165479.00 > lkp-sb03/micro/aim7/fork_test > -7.4% 21586.00 -24.1% 17698.33 > lkp-sb03/micro/aim7/shared > +1.1% 3838724.00+0.3% 3808206.67 > lkp-sb03/micro/hackbench/1600%-process-pipe > +0.8% 5143255.00-1.1% 5046716.67 > lkp-sb03/micro/hackbench/1600%-threads-pipe > +2.8%537048.67-0.8%518351.67 > lkp-sb03/micro/hackbench/1600%-threads-socket > +4.0% 50446.67-5.3% 45960.00 > lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS >-42.0% 52693.00 -26.4% 66849.67 > lkp-sb03/micro/netperf/120s-200%-TCP_STREAM > -0.6% 28005389.67-7.7% 26006116.73 TOTAL vmstat.system.cs looks like a win all across, with a few below 1% regressions what might be statistical outliners - it's hard to tell without a stddev column ... > 1c00bef768d4341afa7d e3e37183ee805f33e88f >
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
* Yuanhan Liu yuanhan@linux.intel.com wrote: Patch 1 turns locking the anon_vma's root to locking itself to let it be a per anon_vma lock, which would reduce contentions. In the same time, lock range becomes quite small then, which is bascially a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. It's a patch made from Ingo, I just made some change to let it apply based on patch 1. Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) Here is the detailed changed stats with this patch applied. The test base is v3.12-rc7, and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. NOTE: both commits are compared to base v3.12-rc7. 1c00bef768d4341afa7d e3e37183ee805f33e88f +35.0%+89.9% brickland1/micro/aim7/fork_test +28.4%+49.3% lkp-ib03/micro/aim7/fork_test +2.0% +2.7% lkp-ib03/micro/aim7/shared -0.4% +0.0% lkp-sb03/micro/aim7/dbase +16.4%+59.0% lkp-sb03/micro/aim7/fork_test +0.1% +0.3% lkp-sb03/micro/aim7/shared +2.2% +5.0% TOTAL aim7.2000.jobs-per-min Impressive! 1c00bef768d4341afa7d e3e37183ee805f33e88f -25.9% 1008.55 -47.3% 717.39 brickland1/micro/aim7/fork_test -1.4% 641.19-3.4% 628.45 brickland1/micro/hackbench/1600%-process-pipe -1.0% 122.84+1.1% 125.36 brickland1/micro/netperf/120s-200%-UDP_RR +0.0% 121.29+0.2% 121.57 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE -22.1% 351.41 -26.3% 332.54 lkp-ib03/micro/aim7/fork_test -1.9%31.33-2.6%31.11 lkp-ib03/micro/aim7/shared -0.4% 630.36+0.4% 635.05 lkp-ib03/micro/hackbench/1600%-process-socket -0.0% 612.62+1.8% 623.80 lkp-ib03/micro/hackbench/1600%-threads-socket -14.1% 340.30 -37.1% 249.26 lkp-sb03/micro/aim7/fork_test -0.1%41.31-0.3%41.22 lkp-sb03/micro/aim7/shared -0.0% 614.26+0.6% 617.81 lkp-sb03/micro/hackbench/1600%-process-socket -10.4% 4515.47 -18.2% 4123.55 TOTAL time.elapsed_time Here you scared me for a second with those negative percentages! :-) 1c00bef768d4341afa7d e3e37183ee805f33e88f +26.7%323386.33 -75.7% 61980.00 brickland1/micro/aim7/fork_test -22.9% 67734.00 -64.1% 31531.33 brickland1/micro/aim7/shared +0.4% 3303.67-0.8% 3264.33 brickland1/micro/dbench/100% +0.7% 1871483.67-0.4% 1850846.00 brickland1/micro/netperf/120s-200%-TCP_MAERTS -1.0%109553.00+0.4%111038.67 brickland1/micro/pigz/100% -0.7% 13600.67+0.1% 13718.67 lkp-a04/micro/netperf/120s-200%-TCP_CRR -4.6%995898.00 -85.2%154621.40 lkp-ib03/micro/aim7/fork_test -31.8% 32178.00 -50.3% 23442.67 lkp-ib03/micro/aim7/shared +1.1% 7466432.67-0.7% 7334831.67 lkp-ib03/micro/hackbench/1600%-threads-pipe +2.5% 1044936.33-1.3% 1006084.00 lkp-ib03/micro/hackbench/1600%-threads-socket -1.3% 5635979.00+0.2% 5721011.67 lkp-ib03/micro/netperf/120s-200%-TCP_RR -24.3% 42853.33 -56.8% 24484.33 lkp-nex04/micro/aim7/shared -23.3%754297.67 -83.2%165479.00 lkp-sb03/micro/aim7/fork_test -7.4% 21586.00 -24.1% 17698.33 lkp-sb03/micro/aim7/shared +1.1% 3838724.00+0.3% 3808206.67 lkp-sb03/micro/hackbench/1600%-process-pipe +0.8% 5143255.00-1.1% 5046716.67 lkp-sb03/micro/hackbench/1600%-threads-pipe +2.8%537048.67-0.8%518351.67 lkp-sb03/micro/hackbench/1600%-threads-socket +4.0% 50446.67-5.3% 45960.00 lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS -42.0% 52693.00 -26.4% 66849.67 lkp-sb03/micro/netperf/120s-200%-TCP_STREAM -0.6% 28005389.67-7.7% 26006116.73 TOTAL vmstat.system.cs looks like a win all across, with a few below 1% regressions what might be statistical outliners - it's hard to tell without a stddev column ... 1c00bef768d4341afa7d e3e37183ee805f33e88f -4.7%
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 09:01:36AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Patch 1 turns locking the anon_vma's root to locking itself to let it be a per anon_vma lock, which would reduce contentions. In the same time, lock range becomes quite small then, which is bascially a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. It's a patch made from Ingo, I just made some change to let it apply based on patch 1. Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) Here is the detailed changed stats with this patch applied. The test base is v3.12-rc7, and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. NOTE: both commits are compared to base v3.12-rc7. 1c00bef768d4341afa7d e3e37183ee805f33e88f +35.0%+89.9% brickland1/micro/aim7/fork_test +28.4%+49.3% lkp-ib03/micro/aim7/fork_test +2.0% +2.7% lkp-ib03/micro/aim7/shared -0.4% +0.0% lkp-sb03/micro/aim7/dbase +16.4%+59.0% lkp-sb03/micro/aim7/fork_test +0.1% +0.3% lkp-sb03/micro/aim7/shared +2.2% +5.0% TOTAL aim7.2000.jobs-per-min Impressive! 1c00bef768d4341afa7d e3e37183ee805f33e88f -25.9% 1008.55 -47.3% 717.39 brickland1/micro/aim7/fork_test -1.4% 641.19-3.4% 628.45 brickland1/micro/hackbench/1600%-process-pipe -1.0% 122.84+1.1% 125.36 brickland1/micro/netperf/120s-200%-UDP_RR +0.0% 121.29+0.2% 121.57 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE -22.1% 351.41 -26.3% 332.54 lkp-ib03/micro/aim7/fork_test -1.9%31.33-2.6%31.11 lkp-ib03/micro/aim7/shared -0.4% 630.36+0.4% 635.05 lkp-ib03/micro/hackbench/1600%-process-socket -0.0% 612.62+1.8% 623.80 lkp-ib03/micro/hackbench/1600%-threads-socket -14.1% 340.30 -37.1% 249.26 lkp-sb03/micro/aim7/fork_test -0.1%41.31-0.3%41.22 lkp-sb03/micro/aim7/shared -0.0% 614.26+0.6% 617.81 lkp-sb03/micro/hackbench/1600%-process-socket -10.4% 4515.47 -18.2% 4123.55 TOTAL time.elapsed_time Here you scared me for a second with those negative percentages! :-) Aha.. 1c00bef768d4341afa7d e3e37183ee805f33e88f +26.7%323386.33 -75.7% 61980.00 brickland1/micro/aim7/fork_test -22.9% 67734.00 -64.1% 31531.33 brickland1/micro/aim7/shared +0.4% 3303.67-0.8% 3264.33 brickland1/micro/dbench/100% +0.7% 1871483.67-0.4% 1850846.00 brickland1/micro/netperf/120s-200%-TCP_MAERTS -1.0%109553.00+0.4%111038.67 brickland1/micro/pigz/100% -0.7% 13600.67+0.1% 13718.67 lkp-a04/micro/netperf/120s-200%-TCP_CRR -4.6%995898.00 -85.2%154621.40 lkp-ib03/micro/aim7/fork_test -31.8% 32178.00 -50.3% 23442.67 lkp-ib03/micro/aim7/shared +1.1% 7466432.67-0.7% 7334831.67 lkp-ib03/micro/hackbench/1600%-threads-pipe +2.5% 1044936.33-1.3% 1006084.00 lkp-ib03/micro/hackbench/1600%-threads-socket -1.3% 5635979.00+0.2% 5721011.67 lkp-ib03/micro/netperf/120s-200%-TCP_RR -24.3% 42853.33 -56.8% 24484.33 lkp-nex04/micro/aim7/shared -23.3%754297.67 -83.2%165479.00 lkp-sb03/micro/aim7/fork_test -7.4% 21586.00 -24.1% 17698.33 lkp-sb03/micro/aim7/shared +1.1% 3838724.00+0.3% 3808206.67 lkp-sb03/micro/hackbench/1600%-process-pipe +0.8% 5143255.00-1.1% 5046716.67 lkp-sb03/micro/hackbench/1600%-threads-pipe +2.8%537048.67-0.8%518351.67 lkp-sb03/micro/hackbench/1600%-threads-socket +4.0% 50446.67-5.3% 45960.00 lkp-sb03/micro/netperf/120s-200%-TCP_MAERTS -42.0% 52693.00 -26.4% 66849.67 lkp-sb03/micro/netperf/120s-200%-TCP_STREAM -0.6% 28005389.67-7.7% 26006116.73 TOTAL vmstat.system.cs looks like a win all across, with a few below 1% regressions what might be statistical outliners
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
* Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min v3.12-rc7 fe001e3de090e179f95d 204056.67 -23.5%156082.33 brickland1/micro/aim7/shared 79248.00 +144.3%193617.25 lkp-ib03/micro/aim7/fork_test 298355.33 -25.2%223084.67 lkp-ib03/micro/aim7/shared 581660.00-1.5%572784.25 TOTAL time.involuntary_context_switches v3.12-rc7 fe001e3de090e179f95d 22487.33-4.7% 21429.33 brickland1/micro/aim7/dbase 61412.67 -29.1% 43511.00 brickland1/micro/aim7/shared 531142.00 -27.7%383818.75 lkp-ib03/micro/aim7/fork_test 20158.33 -50.9% 9899.67 lkp-ib03/micro/aim7/shared 635200.33 -27.8%458658.75 TOTAL vmstat.system.in v3.12-rc7 fe001e3de090e179f95d 6408.67-4.5% 6117.33 brickland1/micro/aim7/dbase 87856.00 -39.5% 53170.67 brickland1/micro/aim7/shared 1043620.00 -28.0%751214.75 lkp-ib03/micro/aim7/fork_test 47152.33 -38.0% 29245.33 lkp-ib03/micro/aim7/shared 1185037.00 -29.1%839748.08 TOTAL vmstat.system.cs v3.12-rc7 fe001e3de090e179f95d 13295.00 -10.0% 11960.00 brickland1/micro/aim7/dbase 1901175.00 -35.5% 1226787.33 brickland1/micro/aim7/shared 13951.00-6.5% 13051.00 lkp-ib03/micro/aim7/dbase 239773251.17 -30.9% 165727820.75 lkp-ib03/micro/aim7/fork_test 1014933.67 -31.1%699259.67 lkp-ib03/micro/aim7/shared 242716605.83 -30.9% 167678878.75 TOTAL time.voluntary_context_switches v3.12-rc7 fe001e3de090e179f95d 9.56-1.0% 9.46 brickland1/micro/aim7/dbase 11.01 -10.1% 9.90 brickland1/micro/aim7/shared 36.23 +15.3%41.77 lkp-ib03/micro/aim7/fork_test 10.51 -11.9% 9.26 lkp-ib03/micro/aim7/shared 67.31+4.6%70.39 TOTAL iostat.cpu.system v3.12-rc7 fe001e3de090e179f95d 36.39-3.6%35.09 brickland1/micro/aim7/dbase 34.97-8.1%32.13 brickland1/micro/aim7/shared 20.34+6.7%21.70 lkp-ib03/micro/aim7/shared 91.70-3.0%88.92 TOTAL boottime.dhcp v3.12-rc7 fe001e3de090e179f95d 60.00+6.7%64.00 brickland1/micro/aim7/shared 60.83-9.2%55.25 lkp-ib03/micro/aim7/fork_test 120.83-1.3% 119.25 TOTAL vmstat.cpu.id v3.12-rc7 fe001e3de090e179f95d 345.50-1.1% 341.73 brickland1/micro/aim7/dbase 3788.80 +11.5% 4223.15 lkp-ib03/micro/aim7/fork_test 108.29-7.1% 100.62 lkp-ib03/micro/aim7/shared 4242.59 +10.0% 4665.50 TOTAL time.system_time v3.12-rc7 fe001e3de090e179f95d 7481.33-0.4% 7454.00
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 09:01 +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Patch 1 turns locking the anon_vma's root to locking itself to let it be a per anon_vma lock, which would reduce contentions. In the same time, lock range becomes quite small then, which is bascially a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. It's a patch made from Ingo, I just made some change to let it apply based on patch 1. Andrea's last input from this kind of conversion is that it cannot be done (at least yet): https://lkml.org/lkml/2013/9/30/53 Thanks, Davidlohr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso davidl...@hp.com wrote: Andrea's last input from this kind of conversion is that it cannot be done (at least yet): https://lkml.org/lkml/2013/9/30/53 No, none of the invalidate_page users really need to sleep. If doing this makes some people not do stupid sh*t, then that's all good. So at least _that_ worry was a false alarm. We definitely don't want to support crap in the VM, and sleeping during teardown is crap. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 11:09 AM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, Nov 1, 2013 at 10:49 AM, Davidlohr Bueso davidl...@hp.com wrote: Andrea's last input from this kind of conversion is that it cannot be done (at least yet): https://lkml.org/lkml/2013/9/30/53 No, none of the invalidate_page users really need to sleep. If doing this makes some people not do stupid sh*t, then that's all good. So at least _that_ worry was a false alarm. We definitely don't want to support crap in the VM, and sleeping during teardown is crap. Should copy Andrea on this. I talked with him during KS, and there are no current in-tree users who are doing such sleeping; however there are prospective users for networking (RDMA) or GPU stuff who want to use this to let hardware directly copy data into user mappings. I'm not too aware of the details, but my understanding is that we then need to send the NIC and/or GPU some commands to tear down the mapping, and that command is currently acknowledged with an interrupt, which is where the lseepability requirement comes from. Andrea was thinking about cooking up some scheme to dynamically change between sleepable and non-sleepable locks at runtime depending on when such drivers are used; this seems quite complicated to me but I haven't heard of alternative plans for RDMA usage either. -- Michel Walken Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wal...@google.com wrote: Should copy Andrea on this. I talked with him during KS, and there are no current in-tree users who are doing such sleeping; however there are prospective users for networking (RDMA) or GPU stuff who want to use this to let hardware directly copy data into user mappings. Tough. I spoke up the first time this came up and I'll say the same thing again: we're not screwing over the VM subsystem because some crazy user might want to do crazy and stupid things that nobody sane cares about. The whole somebody might want to .. argument is just irrelevant. Some people want to sleep in interrupt handlers too, or while holding random spinlocks. Too bad. They don't get to, because doing that results in problems for the rest of the system. Our job in the kernel is to do the best job technically that we can. And sometimes that very much involves saying No, you can't do that. We have limitations in the kernel. The stack is of limited size. You can't allocate arbitrarily sized memory. You must follow some very strict rules. If people can't handle that, then they can go cry to mommy, and go back to writing user mode code. In the kernel, you have to live with certain constraints that makes the kernel better. Linus -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
(11/1/13 3:54 AM), Yuanhan Liu wrote: Patch 1 turns locking the anon_vma's root to locking itself to let it be a per anon_vma lock, which would reduce contentions. In the same time, lock range becomes quite small then, which is bascially a call of anon_vma_interval_tree_insert(). Patch 2 turn rwsem to rwlock_t. It's a patch made from Ingo, I just made some change to let it apply based on patch 1. Patch 3 is from Peter. It was a diff, I edited it to be a patch ;) Here is the detailed changed stats with this patch applied. The test base is v3.12-rc7, and 1c00bef768d4341afa7d is patch 1, e3e37183ee805f33e88f is patch 2. NOTE: both commits are compared to base v3.12-rc7. I'd suggest you CCing linux-mm when posting mm patches. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 18:16 +0800, Yuanhan Liu wrote: On Fri, Nov 01, 2013 at 09:21:46AM +0100, Ingo Molnar wrote: * Yuanhan Liu yuanhan@linux.intel.com wrote: Btw., another _really_ interesting comparison would be against the latest rwsem patches. Mind doing such a comparison? Sure. Where can I get it? Are they on some git tree? I've Cc:-ed Tim Chen who might be able to point you to the latest version. The last on-lkml submission was in this thread: Subject: [PATCH v8 0/9] rwsem performance optimizations Thanks. I queued bunchs of tests about one hour ago, and already got some results(If necessary, I can add more data tomorrow when those tests are finished): What kind of system are you using to run these workloads on? v3.12-rc7 fe001e3de090e179f95d -9.3% brickland1/micro/aim7/shared +4.3% lkp-ib03/micro/aim7/fork_test +2.2% lkp-ib03/micro/aim7/shared -2.6% TOTAL aim7.2000.jobs-per-min Sorry if I'm missing something, but could you elaborate more on what these percentages represent? Are they anon vma rwsem + optimistic spinning patches vs anon vma rwlock? Also, I see your running aim7, you might be interested in some of the results I found when trying out Ingo's rwlock conversion patch on a largish 80 core system: https://lkml.org/lkml/2013/9/29/280 Thanks, Davidlohr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/4] per anon_vma lock and turn anon_vma rwsem lock to rwlock_t
On Fri, 2013-11-01 at 11:55 -0700, Linus Torvalds wrote: On Fri, Nov 1, 2013 at 11:47 AM, Michel Lespinasse wal...@google.com wrote: Should copy Andrea on this. I talked with him during KS, and there are no current in-tree users who are doing such sleeping; however there are prospective users for networking (RDMA) or GPU stuff who want to use this to let hardware directly copy data into user mappings. Tough. I spoke up the first time this came up and I'll say the same thing again: we're not screwing over the VM subsystem because some crazy user might want to do crazy and stupid things that nobody sane cares about. The whole somebody might want to .. argument is just irrelevant. Ok, I was under the impression that this was something already in the kernel and hence too late to go back. Based on the results I'm definitely in favor of the whole rwlock conversion. Thanks, Davidlohr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/