subject:"Re\: \[PATCH 00\/27\] Latest numa\/core release, v16"

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-22 Thread Linus Torvalds

On Wed, Nov 21, 2012 at 7:10 AM, Ingo Molnar wrote: > > Because scalability slowdowns are often non-linear. Only if you hold locks or have other non-cpu-private activity. Which the vsyscall code really shouldn't have. That said, it might be worth removing the "prefetchw(&mm->mmap_sem)" from the

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-22 Thread Ingo Molnar

* Linus Torvalds wrote: > On Wed, Nov 21, 2012 at 7:10 AM, Ingo Molnar wrote: > > > > Because scalability slowdowns are often non-linear. > > Only if you hold locks or have other non-cpu-private activity. > > Which the vsyscall code really shouldn't have. Yeah, the faults accessing any sort

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-22 Thread David Rientjes

On Wed, 21 Nov 2012, Ingo Molnar wrote: > Btw., what I did was to simply look at David's profile on the > regressing system and I compared it to the profile I got on a > pretty similar (but unfortunately not identical and not > regressing) system. I saw 3 differences: > > - the numa emulation

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Mel Gorman

On Wed, Nov 21, 2012 at 08:37:12PM +0100, Andrea Arcangeli wrote: > Hi, > > On Wed, Nov 21, 2012 at 10:38:59AM +, Mel Gorman wrote: > > HACKBENCH PIPES > > 3.7.0 3.7.0 3.7.0 > >3.7.0 3.7.0 > >

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Andrea Arcangeli

Hi, On Wed, Nov 21, 2012 at 10:38:59AM +, Mel Gorman wrote: > HACKBENCH PIPES > 3.7.0 3.7.0 3.7.0 > 3.7.0 3.7.0 >rc6-stats-v4r12 rc6-schednuma-v16r2rc6-autonuma-v28fastr3 >rc6-mo

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Ingo Molnar

* Linus Torvalds wrote: > [...] And not look at vsyscalls or anything, but look at what > schednuma does wrong! I have started 4 independent lines of inquiry to figure out what's wrong on David's system, and all four are in the category of 'what does our tree do to cause a regression': -

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Rik van Riel

On 11/21/2012 12:02 PM, Linus Torvalds wrote: The same is true of all your arguments about Mel's numbers wrt THP etc. Your arguments are misleading - either intentionally, of because you yourself didn't think things through. For schednuma, it's not enough to be par with mainline with THP off - t

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Ingo Molnar

* Ingo Molnar wrote: > So because I did not have an old-glibc system like David's, I > did not know the actual page fault rate. If it is high enough > then nonlinear effects might cause such effects. > > This is an entirely valid line of inquiry IMO. Btw., when comparing against 'mainline' I

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Ingo Molnar

* Ingo Molnar wrote: > This is an entirely valid line of inquiry IMO. Btw., what I did was to simply look at David's profile on the regressing system and I compared it to the profile I got on a pretty similar (but unfortunately not identical and not regressing) system. I saw 3 differences:

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Ingo Molnar

* Linus Torvalds wrote: > On Mon, Nov 19, 2012 at 11:06 PM, Ingo Molnar wrote: > > > > Oh, finally a clue: you seem to have vsyscall emulation > > overhead! > > Ingo, stop it already! > > This is *exactly* the kind of "blame everybody else than > yourself" behavior that I was talking about e

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Linus Torvalds

On Mon, Nov 19, 2012 at 11:06 PM, Ingo Molnar wrote: > > Oh, finally a clue: you seem to have vsyscall emulation > overhead! Ingo, stop it already! This is *exactly* the kind of "blame everybody else than yourself" behavior that I was talking about earlier. There have been an absolute *shitload

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Mel Gorman

On Mon, Nov 19, 2012 at 11:37:01PM -0800, David Rientjes wrote: > On Tue, 20 Nov 2012, Ingo Molnar wrote: > > > No doubt numa/core should not regress with THP off or on and > > I'll fix that. > > > > As a background, here's how SPECjbb gets slower on mainline > > (v3.7-rc6) if you boot Mel's ke

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-21 Thread Mel Gorman

On Mon, Nov 19, 2012 at 07:41:16PM -0500, Rik van Riel wrote: > On 11/19/2012 06:00 PM, Mel Gorman wrote: > >On Mon, Nov 19, 2012 at 11:36:04PM +0100, Ingo Molnar wrote: > >> > >>* Mel Gorman wrote: > >> > >>>Ok. > >>> > >>>In response to one of your later questions, I found that I had > >>>in fac

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Mel Gorman

On Tue, Nov 20, 2012 at 11:40:53AM +0100, Ingo Molnar wrote: > > btw., mind sending me a fuller/longer profile than the one > you've sent before? In particular does your system have any > vsyscall emulation page fault overhead? > I can't, the results for specjbb got trashed after I moved to 3.

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Mel Gorman

On Tue, Nov 20, 2012 at 10:20:10AM +, Mel Gorman wrote: > I've added two extra configuration files to run specjbb single and multi > JVMs with THP enabled. It takes about 1.5 to 2 hours to complete a single 1.5 to 2 hours if running to the full set of warehouses required for a compliant run. C

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Ingo Molnar

btw., mind sending me a fuller/longer profile than the one you've sent before? In particular does your system have any vsyscall emulation page fault overhead? If yes, does the patch below change anything for you? Thanks, Ingo > Subject: x86/vsyscall: Add Kconfig optio

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Mel Gorman

> > Ingo, stop doing this kind of crap. > > > > Let's make it clear: if the NUMA patches continue to regress > > performance for reasonable loads (and that very much includes > > "no THP") then they won't be merged. > > > > You seem to be in total denial. Every time Mel sends out > > results t

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Ingo Molnar

* David Rientjes wrote: > On Tue, 20 Nov 2012, Ingo Molnar wrote: > > > > This happened to be an Opteron (but not 83xx series), 2.4Ghz. > > > > Ok - roughly which family/model from /proc/cpuinfo? > > It's close enough, it's 23xx. Ok - which family/model number in /proc/cpuinfo? I'm asking

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread David Rientjes

On Tue, 20 Nov 2012, Ingo Molnar wrote: > > This happened to be an Opteron (but not 83xx series), 2.4Ghz. > > Ok - roughly which family/model from /proc/cpuinfo? > It's close enough, it's 23xx. > > It's perf top -U, the benchmark itself was unchanged so I > > didn't think it was interesting

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread David Rientjes

On Tue, 20 Nov 2012, Ingo Molnar wrote: > > I confirm that numa/core regresses significantly more without > > thp than the 6.3% regression I reported with thp in terms of > > throughput on the same system. numa/core at 01aa90068b12 > > ("sched: Use the best-buddy 'ideal cpu' in balancing > >

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-20 Thread Ingo Molnar

* David Rientjes wrote: > I confirm that numa/core regresses significantly more without > thp than the 6.3% regression I reported with thp in terms of > throughput on the same system. numa/core at 01aa90068b12 > ("sched: Use the best-buddy 'ideal cpu' in balancing > decisions") had 99389.49

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Paul Turner

On Mon, Nov 19, 2012 at 11:44 PM, Ingo Molnar wrote: > > * David Rientjes wrote: > >> On Tue, 20 Nov 2012, Ingo Molnar wrote: >> >> > > > numa/core at ec05a2311c35 ("Merge branch 'sched/urgent' into >> > > > sched/core") had an average throughput of 136918.34 >> > > > SPECjbb2005 bops, which is a

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* David Rientjes wrote: > This is in comparison to my earlier perftop results which were with thp > enabled. Keep in mind that this system has a NUMA configuration of > > $ cat /sys/devices/system/node/node*/distance > 10 20 20 30 > 20 10 20 20 > 20 20 10 20 > 30 20 20 10 You could check wh

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* David Rientjes wrote: > On Tue, 20 Nov 2012, Ingo Molnar wrote: > > > > > numa/core at ec05a2311c35 ("Merge branch 'sched/urgent' into > > > > sched/core") had an average throughput of 136918.34 > > > > SPECjbb2005 bops, which is a 6.3% regression. > > > > > > perftop during the run on num

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread David Rientjes

On Tue, 20 Nov 2012, Ingo Molnar wrote: > No doubt numa/core should not regress with THP off or on and > I'll fix that. > > As a background, here's how SPECjbb gets slower on mainline > (v3.7-rc6) if you boot Mel's kernel config and turn THP forcibly > off: > > (avg: 502395 ops/sec) > (avg

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* Linus Torvalds wrote: > On Mon, Nov 19, 2012 at 12:36 PM, Ingo Molnar wrote: > > > > Hugepages is a must for most forms of NUMA/HPC. This alone > > questions the relevance of most of your prior numa/core testing > > results. I now have to strongly dispute your other conclusions > > as well. >

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread David Rientjes

On Tue, 20 Nov 2012, Ingo Molnar wrote: > > > numa/core at ec05a2311c35 ("Merge branch 'sched/urgent' into > > > sched/core") had an average throughput of 136918.34 > > > SPECjbb2005 bops, which is a 6.3% regression. > > > > perftop during the run on numa/core at 01aa90068b12 ("sched: > > Use

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* David Rientjes wrote: > > numa/core at ec05a2311c35 ("Merge branch 'sched/urgent' into > > sched/core") had an average throughput of 136918.34 > > SPECjbb2005 bops, which is a 6.3% regression. > > perftop during the run on numa/core at 01aa90068b12 ("sched: > Use the best-buddy 'ideal cpu'

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread David Rientjes

On Mon, 19 Nov 2012, David Rientjes wrote: > I confirm that SPECjbb2005 1.07 -Xmx4g regresses in terms of throughput on > my 16-way, 4 node system with 32GB of memory using 16 warehouses and 240 > measurement seconds. I averaged the throughput for five runs on each > kernel. > > Java(TM) SE R

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Linus Torvalds

On Mon, Nov 19, 2012 at 12:36 PM, Ingo Molnar wrote: > > Hugepages is a must for most forms of NUMA/HPC. This alone > questions the relevance of most of your prior numa/core testing > results. I now have to strongly dispute your other conclusions > as well. Ingo, stop doing this kind of crap. Le

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread David Rientjes

On Mon, 19 Nov 2012, Mel Gorman wrote: > I was not able to run a full sets of tests today as I was distracted so > all I have is a multi JVM comparison. I'll keep it shorter than average > > 3.7.0 3.7.0 > rc5-stats-v4r2 rc5-schednuma-v1

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Rik van Riel

On 11/19/2012 06:00 PM, Mel Gorman wrote: On Mon, Nov 19, 2012 at 11:36:04PM +0100, Ingo Molnar wrote: * Mel Gorman wrote: Ok. In response to one of your later questions, I found that I had in fact disabled THP without properly reporting it. [...] Hugepages is a must for most forms of NUM

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Mel Gorman

On Mon, Nov 19, 2012 at 11:36:04PM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > Ok. > > > > In response to one of your later questions, I found that I had > > in fact disabled THP without properly reporting it. [...] > > Hugepages is a must for most forms of NUMA/HPC. Requiring hu

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* Mel Gorman wrote: > Ok. > > In response to one of your later questions, I found that I had > in fact disabled THP without properly reporting it. [...] Hugepages is a must for most forms of NUMA/HPC. This alone questions the relevance of most of your prior numa/core testing results. I now

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Mel Gorman

On Mon, Nov 19, 2012 at 09:07:07PM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > > [ SPECjbb transactions/sec ]| > > > [ higher is better ]| > > > | > > > SPECjbb single-1x32524k 507k| 638

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Mel Gorman

On Mon, Nov 19, 2012 at 08:13:39PM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > On Mon, Nov 19, 2012 at 03:14:17AM +0100, Ingo Molnar wrote: > > > I'm pleased to announce the latest version of the numa/core tree. > > > > > > Here are some quick, preliminary performance numbers on a 4

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* Mel Gorman wrote: > > [ SPECjbb transactions/sec ]| > > [ higher is better ]| > > | > > SPECjbb single-1x32524k 507k| 638k +21.7% > > --

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Ingo Molnar

* Mel Gorman wrote: > On Mon, Nov 19, 2012 at 03:14:17AM +0100, Ingo Molnar wrote: > > I'm pleased to announce the latest version of the numa/core tree. > > > > Here are some quick, preliminary performance numbers on a 4-node, > > 32-way, 64 GB RAM system: > > > > CONFIG_NUMA_BALANCING=y > >

Re: [PATCH 00/27] Latest numa/core release, v16

2012-11-19 Thread Mel Gorman

On Mon, Nov 19, 2012 at 03:14:17AM +0100, Ingo Molnar wrote: > I'm pleased to announce the latest version of the numa/core tree. > > Here are some quick, preliminary performance numbers on a 4-node, > 32-way, 64 GB RAM system: > > CONFIG_NUMA_BALANCING=y >

39 matches

Mail list logo