RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-07 Thread Chen, Kenneth W
Ingo Molnar wrote on Tuesday, April 05, 2005 11:46 PM > ok, the delay of 16 secs is alot better. Could you send me the full > detection log, how stable is the curve? Full log attached. begin 666 boot.log M0F]O="!PF5D($E40R!W:71H($-052 P("AL87-T(&1I9F8@,R!C>6-L97,L(&UA>&5R M@I#86QI8G)A=&EN M9R!D9

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-05 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > tested on x86, the calibration results look ok there. > > Calibration result on ia64 (1.5 GHz, 9 MB), somewhat smaller in this > version compare to earlier estimate of 10.4ms. The optimal setting > found by a db workload is around 16 ms. with

RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-05 Thread Chen, Kenneth W
Ingo Molnar wrote on Monday, April 04, 2005 8:05 PM > > latest patch attached. Changes: > > - stabilized calibration even more, by using cache flushing >instructions to generate a predictable working set. The cache >flushing itself is not timed, it is used to create quiescent >cache s

RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-05 Thread Chen, Kenneth W
Ingo Molnar wrote on Sunday, April 03, 2005 11:24 PM > great! How long does the benchmark take (hours?), and is there any way > to speed up the benchmarking (without hurting accuracy), so that > multiple migration-cost settings could be tried? Would it be possible to > try a few other values via t

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Ingo Molnar
latest patch attached. Changes: - stabilized calibration even more, by using cache flushing instructions to generate a predictable working set. The cache flushing itself is not timed, it is used to create quiescent cache state. I only guessed the ia64 version - e.g. i didnt know

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > Perhaps, I'm not getting the latest patch? It skipped measuring > because migration cost array is non-zero (initialized to -1LL): yeah ... some mixup here. I've attached the latest. > Also, the cost calculation in measure_one() looks fishy to me

RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Chen, Kenneth W
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > The cache size information on ia64 is already available at the finger > tip. Here is a patch that I whipped up to set max_cache_size for ia64. Ingo Molnar wrote on Monday, April 04, 2005 4:38 AM > thanks - i've added this to my tree. > > i've attached

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Paul Jackson
Ingo wrote: > i've attached the latest snapshot. I ran your latest snapshot on 64 CPU (well, 62 - one node wasn't working) system. I made one change - chop the matrix lines at 8 terms. It's a hack - don't know if it's a good idea. But the long lines were hard to read (and would only get worse o

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote on Saturday, April 02, 2005 11:04 PM > > the default on ia64 (32MB) was way too large and caused the search to > > start from 64MB. That can take a _long_ time. > > > > i've attached a new patch with your changes included, and a cou

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Paul Jackson
Ingo wrote: > the problem i mentioned earlier is that there is no other use Eh ... whatever. The present seems straight forward enough, with a simple sched domain tree and your auto-tune migration cost calculation bolted directly on top of that. I'd better leave the futures to those more experie

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-04 Thread Paul Jackson
Ingo wrote: > agreed - i've changed it to domain_distance() in my tree. Good - cool - thanks. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscri

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Would be a good idea to rename 'cpu_distance()' to something more > specific, like 'cpu_dist_ndx()', and reserve the generic name > 'cpu_distance()' for later use to return a scaled integer distance, > rather like 'node_distance()' does now. [...] a

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Nick wrote: > > In a sense, the information *is* already there - in node_distance. > > What I think should be done is probably to use node_distance when > > calculating costs, ... > > Hmmm ... perhaps I'm confused, but this sure sounds like the alterna

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Nick wrote: > In a sense, the information *is* already there - in node_distance. > What I think should be done is probably to use node_distance when > calculating costs, ... Hmmm ... perhaps I'm confused, but this sure sounds like the alternative implementation of cpu_distance using node_distance

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > a numa scheduler domain at the top level and cache_hot_time will be > > set to 0 in that case on smp box. Though this will be a mutt point > > with recent patch from Suresh Siddha for removing the extra bogus > > scheduler domains. > > http://mar

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote on Sunday, April 03, 2005 7:30 AM > > how close are these numbers to the real worst-case migration costs on > > that box? > > I booted your latest patch on a 4-way SMP box (1.5 GHz, 9MB ia64). This > is what it produces. I think t

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Nick Piggin
On Sun, 2005-04-03 at 20:55 -0700, Paul Jackson wrote: > But if we knew the CPU hierarchy in more detail, and if we had some > other use for that detail (we don't that I know), then I take it from > your comment that we should be reluctant to push those details into the > sched domains. Put them

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote: > There's no other place to push them One could make a place, if the need arose. > but trying and benchmarking it is necessary to tell for sure. Hard to argue with that ... ;). -- I won't rest till it's the best ... Programmer, Linux Scalability

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Ingo, if I understood correctly, suggested pushing any necessary > detail of the CPU hierarchy into the scheduler domains, so that his > latest work tuning migration costs could pick it up from there. > > It makes good sense for the migration cost es

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Andy wrote: > Not that I really know what I'm talking about here, but this sounds > highly parallelizable. I doubt it. If we are testing the cost of a migration between CPUs alpha and beta, and at the same time testing betweeen CPUs gamma and delta, then often there will be some hardware that is

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Andy Lutomirski
Paul Jackson wrote: Ok - that flies, or at least walks. It took 53 seconds to compute this cost matrix. Not that I really know what I'm talking about here, but this sounds highly parallelizable. It seems like you could do N/2 measurements at a time, so this should be O(N) to compute the matrix

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Paul wrote: > I should push in the direction of improving the > SN2 sched domain hierarchy. Nick wrote: > I'd just be a bit careful about this. Good point - thanks. I will - be careful. I have no delusions that I know what would be an "improvement" to the scheduler - if anything. Ingo, if I un

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Nick Piggin
Paul Jackson wrote: Ingo wrote: if you create a sched-domains hierarchy (based on the SLIT tables, or in whatever other way) that matches the CPU hierarchy then you'll automatically get the proper distances detected. Yes - agreed. I should push in the direction of improving the SN2 sched domain

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-03 Thread Kevin Puetz
Linus Torvalds wrote: > > > On Fri, 1 Apr 2005, Chen, Kenneth W wrote: >> >> Paul, you definitely want to check this out on your large numa box. I >> booted a kernel with this patch on a 32-way numa box and it took a long >> time to produce the cost matrix. > > Is there anything fundamen

RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Chen, Kenneth W
Ingo Molnar wrote on Sunday, April 03, 2005 7:30 AM > how close are these numbers to the real worst-case migration costs on > that box? I booted your latest patch on a 4-way SMP box (1.5 GHz, 9MB ia64). This is what it produces. I think the estimate is excellent. [00]: -10.4(0) 10.4(0) 1

RE: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Chen, Kenneth W
Ingo Molnar wrote on Saturday, April 02, 2005 11:04 PM > the default on ia64 (32MB) was way too large and caused the search to > start from 64MB. That can take a _long_ time. > > i've attached a new patch with your changes included, and a couple of > new things added: > > - removed the 32MB max_ca

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote: > how close are these numbers to the real worst-case migration costs on > that box? What are the cache sizes and what is their hierarchies? > ... > is there any workload that shows the same scheduling related performance > regressions, other than Ken's $1m+ benchmark kit? I'll have

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote: > if you create a sched-domains hierarchy (based on the SLIT tables, or in > whatever other way) that matches the CPU hierarchy then you'll > automatically get the proper distances detected. Yes - agreed. I should push in the direction of improving the SN2 sched domain hierarchy.

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ingo wrote: > if_ there is a significant hierarchy between CPUs it > should be represented via a matching sched-domains hierarchy, Agreed. I'll see how the sched domains hierarchy looks on a bigger SN2 systems. If the CPU hierarchy is not reflected in the sched-domain hierarchy any better there,

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > > 3) I was noticing that my test system was only showing a couple of > distinct values for cpu_distance, even though it has 4 distinct > distances for values of node_distance. So I coded up a variant of > cpu_distance that converts the

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Three more observations. > > 1) The slowest measure_one() calls are, not surprisingly, those for the > largest sizes. At least on my test system of the moment, the plot > of cost versus size has one major maximum (a one hump camel, not two).

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Ok - that flies, or at least walks. It took 53 seconds to compute > this cost matrix. 53 seconds is too much - i'm working on reducing it. > Here's what it prints, on a small 8 CPU ia64 SN2 Altix, with > the migration_debug prints formatted separate

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Three more observations. 1) The slowest measure_one() calls are, not surprisingly, those for the largest sizes. At least on my test system of the moment, the plot of cost versus size has one major maximum (a one hump camel, not two). Seems like if we computed from smallest size

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Ok - that flies, or at least walks. It took 53 seconds to compute this cost matrix. Here's what it prints, on a small 8 CPU ia64 SN2 Altix, with the migration_debug prints formatted separately from the primary table, for ease of reading: Total of 8 processors activated (15548.60 BogoMIPS). -

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
Earlier, Paul wrote: > Note the first 3 chars of the panic message "4.5". This looks like it > might be the [00]-[01] entry of Ingo's table, flushed out when the > newlines of the panic came through. For the record, the above speculation is probably wrong. More likely, the first six characters "

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-03 Thread Paul Jackson
> the default on ia64 (32MB) was way too large Agreed. It took about 9 minutes to search the first pair of cpus (cpu 0 to cpu 1) from a size of 67107840 down to a size of 62906, based on some prints I added since my last message. > it seems the screen blanking timer hit Ah - yes. That makes s

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-02 Thread Nick Piggin
David Lang wrote: On Sat, 2 Apr 2005, Andreas Dilger wrote: given that this would let you get the same storage with about 1200 fewer drives (with corresponding savings in raid controllers, fiberchannel controllers and rack frames) it would be interesting to know how close it would be (for a lot o

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-02 Thread David Lang
On Sat, 2 Apr 2005, Andreas Dilger wrote: given that this would let you get the same storage with about 1200 fewer drives (with corresponding savings in raid controllers, fiberchannel controllers and rack frames) it would be interesting to know how close it would be (for a lot of people the savings

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-02 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Just so as no else wastes time repeating the little bit I've done so > far, and so I don't waste time figuring out what is already known, > here's what I have so far, trying out Ingo's "sched: auto-tune > migration costs" on ia64 SN2: > > To get it

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-02 Thread Andreas Dilger
On Apr 02, 2005 22:36 -0800, David Lang wrote: > On Fri, 1 Apr 2005, Chen, Kenneth W wrote: > >To run this "industry db benchmark", assuming you have a 32-way numa box, > >I recommend buying the following: > > > >512 GB memory > >1500 73 GB 15k-rpm fiber channel disks > >50 hardware raid controlle

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-02 Thread David Lang
On Fri, 1 Apr 2005, Chen, Kenneth W wrote: To run this "industry db benchmark", assuming you have a 32-way numa box, I recommend buying the following: 512 GB memory 1500 73 GB 15k-rpm fiber channel disks 50 hardware raid controllers, make sure you get the top of the line model (the one has 1GB me

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-02 Thread Paul Jackson
Just so as no else wastes time repeating the little bit I've done so far, and so I don't waste time figuring out what is already known, here's what I have so far, trying out Ingo's "sched: auto-tune migration costs" on ia64 SN2: To get it to compile against 2.6.12-rc1-mm4, I did thus: 1. Ma

Re: [patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-02 Thread Paul Jackson
Ingo wrote: > in theory the code should work fine on ia64 as well, Nice. I'll try it on our SN2 Altix IA64 as well. Though I am being delayed a day or two in this by irrelevant problems. -- I won't rest till it's the best ... Programmer, Linux Scalability

[patch] sched: auto-tune migration costs [was: Re: Industry db benchmark result on recent 2.6 kernels]

2005-04-02 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote on Thursday, March 31, 2005 8:52 PM > > the current defaults for cache_hot_time are 10 msec for NUMA domains, > > and 2.5 msec for SMP domains. Clearly too low for CPUs with 9MB cache. > > Are you increasing cache_hot_time in your e

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Paul Jackson
Kenneth wrote: > I recommend buying the following: ah so ... I think I'll skip running the industry db benchmark for now, if that's all the same. What sort of feedback are you looking for from my running this patch? -- I won't rest till it's the best ... Prog

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Nick Piggin
Linus Torvalds wrote: On Fri, 1 Apr 2005, Chen, Kenneth W wrote: Paul, you definitely want to check this out on your large numa box. I booted a kernel with this patch on a 32-way numa box and it took a long time to produce the cost matrix. Is there anything fundamentally wrong with the notio

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Nick Piggin
Chen, Kenneth W wrote: Ingo Molnar wrote on Thursday, March 31, 2005 8:52 PM the current defaults for cache_hot_time are 10 msec for NUMA domains, and 2.5 msec for SMP domains. Clearly too low for CPUs with 9MB cache. Are you increasing cache_hot_time in your experiment? If that solves most of the

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Chen, Kenneth W
Paul Jackson wrote on Friday, April 01, 2005 5:45 PM > Kenneth wrote: > > Paul, you definitely want to check this out on your large numa box. > > Interesting - thanks. I can get a kernel patched and booted on a big > box easily enough. I don't know how to run an "industry db benchmark", > and ben

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Paul Jackson
Kenneth wrote: > Paul, you definitely want to check this out on your large numa box. Interesting - thanks. I can get a kernel patched and booted on a big box easily enough. I don't know how to run an "industry db benchmark", and benchmarks aren't my forte. Should I rope in one of our guys who i

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Chen, Kenneth W
Ingo Molnar wrote on Thursday, March 31, 2005 8:52 PM > the current defaults for cache_hot_time are 10 msec for NUMA domains, > and 2.5 msec for SMP domains. Clearly too low for CPUs with 9MB cache. > Are you increasing cache_hot_time in your experiment? If that solves > most of the problem that wo

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Chen, Kenneth W
Linus Torvalds wrote on Tuesday, March 29, 2005 4:00 PM > Also, it would be absolutely wonderful to see a finer granularity (which > would likely also answer the stability question of the numbers). If you > can do this with the daily snapshots, that would be great. If it's not > easily automatable,

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Linus Torvalds
On Fri, 1 Apr 2005, Chen, Kenneth W wrote: > > Paul, you definitely want to check this out on your large numa box. I booted > a kernel with this patch on a 32-way numa box and it took a long time > to produce the cost matrix. Is there anything fundamentally wrong with the notion of just i

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Chen, Kenneth W
Ingo Molnar wrote on Thursday, March 31, 2005 10:46 PM > before we get into complexities, i'd like to see whether it solves Ken's > performance problem. The attached patch (against BK-curr, but should > apply to vanilla 2.6.12-rc1 too) adds the autodetection feature. (For > ia64 i've hacked in a ca

RE: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Manfred Spraul
On Mon, 28 Mar 2005, Chen, Kenneth W wrote: With that said, here goes our first data point along with some historical data we have collected so far. 2.6.11 -13% 2.6.9 - 6% 2.6.8 -23% 2.6.2 - 1% baseline(rhel3) Is it possible to generate an instruction level op

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Paul Jackson
Ingo wrote: > but i'd too go for the simpler 'pseudo-distance' function, because it's > so much easier to iterate through it. But it's not intuitive. Maybe it > should be called 'connection ID': a unique ID for each uniqe type of > path between CPUs. Well said. Thanks. -- I

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > > It has to be made sure that H1+H2+H3 != H4+H5+H6, > > Yeah - if you start trying to think about the general case here, the > combinations tend to explode on one. well, while i dont think we need that much complexity, the most generic case is a rep

Re: Industry db benchmark result on recent 2.6 kernels

2005-04-01 Thread Paul Jackson
> It has to be made sure that H1+H2+H3 != H4+H5+H6, Yeah - if you start trying to think about the general case here, the combinations tend to explode on one. I'm thinking we get off easy, because: 1) Specific arch's can apply specific short cuts. My intuition was that any specific arch

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Paul Jackson
> Couple of observations: yeah - plausible enough. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373, 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscri

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Nick wrote: > > Ingo had a cool patch to estimate dirty => dirty cacheline transfer latency > > ... Unfortunately ... and it is an O(cpus^2) operation. > > Yes - a cool patch. > > If we had an arch-specific bit of code, that for any two cpus, could >

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Ingo Molnar
* Paul Jackson <[EMAIL PROTECTED]> wrote: > Nick wrote: > > Ingo had a cool patch to estimate dirty => dirty cacheline transfer latency > > ... Unfortunately ... and it is an O(cpus^2) operation. > > Yes - a cool patch. before we get into complexities, i'd like to see whether it solves Ken's p

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Nick Piggin
On Thu, 2005-03-31 at 22:05 -0800, Paul Jackson wrote: > > Then us poor slobs with big honkin numa iron could code up a real > pseudo_distance() routine, to avoid the actual pain of doing real work > for cpus^2 iterations for large cpu counts. > > Our big boxes have regular geometries with much s

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Paul Jackson
Nick wrote: > Ingo had a cool patch to estimate dirty => dirty cacheline transfer latency > ... Unfortunately ... and it is an O(cpus^2) operation. Yes - a cool patch. If we had an arch-specific bit of code, that for any two cpus, could give a 'pseudo-distance' between them, where the only real r

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Chen, Kenneth W
Ingo Molnar wrote on Thursday, March 31, 2005 8:52 PM > the current scheduler queue in -mm has some experimental bits as well > which will reduce the amount of balancing. But we cannot just merge them > an bloc right now, there's been too much back and forth in recent > kernels. The safe-to-merge-f

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > The low point in 2.6.11 could very well be the change in the > scheduler. It does too many load balancing in the wake up path and > possibly made a lot of unwise decision. For example, in > try_to_wake_up(), it will try SD_WAKE_AFFINE for task th

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Nick Piggin
Chen, Kenneth W wrote: Linus Torvalds wrote on Thursday, March 31, 2005 12:09 PM Btw, I realize that you can't give good oprofiles for the user-mode components, but a kernel profile with even just single "time spent in user mode" datapoint would be good, since a kernel scheduling problem might just

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Chen, Kenneth W
Linus Torvalds wrote on Thursday, March 31, 2005 12:09 PM > Btw, I realize that you can't give good oprofiles for the user-mode > components, but a kernel profile with even just single "time spent in user > mode" datapoint would be good, since a kernel scheduling problem might > just make caches wo

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Linus Torvalds
On Thu, 31 Mar 2005, Linus Torvalds wrote: > > Can you post oprofile data for a run? Btw, I realize that you can't give good oprofiles for the user-mode components, but a kernel profile with even just single "time spent in user mode" datapoint would be good, since a kernel scheduling problem mi

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Linus Torvalds
On Thu, 31 Mar 2005, Chen, Kenneth W wrote: > > No, there are no idle time on the system. If system become I/O bound, we > would do everything we can to remove that bottleneck, i.e., throw a couple > hundred GB of memory to the system, or add a couple hundred disk drives, > etc. Believe it or n

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Chen, Kenneth W
Ingo Molnar wrote on Thursday, March 31, 2005 6:15 AM > is there any idle time on the system, in steady state (it's a sign of > under-balancing)? Idle balancing (and wakeup balancing) is one of the > things that got tuned back and forth alot. Also, do you know what the > total number of context-swi

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-31 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > If it is doing a lot of mapping/unmapping (or fork/exit), then that > > might explain why 2.6.11 is worse. > > > > Fortunately there are more patches to improve this on the way. > > Once benchmark reaches steady state, there is no mapping/unmappin

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Nick Piggin
Chen, Kenneth W wrote: Nick Piggin wrote on Tuesday, March 29, 2005 5:32 PM If it is doing a lot of mapping/unmapping (or fork/exit), then that might explain why 2.6.11 is worse. Fortunately there are more patches to improve this on the way. Once benchmark reaches steady state, there is no mapping

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Chen, Kenneth W
Nick Piggin wrote on Tuesday, March 29, 2005 5:32 PM > If it is doing a lot of mapping/unmapping (or fork/exit), then that > might explain why 2.6.11 is worse. > > Fortunately there are more patches to improve this on the way. Once benchmark reaches steady state, there is no mapping/unmapping goin

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Nick Piggin
Linus Torvalds wrote: On Tue, 29 Mar 2005, Chen, Kenneth W wrote: Linus Torvalds wrote on Tuesday, March 29, 2005 4:00 PM The fact that it seems to fluctuate pretty wildly makes me wonder how stable the numbers are. I can't resist myself from bragging. The high point in the fluctuation might be bec

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Linus Torvalds
On Tue, 29 Mar 2005, Chen, Kenneth W wrote: > > Linus Torvalds wrote on Tuesday, March 29, 2005 4:00 PM > > The fact that it seems to fluctuate pretty wildly makes me wonder > > how stable the numbers are. > > I can't resist myself from bragging. The high point in the fluctuation > might be beca

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Chen, Kenneth W
Linus Torvalds wrote on Tuesday, March 29, 2005 4:00 PM > The fact that it seems to fluctuate pretty wildly makes me wonder > how stable the numbers are. I can't resist myself from bragging. The high point in the fluctuation might be because someone is working hard trying to make 2.6 kernel run fa

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Chen, Kenneth W
On Mon, 28 Mar 2005, Chen, Kenneth W wrote: > With that said, here goes our first data point along with some historical data > we have collected so far. > > 2.6.11-13% > 2.6.9 - 6% > 2.6.8 -23% > 2.6.2 - 1% > baseline (rhel3) Linus Torvalds wrote on Tuesday, Ma

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-29 Thread Linus Torvalds
On Mon, 28 Mar 2005, Chen, Kenneth W wrote: > > With that said, here goes our first data point along with some historical data > we have collected so far. > > 2.6.11-13% > 2.6.9 - 6% > 2.6.8 -23% > 2.6.2 - 1% > baseline (rhel3) How repeatable are the number

RE: Industry db benchmark result on recent 2.6 kernels

2005-03-28 Thread Chen, Kenneth W
On Mon, 2005-03-28 at 11:33 -0800, Chen, Kenneth W wrote: > We will be taking db benchmark measurements more frequently from now on with > latest kernel from kernel.org (and make these measurements on a fixed > interval). > By doing this, I hope to achieve two things: one is to track base kernel >

Re: Industry db benchmark result on recent 2.6 kernels

2005-03-28 Thread Dave Hansen
On Mon, 2005-03-28 at 11:33 -0800, Chen, Kenneth W wrote: > We will be taking db benchmark measurements more frequently from now on with > latest kernel from kernel.org (and make these measurements on a fixed > interval). > By doing this, I hope to achieve two things: one is to track base kernel >