Re: [9fans] interesting timing tests

2010-06-22 Thread erik quanstrom
> Do you have a way to turn off one of the sockets on "c" (2 x E5540) and get > the numbers with HT (8 processors) and without HT (4 processors)? It would > also be interesting to see "c" with HT turned off. here's the progression 4 4.41u 1.83s 4.06r 0. %ilock 8 4.47u 2

Re: [9fans] interesting timing tests

2010-06-21 Thread Lawrence E. Bakst
Do you have a way to turn off one of the sockets on "c" (2 x E5540) and get the numbers with HT (8 processors) and without HT (4 processors)? It would also be interesting to see "c" with HT turned off. Certainly it seems to me that idlehands needs to be fixed, your bit array "active.schedwait"

Re: [9fans] interesting timing tests

2010-06-21 Thread erik quanstrom
> Is there a way to check this? > > Is there a way to completely shut off N processors and > measure benchmark speed slow down as function of processor? there hasn't been any performance impact measured. however, the extreme system time still seems wierd. richard miller suggested that kprof migh

Re: [9fans] interesting timing tests

2010-06-21 Thread Bakul Shah
On Mon, 21 Jun 2010 17:21:36 EDT erik quanstrom wrote: > > > note the extreme system time on the 16 processor machine > > > > Could this be due to memory contention caused by spinlocks? > > While locks are spinning they eat up memory bandwidth which > > slows down everyone's memory accesses (inc

Re: [9fans] interesting timing tests

2010-06-21 Thread erik quanstrom
> > note the extreme system time on the 16 processor machine > > Could this be due to memory contention caused by spinlocks? > While locks are spinning they eat up memory bandwidth which > slows down everyone's memory accesses (including the one who > is trying to finish its work while holding the

Re: [9fans] interesting timing tests

2010-06-21 Thread Bakul Shah
On Fri, 18 Jun 2010 19:26:25 EDT erik quanstrom wrote: > note the extreme system time on the 16 processor machine Could this be due to memory contention caused by spinlocks? While locks are spinning they eat up memory bandwidth which slows down everyone's memory accesses (including the one who

Re: [9fans] interesting timing tests

2010-06-21 Thread Venkatesh Srinivas
On Mon, Jun 21, 2010 at 10:40 AM, erik quanstrom wrote: > void > lock(ulong *l) > { >ulong old; >ushort next, owner; > >old = _xadd(l, 1); >for(;;){ >next = old; >owner = old>>16; >old = *l; >if(next ==

Re: [9fans] interesting timing tests

2010-06-21 Thread erik quanstrom
On Mon Jun 21 10:51:30 EDT 2010, quans...@quanstro.net wrote: > void > lock(ulong *l) somehow lost was an observation that since lock is only testing that next == owner, and that both are based on the current state of *l, i don't see how this is robust in the face of more than one mach spinning.

Re: [9fans] interesting timing tests

2010-06-21 Thread erik quanstrom
void lock(ulong *l) { ulong old; ushort next, owner; old = _xadd(l, 1); for(;;){ next = old; owner = old>>16; old = *l; if(next == owner) break; } } void unlock(ulong *l

Re: [9fans] interesting timing tests

2010-06-20 Thread Venkatesh Srinivas
knowing which locks would rock. i imagine the easiest way to find out would be modify lock() to bump a per-lock ctr on failure-to-acquire. on i386 lock add would be the easiest way to do that, i think. add an 'inited' field to the spinlock and a list linkage as well, to allow for easy examination

Re: [9fans] interesting timing tests

2010-06-20 Thread erik quanstrom
oops. botched fix of harmess warning. corrected source attached. just for a giggle, i ran this test on a few handy machines to get a feel for relative speed of a single core. since this test is small enough to fit in the tiniest cache, i would think that memory speed or any other external factor

Re: [9fans] interesting timing tests

2010-06-20 Thread erik quanstrom
> > yet for one machine conf.nmach == 4 and for the > > other conf.nmach == 16; neither is calling halt. > > Hypothesis: with four processors there's enough work to keep all > the cpus busy. With sixteen processors you're getting i/o bound > (where's the filesystem coming from?) so some of the cp

Re: [9fans] interesting timing tests

2010-06-20 Thread Richard Miller
> yet for one machine conf.nmach == 4 and for the > other conf.nmach == 16; neither is calling halt. Hypothesis: with four processors there's enough work to keep all the cpus busy. With sixteen processors you're getting i/o bound (where's the filesystem coming from?) so some of the cpus are idlin

Re: [9fans] interesting timing tests

2010-06-20 Thread erik quanstrom
> Spin locks would have been high on my list of suspects. mine, too. the 64 bit question is, which spin locks. > > > i'm less sure that runproc is really using 62% of the cpu > > Not impossible, given this: > > Proc* > runproc(void) > { > ... > /* waste time or hal

Re: [9fans] interesting timing tests

2010-06-20 Thread Richard Miller
> in any event, i was suspecting that ilock > would be a big loser as nproc goes up, and it does appear to > be. Spin locks would have been high on my list of suspects. > i'm less sure that runproc is really using 62% of the cpu Not impossible, given this: Proc* runproc(void) {

Re: [9fans] interesting timing tests

2010-06-19 Thread erik quanstrom
On Sat Jun 19 09:44:25 EDT 2010, 9f...@hamnavoe.com wrote: > > note the extreme system time on the 16 processor machine > > kprof(3) i'm not sure i completely trust kprof these days. there seems to be a lot of sampling error. the last time i tried to use it to get timing on esp, encryption didn

Re: [9fans] interesting timing tests

2010-06-19 Thread Richard Miller
> note the extreme system time on the 16 processor machine kprof(3)

[9fans] interesting timing tests

2010-06-18 Thread erik quanstrom
note the extreme system time on the 16 processor machine a 2 * Intel(R) Xeon(R) CPU5120 @ 1.86GHz b 4 * Intel(R) Xeon(R) CPU E5630 @ 2.53GHz c 16* Intel(R) Xeon(R) CPU E5540 @ 2.53GHz # libsec a; objtype=arm time mk>/dev/null 0.44u 0.63s 0.94r