Hi,

> To discover possible locking limitations to scalability, I have collected 
> locking statistics on a 2-way, 4-way, and 8-way performing as networked
> database servers.  I patched the [48]-way kernels with Kravetz's multiqueue 
> patch in the hope that mitigating runqueue_lock contention might better 
> reveal other lock contention.

...

>       24.38%  23.93%    15us(   218us)   4.3us(   111us)     744475     566289     
>178186      0  runqueue_lock
>       23.15%  38.78%    28us(   218us)   6.2us(   108us)     376292     230381     
>145911      0    schedule+0xe0

Tridge and I tried out the postgresql benchmark you used here and this
contention is due to a bug in postgres. From a quick strace, we found
the threads do a load of select(0, NULL, NULL, NULL, {0,0}). Basically all
threads are pounding on schedule().

Our guess is that the app has some form of userspace synchronisation
(semaphores/spinlocks). I'd argue that the app needs to be fixed not the
kernel, or a more valid test case is put forwards. :)

PS: I just looked at the postgresql source and the spinlocks (s_lock() etc)
are in a tight loop doing select(0, NULL, NULL, NULL, {0,0}). In samba
we have userspace spinlocks, but they cover small amounts of code and
offer an advantage over ipc semaphores. When you have to synchronise
large sections of code ipc semaphores are reasonably fast on linux and
would be a better fit.

Cheers,
Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to