Re: [PATCH] Re: Negative scalability by removal of

2000-11-20 Thread lamont
there's already the Linux Scalability Project's wake_one() patch for 2.2.9 (which applies fine to 2.2.18preX): http://www.citi.umich.edu/projects/linux-scalability/patches/p_accept-2.2.9.diff - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread Alan Cox
> Anyway, version 2 below uses LIFO for the accept() wakeups. This > appears to be a 5%-10% win for Apache. The browsing loop for > exclusive tasks will now pull in cachelines 0 and 2, rather > than the previous 0 and 1. That makes it much worse for the newest cpus which use 64byte lines

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread Andrew Morton
Linus Torvalds wrote: > > On Tue, 7 Nov 2000, Andrew Morton wrote: > > > Alan Cox wrote: > > > > > > > Even 2.2.x can be fixed to do the wake-one for accept(), if required. > > > > > > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to > > > try and backport all the

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread dean gaudet
haha, ok! :) (well i'm sure you know the history, but for others -- that code entered apache not specifically for linux... but specifically for handling the many early-to-mid 90s unixes that just plain broke on multiple accept :) -dean On Mon, 6 Nov 2000, David S. Miller wrote: >Date:

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread dean gaudet
haha, ok! :) (well i'm sure you know the history, but for others -- that code entered apache not specifically for linux... but specifically for handling the many early-to-mid 90s unixes that just plain broke on multiple accept :) -dean On Mon, 6 Nov 2000, David S. Miller wrote: Date:

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread Andrew Morton
Linus Torvalds wrote: On Tue, 7 Nov 2000, Andrew Morton wrote: Alan Cox wrote: Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for

Re: [PATCH] Re: Negative scalability by removal of

2000-11-07 Thread Alan Cox
Anyway, version 2 below uses LIFO for the accept() wakeups. This appears to be a 5%-10% win for Apache. The browsing loop for exclusive tasks will now pull in cachelines 0 and 2, rather than the previous 0 and 1. That makes it much worse for the newest cpus which use 64byte lines (Athlon

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread David S. Miller
Date:Mon, 6 Nov 2000 21:23:57 -0800 (PST) From: dean gaudet <[EMAIL PROTECTED]> apache is about correctness first, and performance second. Which is why we say it is "incorrect" for apache to try and work around kernel performance problems. :-))) Later, David S. Miller [EMAIL

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread dean gaudet
On Mon, 6 Nov 2000, Linus Torvalds wrote: > This is why I'd love to _not_ see silly work-arounds in apache hey, maybe it's time for me to repeat something that i'm often quoted as saying: apache is about correctness first, and performance second. i don't think that's silly personally.

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Linus Torvalds
On Tue, 7 Nov 2000, Andrew Morton wrote: > Alan Cox wrote: > > > > > Even 2.2.x can be fixed to do the wake-one for accept(), if required. > > > > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to > > try and backport all the mechanism. I think for 2.2 using the

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Alan Cox
> It's a 16-liner! I'll cheerfully admit that this patch > may be completely broken, but hey, it's free. I suggest > that _something_ has to be done for 2.2 now, because > Apache has switched to unserialised accept(). Interesting > The fact that the throughput is 3-4 time worse for 2, 3, 4

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrea Arcangeli
On Tue, Nov 07, 2000 at 01:27:07AM +1100, Andrew Morton wrote: > context. For networking, where it is called from softirq context > it is O(N). Yes, the heuristic that runs from irqs has an O(N) worst case in wakeup. But the current implementation is silly because the best case could run

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrew Morton
Alan Cox wrote: > > > Even 2.2.x can be fixed to do the wake-one for accept(), if required. > > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to > try and backport all the mechanism. I think for 2.2 using the semaphore is a > good approach. Its a hack to fix an old

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrew Morton
Andrea Arcangeli wrote: > > On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote: > > We don't need to backport of the full exclusive wait queues: we could do > > the equivalent of the semaphore inside the kernel around just accept(). It > > wouldn't be a generic thing, but it would

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread David S. Miller
Date:Mon, 6 Nov 2000 21:23:57 -0800 (PST) From: dean gaudet [EMAIL PROTECTED] apache is about correctness first, and performance second. Which is why we say it is "incorrect" for apache to try and work around kernel performance problems. :-))) Later, David S. Miller [EMAIL

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrew Morton
Andrea Arcangeli wrote: On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote: We don't need to backport of the full exclusive wait queues: we could do the equivalent of the semaphore inside the kernel around just accept(). It wouldn't be a generic thing, but it would fix the

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrew Morton
Alan Cox wrote: Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for 2.2 using the semaphore is a good approach. Its a hack to fix an old OS

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Andrea Arcangeli
On Tue, Nov 07, 2000 at 01:27:07AM +1100, Andrew Morton wrote: context. For networking, where it is called from softirq context it is O(N). Yes, the heuristic that runs from irqs has an O(N) worst case in wakeup. But the current implementation is silly because the best case could run without

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Alan Cox
It's a 16-liner! I'll cheerfully admit that this patch may be completely broken, but hey, it's free. I suggest that _something_ has to be done for 2.2 now, because Apache has switched to unserialised accept(). Interesting The fact that the throughput is 3-4 time worse for 2, 3, 4 and 5

Re: [PATCH] Re: Negative scalability by removal of

2000-11-06 Thread Linus Torvalds
On Tue, 7 Nov 2000, Andrew Morton wrote: Alan Cox wrote: Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for 2.2 using the semaphore is

Re: [PATCH] Re: Negative scalability by removal of

2000-11-05 Thread Alan Cox
> oh, someone reminded me of the other reason sysvsems suck: a cgi can grab > the semaphore and hold it, causing a DoS. of course folks could, and > should use suexec/cgiwrap to avoid this. The same cgi can killall -STOP httpd - To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH] Re: Negative scalability by removal of

2000-11-05 Thread dean gaudet
the numbers didn't look that bad for the small numbers of concurrent clients on 2.2... a few % slower without the serialisation. compared to orders of magnitude slower with large numbers of concurrent client. oh, someone reminded me of the other reason sysvsems suck: a cgi can grab the

Re: [PATCH] Re: Negative scalability by removal of

2000-11-05 Thread Andrea Arcangeli
On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote: > We don't need to backport of the full exclusive wait queues: we could do > the equivalent of the semaphore inside the kernel around just accept(). It > wouldn't be a generic thing, but it would fix the specific case of > accept().

Re: [PATCH] Re: Negative scalability by removal of

2000-11-05 Thread Andrea Arcangeli
On Sat, Nov 04, 2000 at 09:22:58AM -0800, Linus Torvalds wrote: We don't need to backport of the full exclusive wait queues: we could do the equivalent of the semaphore inside the kernel around just accept(). It wouldn't be a generic thing, but it would fix the specific case of accept(). The

Re: [PATCH] Re: Negative scalability by removal of

2000-11-05 Thread dean gaudet
the numbers didn't look that bad for the small numbers of concurrent clients on 2.2... a few % slower without the serialisation. compared to orders of magnitude slower with large numbers of concurrent client. oh, someone reminded me of the other reason sysvsems suck: a cgi can grab the

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread dean gaudet
On Sat, 4 Nov 2000, Alan Cox wrote: > > sysv semaphores have a very unfortunate negative feature -- if the admin > > kill -9's the server (impatient admins do this all the time) then you end > > up leaving a semaphore lying around. sysvsem don't have the usual unix > > Umm they have SEM_UNDO.

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-04 Thread Dave Wagner
Linus Torvalds wrote: > > No. > > Please use unserialized accept() _always_, because we can fix that. > > Even 2.2.x can be fixed to do the wake-one for accept(), if required. > It's not going to be any worse than the current apache config, and > basically the less games apache plays, the better

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread Alan Cox
> sysv semaphores have a very unfortunate negative feature -- if the admin > kill -9's the server (impatient admins do this all the time) then you end > up leaving a semaphore lying around. sysvsem don't have the usual unix Umm they have SEM_UNDO. Its a case of deeper magic - To unsubscribe

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread Alan Cox
> > Instead, if apache had just done the thing it wanted to do in the first > > place, the wake-one accept() semantics would have happened a hell of a > > lot earlier. > > counter-example: freebsd had wake-one semantics a few years before linux. And Im sure apache authors can use the utsname()

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strangeperformance behavior of 2.4.0-test9)

2000-11-04 Thread dean gaudet
On Sat, 4 Nov 2000, Andrew Morton wrote: > Dean, > > neither flock() nor fcntl() serialisation are effective > on linux 2.2 or linux 2.4. i have to admit the last time i timed any of the methods on linux was in 2.0.x days. thanks for the updated data! > For kernel 2.2 I recommend that Apache

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strangeperformance behavior of 2.4.0-test9)

2000-11-04 Thread dean gaudet
On Fri, 3 Nov 2000, Linus Torvalds wrote: > Please use unserialized accept() _always_, because we can fix that. i can unserialise the single socket case, but the multiple socket case is not so simple. the executive summary is that when you've got multiple sockets you have to use select().

Re: [PATCH] Re: Negative scalability by removal of

2000-11-04 Thread Linus Torvalds
On Sat, 4 Nov 2000, Alan Cox wrote: > > > Even 2.2.x can be fixed to do the wake-one for accept(), if required. > > Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to > try and backport all the mechanism. I think for 2.2 using the semaphore is a > good approach.

Re: [PATCH] Re: Negative scalability by removal of

2000-11-04 Thread Alan Cox
> Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for 2.2 using the semaphore is a good approach. Its a hack to fix an old OS kernel. For 2.4 its not

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strangeperformance behavior of 2.4.0-test9)

2000-11-04 Thread dean gaudet
On Sat, 4 Nov 2000, Andrew Morton wrote: Dean, neither flock() nor fcntl() serialisation are effective on linux 2.2 or linux 2.4. i have to admit the last time i timed any of the methods on linux was in 2.0.x days. thanks for the updated data! For kernel 2.2 I recommend that Apache

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread Alan Cox
sysv semaphores have a very unfortunate negative feature -- if the admin kill -9's the server (impatient admins do this all the time) then you end up leaving a semaphore lying around. sysvsem don't have the usual unix Umm they have SEM_UNDO. Its a case of deeper magic - To unsubscribe from

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread dean gaudet
On Sat, 4 Nov 2000, Alan Cox wrote: sysv semaphores have a very unfortunate negative feature -- if the admin kill -9's the server (impatient admins do this all the time) then you end up leaving a semaphore lying around. sysvsem don't have the usual unix Umm they have SEM_UNDO. Its a

Re: [PATCH] Re: Negative scalability by removal of

2000-11-04 Thread Alan Cox
Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for 2.2 using the semaphore is a good approach. Its a hack to fix an old OS kernel. For 2.4 its not

Re: [PATCH] Re: Negative scalability by removal of

2000-11-04 Thread Linus Torvalds
On Sat, 4 Nov 2000, Alan Cox wrote: Even 2.2.x can be fixed to do the wake-one for accept(), if required. Do we really want to retrofit wake_one to 2.2. I know Im not terribly keen to try and backport all the mechanism. I think for 2.2 using the semaphore is a good approach. Its a

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strangeperformance behavior of 2.4.0-test9)

2000-11-04 Thread dean gaudet
On Fri, 3 Nov 2000, Linus Torvalds wrote: Please use unserialized accept() _always_, because we can fix that. i can unserialise the single socket case, but the multiple socket case is not so simple. the executive summary is that when you've got multiple sockets you have to use select().

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange

2000-11-04 Thread Alan Cox
Instead, if apache had just done the thing it wanted to do in the first place, the wake-one accept() semantics would have happened a hell of a lot earlier. counter-example: freebsd had wake-one semantics a few years before linux. And Im sure apache authors can use the utsname() syscall

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-04 Thread Dave Wagner
Linus Torvalds wrote: No. Please use unserialized accept() _always_, because we can fix that. Even 2.2.x can be fixed to do the wake-one for accept(), if required. It's not going to be any worse than the current apache config, and basically the less games apache plays, the better the

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Linus Torvalds
In article <[EMAIL PROTECTED]>, Andrew Morton <[EMAIL PROTECTED]> wrote: > >neither flock() nor fcntl() serialisation are effective >on linux 2.2 or linux 2.4. This is because the file >locking code still wakes up _all_ waiters. In my testing >with fcntl serialisation I have seen a single

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Andrew Morton
dean gaudet wrote: > > On Tue, 31 Oct 2000, Andrew Morton wrote: > > > Dean, it looks like the same problem will occur with flock()-based > > serialisation. Does Apache/Linux ever use that option? > > from apache/src/include/ap_config.h in the linux section there's > this: > > /* flock is

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Andrew Morton
[EMAIL PROTECTED] wrote: > > Andrew Morton writes: > > This patch is a moderate rewrite of __wake_up_common. I'd be > > interested in seeing how much difference it makes to the > > performance of Apache when the file-locking serialisation is > > disabled. > > - It implements

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Linus Torvalds
In article [EMAIL PROTECTED], Andrew Morton [EMAIL PROTECTED] wrote: neither flock() nor fcntl() serialisation are effective on linux 2.2 or linux 2.4. This is because the file locking code still wakes up _all_ waiters. In my testing with fcntl serialisation I have seen a single Apache

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Andrew Morton
[EMAIL PROTECTED] wrote: Andrew Morton writes: This patch is a moderate rewrite of __wake_up_common. I'd be interested in seeing how much difference it makes to the performance of Apache when the file-locking serialisation is disabled. - It implements last-in/first-out

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:Strange performance behavior of 2.4.0-test9)

2000-11-03 Thread Andrew Morton
dean gaudet wrote: On Tue, 31 Oct 2000, Andrew Morton wrote: Dean, it looks like the same problem will occur with flock()-based serialisation. Does Apache/Linux ever use that option? from apache/src/include/ap_config.h in the linux section there's this: /* flock is faster ...

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:

2000-10-30 Thread Andrea Arcangeli
On Mon, Oct 30, 2000 at 02:36:39PM -0200, Rik van Riel wrote: > For stuff like ___wait_on_page(), OTOH, you really want FIFO > wakeup to avoid starvation (yes, I know we're currently doing Sure agreed. In my _whole_ previous email I was only talking about accept. Semaphores file locking etc..

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:

2000-10-30 Thread Andrea Arcangeli
On Mon, Oct 30, 2000 at 02:36:39PM -0200, Rik van Riel wrote: For stuff like ___wait_on_page(), OTOH, you really want FIFO wakeup to avoid starvation (yes, I know we're currently doing Sure agreed. In my _whole_ previous email I was only talking about accept. Semaphores file locking etc.. all

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:

2000-10-29 Thread Andi Kleen
On Sun, Oct 29, 2000 at 11:45:49AM -0800, dean gaudet wrote: > On Sat, 28 Oct 2000, Alan Cox wrote: > > > > The big question is: why is Apache using file locking so > > > much? Is this normal behaviour for Apache? > > > > Apache uses file locking to serialize accept on hosts where accept

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was:

2000-10-29 Thread Andi Kleen
On Sun, Oct 29, 2000 at 11:45:49AM -0800, dean gaudet wrote: On Sat, 28 Oct 2000, Alan Cox wrote: The big question is: why is Apache using file locking so much? Is this normal behaviour for Apache? Apache uses file locking to serialize accept on hosts where accept either has bad

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andrew Morton
Andrew Morton wrote: > > I think it's more expedient at this time to convert > acquire_fl_sem/release_fl_sem into lock_kernel/unlock_kernel > (so we _can_ sleep) and to fix the above alleged deadlock > via the creation of __posix_unblock_lock() I agree with me. Could you please test the

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Jeff Garzik
Andrew Morton wrote: > --- linux-2.4.0-test10-pre5/fs/locks.c Tue Oct 24 21:34:13 2000 > +++ linux-akpm/fs/locks.c Sun Oct 29 02:31:10 2000 > @@ -125,10 +125,9 @@ > #include > #include > > -DECLARE_MUTEX(file_lock_sem); > - > -#define acquire_fl_sem() down(_lock_sem) > -#define

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andi Kleen
On Sun, Oct 29, 2000 at 02:46:14AM +1100, Andrew Morton wrote: > [EMAIL PROTECTED] wrote: > > > > Change the following two macros: > > acquire_fl_sem()->lock_kernel() > > release_fl_sem()->unlock_kernel() > > then > > 5192 Req/s @8cpu is got. It is same as test8 within

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andrew Morton
[EMAIL PROTECTED] wrote: > > Change the following two macros: > acquire_fl_sem()->lock_kernel() > release_fl_sem()->unlock_kernel() > then > 5192 Req/s @8cpu is got. It is same as test8 within fluctuation. hmm.. BKL increases scalability. News at 11. The big question is: why

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andrew Morton
Andrew Morton wrote: I think it's more expedient at this time to convert acquire_fl_sem/release_fl_sem into lock_kernel/unlock_kernel (so we _can_ sleep) and to fix the above alleged deadlock via the creation of __posix_unblock_lock() I agree with me. Could you please test the scalability

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andrew Morton
[EMAIL PROTECTED] wrote: Change the following two macros: acquire_fl_sem()-lock_kernel() release_fl_sem()-unlock_kernel() then 5192 Req/s @8cpu is got. It is same as test8 within fluctuation. hmm.. BKL increases scalability. News at 11. The big question is: why is

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Andi Kleen
On Sun, Oct 29, 2000 at 02:46:14AM +1100, Andrew Morton wrote: [EMAIL PROTECTED] wrote: Change the following two macros: acquire_fl_sem()-lock_kernel() release_fl_sem()-unlock_kernel() then 5192 Req/s @8cpu is got. It is same as test8 within fluctuation. hmm..

Re: [PATCH] Re: Negative scalability by removal of lock_kernel()?(Was: Strange performance behavior of 2.4.0-test9)

2000-10-28 Thread Jeff Garzik
Andrew Morton wrote: --- linux-2.4.0-test10-pre5/fs/locks.c Tue Oct 24 21:34:13 2000 +++ linux-akpm/fs/locks.c Sun Oct 29 02:31:10 2000 @@ -125,10 +125,9 @@ #include asm/semaphore.h #include asm/uaccess.h -DECLARE_MUTEX(file_lock_sem); - -#define acquire_fl_sem()