[PATCH v2 1/5] refcount, kref: add dec-and-test wrappers for rw_semaphores

2020-05-04 Thread Emanuele Giuseppe Esposito
Similar to the existing functions that take a mutex or spinlock if and only if a reference count is decremented to zero, these new function take an rwsem for writing just before the refcount reaches 0 (and call a user-provided function in the case of kref_put_rwsem). These will be used for

[PATCH] rw_semaphores, exported symbol non-versioning

2001-04-27 Thread D . W . Howells
This patch (made against linux-2.4.4-pre8) turns off module export versioning on the rwsem symbols called from inline assembly. David diff -uNr linux-2.4.4-pre8/lib/rwsem.c linux-rwsem/lib/rwsem.c --- linux-2.4.4-pre8/lib/rwsem.cFri Apr 27 20:10:11 2001 +++ linux-rwsem/lib/rwsem.c

[PATCH] rw_semaphores, exported symbol non-versioning

2001-04-27 Thread D . W . Howells
This patch (made against linux-2.4.4-pre8) turns off module export versioning on the rwsem symbols called from inline assembly. David diff -uNr linux-2.4.4-pre8/lib/rwsem.c linux-rwsem/lib/rwsem.c --- linux-2.4.4-pre8/lib/rwsem.cFri Apr 27 20:10:11 2001 +++ linux-rwsem/lib/rwsem.c

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-26 Thread David Howells
Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > It seems more similar to my code btw (you finally killed the useless > chmxchg ;). CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop, however, I tried various combinations and XADD beats CMPXCHG significantly. Here's a quote

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-26 Thread David Howells
Andrea Arcangeli [EMAIL PROTECTED] wrote: It seems more similar to my code btw (you finally killed the useless chmxchg ;). CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop, however, I tried various combinations and XADD beats CMPXCHG significantly. Here's a quote

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread Andrea Arcangeli
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote: > This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves > performance on the i386 XADD optimised implementation: It seems more similar to my code btw (you finally killed the useless chmxchg ;). I only had a

[PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: A patch against -pre6 can be obtained too: ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff Here's some benchmarks (take with a pinch of salt of

[PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: A patch against -pre6 can be obtained too: ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff Here's some benchmarks (take with a pinch of salt of

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread Andrea Arcangeli
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: It seems more similar to my code btw (you finally killed the useless chmxchg ;). I only had a

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread David Howells
Linus Torvalds <[EMAIL PROTECTED]> wrote: > - nobody will look up the list because we do have the spinlock at this > point, so a destroyed list doesn't actually _matter_ to anybody I suppose that it'll be okay, provided I take care not to access a block for a task I've just woken up. > -

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisationstry #3]

2001-04-24 Thread Linus Torvalds
On Tue, 24 Apr 2001, Andrea Arcangeli wrote: > > > > Again it's not a performance issue, the "+a" (sem) is a correctness issue > > > because the slow path will clobber it. > > > > There must be a performance issue too, otherwise our read up/down fastpaths > > are the same. Which clearly

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread Linus Torvalds
On Tue, 24 Apr 2001, David Howells wrote: > > Yes but the "struct rwsem_waiter" batch would have to be entirely deleted from > the list before any of them are woken, otherwise the waking processes may > destroy their "rwsem_waiter" blocks before they are dequeued (this destruction > is not

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote: > It was my implementation that triggered it (I haven't tried it with yours), > but the bug occurred because the SUBL happened to make the change outside of > the spinlocked region in the slowpath at the same time as the wakeup

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> so you reproduced a deadlock with my patch applied, or you are saying > you discovered that case with one of you testcases? It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote: > I'm starting the benchmarks of the C version and I will post a number update > and a new patch in a few minutes. (sorry for the below wrap around, just grow your terminal to read it stright) aa

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
There is a bug in both the C version and asm version of my rwsem and it is the slow path where I forgotten to drop the _irq part from the spinlock calls ;) Silly bug. (I inherit it also in the asm fast path version because I started hacking the same C slow path) I catched it now because it locks

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote: > *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come > out differently. compile with -mathlon and the compiler then should generate (%%eax) if that's faster even if the sem is a constant, that's a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote: > > I'd love to hear this sequence. Certainly regression testing never generated > > this sequence yet but yes that doesn't mean anything. Note that your slow > > path is very different than mine. > > One of my testcases fell over on

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> I see what you meant here and no, I'm not lucky, I thought about that. gcc x > 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the > sem is not a constant (I'm not clobbering another register, if it does it's > stupid and I consider this a compiler mistake). It is a

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> I'd love to hear this sequence. Certainly regression testing never generated > this sequence yet but yes that doesn't mean anything. Note that your slow > path is very different than mine. One of my testcases fell over on it... > I don't feel the need of any xchg to enforce additional

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote: > | +: "+m" (sem->count), "+a" (sem) ^^ I think you were comenting on the +m not +a ok > > >From what I've been

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread David Howells
Linus Torvalds <[EMAIL PROTECTED]> wrote: > Note that the generic list structure already has support for "batching". > It only does it for multiple adds right now (see the "list_splice" > merging code), but there is nothing to stop people from doing it for > multiple deletions too. The code is

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> Ok I finished now my asm optimized rwsemaphores and I improved a little my > spinlock based one but without touching the icache usage. And I can break it. There's a very good reason the I changed __up_write() to use CMPXCHG instead of SUBL. I found a sequence of operations that locked up on

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
Ok I finished now my asm optimized rwsemaphores and I improved a little my spinlock based one but without touching the icache usage. And I can break it. There's a very good reason the I changed __up_write() to use CMPXCHG instead of SUBL. I found a sequence of operations that locked up on

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread David Howells
Linus Torvalds [EMAIL PROTECTED] wrote: Note that the generic list structure already has support for batching. It only does it for multiple adds right now (see the list_splice merging code), but there is nothing to stop people from doing it for multiple deletions too. The code is something

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote: | +: +m (sem-count), +a (sem) ^^ I think you were comenting on the +m not +a ok From what I've been told, you're

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
I'd love to hear this sequence. Certainly regression testing never generated this sequence yet but yes that doesn't mean anything. Note that your slow path is very different than mine. One of my testcases fell over on it... I don't feel the need of any xchg to enforce additional

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
I see what you meant here and no, I'm not lucky, I thought about that. gcc x 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the sem is not a constant (I'm not clobbering another register, if it does it's stupid and I consider this a compiler mistake). It is a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote: I'd love to hear this sequence. Certainly regression testing never generated this sequence yet but yes that doesn't mean anything. Note that your slow path is very different than mine. One of my testcases fell over on it...

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote: *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come out differently. compile with -mathlon and the compiler then should generate (%%eax) if that's faster even if the sem is a constant, that's a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
There is a bug in both the C version and asm version of my rwsem and it is the slow path where I forgotten to drop the _irq part from the spinlock calls ;) Silly bug. (I inherit it also in the asm fast path version because I started hacking the same C slow path) I catched it now because it locks

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote: I'm starting the benchmarks of the C version and I will post a number update and a new patch in a few minutes. (sorry for the below wrap around, just grow your terminal to read it stright) aa RW

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
so you reproduced a deadlock with my patch applied, or you are saying you discovered that case with one of you testcases? It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote: It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked region in the slowpath at the same time as the wakeup routine

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread Linus Torvalds
On Tue, 24 Apr 2001, David Howells wrote: Yes but the struct rwsem_waiter batch would have to be entirely deleted from the list before any of them are woken, otherwise the waking processes may destroy their rwsem_waiter blocks before they are dequeued (this destruction is not guarded by a

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisationstry #3]

2001-04-24 Thread Linus Torvalds
On Tue, 24 Apr 2001, Andrea Arcangeli wrote: Again it's not a performance issue, the +a (sem) is a correctness issue because the slow path will clobber it. There must be a performance issue too, otherwise our read up/down fastpaths are the same. Which clearly they're not. I

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-24 Thread David Howells
Linus Torvalds [EMAIL PROTECTED] wrote: - nobody will look up the list because we do have the spinlock at this point, so a destroyed list doesn't actually _matter_ to anybody I suppose that it'll be okay, provided I take care not to access a block for a task I've just woken up. -

rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote: > On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: > > This patch (made against linux-2.4.4-pre6) makes a number of changes to the > > rwsem implementation: > > > > (1) Everything in try #2 > > > > plus > > > >

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread Linus Torvalds
On Mon, 23 Apr 2001, D.W.Howells wrote: > > Linus, you suggested that the generic list handling stuff would be faster (2 > unconditional stores) than mine (1 unconditional store and 1 conditional > store and branch to jump round it). You are both right and wrong. The generic > code does two

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: > This patch (made against linux-2.4.4-pre6) makes a number of changes to the > rwsem implementation: > > (1) Everything in try #2 > > plus > > (2) Changes proposed by Linus for the generic semaphore code. > > (3) Ideas from

[PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Everything in try #2 plus (2) Changes proposed by Linus for the generic semaphore code. (3) Ideas from Andrea and how he implemented his semaphores. Linus, you suggested that the generic

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Everything in try #2 plus (2) Changes proposed by Linus for the generic semaphore code. (3) Ideas from Andrea and

Re: [PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread Linus Torvalds
On Mon, 23 Apr 2001, D.W.Howells wrote: Linus, you suggested that the generic list handling stuff would be faster (2 unconditional stores) than mine (1 unconditional store and 1 conditional store and branch to jump round it). You are both right and wrong. The generic code does two stores

[PATCH] rw_semaphores, optimisations try #3

2001-04-23 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Everything in try #2 plus (2) Changes proposed by Linus for the generic semaphore code. (3) Ideas from Andrea and how he implemented his semaphores. Linus, you suggested that the generic

rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote: On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Everything in try #2 plus (2) Changes

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 03:04:41AM +0200, Andrea Arcangeli wrote: > that is supposed to be a performance optimization, I do the same too in my code. ah no I see what you mean, yes you are hurted by that. I'm waiting your #try 3 against pre6, by that time I hope to be able to make a run of the

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 11:52:29PM +0100, D . W . Howells wrote: > Hello Andrea, > > Interesting benchmarks... did you compile the test programs with "make > SCHED=yes" by any chance? Also what other software are you running? No I never tried the SCHED=yes. However in my modification of the

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread D . W . Howells
Hello Andrea, Interesting benchmarks... did you compile the test programs with "make SCHED=yes" by any chance? Also what other software are you running? The reason I ask is that running a full blown KDE setup running in the background, I get the following numbers on the rwsem-ro test (XADD

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 09:07:03PM +0200, Andrea Arcangeli wrote: > On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote: btw, I noticed I answered your previous email but for my benchmarks I really used your latest #try2 posted today at 13 (not last night a 1am), just to avoid

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote: > This patch (made against linux-2.4.4-pre6) makes a number of changes to the > rwsem implementation: > > (1) Fixes a subtle contention bug between up_write and the down_* functions. > > (2) Optimises the i386 fastpath

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Fixes a subtle contention bug between up_write and the down_* functions. (2) Optimises the i386 fastpath implementation

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 09:07:03PM +0200, Andrea Arcangeli wrote: On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote: btw, I noticed I answered your previous email but for my benchmarks I really used your latest #try2 posted today at 13 (not last night a 1am), just to avoid

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread D . W . Howells
Hello Andrea, Interesting benchmarks... did you compile the test programs with "make SCHED=yes" by any chance? Also what other software are you running? The reason I ask is that running a full blown KDE setup running in the background, I get the following numbers on the rwsem-ro test (XADD

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Sun, Apr 22, 2001 at 11:52:29PM +0100, D . W . Howells wrote: Hello Andrea, Interesting benchmarks... did you compile the test programs with "make SCHED=yes" by any chance? Also what other software are you running? No I never tried the SCHED=yes. However in my modification of the

Re: [PATCH] rw_semaphores, optimisations

2001-04-22 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 03:04:41AM +0200, Andrea Arcangeli wrote: that is supposed to be a performance optimization, I do the same too in my code. ah no I see what you mean, yes you are hurted by that. I'm waiting your #try 3 against pre6, by that time I hope to be able to make a run of the

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread Ivan Kokshaysky
On Fri, Apr 20, 2001 at 08:50:38AM +0100, David Howells wrote: > There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It > needs to go in the gap below "#include ". Otherwise the > declarations for the contention handling functions will give warnings about > the struct

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread David S. Miller
David Howells writes: > There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It > needs to go in the gap below "#include ". Otherwise the > declarations for the contention handling functions will give warnings about > the struct being declared in the parameter list.

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread David Howells
David S. Miller <[EMAIL PROTECTED]> wrote: > D.W.Howells writes: > > This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained > > when using the generic rwsem implementation. > > Have a look at pre5, this is already fixed. Not entirely so... There's also a missing

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread Ivan Kokshaysky
On Fri, Apr 20, 2001 at 08:50:38AM +0100, David Howells wrote: There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It needs to go in the gap below "#include linux/wait.h". Otherwise the declarations for the contention handling functions will give warnings about the

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread David Howells
David S. Miller [EMAIL PROTECTED] wrote: D.W.Howells writes: This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained when using the generic rwsem implementation. Have a look at pre5, this is already fixed. Not entirely so... There's also a missing "struct

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-20 Thread David S. Miller
David Howells writes: There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It needs to go in the gap below "#include linux/wait.h". Otherwise the declarations for the contention handling functions will give warnings about the struct being declared in the parameter

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-19 Thread David S. Miller
D.W.Howells writes: > This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained > when using the generic rwsem implementation. Have a look at pre5, this is already fixed. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe

[PATCH] generic rw_semaphores, compile warnings patch

2001-04-19 Thread D . W . Howells
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained when using the generic rwsem implementation. David diff -uNr linux-2.4.4-pre4/include/linux/rwsem.h linux/include/linux/rwsem.h --- linux-2.4.4-pre4/include/linux/rwsem.h Thu Apr 19 22:07:49 2001 +++

[PATCH] generic rw_semaphores, compile warnings patch

2001-04-19 Thread D . W . Howells
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained when using the generic rwsem implementation. David diff -uNr linux-2.4.4-pre4/include/linux/rwsem.h linux/include/linux/rwsem.h --- linux-2.4.4-pre4/include/linux/rwsem.h Thu Apr 19 22:07:49 2001 +++

Re: [PATCH] generic rw_semaphores, compile warnings patch

2001-04-19 Thread David S. Miller
D.W.Howells writes: This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained when using the generic rwsem implementation. Have a look at pre5, this is already fixed. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe

Re: rw_semaphores

2001-04-16 Thread yodaiken
On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote: > > > On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote: > > > > I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a > > major failure and I can't. To me: the result of an attempt by the 32,768th locker > >

Re: rw_semaphores

2001-04-16 Thread Andrew Morton
[EMAIL PROTECTED] wrote: > > On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote: > > > > Since you're willing to use CMPXCHG in your suggested implementation, would it > > make it make life easier if you were willing to use XADD too? > > > > Plus, are you really willing to limit the

Re: rw_semaphores

2001-04-16 Thread Linus Torvalds
On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote: > > I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a > major failure and I can't. To me: the result of an attempt by the 32,768th locker > should be a kernel panic. Is there a reasonable scenario where this is wrong?

Re: rw_semaphores

2001-04-16 Thread Alan Cox
> I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a > major failure and I can't. To me: the result of an attempt by the 32,768th locker > should be a kernel panic. Is there a reasonable scenario where this is wrong? 32000 threads all trying to lock the same piece

Re: rw_semaphores

2001-04-16 Thread yodaiken
On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote: > > Since you're willing to use CMPXCHG in your suggested implementation, would it > make it make life easier if you were willing to use XADD too? > > Plus, are you really willing to limit the number of readers or writers to be >

Re: rw_semaphores

2001-04-16 Thread yodaiken
On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote: Since you're willing to use CMPXCHG in your suggested implementation, would it make it make life easier if you were willing to use XADD too? Plus, are you really willing to limit the number of readers or writers to be 32767?

Re: rw_semaphores

2001-04-16 Thread Andrew Morton
[EMAIL PROTECTED] wrote: On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote: Since you're willing to use CMPXCHG in your suggested implementation, would it make it make life easier if you were willing to use XADD too? Plus, are you really willing to limit the number of

Re: rw_semaphores

2001-04-16 Thread yodaiken
On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote: On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote: I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a major failure and I can't. To me: the result of an attempt by the 32,768th locker should be a

Re: rw_semaphores

2001-04-16 Thread Alan Cox
I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a major failure and I can't. To me: the result of an attempt by the 32,768th locker should be a kernel panic. Is there a reasonable scenario where this is wrong? 32000 threads all trying to lock the same piece of

Re: rw_semaphores

2001-04-16 Thread Linus Torvalds
On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote: I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a major failure and I can't. To me: the result of an attempt by the 32,768th locker should be a kernel panic. Is there a reasonable scenario where this is wrong?

Re: [PATCH] 4th try: i386 rw_semaphores fix

2001-04-12 Thread D . W . Howells
the fastpath, then > the code at http://www.uow.edu.au/~andrewm/linux/rw_semaphore.tar.gz > is bog-simple and works. Sorry to pre-empt you, but have you seen my "advanced" patch? I sent it to the list in an email with the subject: [PATCH] i386 rw_semaphores, general abstr

Re: [PATCH] 4th try: i386 rw_semaphores fix

2001-04-12 Thread Andrew Morton
David Howells wrote: > > Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew > Morton's test cases showed up. > It still doesn't compile with gcc-2.91.66 because of the "#define rwsemdebug(FMT, ...)" thing. What can we do about this? I cooked up a few more tests,

[PATCH] i386 rw_semaphores, general abstraction patch

2001-04-12 Thread David Howells
This patch (made against linux-2.4.4-pre2) takes Anton Blanchard's suggestions and abstracts out the rwsem implementation somewhat. This makes the following general files: include/linux/rwsem.h - general RW semaphore wrapper include/linux/rwsem-spinlock.h - rwsem

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-12 Thread Jamie Lokier
Linus Torvalds wrote: > > > On Wed, 11 Apr 2001, Bernd Schmidt wrote: > See? Do you see why a "memory" clobber is _not_ comparable to a "ax" > clobber? And why that non-comparability makes a memory clobber equivalent > to a read-modify-write cycle? I had to think about this, so I'll explain it

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-12 Thread Jamie Lokier
Linus Torvalds wrote: On Wed, 11 Apr 2001, Bernd Schmidt wrote: See? Do you see why a "memory" clobber is _not_ comparable to a "ax" clobber? And why that non-comparability makes a memory clobber equivalent to a read-modify-write cycle? I had to think about this, so I'll explain it a

[PATCH] i386 rw_semaphores, general abstraction patch

2001-04-12 Thread David Howells
This patch (made against linux-2.4.4-pre2) takes Anton Blanchard's suggestions and abstracts out the rwsem implementation somewhat. This makes the following general files: include/linux/rwsem.h - general RW semaphore wrapper include/linux/rwsem-spinlock.h - rwsem

Re: [PATCH] 4th try: i386 rw_semaphores fix

2001-04-12 Thread Andrew Morton
David Howells wrote: Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew Morton's test cases showed up. It still doesn't compile with gcc-2.91.66 because of the "#define rwsemdebug(FMT, ...)" thing. What can we do about this? I cooked up a few more tests, generally

Re: [PATCH] 3rd try: i386 rw_semaphores fix

2001-04-11 Thread Anton Blanchard
Hi, > Here's the RW semaphore patch #3. This time with more asm constraints added. Personally I care about sparc and ppc64 and as such would like to see the slow paths end up in lib/rwsem.c protected by #ifndef __HAVE_ARCH_RWSEM or something like that. If we couldn't get rwsems to work on x86,

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread Alan Cox
> > Be careful there. CMOV is an optional instruction. gcc is arguably wrong > > in using cmov in '686' mode. Building libs with cmov makes sense though > > especially for the PIV with its ridiculously long pipeline > > It is just a matter how you define "686 mode", otherwise the very concept >

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread Alan Cox
> Yes, the big 686 optimization is CMOV, and that one is > ultra-pervasive. Be careful there. CMOV is an optional instruction. gcc is arguably wrong in using cmov in '686' mode. Building libs with cmov makes sense though especially for the PIV with its ridiculously long pipeline > - To

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread H. Peter Anvin
Alan Cox wrote: > > > Yes, the big 686 optimization is CMOV, and that one is > > ultra-pervasive. > > Be careful there. CMOV is an optional instruction. gcc is arguably wrong > in using cmov in '686' mode. Building libs with cmov makes sense though > especially for the PIV with its ridiculously

[PATCH] 4th try: i386 rw_semaphores fix

2001-04-11 Thread David Howells
Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew Morton's test cases showed up. It simplifies the __wake_up_ctx_common() function and adds an iterative clause to the end of rwsem_wake(). David diff -uNr linux-2.4.3/arch/i386/config.in linux-rwsem/arch/i386/config.in

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread David Howells
> You need sterner testing stuff :) I hit the BUG at the end of rwsem_wake() > in about a second running rwsem-4. Removed the BUG and everything stops > in D state. > > Grab rwsem-4 from > > http://www.uow.edu.au/~andrewm/linux/rwsem.tar.gz > > It's very simple. But running fully

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread Linus Torvalds
On Wed, 11 Apr 2001, David Howells wrote: > > > These numbers are infinity :) > > I know, but I think Linus may be happy with the resolution for the moment. It > can be extended later by siphoning off excess quantities of waiters into a > separate counter (as is done now) and by making the

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread Linus Torvalds
On Wed, 11 Apr 2001, Bernd Schmidt wrote: > > > > The example in there compiles out-of-the box and is much easier to > > experiment on than the whole kernel :-) > > That example seems to fail because a "memory" clobber only tells the compiler > that memory is written, not that it is read. The

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread H. Peter Anvin
Followup to: <[EMAIL PROTECTED]> By author:Alan Cox <[EMAIL PROTECTED]> In newsgroup: linux.dev.kernel > > > > Yes, and with CMPXCHG handler in the kernel it wouldn't be needed > > (the other 686 optimizations like memcpy also work on 386) > > They would still be needed. The 686 built

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread H. Peter Anvin
Followup to: <[EMAIL PROTECTED]> By author:Linus Torvalds <[EMAIL PROTECTED]> In newsgroup: linux.dev.kernel > > Note that the "fixup" approach is not necessarily very painful at all, > from a performance standpoint (either on 386 or on newer CPU's). It's not > really that hard to just

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread David Howells
Andrew Morton wrote: > I think that's a very good approach. Sure, it's suboptimal when there > are three or more waiters (and they're the right type and order). But > that never happens. Nice design idea. Cheers. > These numbers are infinity :) I know, but I think Linus may be happy with

Re: [PATCH] i386 rw_semaphores fix

2001-04-11 Thread Andrew Morton
David Howells wrote: > > Here's a patch that fixes RW semaphores on the i386 architecture. It is very > simple in the way it works. > > The lock counter is dealt with as two semi-independent words: the LSW is the > number of active (granted) locks, and the MSW, if negated, is the number of >

[PATCH] 3rd try: i386 rw_semaphores fix

2001-04-11 Thread David Howells
Here's the RW semaphore patch #3. This time with more asm constraints added. David diff -uNr linux-2.4.3/arch/i386/config.in linux-rwsem/arch/i386/config.in --- linux-2.4.3/arch/i386/config.in Thu Apr 5 14:44:04 2001 +++ linux-rwsem/arch/i386/config.in Wed Apr 11 08:38:04 2001 @@

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread Bernd Schmidt
On Wed, 11 Apr 2001, Andreas Franck wrote: > Hello David, > > > I've been discussing it with some other kernel and GCC people, and they > > think > > that only "memory" is required. > > Hmm.. I just looked at my GCC problem report from December, perhaps you're > interested, too: > >

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread Andreas Franck
Hello David, > I've been discussing it with some other kernel and GCC people, and they > think > that only "memory" is required. Hmm.. I just looked at my GCC problem report from December, perhaps you're interested, too: http://gcc.gnu.org/ml/gcc-bugs/2000-12/msg00554.html The example in

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread David Howells
I've been discussing it with some other kernel and GCC people, and they think that only "memory" is required. > What are the reasons against mentioning sem->count directly as a "=m" > reference? This makes the whole thing less fragile and no more dependent > on the memory layout of the

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread Andreas Franck
Hello David and people, > I've just consulted with one of the gcc people we have here, and he says > that > the '"memory"' constraint should do the trick. > > Do I take it that that is actually insufficient? I don't remember exactly, it's been a while, but I think it was not sufficient when I

Re: [PATCH] 2nd try: i386 rw_semaphores fix

2001-04-11 Thread David Howells
> I'd like you to look over it. It seems newer GCC's (snapshots and the > upcoming 3.0) will be more strict when modifying some values through > assembler-passed pointers - in this case, the passed semaphore structure got > freed too early, causing massive stack corruption on early bootup. > >

  1   2   >