Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote: > It was my implementation that triggered it (I haven't tried it with yours), > but the bug occurred because the SUBL happened to make the change outside of > the spinlocked region in the slowpath at the same time as the wakeup

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> so you reproduced a deadlock with my patch applied, or you are saying > you discovered that case with one of you testcases? It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote: > I'm starting the benchmarks of the C version and I will post a number update > and a new patch in a few minutes. (sorry for the below wrap around, just grow your terminal to read it stright) aa

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
There is a bug in both the C version and asm version of my rwsem and it is the slow path where I forgotten to drop the _irq part from the spinlock calls ;) Silly bug. (I inherit it also in the asm fast path version because I started hacking the same C slow path) I catched it now because it locks

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote: > *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come > out differently. compile with -mathlon and the compiler then should generate (%%eax) if that's faster even if the sem is a constant, that's a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote: > > I'd love to hear this sequence. Certainly regression testing never generated > > this sequence yet but yes that doesn't mean anything. Note that your slow > > path is very different than mine. > > One of my testcases fell over on

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> I see what you meant here and no, I'm not lucky, I thought about that. gcc x > 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the > sem is not a constant (I'm not clobbering another register, if it does it's > stupid and I consider this a compiler mistake). It is a

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> I'd love to hear this sequence. Certainly regression testing never generated > this sequence yet but yes that doesn't mean anything. Note that your slow > path is very different than mine. One of my testcases fell over on it... > I don't feel the need of any xchg to enforce additional

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote: > | +: "+m" (sem->count), "+a" (sem) ^^ I think you were comenting on the +m not +a ok > > >From what I've been

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
> Ok I finished now my asm optimized rwsemaphores and I improved a little my > spinlock based one but without touching the icache usage. And I can break it. There's a very good reason the I changed __up_write() to use CMPXCHG instead of SUBL. I found a sequence of operations that locked up on

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
Ok I finished now my asm optimized rwsemaphores and I improved a little my spinlock based one but without touching the icache usage. And I can break it. There's a very good reason the I changed __up_write() to use CMPXCHG instead of SUBL. I found a sequence of operations that locked up on

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote: | +: +m (sem-count), +a (sem) ^^ I think you were comenting on the +m not +a ok From what I've been told, you're

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
I'd love to hear this sequence. Certainly regression testing never generated this sequence yet but yes that doesn't mean anything. Note that your slow path is very different than mine. One of my testcases fell over on it... I don't feel the need of any xchg to enforce additional

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
I see what you meant here and no, I'm not lucky, I thought about that. gcc x 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the sem is not a constant (I'm not clobbering another register, if it does it's stupid and I consider this a compiler mistake). It is a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote: I'd love to hear this sequence. Certainly regression testing never generated this sequence yet but yes that doesn't mean anything. Note that your slow path is very different than mine. One of my testcases fell over on it...

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote: *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come out differently. compile with -mathlon and the compiler then should generate (%%eax) if that's faster even if the sem is a constant, that's a compiler

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
There is a bug in both the C version and asm version of my rwsem and it is the slow path where I forgotten to drop the _irq part from the spinlock calls ;) Silly bug. (I inherit it also in the asm fast path version because I started hacking the same C slow path) I catched it now because it locks

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote: I'm starting the benchmarks of the C version and I will post a number update and a new patch in a few minutes. (sorry for the below wrap around, just grow your terminal to read it stright) aa RW

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread David Howells
so you reproduced a deadlock with my patch applied, or you are saying you discovered that case with one of you testcases? It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked

Re: rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-24 Thread Andrea Arcangeli
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote: It was my implementation that triggered it (I haven't tried it with yours), but the bug occurred because the SUBL happened to make the change outside of the spinlocked region in the slowpath at the same time as the wakeup routine

rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote: > On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: > > This patch (made against linux-2.4.4-pre6) makes a number of changes to the > > rwsem implementation: > > > > (1) Everything in try #2 > > > > plus > > > >

rwsem benchmark [was Re: [PATCH] rw_semaphores, optimisations try #3]

2001-04-23 Thread Andrea Arcangeli
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote: On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6) makes a number of changes to the rwsem implementation: (1) Everything in try #2 plus (2) Changes