Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-26 Thread David Howells
Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > It seems more similar to my code btw (you finally killed the useless > chmxchg ;). CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop, however, I tried various combinations and XADD beats CMPXCHG significantly. Here's a quote

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-26 Thread David Howells
Andrea Arcangeli [EMAIL PROTECTED] wrote: It seems more similar to my code btw (you finally killed the useless chmxchg ;). CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop, however, I tried various combinations and XADD beats CMPXCHG significantly. Here's a quote

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread Andrea Arcangeli
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote: > This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves > performance on the i386 XADD optimised implementation: It seems more similar to my code btw (you finally killed the useless chmxchg ;). I only had a

[PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: A patch against -pre6 can be obtained too: ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff Here's some benchmarks (take with a pinch of salt of

[PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread D . W . Howells
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: A patch against -pre6 can be obtained too: ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff Here's some benchmarks (take with a pinch of salt of

Re: [PATCH] rw_semaphores, optimisations try #4

2001-04-25 Thread Andrea Arcangeli
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote: This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves performance on the i386 XADD optimised implementation: It seems more similar to my code btw (you finally killed the useless chmxchg ;). I only had a