Andrea Arcangeli <[EMAIL PROTECTED]> wrote:
> It seems more similar to my code btw (you finally killed the useless
> chmxchg ;).
CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop,
however, I tried various combinations and XADD beats CMPXCHG significantly.
Here's a quote
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
> performance on the i386 XADD optimised implementation:
It seems more similar to my code btw (you finally killed the useless
chmxchg ;).
I only had a sho
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
performance on the i386 XADD optimised implementation:
A patch against -pre6 can be obtained too:
ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff
Here's some benchmarks (take with a pinch of salt of cours
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> - nobody will look up the list because we do have the spinlock at this
> point, so a destroyed list doesn't actually _matter_ to anybody
I suppose that it'll be okay, provided I take care not to access a block for a
task I've just woken up.
> - list_
On Tue, 24 Apr 2001, David Howells wrote:
>
> Yes but the "struct rwsem_waiter" batch would have to be entirely deleted from
> the list before any of them are woken, otherwise the waking processes may
> destroy their "rwsem_waiter" blocks before they are dequeued (this destruction
> is not guard
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote:
> It was my implementation that triggered it (I haven't tried it with yours),
> but the bug occurred because the SUBL happened to make the change outside of
> the spinlocked region in the slowpath at the same time as the wakeup routine
> so you reproduced a deadlock with my patch applied, or you are saying
> you discovered that case with one of you testcases?
It was my implementation that triggered it (I haven't tried it with yours),
but the bug occurred because the SUBL happened to make the change outside of
the spinlocked reg
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote:
> I'm starting the benchmarks of the C version and I will post a number update
> and a new patch in a few minutes.
(sorry for the below wrap around, just grow your terminal to read it stright)
aa RW
There is a bug in both the C version and asm version of my rwsem
and it is the slow path where I forgotten to drop the _irq part
from the spinlock calls ;) Silly bug. (I inherit it also in the
asm fast path version because I started hacking the same C slow path)
I catched it now because it locks
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote:
> *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come
> out differently.
compile with -mathlon and the compiler then should generate (%%eax) if that's
faster even if the sem is a constant, that's a compiler is
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote:
> > I'd love to hear this sequence. Certainly regression testing never generated
> > this sequence yet but yes that doesn't mean anything. Note that your slow
> > path is very different than mine.
>
> One of my testcases fell over on
> I see what you meant here and no, I'm not lucky, I thought about that. gcc x
> 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the
> sem is not a constant (I'm not clobbering another register, if it does it's
> stupid and I consider this a compiler mistake).
It is a compil
> I'd love to hear this sequence. Certainly regression testing never generated
> this sequence yet but yes that doesn't mean anything. Note that your slow
> path is very different than mine.
One of my testcases fell over on it...
> I don't feel the need of any xchg to enforce additional serializ
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote:
> | +: "+m" (sem->count), "+a" (sem)
^^ I think you were comenting on
the +m not +a ok
>
> >From what I've been told,
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> Note that the generic list structure already has support for "batching".
> It only does it for multiple adds right now (see the "list_splice"
> merging code), but there is nothing to stop people from doing it for
> multiple deletions too. The code is som
> Ok I finished now my asm optimized rwsemaphores and I improved a little my
> spinlock based one but without touching the icache usage.
And I can break it. There's a very good reason the I changed __up_write() to
use CMPXCHG instead of SUBL. I found a sequence of operations that locked up
on thi
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote:
> On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
> > This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> > rwsem implementation:
> >
> > (1) Everything in try #2
> >
> > plus
> >
> >
On Mon, 23 Apr 2001, D.W.Howells wrote:
>
> Linus, you suggested that the generic list handling stuff would be faster (2
> unconditional stores) than mine (1 unconditional store and 1 conditional
> store and branch to jump round it). You are both right and wrong. The generic
> code does two stor
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> rwsem implementation:
>
> (1) Everything in try #2
>
> plus
>
> (2) Changes proposed by Linus for the generic semaphore code.
>
> (3) Ideas from A
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Everything in try #2
plus
(2) Changes proposed by Linus for the generic semaphore code.
(3) Ideas from Andrea and how he implemented his semaphores.
Linus, you suggested that the generic l
On Mon, Apr 23, 2001 at 03:04:41AM +0200, Andrea Arcangeli wrote:
> that is supposed to be a performance optimization, I do the same too in my code.
ah no I see what you mean, yes you are hurted by that. I'm waiting your #try 3
against pre6, by that time I hope to be able to make a run of the ben
On Sun, Apr 22, 2001 at 11:52:29PM +0100, D . W . Howells wrote:
> Hello Andrea,
>
> Interesting benchmarks... did you compile the test programs with "make
> SCHED=yes" by any chance? Also what other software are you running?
No I never tried the SCHED=yes. However in my modification of the rws
Hello Andrea,
Interesting benchmarks... did you compile the test programs with "make
SCHED=yes" by any chance? Also what other software are you running?
The reason I ask is that running a full blown KDE setup running in the
background, I get the following numbers on the rwsem-ro test (XADD opt
On Sun, Apr 22, 2001 at 09:07:03PM +0200, Andrea Arcangeli wrote:
> On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
btw, I noticed I answered your previous email but for my benchmarks I really
used your latest #try2 posted today at 13 (not last night a 1am), just to avoid
mistake
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> rwsem implementation:
>
> (1) Fixes a subtle contention bug between up_write and the down_* functions.
>
> (2) Optimises the i386 fastpath implementa
25 matches
Mail list logo