Similar to the existing functions that take a mutex or spinlock if and
only if a reference count is decremented to zero, these new function
take an rwsem for writing just before the refcount reaches 0 (and
call a user-provided function in the case of kref_put_rwsem).
These will be used for
This patch (made against linux-2.4.4-pre8) turns off module export versioning
on the rwsem symbols called from inline assembly.
David
diff -uNr linux-2.4.4-pre8/lib/rwsem.c linux-rwsem/lib/rwsem.c
--- linux-2.4.4-pre8/lib/rwsem.cFri Apr 27 20:10:11 2001
+++ linux-rwsem/lib/rwsem.c
This patch (made against linux-2.4.4-pre8) turns off module export versioning
on the rwsem symbols called from inline assembly.
David
diff -uNr linux-2.4.4-pre8/lib/rwsem.c linux-rwsem/lib/rwsem.c
--- linux-2.4.4-pre8/lib/rwsem.cFri Apr 27 20:10:11 2001
+++ linux-rwsem/lib/rwsem.c
Andrea Arcangeli <[EMAIL PROTECTED]> wrote:
> It seems more similar to my code btw (you finally killed the useless
> chmxchg ;).
CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop,
however, I tried various combinations and XADD beats CMPXCHG significantly.
Here's a quote
Andrea Arcangeli [EMAIL PROTECTED] wrote:
It seems more similar to my code btw (you finally killed the useless
chmxchg ;).
CMPXCHG ought to make things better by avoiding the XADD(+1)/XADD(-1) loop,
however, I tried various combinations and XADD beats CMPXCHG significantly.
Here's a quote
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
> performance on the i386 XADD optimised implementation:
It seems more similar to my code btw (you finally killed the useless
chmxchg ;).
I only had a
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
performance on the i386 XADD optimised implementation:
A patch against -pre6 can be obtained too:
ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff
Here's some benchmarks (take with a pinch of salt of
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
performance on the i386 XADD optimised implementation:
A patch against -pre6 can be obtained too:
ftp://infradead.org/pub/people/dwh/rwsem-pre6-opt4.diff
Here's some benchmarks (take with a pinch of salt of
On Wed, Apr 25, 2001 at 09:06:38PM +0100, D . W . Howells wrote:
This patch (made against linux-2.4.4-pre6 + rwsem-opt3) somewhat improves
performance on the i386 XADD optimised implementation:
It seems more similar to my code btw (you finally killed the useless
chmxchg ;).
I only had a
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> - nobody will look up the list because we do have the spinlock at this
> point, so a destroyed list doesn't actually _matter_ to anybody
I suppose that it'll be okay, provided I take care not to access a block for a
task I've just woken up.
> -
On Tue, 24 Apr 2001, Andrea Arcangeli wrote:
>
> > > Again it's not a performance issue, the "+a" (sem) is a correctness issue
> > > because the slow path will clobber it.
> >
> > There must be a performance issue too, otherwise our read up/down fastpaths
> > are the same. Which clearly
On Tue, 24 Apr 2001, David Howells wrote:
>
> Yes but the "struct rwsem_waiter" batch would have to be entirely deleted from
> the list before any of them are woken, otherwise the waking processes may
> destroy their "rwsem_waiter" blocks before they are dequeued (this destruction
> is not
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote:
> It was my implementation that triggered it (I haven't tried it with yours),
> but the bug occurred because the SUBL happened to make the change outside of
> the spinlocked region in the slowpath at the same time as the wakeup
> so you reproduced a deadlock with my patch applied, or you are saying
> you discovered that case with one of you testcases?
It was my implementation that triggered it (I haven't tried it with yours),
but the bug occurred because the SUBL happened to make the change outside of
the spinlocked
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote:
> I'm starting the benchmarks of the C version and I will post a number update
> and a new patch in a few minutes.
(sorry for the below wrap around, just grow your terminal to read it stright)
aa
There is a bug in both the C version and asm version of my rwsem
and it is the slow path where I forgotten to drop the _irq part
from the spinlock calls ;) Silly bug. (I inherit it also in the
asm fast path version because I started hacking the same C slow path)
I catched it now because it locks
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote:
> *grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come
> out differently.
compile with -mathlon and the compiler then should generate (%%eax) if that's
faster even if the sem is a constant, that's a compiler
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote:
> > I'd love to hear this sequence. Certainly regression testing never generated
> > this sequence yet but yes that doesn't mean anything. Note that your slow
> > path is very different than mine.
>
> One of my testcases fell over on
> I see what you meant here and no, I'm not lucky, I thought about that. gcc x
> 2.95.* seems smart enough to produce (%%eax) that you hardcoded when the
> sem is not a constant (I'm not clobbering another register, if it does it's
> stupid and I consider this a compiler mistake).
It is a
> I'd love to hear this sequence. Certainly regression testing never generated
> this sequence yet but yes that doesn't mean anything. Note that your slow
> path is very different than mine.
One of my testcases fell over on it...
> I don't feel the need of any xchg to enforce additional
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote:
> | +: "+m" (sem->count), "+a" (sem)
^^ I think you were comenting on
the +m not +a ok
>
> >From what I've been
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> Note that the generic list structure already has support for "batching".
> It only does it for multiple adds right now (see the "list_splice"
> merging code), but there is nothing to stop people from doing it for
> multiple deletions too. The code is
> Ok I finished now my asm optimized rwsemaphores and I improved a little my
> spinlock based one but without touching the icache usage.
And I can break it. There's a very good reason the I changed __up_write() to
use CMPXCHG instead of SUBL. I found a sequence of operations that locked up
on
Ok I finished now my asm optimized rwsemaphores and I improved a little my
spinlock based one but without touching the icache usage.
And I can break it. There's a very good reason the I changed __up_write() to
use CMPXCHG instead of SUBL. I found a sequence of operations that locked up
on
Linus Torvalds [EMAIL PROTECTED] wrote:
Note that the generic list structure already has support for batching.
It only does it for multiple adds right now (see the list_splice
merging code), but there is nothing to stop people from doing it for
multiple deletions too. The code is something
On Tue, Apr 24, 2001 at 09:56:11AM +0100, David Howells wrote:
| +: +m (sem-count), +a (sem)
^^ I think you were comenting on
the +m not +a ok
From what I've been told, you're
I'd love to hear this sequence. Certainly regression testing never generated
this sequence yet but yes that doesn't mean anything. Note that your slow
path is very different than mine.
One of my testcases fell over on it...
I don't feel the need of any xchg to enforce additional
I see what you meant here and no, I'm not lucky, I thought about that. gcc x
2.95.* seems smart enough to produce (%%eax) that you hardcoded when the
sem is not a constant (I'm not clobbering another register, if it does it's
stupid and I consider this a compiler mistake).
It is a compiler
On Tue, Apr 24, 2001 at 11:25:23AM +0100, David Howells wrote:
I'd love to hear this sequence. Certainly regression testing never generated
this sequence yet but yes that doesn't mean anything. Note that your slow
path is very different than mine.
One of my testcases fell over on it...
On Tue, Apr 24, 2001 at 11:33:13AM +0100, David Howells wrote:
*grin* Fun ain't it... Try it on a dual athlon or P4 and the answer may come
out differently.
compile with -mathlon and the compiler then should generate (%%eax) if that's
faster even if the sem is a constant, that's a compiler
There is a bug in both the C version and asm version of my rwsem
and it is the slow path where I forgotten to drop the _irq part
from the spinlock calls ;) Silly bug. (I inherit it also in the
asm fast path version because I started hacking the same C slow path)
I catched it now because it locks
On Tue, Apr 24, 2001 at 02:19:28PM +0200, Andrea Arcangeli wrote:
I'm starting the benchmarks of the C version and I will post a number update
and a new patch in a few minutes.
(sorry for the below wrap around, just grow your terminal to read it stright)
aa RW
so you reproduced a deadlock with my patch applied, or you are saying
you discovered that case with one of you testcases?
It was my implementation that triggered it (I haven't tried it with yours),
but the bug occurred because the SUBL happened to make the change outside of
the spinlocked
On Tue, Apr 24, 2001 at 02:07:47PM +0100, David Howells wrote:
It was my implementation that triggered it (I haven't tried it with yours),
but the bug occurred because the SUBL happened to make the change outside of
the spinlocked region in the slowpath at the same time as the wakeup routine
On Tue, 24 Apr 2001, David Howells wrote:
Yes but the struct rwsem_waiter batch would have to be entirely deleted from
the list before any of them are woken, otherwise the waking processes may
destroy their rwsem_waiter blocks before they are dequeued (this destruction
is not guarded by a
On Tue, 24 Apr 2001, Andrea Arcangeli wrote:
Again it's not a performance issue, the +a (sem) is a correctness issue
because the slow path will clobber it.
There must be a performance issue too, otherwise our read up/down fastpaths
are the same. Which clearly they're not.
I
Linus Torvalds [EMAIL PROTECTED] wrote:
- nobody will look up the list because we do have the spinlock at this
point, so a destroyed list doesn't actually _matter_ to anybody
I suppose that it'll be okay, provided I take care not to access a block for a
task I've just woken up.
-
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote:
> On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
> > This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> > rwsem implementation:
> >
> > (1) Everything in try #2
> >
> > plus
> >
> >
On Mon, 23 Apr 2001, D.W.Howells wrote:
>
> Linus, you suggested that the generic list handling stuff would be faster (2
> unconditional stores) than mine (1 unconditional store and 1 conditional
> store and branch to jump round it). You are both right and wrong. The generic
> code does two
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> rwsem implementation:
>
> (1) Everything in try #2
>
> plus
>
> (2) Changes proposed by Linus for the generic semaphore code.
>
> (3) Ideas from
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Everything in try #2
plus
(2) Changes proposed by Linus for the generic semaphore code.
(3) Ideas from Andrea and how he implemented his semaphores.
Linus, you suggested that the generic
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Everything in try #2
plus
(2) Changes proposed by Linus for the generic semaphore code.
(3) Ideas from Andrea and
On Mon, 23 Apr 2001, D.W.Howells wrote:
Linus, you suggested that the generic list handling stuff would be faster (2
unconditional stores) than mine (1 unconditional store and 1 conditional
store and branch to jump round it). You are both right and wrong. The generic
code does two stores
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Everything in try #2
plus
(2) Changes proposed by Linus for the generic semaphore code.
(3) Ideas from Andrea and how he implemented his semaphores.
Linus, you suggested that the generic
On Mon, Apr 23, 2001 at 11:34:35PM +0200, Andrea Arcangeli wrote:
On Mon, Apr 23, 2001 at 09:35:34PM +0100, D . W . Howells wrote:
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Everything in try #2
plus
(2) Changes
On Mon, Apr 23, 2001 at 03:04:41AM +0200, Andrea Arcangeli wrote:
> that is supposed to be a performance optimization, I do the same too in my code.
ah no I see what you mean, yes you are hurted by that. I'm waiting your #try 3
against pre6, by that time I hope to be able to make a run of the
On Sun, Apr 22, 2001 at 11:52:29PM +0100, D . W . Howells wrote:
> Hello Andrea,
>
> Interesting benchmarks... did you compile the test programs with "make
> SCHED=yes" by any chance? Also what other software are you running?
No I never tried the SCHED=yes. However in my modification of the
Hello Andrea,
Interesting benchmarks... did you compile the test programs with "make
SCHED=yes" by any chance? Also what other software are you running?
The reason I ask is that running a full blown KDE setup running in the
background, I get the following numbers on the rwsem-ro test (XADD
On Sun, Apr 22, 2001 at 09:07:03PM +0200, Andrea Arcangeli wrote:
> On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
btw, I noticed I answered your previous email but for my benchmarks I really
used your latest #try2 posted today at 13 (not last night a 1am), just to avoid
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
> This patch (made against linux-2.4.4-pre6) makes a number of changes to the
> rwsem implementation:
>
> (1) Fixes a subtle contention bug between up_write and the down_* functions.
>
> (2) Optimises the i386 fastpath
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
This patch (made against linux-2.4.4-pre6) makes a number of changes to the
rwsem implementation:
(1) Fixes a subtle contention bug between up_write and the down_* functions.
(2) Optimises the i386 fastpath implementation
On Sun, Apr 22, 2001 at 09:07:03PM +0200, Andrea Arcangeli wrote:
On Sun, Apr 22, 2001 at 01:27:20AM +0100, D . W . Howells wrote:
btw, I noticed I answered your previous email but for my benchmarks I really
used your latest #try2 posted today at 13 (not last night a 1am), just to avoid
Hello Andrea,
Interesting benchmarks... did you compile the test programs with "make
SCHED=yes" by any chance? Also what other software are you running?
The reason I ask is that running a full blown KDE setup running in the
background, I get the following numbers on the rwsem-ro test (XADD
On Sun, Apr 22, 2001 at 11:52:29PM +0100, D . W . Howells wrote:
Hello Andrea,
Interesting benchmarks... did you compile the test programs with "make
SCHED=yes" by any chance? Also what other software are you running?
No I never tried the SCHED=yes. However in my modification of the
On Mon, Apr 23, 2001 at 03:04:41AM +0200, Andrea Arcangeli wrote:
that is supposed to be a performance optimization, I do the same too in my code.
ah no I see what you mean, yes you are hurted by that. I'm waiting your #try 3
against pre6, by that time I hope to be able to make a run of the
On Fri, Apr 20, 2001 at 08:50:38AM +0100, David Howells wrote:
> There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It
> needs to go in the gap below "#include ". Otherwise the
> declarations for the contention handling functions will give warnings about
> the struct
David Howells writes:
> There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It
> needs to go in the gap below "#include ". Otherwise the
> declarations for the contention handling functions will give warnings about
> the struct being declared in the parameter list.
David S. Miller <[EMAIL PROTECTED]> wrote:
> D.W.Howells writes:
> > This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
> > when using the generic rwsem implementation.
>
> Have a look at pre5, this is already fixed.
Not entirely so...
There's also a missing
On Fri, Apr 20, 2001 at 08:50:38AM +0100, David Howells wrote:
There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It
needs to go in the gap below "#include linux/wait.h". Otherwise the
declarations for the contention handling functions will give warnings about
the
David S. Miller [EMAIL PROTECTED] wrote:
D.W.Howells writes:
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
when using the generic rwsem implementation.
Have a look at pre5, this is already fixed.
Not entirely so...
There's also a missing "struct
David Howells writes:
There's also a missing "struct rw_semaphore;" declaration in linux/rwsem.h. It
needs to go in the gap below "#include linux/wait.h". Otherwise the
declarations for the contention handling functions will give warnings about
the struct being declared in the parameter
D.W.Howells writes:
> This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
> when using the generic rwsem implementation.
Have a look at pre5, this is already fixed.
Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
when using the generic rwsem implementation.
David
diff -uNr linux-2.4.4-pre4/include/linux/rwsem.h linux/include/linux/rwsem.h
--- linux-2.4.4-pre4/include/linux/rwsem.h Thu Apr 19 22:07:49 2001
+++
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
when using the generic rwsem implementation.
David
diff -uNr linux-2.4.4-pre4/include/linux/rwsem.h linux/include/linux/rwsem.h
--- linux-2.4.4-pre4/include/linux/rwsem.h Thu Apr 19 22:07:49 2001
+++
D.W.Howells writes:
This patch (made against linux-2.4.4-pre4) gets rid of some warnings obtained
when using the generic rwsem implementation.
Have a look at pre5, this is already fixed.
Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe
On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote:
>
>
> On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
> >
> > I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
> > major failure and I can't. To me: the result of an attempt by the 32,768th locker
> >
[EMAIL PROTECTED] wrote:
>
> On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
> >
> > Since you're willing to use CMPXCHG in your suggested implementation, would it
> > make it make life easier if you were willing to use XADD too?
> >
> > Plus, are you really willing to limit the
On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
>
> I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
> major failure and I can't. To me: the result of an attempt by the 32,768th locker
> should be a kernel panic. Is there a reasonable scenario where this is wrong?
> I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
> major failure and I can't. To me: the result of an attempt by the 32,768th locker
> should be a kernel panic. Is there a reasonable scenario where this is wrong?
32000 threads all trying to lock the same piece
On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
>
> Since you're willing to use CMPXCHG in your suggested implementation, would it
> make it make life easier if you were willing to use XADD too?
>
> Plus, are you really willing to limit the number of readers or writers to be
>
On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
Since you're willing to use CMPXCHG in your suggested implementation, would it
make it make life easier if you were willing to use XADD too?
Plus, are you really willing to limit the number of readers or writers to be
32767?
[EMAIL PROTECTED] wrote:
On Tue, Apr 10, 2001 at 08:47:34AM +0100, David Howells wrote:
Since you're willing to use CMPXCHG in your suggested implementation, would it
make it make life easier if you were willing to use XADD too?
Plus, are you really willing to limit the number of
On Mon, Apr 16, 2001 at 10:05:57AM -0700, Linus Torvalds wrote:
On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
major failure and I can't. To me: the result of an attempt by the 32,768th locker
should be a
I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
major failure and I can't. To me: the result of an attempt by the 32,768th locker
should be a kernel panic. Is there a reasonable scenario where this is wrong?
32000 threads all trying to lock the same piece of
On Mon, 16 Apr 2001 [EMAIL PROTECTED] wrote:
I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
major failure and I can't. To me: the result of an attempt by the 32,768th locker
should be a kernel panic. Is there a reasonable scenario where this is wrong?
the fastpath, then
> the code at http://www.uow.edu.au/~andrewm/linux/rw_semaphore.tar.gz
> is bog-simple and works.
Sorry to pre-empt you, but have you seen my "advanced" patch? I sent it to
the list in an email with the subject:
[PATCH] i386 rw_semaphores, general abstr
David Howells wrote:
>
> Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew
> Morton's test cases showed up.
>
It still doesn't compile with gcc-2.91.66 because of the
"#define rwsemdebug(FMT, ...)" thing. What can we do
about this?
I cooked up a few more tests,
This patch (made against linux-2.4.4-pre2) takes Anton Blanchard's suggestions
and abstracts out the rwsem implementation somewhat. This makes the following
general files:
include/linux/rwsem.h - general RW semaphore wrapper
include/linux/rwsem-spinlock.h - rwsem
Linus Torvalds wrote:
>
>
> On Wed, 11 Apr 2001, Bernd Schmidt wrote:
> See? Do you see why a "memory" clobber is _not_ comparable to a "ax"
> clobber? And why that non-comparability makes a memory clobber equivalent
> to a read-modify-write cycle?
I had to think about this, so I'll explain it
Linus Torvalds wrote:
On Wed, 11 Apr 2001, Bernd Schmidt wrote:
See? Do you see why a "memory" clobber is _not_ comparable to a "ax"
clobber? And why that non-comparability makes a memory clobber equivalent
to a read-modify-write cycle?
I had to think about this, so I'll explain it a
This patch (made against linux-2.4.4-pre2) takes Anton Blanchard's suggestions
and abstracts out the rwsem implementation somewhat. This makes the following
general files:
include/linux/rwsem.h - general RW semaphore wrapper
include/linux/rwsem-spinlock.h - rwsem
David Howells wrote:
Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew
Morton's test cases showed up.
It still doesn't compile with gcc-2.91.66 because of the
"#define rwsemdebug(FMT, ...)" thing. What can we do
about this?
I cooked up a few more tests, generally
Hi,
> Here's the RW semaphore patch #3. This time with more asm constraints added.
Personally I care about sparc and ppc64 and as such would like to see the
slow paths end up in lib/rwsem.c protected by #ifndef __HAVE_ARCH_RWSEM
or something like that. If we couldn't get rwsems to work on x86,
> > Be careful there. CMOV is an optional instruction. gcc is arguably wrong
> > in using cmov in '686' mode. Building libs with cmov makes sense though
> > especially for the PIV with its ridiculously long pipeline
>
> It is just a matter how you define "686 mode", otherwise the very concept
>
> Yes, the big 686 optimization is CMOV, and that one is
> ultra-pervasive.
Be careful there. CMOV is an optional instruction. gcc is arguably wrong
in using cmov in '686' mode. Building libs with cmov makes sense though
especially for the PIV with its ridiculously long pipeline
>
-
To
Alan Cox wrote:
>
> > Yes, the big 686 optimization is CMOV, and that one is
> > ultra-pervasive.
>
> Be careful there. CMOV is an optional instruction. gcc is arguably wrong
> in using cmov in '686' mode. Building libs with cmov makes sense though
> especially for the PIV with its ridiculously
Here's the RW semaphore patch attempt #4. This fixes the bugs that Andrew
Morton's test cases showed up.
It simplifies the __wake_up_ctx_common() function and adds an iterative clause
to the end of rwsem_wake().
David
diff -uNr linux-2.4.3/arch/i386/config.in linux-rwsem/arch/i386/config.in
> You need sterner testing stuff :) I hit the BUG at the end of rwsem_wake()
> in about a second running rwsem-4. Removed the BUG and everything stops
> in D state.
>
> Grab rwsem-4 from
>
> http://www.uow.edu.au/~andrewm/linux/rwsem.tar.gz
>
> It's very simple. But running fully
On Wed, 11 Apr 2001, David Howells wrote:
>
> > These numbers are infinity :)
>
> I know, but I think Linus may be happy with the resolution for the moment. It
> can be extended later by siphoning off excess quantities of waiters into a
> separate counter (as is done now) and by making the
On Wed, 11 Apr 2001, Bernd Schmidt wrote:
> >
> > The example in there compiles out-of-the box and is much easier to
> > experiment on than the whole kernel :-)
>
> That example seems to fail because a "memory" clobber only tells the compiler
> that memory is written, not that it is read.
The
Followup to: <[EMAIL PROTECTED]>
By author:Alan Cox <[EMAIL PROTECTED]>
In newsgroup: linux.dev.kernel
> >
> > Yes, and with CMPXCHG handler in the kernel it wouldn't be needed
> > (the other 686 optimizations like memcpy also work on 386)
>
> They would still be needed. The 686 built
Followup to: <[EMAIL PROTECTED]>
By author:Linus Torvalds <[EMAIL PROTECTED]>
In newsgroup: linux.dev.kernel
>
> Note that the "fixup" approach is not necessarily very painful at all,
> from a performance standpoint (either on 386 or on newer CPU's). It's not
> really that hard to just
Andrew Morton wrote:
> I think that's a very good approach. Sure, it's suboptimal when there
> are three or more waiters (and they're the right type and order). But
> that never happens. Nice design idea.
Cheers.
> These numbers are infinity :)
I know, but I think Linus may be happy with
David Howells wrote:
>
> Here's a patch that fixes RW semaphores on the i386 architecture. It is very
> simple in the way it works.
>
> The lock counter is dealt with as two semi-independent words: the LSW is the
> number of active (granted) locks, and the MSW, if negated, is the number of
>
Here's the RW semaphore patch #3. This time with more asm constraints added.
David
diff -uNr linux-2.4.3/arch/i386/config.in linux-rwsem/arch/i386/config.in
--- linux-2.4.3/arch/i386/config.in Thu Apr 5 14:44:04 2001
+++ linux-rwsem/arch/i386/config.in Wed Apr 11 08:38:04 2001
@@
On Wed, 11 Apr 2001, Andreas Franck wrote:
> Hello David,
>
> > I've been discussing it with some other kernel and GCC people, and they
> > think
> > that only "memory" is required.
>
> Hmm.. I just looked at my GCC problem report from December, perhaps you're
> interested, too:
>
>
Hello David,
> I've been discussing it with some other kernel and GCC people, and they
> think
> that only "memory" is required.
Hmm.. I just looked at my GCC problem report from December, perhaps you're
interested, too:
http://gcc.gnu.org/ml/gcc-bugs/2000-12/msg00554.html
The example in
I've been discussing it with some other kernel and GCC people, and they think
that only "memory" is required.
> What are the reasons against mentioning sem->count directly as a "=m"
> reference? This makes the whole thing less fragile and no more dependent
> on the memory layout of the
Hello David and people,
> I've just consulted with one of the gcc people we have here, and he says
> that
> the '"memory"' constraint should do the trick.
>
> Do I take it that that is actually insufficient?
I don't remember exactly, it's been a while, but I think it was not
sufficient when I
> I'd like you to look over it. It seems newer GCC's (snapshots and the
> upcoming 3.0) will be more strict when modifying some values through
> assembler-passed pointers - in this case, the passed semaphore structure got
> freed too early, causing massive stack corruption on early bootup.
>
>
1 - 100 of 186 matches
Mail list logo