On Fri, Sep 22, 2000 at 11:08:21AM -0700, Linus Torvalds wrote:
> The same is true in the filesystems - many of them want to change block or
> inode allocation bitmap bits, but they have to hold a lock anyway (usually
> the kernel lock in addition to the superblock lock.
The patch addresses the
On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
>
> If you don't like the name smp_mb__before/after_clear_bit (not that I like it
> too much too) suggestions are welcome. Maybe we could use a single
> smp_mb__across_bitops() instead?
I suspect that we should just split up the bitops.
We've done
> On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
> >
> > This patch fixes the spinlock problems in read_lock/write_lock and also some
> > alpha SMP race where clear_bit isn't enforcing a memory barrier
>
> Why would clear_bit() be a memory barrier?
Because historically it sort of was. This came
On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
>
> This patch fixes the spinlock problems in read_lock/write_lock and also some
> alpha SMP race where clear_bit isn't enforcing a memory barrier
Why would clear_bit() be a memory barrier?
Anything that expects clear_bit() to be a memory barrier
This patch fixes the spinlock problems in read_lock/write_lock and also some
alpha SMP race where clear_bit isn't enforcing a memory barrier, plus some
improvement in some place where we can check the waitqueue is not full before
entering wake_up. There's some minor alpha compile fix too.
On the
On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
This patch fixes the spinlock problems in read_lock/write_lock and also some
alpha SMP race where clear_bit isn't enforcing a memory barrier
Why would clear_bit() be a memory barrier?
Anything that expects clear_bit() to be a memory barrier
This patch fixes the spinlock problems in read_lock/write_lock and also some
alpha SMP race where clear_bit isn't enforcing a memory barrier, plus some
improvement in some place where we can check the waitqueue is not full before
entering wake_up. There's some minor alpha compile fix too.
On the
On Fri, Sep 22, 2000 at 11:08:21AM -0700, Linus Torvalds wrote:
The same is true in the filesystems - many of them want to change block or
inode allocation bitmap bits, but they have to hold a lock anyway (usually
the kernel lock in addition to the superblock lock.
The patch addresses the
On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
This patch fixes the spinlock problems in read_lock/write_lock and also some
alpha SMP race where clear_bit isn't enforcing a memory barrier
Why would clear_bit() be a memory barrier?
Because historically it sort of was. This came up with
On Fri, 22 Sep 2000, Andrea Arcangeli wrote:
If you don't like the name smp_mb__before/after_clear_bit (not that I like it
too much too) suggestions are welcome. Maybe we could use a single
smp_mb__across_bitops() instead?
I suspect that we should just split up the bitops.
We've done this
On Tue, Sep 19, 2000 at 04:16:07PM -0400, John Wehle wrote:
> Umm ... "miscompilation"? As in the compiler produced the wrong code
> based on the input provided?
That's not a gcc bug (gcc is doing the right thing). It's the kernel that
should use the "memory" clobber in the spinlock
> I see. So Jamie was right and we reproduced a case of miscompilation.
Umm ... "miscompilation"? As in the compiler produced the wrong code
based on the input provided?
int * p;
...
a = *p;
movl p,%eax
movl (%eax),%edx
The assembly code appears to load the address
On Tue, Sep 19, 2000 at 07:22:48PM +0200, Jamie Lokier wrote:
> That instruction loads the _value_ of p. I.e. reads the memory from
> location 0x80495a4 into %eax. The source instruction was:
>
>movl p,%eax
>
> The instructions that you're thinking of, that load fixed addresses,
>
On Tue, Sep 19, 2000 at 10:23:05AM -0700, Richard Henderson wrote:
> On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
> > Wrong: it's really loading the _address_.
> [...]
> > 80483f5: a1 a4 95 04 08 mov0x80495a4,%eax
> >^^^
On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
> Wrong: it's really loading the _address_.
[...]
> 80483f5: a1 a4 95 04 08 mov0x80495a4,%eax
>^^^ ^
No, that's an absolute memory load. If we were loading
On Tue, Sep 19, 2000 at 04:01:26PM +0100, David Howells wrote:
> I can't remember exactly what it was now, but I think it was either something
> to do with spinlocks or bitops. I'll re-investigate tonight and see if I can
> come back with some benchmarks/code-snippets tomorrow.
Yes you should
I've been writing a kernel module and I've noticed a measurable performance
drop between the same code compiled against linux-2.4.0-test7 and against
test8. I disassembled the code to try and work out what was going on and I saw
the following happen:
* [test8]
One of the atomic memory
On Mon, Sep 18, 2000 at 09:37:43PM -0400, John Wehle wrote:
> It's perhaps not optimal, however I'm not sure that it's wrong. In
It's not "wrong" in the sense that something breaks but it's definitely
suboptimal. There's no reason to reload a value that can't change because it's
embedded into
Andrea Arcangeli wrote:
> int * p;
> [...]
> spin_lock();
> a = *p;
> spin_unlock();
>
> spin_lock();
> b = *p;
> spin_unlock();
> [With "memory" clobber"] the [second] reload of the address of `p'
> isn't necessary and gcc is wrong in
Andrea Arcangeli wrote:
int * p;
[...]
spin_lock(lock);
a = *p;
spin_unlock(lock);
spin_lock(lock);
b = *p;
spin_unlock(lock);
[With "memory" clobber"] the [second] reload of the address of `p'
isn't necessary and gcc is wrong in
I've been writing a kernel module and I've noticed a measurable performance
drop between the same code compiled against linux-2.4.0-test7 and against
test8. I disassembled the code to try and work out what was going on and I saw
the following happen:
* [test8]
One of the atomic memory
On Tue, Sep 19, 2000 at 04:01:26PM +0100, David Howells wrote:
I can't remember exactly what it was now, but I think it was either something
to do with spinlocks or bitops. I'll re-investigate tonight and see if I can
come back with some benchmarks/code-snippets tomorrow.
Yes you should tell
On Mon, Sep 18, 2000 at 09:37:43PM -0400, John Wehle wrote:
It's perhaps not optimal, however I'm not sure that it's wrong. In
It's not "wrong" in the sense that something breaks but it's definitely
suboptimal. There's no reason to reload a value that can't change because it's
embedded into
On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
Wrong: it's really loading the _address_.
[...]
80483f5: a1 a4 95 04 08 mov0x80495a4,%eax
^^^ ^
No, that's an absolute memory load. If we were loading
the
On Tue, Sep 19, 2000 at 10:23:05AM -0700, Richard Henderson wrote:
On Tue, Sep 19, 2000 at 04:32:16PM +0200, Andrea Arcangeli wrote:
Wrong: it's really loading the _address_.
[...]
80483f5: a1 a4 95 04 08 mov0x80495a4,%eax
^^^
I see. So Jamie was right and we reproduced a case of miscompilation.
Umm ... "miscompilation"? As in the compiler produced the wrong code
based on the input provided?
int * p;
...
a = *p;
movl p,%eax
movl (%eax),%edx
The assembly code appears to load the address
> The reload of the address of `p' isn't necessary and gcc is wrong in
> generating it. p is a constant embedded into the .text section ...
It's perhaps not optimal, however I'm not sure that it's wrong. In
any case if you can supply a small standalone test case (i.e. preprocessed
source code)
On Mon, Sep 18, 2000 at 07:53:04PM -0400, John Wehle wrote:
> What version of gcc? Recently some work was done to improve the handling of
> constant memory.
I'm using 2.95.2 19991024.
Take this small testcase:
#include
int * p;
spinlock_t lock = SPIN_LOCK_UNLOCKED;
extern void dummy(int,
> I read the asm produced by some of some of my testcases. The current spinlock
> implementation seems to do exactly the _right_ thing in practice and nothing
> more. "memory" was instead causing reloads of constant addresses into registers
> and stuff that shouldn't be necessary (I was infact
On Mon, Sep 18, 2000 at 03:39:50PM -0700, Linus Torvalds wrote:
> Have you looked at the code it generates? Quite sad, really.
I read the asm produced by some of some of my testcases. The current spinlock
implementation seems to do exactly the _right_ thing in practice and nothing
more.
On Tue, 19 Sep 2000, Andrea Arcangeli wrote:
>
> I think we can remove all the __dummy stuff and put the "memory" in such asm
> statements.
>
> Comments?
Have you looked at the code it generates? Quite sad, really.
Linus
-
To unsubscribe from this list: send the line
On Fri, Sep 08, 2000 at 01:41:05PM +0200, Jamie Lokier wrote:
> Casting via __dummy is there so that the "m" (or "=m") memory constraint
> will make that operand refer to the actual object in memory, and not a
> copy (in a different area of memory).
Are you really sure gcc could pass a copy even
On Fri, Sep 08, 2000 at 01:41:05PM +0200, Jamie Lokier wrote:
Casting via __dummy is there so that the "m" (or "=m") memory constraint
will make that operand refer to the actual object in memory, and not a
copy (in a different area of memory).
Are you really sure gcc could pass a copy even
On Mon, Sep 18, 2000 at 03:39:50PM -0700, Linus Torvalds wrote:
Have you looked at the code it generates? Quite sad, really.
I read the asm produced by some of some of my testcases. The current spinlock
implementation seems to do exactly the _right_ thing in practice and nothing
more.
I read the asm produced by some of some of my testcases. The current spinlock
implementation seems to do exactly the _right_ thing in practice and nothing
more. "memory" was instead causing reloads of constant addresses into registers
and stuff that shouldn't be necessary (I was infact
On Tue, 19 Sep 2000, Andrea Arcangeli wrote:
I think we can remove all the __dummy stuff and put the "memory" in such asm
statements.
Comments?
Have you looked at the code it generates? Quite sad, really.
Linus
-
To unsubscribe from this list: send the line
On Mon, Sep 18, 2000 at 07:53:04PM -0400, John Wehle wrote:
What version of gcc? Recently some work was done to improve the handling of
constant memory.
I'm using 2.95.2 19991024.
Take this small testcase:
#include linux/spinlock.h
int * p;
spinlock_t lock = SPIN_LOCK_UNLOCKED;
extern
The reload of the address of `p' isn't necessary and gcc is wrong in
generating it. p is a constant embedded into the .text section ...
It's perhaps not optimal, however I'm not sure that it's wrong. In
any case if you can supply a small standalone test case (i.e. preprocessed
source code)
Andrea Arcangeli wrote:
> >BTW Look also into asm-i386/bitops.h and dummy cast to some crap there.
> >Are you impressed? 8)
>
> Yep 8). If we add "memory" such stuff could be removed I think. As far I
> can see the object of such stuff is to cause gcc to say `I'm too lazy to
> see exactly what
Andrea Arcangeli wrote:
BTW Look also into asm-i386/bitops.h and dummy cast to some crap there.
Are you impressed? 8)
Yep 8). If we add "memory" such stuff could be removed I think. As far I
can see the object of such stuff is to cause gcc to say `I'm too lazy to
see exactly what memory
40 matches
Mail list logo