Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool

Yeah.  Compiler errors are more annoying though I dare say ;-)


Actually, compile-time errors are fine,


Yes, they don't cause data corruption or anything like that,
but I still don't think the 390 people want to ship a kernel
that doesn't build -- and it seems they still need to support
GCC versions before 4.0.  Up until the day they can finally
drop 3.x, they'll have to...


and easy to work around.


...use the simple workaround, annoying as it may be, at least
it works.  That's all I'm saying.


*Much*
more annoying is when gcc actively generates subtly bad code.


Quite so.


We've had
use-after-free issues due to incorrect gcc liveness calculations etc, 
and
inline asm has beeen one of the more common causes - exactly because 
the

kernel is one of the few users (along with glibc) that uses it at all.


The good news is that things are getting better since more and
more of the RTL transformations are ripped out.


Now *those* are hard to find - the code works most of the time, but the
compiler has inserted a really subtle race condition into the code
(deallocated a local stack entry before last use). We had that with our
semaphore code at some point.


Yeah, magnifying glass + lots of time kind of bug.  Lovely :-/


Segher

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Linus Torvalds


On Sun, 12 Aug 2007, Segher Boessenkool wrote:
> 
> Yeah.  Compiler errors are more annoying though I dare say ;-)

Actually, compile-time errors are fine, and easy to work around. *Much* 
more annoying is when gcc actively generates subtly bad code. We've had 
use-after-free issues due to incorrect gcc liveness calculations etc, and 
inline asm has beeen one of the more common causes - exactly because the 
kernel is one of the few users (along with glibc) that uses it at all.

Now *those* are hard to find - the code works most of the time, but the 
compiler has inserted a really subtle race condition into the code 
(deallocated a local stack entry before last use). We had that with our 
semaphore code at some point.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Linus Torvalds


On Sun, 12 Aug 2007, Segher Boessenkool wrote:
> 
> It works _most of the time_.

It used to have problems. Gcc has had problems in various areas.

> Ask Martin.  Oh you don't even have to,
> he told you two mails ago.  My last mail simply pointed out that this
> isn't a GCC bug, but merely documented behaviour.

It was *not* documented behaviour, and there were gcc people who actively 
*encouraged* use of "+m", and actually fixed gcc.

The documentation is buggy.

Live with it.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool

"+m" works. We use it. It's better than the alternatives. Pointing to
stale documentation doesn't change anything.


Well, perhaps on i386. I've seen some older versions of the s390 gcc 
die
with an ICE because I have used "+m" in some kernel inline assembly. 
I'm

happy to hear that this issue is fixed in recent gcc. Now I'll have to
find out if this is already true with gcc 3.x.


It was fixed (that is, "+m" is translated into a separate read
and write by GCC itself) in GCC-4.0.0, I just learnt.

The duplication "=m" and "m" with the same constraint is rather 
annoying.


Yeah.  Compiler errors are more annoying though I dare say ;-)


Segher

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool

Yes, though I would use "=m" on the output list and "m" on the input
list. The reason is that I've seen gcc fall on its face with an ICE 
on
s390 due to "+m". The explanation I've got from our compiler people 
was

quite esoteric, as far as I remember gcc splits "+m" to an input
operand
and an output operand. Now it can happen that the compiler chooses 
two

different registers to access the same memory location. "+m" requires
that the two memory references are identical which causes the ICE if
they are not.


The problem is very nicely described here, last paragraph:


It's not a problem anymore in (very) recent GCC, although
that of course won't help you in the kernel (yet).


So you are saying that gcc 3.x still has this problem ?


Yes.  A warning ("read-write constraint does not allow a register")
was added for GCC-3.4, but the fix/workaround is more recent
(4.2 I believe, perhaps it was backported to the 4.1 branch).


I do not know if the current compilers still do this. Has
anyone else seen this happen ?


In recent GCC, it's actually documented:

	 The ordinary output operands must be write-only; GCC will assume 
that

the values in these operands before the instruction are dead and need
not be generated.  Extended asm supports input-output or read-write
	operands.  Use the constraint character `+' to indicate such an 
operand

and list it with the output operands.  You should only use read-write
	operands when the constraints for the operand (or the operand in 
which

only some of the bits are to be changed) allow a register.

Note that last line.


I see, thanks for the info.


My pleasure.


Segher

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool

Note that last line.


Segher, how about you just accept that Linux uses gcc as per reality, 
and

that sometimes the reality is different from your expectations?

"+m" works.


It works _most of the time_.  Ask Martin.  Oh you don't even have to,
he told you two mails ago.  My last mail simply pointed out that this
isn't a GCC bug, but merely documented behaviour.  Well okay, it was
documented only after people found problems with it; it is an edge
case and pretty hard to trigger (impossible to trigger on RISC
architectures, and not very likely on x86 either, for different
reasons).

Since "realistic" people keep using "+m" anyhow, current GCC has added
a workaround that splits "+m" into an "=m" and an "m" constraint.  It
likely will be accepted as a permanent feature.  However, that 
workaround

doesn't yet exist on some of the compilers Linux still allows building
with (but the problem does).


We use it. It's better than the alternatives.


How is this better than simply writing the slightly more verbose
asm("..." : "=m"(XX) : "m"(XX)) ?  It's a few more characters.
asm() is hard to write (correctly) anyway.  I don't see the problem.


Pointing to stale documentation doesn't change anything.


It's not "stale" documentation, it's freshly built from GCC mainline.
It's also not "stale" in the sense that it isn't updated; in fact,
the email I pointed to is from a thread where a change in this exact
paragraph in the documentation was suggested (a couple of weeks ago).


The same is true of accesses through a volatile pointer. The kernel has
done that for a *loong* time (all MMIO IO is done that way),


(All MMIO on certain architectures).  This however is a completely
different case; there _is_ no underlying C object declared anywhere,
so no way can GCC prove that that object isn't volatile, so it _has_
to assume it is.

Also, in case of e.g. readb(), if the compiler can prove the resulting
value isn't used it doesn't have to perform the access at all -- the
documentation (again, not "stale" documentation) points this out quite
literally.


and it's
totally pointless to say that "volatile" isn't guaranteed to do what it
does.


I actually asked a couple of other GCC developers last night, and
they all agreed with me.  If you look at historic GCC problem reports
(I pointed at a few earlier in these threads) you would see that the
situation is far from clear.


It works, and quite frankly, if it didn't work, it would be a gcc
BUG.


How can it be a bug, if the semantics you think are guaranteed are
guaranteed in your (and others') minds only?

Of course, GCC likes to cater to its users, and tries to make things
work the way the programmer probably intended.  However it a) cannot
*actually* read your mind, it just looks that way; b) it _also_ has
to implement the _correct_ semantics of volatile; c) since this isn't
part of any C version (no, not GNU99 or similar either), sometimes
GCC developers do not think about what some people expect the compiler
should do with such code, but just implement the requirements they
*know* exist.  And that's when "BUGs" happen.

If you want GCC to do what you want it to do instead of what it does,
why not file an enhancement request?  .

To save you the trouble, I just did: .  Let's
see what comes from it.

And again, this is not a "C standard" issue. It's a "sane 
implementation"

issue.


Maybe what you consider sane isn't as clear-cut as you think?  I'm
not alone with this opinion, as the mailing lists archives readily
show.


Linux expects the compiler to be sane.


And the compiler expects the code it is asked to translate to
be reasonably sane.  Nothing new here.  The compiler is more
forgiving than the Linux code, IMHO; that's why I suggested
using the obviously correct asm implementation of the atomic
get/set, rather than using the not-so-obviously-correct and
error-prone volatile thing.  The semantics of the asm() are
exactly the semantics you expect from an atomic get/set, while
that of the volatile implementation are *not*; and on top of
that it is very well-tested well-understood technology.


If it isn't, that's not our problem.


What, if the compiler doesn't work as you expect, like a silent
miscompilation, or even only the compiler ICEs, that is not a
problem for users and/or developers of Linux?  Interesting
opinion.  I'd rather stay clear of known pitfalls.


gcc *is* sane,


Yes, it is.


and I don't see why you constantly act as if it wasn't.


I guess I didn't express myself clearly enough then.  Hopefully
some other people _did_ understand what I meant.  In very harsh
words: "not all (proposed) kernel code is sane".


Segher

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Linus Torvalds


On Sun, 12 Aug 2007, Martin Schwidefsky wrote:
> 
> The duplication "=m" and "m" with the same constraint is rather 
> annoying.

It's not only annoying, it causes gcc to generate bad code too. At least 
certain versions of gcc will generate the address *twice*, even if there 
is obviously only one address used.

If you have problems with "+m", you are often actually better off using 
just "m" and then adding a memory clobber. But that has other code 
generation downsides.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Martin Schwidefsky
On Sat, 2007-08-11 at 23:09 -0700, Linus Torvalds wrote: 
> Segher, how about you just accept that Linux uses gcc as per reality, and 
> that sometimes the reality is different from your expectations?
> 
> "+m" works. We use it. It's better than the alternatives. Pointing to 
> stale documentation doesn't change anything.

Well, perhaps on i386. I've seen some older versions of the s390 gcc die
with an ICE because I have used "+m" in some kernel inline assembly. I'm
happy to hear that this issue is fixed in recent gcc. Now I'll have to
find out if this is already true with gcc 3.x. The duplication "=m" and
"m" with the same constraint is rather annoying.

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Martin Schwidefsky
On Sun, 2007-08-12 at 07:53 +0200, Segher Boessenkool wrote:
> > Yes, though I would use "=m" on the output list and "m" on the input
> > list. The reason is that I've seen gcc fall on its face with an ICE on
> > s390 due to "+m". The explanation I've got from our compiler people was
> > quite esoteric, as far as I remember gcc splits "+m" to an input 
> > operand
> > and an output operand. Now it can happen that the compiler chooses two
> > different registers to access the same memory location. "+m" requires
> > that the two memory references are identical which causes the ICE if
> > they are not.
> 
> The problem is very nicely described here, last paragraph:
> 
> 
> It's not a problem anymore in (very) recent GCC, although
> that of course won't help you in the kernel (yet).

So you are saying that gcc 3.x still has this problem ?

> > I do not know if the current compilers still do this. Has
> > anyone else seen this happen ?
> 
> In recent GCC, it's actually documented:
> 
>The ordinary output operands must be write-only; GCC will assume that
>   the values in these operands before the instruction are dead and need
>   not be generated.  Extended asm supports input-output or read-write
>   operands.  Use the constraint character `+' to indicate such an operand
>   and list it with the output operands.  You should only use read-write
>   operands when the constraints for the operand (or the operand in which
>   only some of the bits are to be changed) allow a register.
> 
> Note that last line.

I see, thanks for the info. 

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-11 Thread Linus Torvalds


On Sun, 12 Aug 2007, Segher Boessenkool wrote:
>
> Note that last line.

Segher, how about you just accept that Linux uses gcc as per reality, and 
that sometimes the reality is different from your expectations?

"+m" works. We use it. It's better than the alternatives. Pointing to 
stale documentation doesn't change anything.

The same is true of accesses through a volatile pointer. The kernel has 
done that for a *loong* time (all MMIO IO is done that way), and it's 
totally pointless to say that "volatile" isn't guaranteed to do what it 
does. It works, and quite frankly, if it didn't work, it would be a gcc 
BUG.

And again, this is not a "C standard" issue. It's a "sane implementation" 
issue. Linux expects the compiler to be sane. If it isn't, that's not our 
problem. gcc *is* sane, and I don't see why you constantly act as if it 
wasn't.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-11 Thread Segher Boessenkool

You'd have to use "+m".


Yes, though I would use "=m" on the output list and "m" on the input
list. The reason is that I've seen gcc fall on its face with an ICE on
s390 due to "+m". The explanation I've got from our compiler people was
quite esoteric, as far as I remember gcc splits "+m" to an input 
operand

and an output operand. Now it can happen that the compiler chooses two
different registers to access the same memory location. "+m" requires
that the two memory references are identical which causes the ICE if
they are not.


The problem is very nicely described here, last paragraph:


It's not a problem anymore in (very) recent GCC, although
that of course won't help you in the kernel (yet).


I do not know if the current compilers still do this. Has
anyone else seen this happen ?


In recent GCC, it's actually documented:

 The ordinary output operands must be write-only; GCC will assume that
the values in these operands before the instruction are dead and need
not be generated.  Extended asm supports input-output or read-write
operands.  Use the constraint character `+' to indicate such an operand
and list it with the output operands.  You should only use read-write
operands when the constraints for the operand (or the operand in which
only some of the bits are to be changed) allow a register.

Note that last line.


Segher

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Martin Schwidefsky
On Thu, 2007-08-09 at 10:55 -0700, Linus Torvalds wrote:
> > You can use this forget() macro to make the compiler reread a variable:
> > 
> > #define forget(var) asm volatile ("" : "=m"(var))
> 
> No. That will also make the compiler "forget" any previous writes to it, 
> so it changes behaviour.
> 
> You'd have to use "+m".

Yes, though I would use "=m" on the output list and "m" on the input
list. The reason is that I've seen gcc fall on its face with an ICE on
s390 due to "+m". The explanation I've got from our compiler people was
quite esoteric, as far as I remember gcc splits "+m" to an input operand
and an output operand. Now it can happen that the compiler chooses two
different registers to access the same memory location. "+m" requires
that the two memory references are identical which causes the ICE if
they are not. I do not know if the current compilers still do this. Has
anyone else seen this happen ?

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Linus Torvalds


On Thu, 9 Aug 2007, Chuck Ebbert wrote:
> 
> You can use this forget() macro to make the compiler reread a variable:
> 
> #define forget(var) asm volatile ("" : "=m"(var))

No. That will also make the compiler "forget" any previous writes to it, 
so it changes behaviour.

You'd have to use "+m".

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Martin Schwidefsky
On Thu, 2007-08-09 at 13:36 -0400, Chuck Ebbert wrote:
> > Fair enough.  Casting to (volatile int *) will give us the behavior
> > people expect when using atomic_t without needing to use inefficient
> > barriers.
> > 
> 
> You can use this forget() macro to make the compiler reread a
> variable:
> 
> #define forget(var) asm volatile ("" : "=m"(var))

You need to specify a "m"(var) input constraint as well. Without it the
compiler might remove the initialization of var. E.g.

void fn(void)
{
int var = 0;
forget(var);
/* now var can have any value. */
}

-- 
blue skies,
  Martin.

"Reality continues to ruin my life." - Calvin.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chuck Ebbert
On 08/09/2007 03:31 AM, Chris Snook wrote:
> 
> Fair enough.  Casting to (volatile int *) will give us the behavior
> people expect when using atomic_t without needing to use inefficient
> barriers.
> 

You can use this forget() macro to make the compiler reread a variable:

#define forget(var) asm volatile ("" : "=m"(var))
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Linus Torvalds


On Thu, 9 Aug 2007, Jerry Jiang wrote:
> 
> and still not to said "Why the *volatile-accesses-in-code* is
> acceptable"

I don't think volatile is necessarily wonderful in code _either_. So I 
think the "atomic_read()" issue would be even better off if we just made 
sure everybody behaves well and has the right barriers.

So the difference between "volatile in code" and "volatile on data 
structures" is that the latter is *always* wrong (and wrong for very 
fundamental reasons). 

But "volatile in code" can be perfectly fine. If you hide the volatile in 
an accessor function, and just specify that the rules for that function is 
that it always loads from memory but doesn't imply any memory barriers, 
than that is fine. And I think those kinds of semantics may well be 
perfectly sane for "atomic_read()".

So I think a volatile access inside "atomic_read()" at least has 
well-defined semantics. Are they the best possible semantics? I dunno. I 
can imagine that it's perfectly fine, but on the other hand, it would be 
interesting to see any code for which it matters - it's entirely possible 
that the code itself is the problem, and should instead have a 
"cpu_relax()" or similar that implies a barrier.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Herbert Xu wrote:

On Thu, Aug 09, 2007 at 03:47:57AM -0400, Chris Snook wrote:
If they're not doing anything, sure.  Plenty of loops actually do some sort 
of real work while waiting for their halt condition, possibly even work 
which is necessary for their halt condition to occur, and you definitely 
don't want to be doing cpu_relax() in this case.  On register-rich 
architectures you can do quite a lot of work without needing to reuse the 
register containing the result of the atomic_read().  Those are precisely 
the architectures where barrier() hurts the most.


I have a problem with this argument.  The same loop could be
using a non-atomic as long as the updaters are serialised.  Would
you suggest that we turn such non-atomics into volatiles too?


No.  I'm simply saying that when people call atomic_read, they expect a read to 
occur.  When this doesn't happen, people get confused.  Merely referencing a 
variable doesn't carry the same connotation.


Anyway, I'm taking Linus's advice and putting the cast in atomic_read(), rather 
than the counter declaration itself.  Everything else uses __asm__ __volatile__, 
or calls atomic_read() with interrupts disabled.  This ensures that 
atomic_read() works as expected across all architectures, without the cruft the 
compiler generates when you declare the variable itself volatile.



Any loop that's waiting for an external halt condition either
has to schedule away (which is a barrier) or you'd be busy
waiting in which case you should use cpu_relax.


Not necessarily.  Some code uses atomic_t for a sort of lightweight semaphore. 
If your loop is actually doing real work, perhaps in a softirq handler 
negotiating shared resources with a hard irq handler, you're not busy-waiting.



Do you have an example where this isn't the case?


a) No, and that's sort of the point.  We shouldn't have to audit the whole 
kernel to find cases where a misleadingly-named function is doing what its users 
are not expecting.  If we can make it always do the right thing without any 
substantial drawbacks, we should.


b) Loops are just one case, which came to mind because of the IPVS bug recently 
discussed.  I recall seeing some scheduler code recently which does an 
atomic_read() twice on the same variable, with a barrier in between.  It's in a 
very hot path, so if we can remove that barrier, we save a bunch of register 
loads.  When you're context switching every microsecond in SCHED_RR, that really 
matters.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Herbert Xu
On Thu, Aug 09, 2007 at 03:47:57AM -0400, Chris Snook wrote:
>
> If they're not doing anything, sure.  Plenty of loops actually do some sort 
> of real work while waiting for their halt condition, possibly even work 
> which is necessary for their halt condition to occur, and you definitely 
> don't want to be doing cpu_relax() in this case.  On register-rich 
> architectures you can do quite a lot of work without needing to reuse the 
> register containing the result of the atomic_read().  Those are precisely 
> the architectures where barrier() hurts the most.

I have a problem with this argument.  The same loop could be
using a non-atomic as long as the updaters are serialised.  Would
you suggest that we turn such non-atomics into volatiles too?

Any loop that's waiting for an external halt condition either
has to schedule away (which is a barrier) or you'd be busy
waiting in which case you should use cpu_relax.

Do you have an example where this isn't the case?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Jerry Jiang
On Thu, 09 Aug 2007 11:10:16 +0200
Bodo Eggert <[EMAIL PROTECTED]> wrote:

> > 
> > Why the *volatile-accesses-in-code* is acceptable, does C standard make it
> > clear?
> 
> http://lwn.net/Articles/233482/

I have read this article before, but What Linus said only focusing on
the conclusion-- The semantics of it are so unclear as
to be totally useless.

and still not to said "Why the *volatile-accesses-in-code* is
acceptable"

-- Jerry


> -- 
> Fun things to slip into your budget
> Heisenberg Compensator upgrade kit
> 
> Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Bodo Eggert
Jerry Jiang <[EMAIL PROTECTED]> wrote:
> On Wed, 8 Aug 2007 21:18:25 -0700 (PDT)
>> On Wed, 8 Aug 2007, Chris Snook wrote:

>> > Some architectures currently do not declare the contents of an atomic_t to
>> > be
>> > volatile.  This causes confusion since atomic_read() might not actually
>> > read anything if an optimizing compiler re-uses a value stored in a
>> > register, which can break code that loops until something external changes
>> > the value of an atomic_t.
>> 
>> I'd be *much* happier with "atomic_read()" doing the "volatile" instead.
>> 
>> The fact is, volatile on data structures is a bug. It's a wart in the C
>> language. It shouldn't be used.
> 
> Why? It's a wart! Is it due to unclear C standard on volatile related point?
> 
> Why the *volatile-accesses-in-code* is acceptable, does C standard make it
> clear?

http://lwn.net/Articles/233482/
-- 
Fun things to slip into your budget
Heisenberg Compensator upgrade kit

Friß, Spammer: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Heiko Carstens
On Thu, Aug 09, 2007 at 03:31:10AM -0400, Chris Snook wrote:
>  Linus Torvalds wrote:
> > I'd be *much* happier with "atomic_read()" doing the "volatile" instead.
> > The fact is, volatile on data structures is a bug. It's a wart in the C 
> > language. It shouldn't be used. Volatile accesses in *code* can be ok, and 
> > if we have "atomic_read()" expand to a "*(volatile int *)&(x)->value", then 
> > I'd be ok with that.
> > But marking data structures volatile just makes the compiler screw up 
> > totally, and makes code for initialization sequences etc much worse.
> > Linus
> 
>  Fair enough.  Casting to (volatile int *) will give us the behavior people 
>  expect when using atomic_t without needing to use inefficient barriers.
> 
>  While we have the hood up, should we convert all the atomic_t's to 
>  non-volatile and put volatile casts in all the atomic_reads?  I don't know 
>  enough about the various arches to say with confidence that those changes 
>  alone will preserve existing behavior.  We might need some arch-specific 
>  tweaking of the atomic operations.

If you write that patch could you include the atomic64 variants as well,
please? Besides that just post the patch to linux-arch and maintainers
should speak up.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Herbert Xu wrote:

Chris Snook <[EMAIL PROTECTED]> wrote:

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads


Such loops should always use something like cpu_relax() which comes
with a barrier.


If they're not doing anything, sure.  Plenty of loops actually do some sort of 
real work while waiting for their halt condition, possibly even work which is 
necessary for their halt condition to occur, and you definitely don't want to be 
doing cpu_relax() in this case.  On register-rich architectures you can do quite 
a lot of work without needing to reuse the register containing the result of the 
atomic_read().  Those are precisely the architectures where barrier() hurts the 
most.



of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally


Do you have an example of such a loop where performance is hurt by this?


Not handy.  Perhaps more interesting are cases where we access the same atomic_t 
twice in a hot path.  If we can remove some of those barriers, those hot paths 
will get faster.


Performance was only part of the motivation.  The IPVS bug was an example of how 
atomic_t is assumed (not always correctly) to work, and other recent discussions 
on this list have made it clear that most people assume atomic_read() actually 
reads something every time, and don't even think to consult the documentation 
until they find out the hard way that it does not.  I'm not saying we should 
encourage lazy programming, but in this case the assumption is reasonable 
because that's how people actually use atomic_t, and making this behavior 
uniform across all architectures makes it more convenient to do things the right 
way, which we should encourage.



The IPVS code that led to this patch was simply broken and has been
fixed to use cpu_relax().


I agree, busy-waiting should be done with cpu_relax(), if at all.  I'm more 
concerned about cases that are not busy-waiting, but could still get compiled 
with the same optimization.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-09 Thread Chris Snook

Linus Torvalds wrote:


On Wed, 8 Aug 2007, Chris Snook wrote:

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.


I'd be *much* happier with "atomic_read()" doing the "volatile" instead.

The fact is, volatile on data structures is a bug. It's a wart in the C 
language. It shouldn't be used. 

Volatile accesses in *code* can be ok, and if we have "atomic_read()" 
expand to a "*(volatile int *)&(x)->value", then I'd be ok with that.


But marking data structures volatile just makes the compiler screw up 
totally, and makes code for initialization sequences etc much worse.


Linus


Fair enough.  Casting to (volatile int *) will give us the behavior people 
expect when using atomic_t without needing to use inefficient barriers.


While we have the hood up, should we convert all the atomic_t's to non-volatile 
and put volatile casts in all the atomic_reads?  I don't know enough about the 
various arches to say with confidence that those changes alone will preserve 
existing behavior.  We might need some arch-specific tweaking of the atomic 
operations.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Jerry Jiang
On Wed, 8 Aug 2007 21:18:25 -0700 (PDT)
Linus Torvalds <[EMAIL PROTECTED]> wrote:

> 
> 
> On Wed, 8 Aug 2007, Chris Snook wrote:
> > 
> > Some architectures currently do not declare the contents of an atomic_t to 
> > be
> > volatile.  This causes confusion since atomic_read() might not actually read
> > anything if an optimizing compiler re-uses a value stored in a register, 
> > which
> > can break code that loops until something external changes the value of an
> > atomic_t.
> 
> I'd be *much* happier with "atomic_read()" doing the "volatile" instead.
> 
> The fact is, volatile on data structures is a bug. It's a wart in the C 
> language. It shouldn't be used. 

Why? It's a wart! Is it due to unclear C standard on volatile related point?

Why the *volatile-accesses-in-code* is acceptable, does C standard make it 
clear?

-- Jerry

> 
> Volatile accesses in *code* can be ok, and if we have "atomic_read()" 
> expand to a "*(volatile int *)&(x)->value", then I'd be ok with that.
> 
> But marking data structures volatile just makes the compiler screw up 
> totally, and makes code for initialization sequences etc much worse.
> 
>   Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Linus Torvalds


On Wed, 8 Aug 2007, Chris Snook wrote:
> 
> Some architectures currently do not declare the contents of an atomic_t to be
> volatile.  This causes confusion since atomic_read() might not actually read
> anything if an optimizing compiler re-uses a value stored in a register, which
> can break code that loops until something external changes the value of an
> atomic_t.

I'd be *much* happier with "atomic_read()" doing the "volatile" instead.

The fact is, volatile on data structures is a bug. It's a wart in the C 
language. It shouldn't be used. 

Volatile accesses in *code* can be ok, and if we have "atomic_read()" 
expand to a "*(volatile int *)&(x)->value", then I'd be ok with that.

But marking data structures volatile just makes the compiler screw up 
totally, and makes code for initialization sequences etc much worse.

Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Paul E. McKenney
On Wed, Aug 08, 2007 at 06:48:24PM -0700, David Miller wrote:
> From: Herbert Xu <[EMAIL PROTECTED]>
> Date: Thu, 09 Aug 2007 09:03:27 +0800
> 
> > Such loops should always use something like cpu_relax() which comes
> > with a barrier.
> 
> This is an excellent point.
> 
> And it needs to be weighed with the error prone'ness Andrew mentioned.
> There probably is a middle ground somewhere.

OK...  I'll bite.  ACCESS_ONCE(), see http://lkml.org/lkml/2007/7/11/664.

This would allow ACCESS_ONCE(atomic_read(&x)) to be used where refetching
would be problematic, but allow the compiler free rein in cases where
refetching is OK.

The ACCESS_ONCE() primitive of course has its limitations as well, but
you did ask for a middle ground.  ;-)

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Thu, 09 Aug 2007 09:03:27 +0800

> Such loops should always use something like cpu_relax() which comes
> with a barrier.

This is an excellent point.

And it needs to be weighed with the error prone'ness Andrew mentioned.
There probably is a middle ground somewhere.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Herbert Xu
Chris Snook <[EMAIL PROTECTED]> wrote:
> 
> Some architectures currently do not declare the contents of an atomic_t to be
> volatile.  This causes confusion since atomic_read() might not actually read
> anything if an optimizing compiler re-uses a value stored in a register, which
> can break code that loops until something external changes the value of an
> atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads

Such loops should always use something like cpu_relax() which comes
with a barrier.

> of all registers used in the loop, thus hurting performance instead of helping
> it, particularly on architectures where it's unnecessary.  Since we generally

Do you have an example of such a loop where performance is hurt by this?

The IPVS code that led to this patch was simply broken and has been
fixed to use cpu_relax().

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Jesper Juhl
On 09/08/2007, Chris Snook <[EMAIL PROTECTED]> wrote:
> Jesper Juhl wrote:
> > On 09/08/2007, Chris Snook <[EMAIL PROTECTED]> wrote:
> >> From: Chris Snook <[EMAIL PROTECTED]>
> >>
> >> Some architectures currently do not declare the contents of an atomic_t to 
> >> be
> >> volatile.  This causes confusion since atomic_read() might not actually 
> >> read
> >> anything if an optimizing compiler re-uses a value stored in a register, 
> >> which
> >> can break code that loops until something external changes the value of an
> >> atomic_t.  Avoiding such bugs requires using barrier(), which causes 
> >> re-loads
> >> of all registers used in the loop, thus hurting performance instead of 
> >> helping
> >> it, particularly on architectures where it's unnecessary.  Since we 
> >> generally
> >> want to re-read the contents of an atomic variable on every access anyway,
> >> let's standardize the behavior across all architectures and avoid the
> >> performance and correctness problems of requiring the use of barrier() in
> >> loops that expect atomic_t variables to change externally.  This is 
> >> relevant
> >> even on non-smp architectures, since drivers may use atomic operations in
> >> interrupt handlers.
> >>
> >> Signed-off-by: Chris Snook <[EMAIL PROTECTED]>
> >>
> >
> > Hmm, I thought we were trying to move away from volatile since it is
> > very weakly defined and towards explicit barriers and locks...
> > Points to --> Documentation/volatile-considered-harmful.txt
>
> This is a special case.

I had a feeling it might be.

> Usually, the use of volatile is just lazy.  In
> this case, it's probably necessary on at least some architectures, so we
> can't remove it everywhere unless we want to rewrite atomic.h completely
> in inline assembler for several architectures, and painstakingly verify
> all that work.

Sounds quite unpleasant and actively harmful - keeping stuff in plain
readable C makes it easier to handle by most people.

> I would hope it's obvious that having consistent
> behavior on all architectures, or even at all compiler optimization
> levels within an architecture, can be agreed to be good.

Agreed, consistency is good.

>  Additionally,
> atomic_t variables are a rare exception, in that we pretty much always
> want to work with the latest value in RAM on every access.  Making this
> atomic will allow us to remove a bunch of barriers which do nothing but
> slow things down on most architectures.
>
I believe you meant to say "volatile" and not "atomic" above.  But
yes, if volatile is sufficiently defined to ensure it does what we
want in this case and using it means we can actually speed up the
kernel, then it does indeed sound reasonable. Especially since it's
confined to the implementation of atomic_t which most people shouldn't
be looking at anyway (but simply use) and when using an atomic_t it's
pretty reasonable to expect that it doesn't need any additional
locking or barriers since it's supposed to be *atomic*.

> I agree that the use of atomic_t in .c files is generally bad, but in
> certain special cases, hiding it inside defined data types may be worth
> the slight impurity, just as we sometimes tolerate lines longer than 80
> columns when "fixing" it makes things unreadable.
>
Can't argue with that.

It seems you've thought it through.  I just wanted to be sure that
this approach was actually the one that makes the most sense, and
you've convinced me of that :-)

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook

Lennert Buytenhek wrote:

On Wed, Aug 08, 2007 at 07:07:33PM -0400, Chris Snook wrote:


From: Chris Snook <[EMAIL PROTECTED]>

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>


Documentation/atomic_ops.txt would need updating:

[...]

One very important aspect of these two routines is that they DO NOT
require any explicit memory barriers.  They need only perform the
atomic_t counter update in an SMP safe manner.


Thanks, I was looking for that.  I'll re-send shortly with my comment 
moved there.  People are already using atomic_t in a manner that implies 
the use of memory barriers and interrupt-safety, which is what the patch 
enforces.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook

Jesper Juhl wrote:

On 09/08/2007, Chris Snook <[EMAIL PROTECTED]> wrote:

From: Chris Snook <[EMAIL PROTECTED]>

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>



Hmm, I thought we were trying to move away from volatile since it is
very weakly defined and towards explicit barriers and locks...
Points to --> Documentation/volatile-considered-harmful.txt


This is a special case.  Usually, the use of volatile is just lazy.  In 
this case, it's probably necessary on at least some architectures, so we 
can't remove it everywhere unless we want to rewrite atomic.h completely 
in inline assembler for several architectures, and painstakingly verify 
all that work.  I would hope it's obvious that having consistent 
behavior on all architectures, or even at all compiler optimization 
levels within an architecture, can be agreed to be good.  Additionally, 
atomic_t variables are a rare exception, in that we pretty much always 
want to work with the latest value in RAM on every access.  Making this 
atomic will allow us to remove a bunch of barriers which do nothing but 
slow things down on most architectures.


I agree that the use of atomic_t in .c files is generally bad, but in 
certain special cases, hiding it inside defined data types may be worth 
the slight impurity, just as we sometimes tolerate lines longer than 80 
columns when "fixing" it makes things unreadable.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Lennert Buytenhek
On Wed, Aug 08, 2007 at 07:07:33PM -0400, Chris Snook wrote:

> From: Chris Snook <[EMAIL PROTECTED]>
> 
> Some architectures currently do not declare the contents of an atomic_t to be
> volatile.  This causes confusion since atomic_read() might not actually read
> anything if an optimizing compiler re-uses a value stored in a register, which
> can break code that loops until something external changes the value of an
> atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
> of all registers used in the loop, thus hurting performance instead of helping
> it, particularly on architectures where it's unnecessary.  Since we generally
> want to re-read the contents of an atomic variable on every access anyway,
> let's standardize the behavior across all architectures and avoid the
> performance and correctness problems of requiring the use of barrier() in
> loops that expect atomic_t variables to change externally.  This is relevant
> even on non-smp architectures, since drivers may use atomic operations in
> interrupt handlers.
> 
> Signed-off-by: Chris Snook <[EMAIL PROTECTED]>

Documentation/atomic_ops.txt would need updating:

[...]

One very important aspect of these two routines is that they DO NOT
require any explicit memory barriers.  They need only perform the
atomic_t counter update in an SMP safe manner.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Jesper Juhl
On 09/08/2007, Chris Snook <[EMAIL PROTECTED]> wrote:
> From: Chris Snook <[EMAIL PROTECTED]>
>
> Some architectures currently do not declare the contents of an atomic_t to be
> volatile.  This causes confusion since atomic_read() might not actually read
> anything if an optimizing compiler re-uses a value stored in a register, which
> can break code that loops until something external changes the value of an
> atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
> of all registers used in the loop, thus hurting performance instead of helping
> it, particularly on architectures where it's unnecessary.  Since we generally
> want to re-read the contents of an atomic variable on every access anyway,
> let's standardize the behavior across all architectures and avoid the
> performance and correctness problems of requiring the use of barrier() in
> loops that expect atomic_t variables to change externally.  This is relevant
> even on non-smp architectures, since drivers may use atomic operations in
> interrupt handlers.
>
> Signed-off-by: Chris Snook <[EMAIL PROTECTED]>
>

Hmm, I thought we were trying to move away from volatile since it is
very weakly defined and towards explicit barriers and locks...
Points to --> Documentation/volatile-considered-harmful.txt


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] make atomic_t volatile on all architectures

2007-08-08 Thread Chris Snook
From: Chris Snook <[EMAIL PROTECTED]>

Some architectures currently do not declare the contents of an atomic_t to be
volatile.  This causes confusion since atomic_read() might not actually read
anything if an optimizing compiler re-uses a value stored in a register, which
can break code that loops until something external changes the value of an
atomic_t.  Avoiding such bugs requires using barrier(), which causes re-loads
of all registers used in the loop, thus hurting performance instead of helping
it, particularly on architectures where it's unnecessary.  Since we generally
want to re-read the contents of an atomic variable on every access anyway,
let's standardize the behavior across all architectures and avoid the
performance and correctness problems of requiring the use of barrier() in
loops that expect atomic_t variables to change externally.  This is relevant
even on non-smp architectures, since drivers may use atomic operations in
interrupt handlers.

Signed-off-by: Chris Snook <[EMAIL PROTECTED]>


diff -urp linux-2.6.23-rc2-orig/include/asm-blackfin/atomic.h 
linux-2.6.23-rc2/include/asm-blackfin/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-blackfin/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-blackfin/atomic.h  2007-08-08 
18:18:43.0 -0400
@@ -13,8 +13,13 @@
  * Tony Kou ([EMAIL PROTECTED])   Lineo Inc.   2001
  */
 
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
 typedef struct {
-   int counter;
+   volatile int counter;
 } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
diff -urp linux-2.6.23-rc2-orig/include/asm-frv/atomic.h 
linux-2.6.23-rc2/include/asm-frv/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-frv/atomic.h  2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-frv/atomic.h   2007-08-08 18:21:55.0 
-0400
@@ -35,8 +35,13 @@
 #define smp_mb__before_atomic_inc()barrier()
 #define smp_mb__after_atomic_inc() barrier()
 
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
 typedef struct {
-   int counter;
+   volatile int counter;
 } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
diff -urp linux-2.6.23-rc2-orig/include/asm-h8300/atomic.h 
linux-2.6.23-rc2/include/asm-h8300/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-h8300/atomic.h2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-h8300/atomic.h 2007-08-08 18:24:02.0 
-0400
@@ -6,7 +6,12 @@
  * resource counting etc..
  */
 
-typedef struct { int counter; } atomic_t;
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
+typedef struct { volatile int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
 #define atomic_read(v) ((v)->counter)
diff -urp linux-2.6.23-rc2-orig/include/asm-i386/atomic.h 
linux-2.6.23-rc2/include/asm-i386/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-i386/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-i386/atomic.h  2007-08-08 18:26:09.0 
-0400
@@ -14,8 +14,12 @@
  * Make sure gcc doesn't try to be clever and move things around
  * on us. We need to use _exactly_ the address the user gave us,
  * not some alias that contains the same information.
+ *
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
  */
-typedef struct { int counter; } atomic_t;
+typedef struct { volatile int counter; } atomic_t;
 
 #define ATOMIC_INIT(i) { (i) }
 
diff -urp linux-2.6.23-rc2-orig/include/asm-m68k/atomic.h 
linux-2.6.23-rc2/include/asm-m68k/atomic.h
--- linux-2.6.23-rc2-orig/include/asm-m68k/atomic.h 2007-07-08 
19:32:17.0 -0400
+++ linux-2.6.23-rc2/include/asm-m68k/atomic.h  2007-08-08 18:28:58.0 
-0400
@@ -13,7 +13,12 @@
  * We do not have SMP m68k systems, so we don't have to deal with that.
  */
 
-typedef struct { int counter; } atomic_t;
+/*
+ * Make atomic_t volatile to remove the need for barriers in loops that
+ * wait for an outside event.  We generally want to re-load the atomic_t
+ * variable each time anyway, but don't need to re-load everything else.
+ */
+typedef struct { volatile int counter; } atomic_t;
 #define ATOMIC_INIT(i) { (i) }
 
 #define atomic_read(v) ((v)->counter)
diff -urp linux-2.6.23-rc2-orig/include/asm-m68knommu/atomic.h 
linux-2.6.23-rc2/include/asm-m68knommu/atomic.h
--- linux-2.6.23-r