[PATCH 0/24] make atomic_read() behave consistently across all architectures
As recent discussions[1], and bugs[2] have shown, there is a great deal of confusion about the expected behavior of atomic_read(), compounded by the fact that it is not the same on all architectures. Since users expect calls to atomic_read() to actually perform a read, it is not desirable to allow the compiler to optimize this away. Requiring the use of barrier() in this case is inefficient, since we only want to re-load the atomic_t variable, not everything else in scope. This patchset makes the behavior of atomic_read uniform by removing the volatile keyword from all atomic_t and atomic64_t definitions that currently have it, and instead explicitly casts the variable as volatile in atomic_read(). This leaves little room for creative optimization by the compiler, and is in keeping with the principles behind "volatile considered harmful". Busy-waiters should still use cpu_relax(), but fast paths may be able to reduce their use of barrier() between some atomic_read() calls. -- Chris 1) http://lkml.org/lkml/2007/7/1/52 2) http://lkml.org/lkml/2007/8/8/122 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thursday 09 August 2007, Chris Snook wrote: > This patchset makes the behavior of atomic_read uniform by removing the > volatile keyword from all atomic_t and atomic64_t definitions that currently > have it, and instead explicitly casts the variable as volatile in > atomic_read(). This leaves little room for creative optimization by the > compiler, and is in keeping with the principles behind "volatile considered > harmful". > Just an idea: since all architectures already include asm-generic/atomic.h, why not move the definitions of atomic_t and atomic64_t, as well as anything that does not involve architecture specific inline assembly into the generic header? Arnd <>< - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Arnd Bergmann wrote: On Thursday 09 August 2007, Chris Snook wrote: This patchset makes the behavior of atomic_read uniform by removing the volatile keyword from all atomic_t and atomic64_t definitions that currently have it, and instead explicitly casts the variable as volatile in atomic_read(). This leaves little room for creative optimization by the compiler, and is in keeping with the principles behind "volatile considered harmful". Just an idea: since all architectures already include asm-generic/atomic.h, why not move the definitions of atomic_t and atomic64_t, as well as anything that does not involve architecture specific inline assembly into the generic header? Arnd <>< a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h If you can find a way to reshuffle the code and make it simpler, I personally am all for it. I'm skeptical that you'll get much to show for the effort. b) The definitions aren't precisely identical between all architectures, so it would be a mess of special cases, which gets us right back to where we are now. -- Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thursday 09 August 2007, Chris Snook wrote: > a) chicken and egg: asm-generic/atomic.h depends on definitions in > asm/atomic.h Ok, I see. > If you can find a way to reshuffle the code and make it simpler, I personally > am > all for it. I'm skeptical that you'll get much to show for the effort. I guess it could be done using more macros or new headers, but I don't see a way that would actually improve the situation. > b) The definitions aren't precisely identical between all architectures, so > it > would be a mess of special cases, which gets us right back to where we are > now. Why are they not identical? Anything beyond the 32/64 bit difference should be the same afaics, or it might cause more bugs. Arnd <>< - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 9 Aug 2007, Chris Snook wrote: > This patchset makes the behavior of atomic_read uniform by removing the > volatile keyword from all atomic_t and atomic64_t definitions that currently > have it, and instead explicitly casts the variable as volatile in > atomic_read(). This leaves little room for creative optimization by the > compiler, and is in keeping with the principles behind "volatile considered > harmful". volatile is generally harmful even in atomic_read(). Barriers control visibility and AFAICT things are fine. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Christoph Lameter wrote: On Thu, 9 Aug 2007, Chris Snook wrote: This patchset makes the behavior of atomic_read uniform by removing the volatile keyword from all atomic_t and atomic64_t definitions that currently have it, and instead explicitly casts the variable as volatile in atomic_read(). This leaves little room for creative optimization by the compiler, and is in keeping with the principles behind "volatile considered harmful". volatile is generally harmful even in atomic_read(). Barriers control visibility and AFAICT things are fine. But barriers force a flush of *everything* in scope, which we generally don't want. On the other hand, we pretty much always want to flush atomic_* operations. One way or another, we should be restricting the volatile behavior to the thing that needs it. On most architectures, this patch set just moves that from the declaration, where it is considered harmful, to the use, where it is considered an occasional necessary evil. See the resubmitted patchset, which also puts a cast in the atomic[64]_set operations. -- Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Tue, 14 Aug 2007, Chris Snook wrote: > But barriers force a flush of *everything* in scope, which we generally don't > want. On the other hand, we pretty much always want to flush atomic_* > operations. One way or another, we should be restricting the volatile > behavior to the thing that needs it. On most architectures, this patch set > just moves that from the declaration, where it is considered harmful, to the > use, where it is considered an occasional necessary evil. Then we would need atomic_read() and atomic_read_volatile() atomic_read_volatile() would imply an object sized memory barrier before and after? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Tue, 14 Aug 2007, Christoph Lameter wrote: > On Thu, 9 Aug 2007, Chris Snook wrote: > > > This patchset makes the behavior of atomic_read uniform by removing the > > volatile keyword from all atomic_t and atomic64_t definitions that currently > > have it, and instead explicitly casts the variable as volatile in > > atomic_read(). This leaves little room for creative optimization by the > > compiler, and is in keeping with the principles behind "volatile considered > > harmful". > > volatile is generally harmful even in atomic_read(). Barriers control > visibility and AFAICT things are fine. Frankly, I don't see the need for this series myself either. Personal opinion (others may differ), but I consider "volatile" to be a sad / unfortunate wart in C (numerous threads on this list and on the gcc lists/bugzilla over the years stand testimony to this) and if we _can_ steer clear of it, then why not -- why use this ill-defined primitive whose implementation has often differed over compiler versions and platforms? Granted, barrier() _is_ heavy-handed in that it makes the optimizer forget _everything_, but then somebody did post a forget() macro on this thread itself ... [ BTW, why do we want the compiler to not optimize atomic_read()'s in the first place? Atomic ops guarantee atomicity, which has nothing to do with "volatility" -- users that expect "volatility" from atomic ops are the ones who must be fixed instead, IMHO. ] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Satyam Sharma wrote: On Tue, 14 Aug 2007, Christoph Lameter wrote: On Thu, 9 Aug 2007, Chris Snook wrote: This patchset makes the behavior of atomic_read uniform by removing the volatile keyword from all atomic_t and atomic64_t definitions that currently have it, and instead explicitly casts the variable as volatile in atomic_read(). This leaves little room for creative optimization by the compiler, and is in keeping with the principles behind "volatile considered harmful". volatile is generally harmful even in atomic_read(). Barriers control visibility and AFAICT things are fine. Frankly, I don't see the need for this series myself either. Personal opinion (others may differ), but I consider "volatile" to be a sad / unfortunate wart in C (numerous threads on this list and on the gcc lists/bugzilla over the years stand testimony to this) and if we _can_ steer clear of it, then why not -- why use this ill-defined primitive whose implementation has often differed over compiler versions and platforms? Granted, barrier() _is_ heavy-handed in that it makes the optimizer forget _everything_, but then somebody did post a forget() macro on this thread itself ... [ BTW, why do we want the compiler to not optimize atomic_read()'s in the first place? Atomic ops guarantee atomicity, which has nothing to do with "volatility" -- users that expect "volatility" from atomic ops are the ones who must be fixed instead, IMHO. ] Because atomic operations are generally used for synchronization, which requires volatile behavior. Most such codepaths currently use an inefficient barrier(). Some forget to and we get bugs, because people assume that atomic_read() actually reads something, and atomic_write() actually writes something. Worse, these are architecture-specific, even compiler version-specific bugs that are often difficult to track down. -- Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Tue, 14 Aug 2007, Chris Snook wrote: > Because atomic operations are generally used for synchronization, which > requires volatile behavior. Most such codepaths currently use an inefficient > barrier(). Some forget to and we get bugs, because people assume that > atomic_read() actually reads something, and atomic_write() actually writes > something. Worse, these are architecture-specific, even compiler > version-specific bugs that are often difficult to track down. Looks like we need to have lock and unlock semantics? atomic_read() which has no barrier or volatile implications. atomic_read_for_lock Acquire semantics? atomic_read_for_unlock Release semantics? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 04:38:54AM +0530, Satyam Sharma wrote: > > > On Tue, 14 Aug 2007, Christoph Lameter wrote: > > > On Thu, 9 Aug 2007, Chris Snook wrote: > > > > > This patchset makes the behavior of atomic_read uniform by removing the > > > volatile keyword from all atomic_t and atomic64_t definitions that > > > currently > > > have it, and instead explicitly casts the variable as volatile in > > > atomic_read(). This leaves little room for creative optimization by the > > > compiler, and is in keeping with the principles behind "volatile > > > considered > > > harmful". > > > > volatile is generally harmful even in atomic_read(). Barriers control > > visibility and AFAICT things are fine. > > Frankly, I don't see the need for this series myself either. Personal > opinion (others may differ), but I consider "volatile" to be a sad / > unfortunate wart in C (numerous threads on this list and on the gcc > lists/bugzilla over the years stand testimony to this) and if we _can_ > steer clear of it, then why not -- why use this ill-defined primitive > whose implementation has often differed over compiler versions and > platforms? Granted, barrier() _is_ heavy-handed in that it makes the > optimizer forget _everything_, but then somebody did post a forget() > macro on this thread itself ... > > [ BTW, why do we want the compiler to not optimize atomic_read()'s in > the first place? Atomic ops guarantee atomicity, which has nothing > to do with "volatility" -- users that expect "volatility" from > atomic ops are the ones who must be fixed instead, IMHO. ] Interactions between mainline code and interrupt/NMI handlers on the same CPU (for example, when both are using per-CPU variables. See examples previously posted in this thread, or look at the rcu_read_lock() and rcu_read_unlock() implementations in http://lkml.org/lkml/2007/8/7/280. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Chris Snook <[EMAIL PROTECTED]> wrote: > > Because atomic operations are generally used for synchronization, which > requires > volatile behavior. Most such codepaths currently use an inefficient > barrier(). > Some forget to and we get bugs, because people assume that atomic_read() > actually reads something, and atomic_write() actually writes something. > Worse, > these are architecture-specific, even compiler version-specific bugs that are > often difficult to track down. I'm yet to see a single example from the current tree where this patch series is the correct solution. So far the only example has been a buggy piece of code which has since been fixed with a cpu_relax. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote: > Chris Snook <[EMAIL PROTECTED]> wrote: > > > > Because atomic operations are generally used for synchronization, which > > requires > > volatile behavior. Most such codepaths currently use an inefficient > > barrier(). > > Some forget to and we get bugs, because people assume that atomic_read() > > actually reads something, and atomic_write() actually writes something. > > Worse, > > these are architecture-specific, even compiler version-specific bugs that > > are > > often difficult to track down. > > I'm yet to see a single example from the current tree where > this patch series is the correct solution. So far the only > example has been a buggy piece of code which has since been > fixed with a cpu_relax. Btw.: we still have include/asm-i386/mach-es7000/mach_wakecpu.h: while (!atomic_read(deassert)); include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert)); Looks like they need to be fixed as well. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Satyam Sharma wrote: > [ BTW, why do we want the compiler to not optimize atomic_read()'s in > the first place? Atomic ops guarantee atomicity, which has nothing > to do with "volatility" -- users that expect "volatility" from > atomic ops are the ones who must be fixed instead, IMHO. ] LDD3 says on page 125: "The following operations are defined for the type [atomic_t] and are guaranteed to be atomic with respect to all processors of an SMP computer." Doesn't "atomic WRT all processors" require volatility? -- Stefan Richter -=-=-=== =--- - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 12:35:31PM +0200, Stefan Richter wrote: > > LDD3 says on page 125: "The following operations are defined for the > type [atomic_t] and are guaranteed to be atomic with respect to all > processors of an SMP computer." > > Doesn't "atomic WRT all processors" require volatility? Not at all. We also require this to be atomic without any hint of volatility. extern int foo; foo = 4; Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Stefan Richter wrote: > Satyam Sharma wrote: > > [ BTW, why do we want the compiler to not optimize atomic_read()'s in > > the first place? Atomic ops guarantee atomicity, which has nothing > > to do with "volatility" -- users that expect "volatility" from > > atomic ops are the ones who must be fixed instead, IMHO. ] > > LDD3 says on page 125: "The following operations are defined for the > type [atomic_t] and are guaranteed to be atomic with respect to all > processors of an SMP computer." > > Doesn't "atomic WRT all processors" require volatility? No, it definitely doesn't. Why should it? "Atomic w.r.t. all processors" is just your normal, simple "atomicity" for SMP systems (ensure that that object is modified / set / replaced in main memory atomically) and has nothing to do with "volatile" behaviour. "Volatile behaviour" itself isn't consistently defined (at least definitely not consistently implemented in various gcc versions across platforms), but it is /expected/ to mean something like: "ensure that every such access actually goes all the way to memory, and is not re-ordered w.r.t. to other accesses, as far as the compiler can take care of these". The last "as far as compiler can take care" disclaimer comes about due to CPUs doing their own re-ordering nowadays. For example (say on i386): (A) $ cat tp1.c int a; void func(void) { a = 10; a = 20; } $ gcc -Os -S tp1.c $ cat tp1.s ... movl$20, a ... (B) $ cat tp2.c volatile int a; void func(void) { a = 10; a = 20; } $ gcc -Os -S tp2.c $ cat tp2.s ... movl$10, a movl$20, a ... (C) $ cat tp3.c int a; void func(void) { *(volatile int *)&a = 10; *(volatile int *)&a = 20; } $ gcc -Os -S tp3.c $ cat tp3.s ... movl$10, a movl$20, a ... In (A) the compiler optimized "a = 10;" away, but the actual store of the final value "20" to "a" was still "atomic". (B) and (C) also exhibit "volatile" behaviour apart from the "atomicity". But as others replied, it seems some callers out there depend upon atomic ops exhibiting "volatile" behaviour as well, so that answers my initial question, actually. I haven't looked at the code Paul pointed me at, but I wonder if that "forget(x)" macro would help those cases. I'd wish to avoid the "volatile" primitive, personally. Satyam - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Satyam Sharma wrote: > On Wed, 15 Aug 2007, Stefan Richter wrote: >> Doesn't "atomic WRT all processors" require volatility? > > No, it definitely doesn't. Why should it? > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity" > for SMP systems (ensure that that object is modified / set / replaced > in main memory atomically) and has nothing to do with "volatile" > behaviour. > > "Volatile behaviour" itself isn't consistently defined (at least > definitely not consistently implemented in various gcc versions across > platforms), but it is /expected/ to mean something like: "ensure that > every such access actually goes all the way to memory, and is not > re-ordered w.r.t. to other accesses, as far as the compiler can take > care of these". The last "as far as compiler can take care" disclaimer > comes about due to CPUs doing their own re-ordering nowadays. > > For example (say on i386): [...] > In (A) the compiler optimized "a = 10;" away, but the actual store > of the final value "20" to "a" was still "atomic". (B) and (C) also > exhibit "volatile" behaviour apart from the "atomicity". > > But as others replied, it seems some callers out there depend upon > atomic ops exhibiting "volatile" behaviour as well, so that answers > my initial question, actually. I haven't looked at the code Paul > pointed me at, but I wonder if that "forget(x)" macro would help > those cases. I'd wish to avoid the "volatile" primitive, personally. So, looking at load instead of store, understand I correctly that in your opinion int b; b = atomic_read(&a); if (b) do_something_time_consuming(); b = atomic_read(&a); if (b) do_something_more(); should be changed to explicitly forget(&a) after do_something_time_consuming? If so, how about the following: static inline void A(atomic_t *a) { int b = atomic_read(a); if (b) do_something_time_consuming(); } static inline void B(atomic_t *a) { int b = atomic_read(a); if (b) do_something_more(); } static void C(atomic_t *a) { A(a); B(b); } Would this need forget(a) after A(a)? (Is the latter actually answered in C99 or is it compiler-dependent?) -- Stefan Richter -=-=-=== =--- - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
I wrote: > static inline void A(atomic_t *a) > { > int b = atomic_read(a); > if (b) > do_something_time_consuming(); > } > > static inline void B(atomic_t *a) > { > int b = atomic_read(a); > if (b) > do_something_more(); > } > > static void C(atomic_t *a) > { > A(a); > B(b); /* ^ typo */ B(a); > } > > Would this need forget(a) after A(a)? > > (Is the latter actually answered in C99 or is it compiler-dependent?) -- Stefan Richter -=-=-=== =--- - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On 8/15/2007 10:18 AM, Heiko Carstens wrote: > On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote: >> Chris Snook <[EMAIL PROTECTED]> wrote: >> > >> > Because atomic operations are generally used for synchronization, which >> > requires >> > volatile behavior. Most such codepaths currently use an inefficient >> > barrier(). >> > Some forget to and we get bugs, because people assume that atomic_read() >> > actually reads something, and atomic_write() actually writes something. >> > Worse, >> > these are architecture-specific, even compiler version-specific bugs that >> > are >> > often difficult to track down. >> >> I'm yet to see a single example from the current tree where >> this patch series is the correct solution. So far the only >> example has been a buggy piece of code which has since been >> fixed with a cpu_relax. > > Btw.: we still have > > include/asm-i386/mach-es7000/mach_wakecpu.h: while (!atomic_read(deassert)); > include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert)); > > Looks like they need to be fixed as well. I don't know if this here is affected: /* drivers/ieee1394/ieee1394_core.h */ static inline unsigned int get_hpsb_generation(struct hpsb_host *host) { return atomic_read(&host->generation); } /* drivers/ieee1394/nodemgr.c */ static int nodemgr_host_thread(void *__hi) { [...] for (;;) { [... sleep until bus reset event ...] /* Pause for 1/4 second in 1/16 second intervals, * to make sure things settle down. */ g = get_hpsb_generation(host); for (i = 0; i < 4 ; i++) { if (msleep_interruptible(63) || kthread_should_stop()) goto exit; /* Now get the generation in which the node ID's we collect * are valid. During the bus scan we will use this generation * for the read transactions, so that if another reset occurs * during the scan the transactions will fail instead of * returning bogus data. */ generation = get_hpsb_generation(host); /* If we get a reset before we are done waiting, then * start the waiting over again */ if (generation != g) g = generation, i = 0; } [... scan bus, using generation ...] } exit: [...] } -- Stefan Richter -=-=-=== =--- - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Stefan Richter wrote: > Satyam Sharma wrote: > > On Wed, 15 Aug 2007, Stefan Richter wrote: > >> Doesn't "atomic WRT all processors" require volatility? > > > > No, it definitely doesn't. Why should it? > > > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity" > > for SMP systems (ensure that that object is modified / set / replaced > > in main memory atomically) and has nothing to do with "volatile" > > behaviour. > > > > "Volatile behaviour" itself isn't consistently defined (at least > > definitely not consistently implemented in various gcc versions across > > platforms), but it is /expected/ to mean something like: "ensure that > > every such access actually goes all the way to memory, and is not > > re-ordered w.r.t. to other accesses, as far as the compiler can take > > care of these". The last "as far as compiler can take care" disclaimer > > comes about due to CPUs doing their own re-ordering nowadays. > > > > For example (say on i386): > > [...] > > > In (A) the compiler optimized "a = 10;" away, but the actual store > > of the final value "20" to "a" was still "atomic". (B) and (C) also > > exhibit "volatile" behaviour apart from the "atomicity". > > > > But as others replied, it seems some callers out there depend upon > > atomic ops exhibiting "volatile" behaviour as well, so that answers > > my initial question, actually. I haven't looked at the code Paul > > pointed me at, but I wonder if that "forget(x)" macro would help > > those cases. I'd wish to avoid the "volatile" primitive, personally. > > So, looking at load instead of store, understand I correctly that in > your opinion > > int b; > > b = atomic_read(&a); > if (b) > do_something_time_consuming(); > > b = atomic_read(&a); > if (b) > do_something_more(); > > should be changed to explicitly forget(&a) after > do_something_time_consuming? No, I'd actually prefer something like what Christoph Lameter suggested, i.e. users (such as above) who want "volatile"-like behaviour from atomic ops can use alternative functions. How about something like: #define atomic_read_volatile(v) \ ({ \ forget(&(v)->counter); \ ((v)->counter); \ }) Or possibly, implement these "volatile" atomic ops variants in inline asm like the patch that Sebastian Siewior has submitted on another thread just a while back. Of course, if we find there are more callers in the kernel who want the volatility behaviour than those who don't care, we can re-define the existing ops to such variants, and re-name the existing definitions to somethine else, say "atomic_read_nonvolatile" for all I care. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Hi Stefan, On Wed, 15 Aug 2007, Stefan Richter wrote: > On 8/15/2007 10:18 AM, Heiko Carstens wrote: > > On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote: > >> Chris Snook <[EMAIL PROTECTED]> wrote: > >> > > >> > Because atomic operations are generally used for synchronization, which > >> > requires > >> > volatile behavior. Most such codepaths currently use an inefficient > >> > barrier(). > >> > Some forget to and we get bugs, because people assume that > >> > atomic_read() > >> > actually reads something, and atomic_write() actually writes something. > >> > Worse, > >> > these are architecture-specific, even compiler version-specific bugs > >> > that are > >> > often difficult to track down. > >> > >> I'm yet to see a single example from the current tree where > >> this patch series is the correct solution. So far the only > >> example has been a buggy piece of code which has since been > >> fixed with a cpu_relax. > > > > Btw.: we still have > > > > include/asm-i386/mach-es7000/mach_wakecpu.h: while > > (!atomic_read(deassert)); > > include/asm-i386/mach-default/mach_wakecpu.h: while > > (!atomic_read(deassert)); > > > > Looks like they need to be fixed as well. > > > I don't know if this here is affected: Yes, I think it is. You're clearly expecting the read to actually happen when you call get_hpsb_generation(). It's clearly not a busy-loop, so cpu_relax() sounds pointless / wrong solution for this case, so I'm now somewhat beginning to appreciate the motivation behind this series :-) But as I said, there are ways to achieve the same goals of this series without using "volatile". I think I'll submit a RFC/patch or two on this myself (will also fix the code pieces listed here). > /* drivers/ieee1394/ieee1394_core.h */ > static inline unsigned int get_hpsb_generation(struct hpsb_host *host) > { > return atomic_read(&host->generation); > } > > /* drivers/ieee1394/nodemgr.c */ > static int nodemgr_host_thread(void *__hi) > { > [...] > > for (;;) { > [... sleep until bus reset event ...] > > /* Pause for 1/4 second in 1/16 second intervals, >* to make sure things settle down. */ > g = get_hpsb_generation(host); > for (i = 0; i < 4 ; i++) { > if (msleep_interruptible(63) || > kthread_should_stop()) > goto exit; Totally unrelated, but this looks weird. IMHO you actually wanted: msleep_interruptible(63); if (kthread_should_stop()) goto exit; here, didn't you? Otherwise the thread will exit even when kthread_should_stop() != TRUE (just because it received a signal), and it is not good for a kthread to exit on its own if it uses kthread_should_stop() or if some other piece of kernel code could eventually call kthread_stop(tsk) on it. Ok, probably the thread will never receive a signal in the first place because it's spawned off kthreadd which ignores all signals beforehand, but still ... [PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread The nodemgr host thread can exit on its own even when kthread_should_stop is not true, on receiving a signal (might never happen in practice, as it ignores signals). But considering kthread_stop() must not be mixed with kthreads that can exit on their own, I think changing the code like this is clearer. This change means the thread can cut its sleep short when receive a signal but looking at the code around, that sounds okay (and again, it might never actually recieve a signal in practice). Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]> --- drivers/ieee1394/nodemgr.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/ieee1394/nodemgr.c b/drivers/ieee1394/nodemgr.c index 2ffd534..981a7da 100644 --- a/drivers/ieee1394/nodemgr.c +++ b/drivers/ieee1394/nodemgr.c @@ -1721,7 +1721,8 @@ static int nodemgr_host_thread(void *__hi) * to make sure things settle down. */ g = get_hpsb_generation(host); for (i = 0; i < 4 ; i++) { - if (msleep_interruptible(63) || kthread_should_stop()) + msleep_interruptible(63); + if (kthread_should_stop()) goto exit; /* Now get the generation in which the node ID's we collect - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote: > On Wed, 15 Aug 2007, Stefan Richter wrote: > > Satyam Sharma wrote: > > > On Wed, 15 Aug 2007, Stefan Richter wrote: > > >> Doesn't "atomic WRT all processors" require volatility? > > > > > > No, it definitely doesn't. Why should it? > > > > > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity" > > > for SMP systems (ensure that that object is modified / set / replaced > > > in main memory atomically) and has nothing to do with "volatile" > > > behaviour. > > > > > > "Volatile behaviour" itself isn't consistently defined (at least > > > definitely not consistently implemented in various gcc versions across > > > platforms), but it is /expected/ to mean something like: "ensure that > > > every such access actually goes all the way to memory, and is not > > > re-ordered w.r.t. to other accesses, as far as the compiler can take > > > care of these". The last "as far as compiler can take care" disclaimer > > > comes about due to CPUs doing their own re-ordering nowadays. > > > > > > For example (say on i386): > > > > [...] > > > > > In (A) the compiler optimized "a = 10;" away, but the actual store > > > of the final value "20" to "a" was still "atomic". (B) and (C) also > > > exhibit "volatile" behaviour apart from the "atomicity". > > > > > > But as others replied, it seems some callers out there depend upon > > > atomic ops exhibiting "volatile" behaviour as well, so that answers > > > my initial question, actually. I haven't looked at the code Paul > > > pointed me at, but I wonder if that "forget(x)" macro would help > > > those cases. I'd wish to avoid the "volatile" primitive, personally. > > > > So, looking at load instead of store, understand I correctly that in > > your opinion > > > > int b; > > > > b = atomic_read(&a); > > if (b) > > do_something_time_consuming(); > > > > b = atomic_read(&a); > > if (b) > > do_something_more(); > > > > should be changed to explicitly forget(&a) after > > do_something_time_consuming? > > No, I'd actually prefer something like what Christoph Lameter suggested, > i.e. users (such as above) who want "volatile"-like behaviour from atomic > ops can use alternative functions. How about something like: > > #define atomic_read_volatile(v) \ > ({ \ > forget(&(v)->counter); \ > ((v)->counter); \ > }) Wouldn't the above "forget" the value, throw it away, then forget that it forgot it, giving non-volatile semantics? > Or possibly, implement these "volatile" atomic ops variants in inline asm > like the patch that Sebastian Siewior has submitted on another thread just > a while back. Given that you are advocating a change (please keep in mind that atomic_read() and atomic_set() had volatile semantics on almost all platforms), care to give some example where these historical volatile semantics are causing a problem? > Of course, if we find there are more callers in the kernel who want the > volatility behaviour than those who don't care, we can re-define the > existing ops to such variants, and re-name the existing definitions to > somethine else, say "atomic_read_nonvolatile" for all I care. Do we really need another set of APIs? Can you give even one example where the pre-existing volatile semantics are causing enough of a problem to justify adding yet more atomic_*() APIs? Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote: > > > I don't know if this here is affected: > > Yes, I think it is. You're clearly expecting the read to actually happen > when you call get_hpsb_generation(). It's clearly not a busy-loop, so > cpu_relax() sounds pointless / wrong solution for this case, so I'm now > somewhat beginning to appreciate the motivation behind this series :-) Nope, we're calling schedule which is a rather heavy-weight barrier. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote: > > Do we really need another set of APIs? Can you give even one example > where the pre-existing volatile semantics are causing enough of a problem > to justify adding yet more atomic_*() APIs? Let's turn this around. Can you give a single example where the volatile semantics is needed in a legitimate way? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote: > > > > Do we really need another set of APIs? Can you give even one example > > where the pre-existing volatile semantics are causing enough of a problem > > to justify adding yet more atomic_*() APIs? > > Let's turn this around. Can you give a single example where > the volatile semantics is needed in a legitimate way? Sorry, but you are the one advocating for the change. Nice try, though! ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu wrote: > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote: >>> I don't know if this here is affected: [...something like] b = atomic_read(a); for (i = 0; i < 4; i++) { msleep_interruptible(63); c = atomic_read(a); if (c != b) { b = c; i = 0; } } > Nope, we're calling schedule which is a rather heavy-weight > barrier. How does the compiler know that msleep() has got barrier()s? -- Stefan Richter -=-=-=== =--- - http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu wrote: Chris Snook <[EMAIL PROTECTED]> wrote: Because atomic operations are generally used for synchronization, which requires volatile behavior. Most such codepaths currently use an inefficient barrier(). Some forget to and we get bugs, because people assume that atomic_read() actually reads something, and atomic_write() actually writes something. Worse, these are architecture-specific, even compiler version-specific bugs that are often difficult to track down. I'm yet to see a single example from the current tree where this patch series is the correct solution. So far the only example has been a buggy piece of code which has since been fixed with a cpu_relax. Part of the motivation here is to fix heisenbugs. If I knew where they were, I'd be posting patches for them. Unlike most bugs, where we want to expose them as obviously as possible, these can be extremely difficult to track down, and are often due to people assuming that the atomic_* operations have the same semantics they've historically had. Remember that until recently, all SMP architectures except s390 (which very few kernel developers outside of IBM, Red Hat, and SuSE do much work on) had volatile declarations for atomic_t. Removing the volatile declarations from i386 and x86_64 may have created heisenbugs that won't manifest themselves until GCC 6.0 comes out and people start compiling kernels with -O5. We should have consistent semantics for atomic_* operations. The other motivation is to reduce the need for the barriers used to prevent/fix such problems which clobber all your registers, and instead force atomic_* operations to behave in the way they're actually used. After the (resubmitted) patchset is merged, we'll be able to remove a whole bunch of barriers, shrinking our source and our binaries, and improving performance. -- Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote: > Herbert Xu wrote: > > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote: > >>> I don't know if this here is affected: > > [...something like] > b = atomic_read(a); > for (i = 0; i < 4; i++) { > msleep_interruptible(63); > c = atomic_read(a); > if (c != b) { > b = c; > i = 0; > } > } > > > Nope, we're calling schedule which is a rather heavy-weight > > barrier. > > How does the compiler know that msleep() has got barrier()s? Because msleep_interruptible() is in a separate compilation unit, the compiler has to assume that it might modify any arbitrary global. In many cases, the compiler also has to assume that msleep_interruptible() might call back into a function in the current compilation unit, thus possibly modifying global static variables. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote: > > Herbert Xu wrote: > > > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote: > > >>> I don't know if this here is affected: > > > > [...something like] > > b = atomic_read(a); > > for (i = 0; i < 4; i++) { > > msleep_interruptible(63); > > c = atomic_read(a); > > if (c != b) { > > b = c; > > i = 0; > > } > > } > > > > > Nope, we're calling schedule which is a rather heavy-weight > > > barrier. > > > > How does the compiler know that msleep() has got barrier()s? > > Because msleep_interruptible() is in a separate compilation unit, > the compiler has to assume that it might modify any arbitrary global. > In many cases, the compiler also has to assume that msleep_interruptible() > might call back into a function in the current compilation unit, thus > possibly modifying global static variables. Yup, I've just verified this with a testcase. So a call to any function outside of the current compilation unit acts as a compiler barrier. Cool. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote: > > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote: > > > > > > Do we really need another set of APIs? Can you give even one example > > > where the pre-existing volatile semantics are causing enough of a problem > > > to justify adding yet more atomic_*() APIs? > > > > Let's turn this around. Can you give a single example where > > the volatile semantics is needed in a legitimate way? > > Sorry, but you are the one advocating for the change. Not for i386 and x86_64 -- those have atomic ops without any "volatile" semantics (currently as per existing definitions). - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote: > > [...] > > Not for i386 and x86_64 -- those have atomic ops without any "volatile" > > semantics (currently as per existing definitions). > > I claim unit volumes with arm, and the majority of the architectures, but > I cannot deny the popularity of i386 and x86_64 with many developers. ;-) Hmm, does arm really need that "volatile int counter;"? Hopefully RMK will take a patch removing that "volatile" ... ;-) > However, I am not aware of code in the kernel that would benefit > from the compiler coalescing multiple atomic_set() and atomic_read() > invocations, thus I don't see the downside to volatility in this case. > Are there some performance-critical code fragments that I am missing? I don't know, and yes, code with multiple atomic_set's and atomic_read's getting optimized or coalesced does sound strange to start with. Anyway, I'm not against "volatile semantics" per se. As replied elsewhere, I do appreciate the motivation behind this series (to _avoid_ gotchas, not to fix existing ones). Just that I'd like to avoid using "volatile", for aforementioned reasons, especially given that there are perfectly reasonable alternatives to achieve the same desired behaviour. Satyam - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Hi Paul, On Wed, 15 Aug 2007, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote: > > [...] > > No, I'd actually prefer something like what Christoph Lameter suggested, > > i.e. users (such as above) who want "volatile"-like behaviour from atomic > > ops can use alternative functions. How about something like: > > > > #define atomic_read_volatile(v) \ > > ({ \ > > forget(&(v)->counter); \ > > ((v)->counter); \ > > }) > > Wouldn't the above "forget" the value, throw it away, then forget > that it forgot it, giving non-volatile semantics? Nope, I don't think so. I wrote the following trivial testcases: [ See especially tp4.c and tp4.s (last example). ] == $ cat tp1.c # Using volatile access casts #define atomic_read(a) (*(volatile int *)&a) int a; void func(void) { a = 0; while (atomic_read(a)) ; } == $ gcc -Os -S tp1.c; cat tp1.s func: pushl %ebp movl%esp, %ebp movl$0, a .L2: movla, %eax testl %eax, %eax jne .L2 popl%ebp ret == $ cat tp2.c # Using nothing; gcc will optimize the whole loop away #define forget(x) #define atomic_read(a) \ ({ \ forget(&(a)); \ (a);\ }) int a; void func(void) { a = 0; while (atomic_read(a)) ; } == $ gcc -Os -S tp2.c; cat tp2.s func: pushl %ebp movl%esp, %ebp popl%ebp movl$0, a ret == $ cat tp3.c # Using a full memory clobber barrier #define forget(x) asm volatile ("":::"memory") #define atomic_read(a) \ ({ \ forget(&(a)); \ (a);\ }) int a; void func(void) { a = 0; while (atomic_read(a)) ; } == $ gcc -Os -S tp3.c; cat tp3.s func: pushl %ebp movl%esp, %ebp movl$0, a .L2: cmpl$0, a jne .L2 popl%ebp ret == $ cat tp4.c # Using a forget(var) macro #define forget(a) __asm__ __volatile__ ("" :"=m" (a) :"m" (a)) #define atomic_read(a) \ ({ \ forget(a); \ (a);\ }) int a; void func(void) { a = 0; while (atomic_read(a)) ; } == $ gcc -Os -S tp4.c; cat tp4.s func: pushl %ebp movl%esp, %ebp movl$0, a .L2: cmpl$0, a jne .L2 popl%ebp ret == Possibly these were too trivial to expose any potential problems that you may have been referring to, so would be helpful if you could write a more concrete example / sample code. > > Or possibly, implement these "volatile" atomic ops variants in inline asm > > like the patch that Sebastian Siewior has submitted on another thread just > > a while back. > > Given that you are advocating a change (please keep in mind that > atomic_read() and atomic_set() had volatile semantics on almost all > platforms), care to give some example where these historical volatile > semantics are causing a problem? > [...] > Can you give even one example > where the pre-existing volatile semantics are causing enough of a problem > to justify adding yet more atomic_*() APIs? Will take this to the other sub-thread ... > > Of course, if we find there are more callers in the kernel who want the > > volatility behaviour than those who don't care, we can re-define the > > existing ops to such variants, and re-name the existing definitions to > > somethine else, say "atomic_read_nonvolatile" for all I care. > > Do we really need another set of APIs? Well, if there's one set of users who do care about volatile behaviour, and another set that doesn't, it only sounds correct to provide both those APIs, instead of forcing one behaviour on all users. Thanks, Satyam - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote: > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote: > > > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote: > > > > > > > > Do we really need another set of APIs? Can you give even one example > > > > where the pre-existing volatile semantics are causing enough of a > > > > problem > > > > to justify adding yet more atomic_*() APIs? > > > > > > Let's turn this around. Can you give a single example where > > > the volatile semantics is needed in a legitimate way? > > > > Sorry, but you are the one advocating for the change. > > Not for i386 and x86_64 -- those have atomic ops without any "volatile" > semantics (currently as per existing definitions). I claim unit volumes with arm, and the majority of the architectures, but I cannot deny the popularity of i386 and x86_64 with many developers. ;-) However, I am not aware of code in the kernel that would benefit from the compiler coalescing multiple atomic_set() and atomic_read() invocations, thus I don't see the downside to volatility in this case. Are there some performance-critical code fragments that I am missing? Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu <[EMAIL PROTECTED]> wrote: > Let's turn this around. Can you give a single example where > the volatile semantics is needed in a legitimate way? Accessing H/W registers? But apart from that... David - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote: > Herbert Xu <[EMAIL PROTECTED]> wrote: > > > Let's turn this around. Can you give a single example where > > the volatile semantics is needed in a legitimate way? > > Accessing H/W registers? But apart from that... Communicating between process context and interrupt/NMI handlers using per-CPU variables. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
"Volatile behaviour" itself isn't consistently defined (at least definitely not consistently implemented in various gcc versions across platforms), It should be consistent across platforms; if not, file a bug please. but it is /expected/ to mean something like: "ensure that every such access actually goes all the way to memory, and is not re-ordered w.r.t. to other accesses, as far as the compiler can take care of these". The last "as far as compiler can take care" disclaimer comes about due to CPUs doing their own re-ordering nowadays. You can *expect* whatever you want, but this isn't in line with reality at all. volatile _does not_ make accesses go all the way to memory. volatile _does not_ prevent reordering wrt other accesses. What volatile does are a) never optimise away a read (or write) to the object, since the data can change in ways the compiler cannot see; and b) never move stores to the object across a sequence point. This does not mean other accesses cannot be reordered wrt the volatile access. If the abstract machine would do an access to a volatile- qualified object, the generated machine code will do that access too. But, for example, it can still be optimised away by the compiler, if it can prove it is allowed to. If you want stuff to go all the way to memory, you need some architecture-specific flush sequence; to make a store globally visible before another store, you need mb(); before some following read, you need mb(); to prevent reordering you need a barrier. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote: > >>How does the compiler know that msleep() has got barrier()s? > > > >Because msleep_interruptible() is in a separate compilation unit, > >the compiler has to assume that it might modify any arbitrary global. > > No; compilation units have nothing to do with it, GCC can optimise > across compilation unit boundaries just fine, if you tell it to > compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. > What you probably mean is that the compiler has to assume any code > it cannot currently see can do anything (insofar as allowed by the > relevant standards etc.) Indeed. > >In many cases, the compiler also has to assume that > >msleep_interruptible() > >might call back into a function in the current compilation unit, thus > >possibly modifying global static variables. > > It most often is smart enough to see what compilation-unit-local > variables might be modified that way, though :-) Yep. For example, if it knows the current value of a given such local variable, and if all code paths that would change some other variable cannot be reached given that current value of the first variable. At least given that gcc doesn't know about multiple threads of execution! Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote: > Hi Paul, > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote: > > > [...] > > > No, I'd actually prefer something like what Christoph Lameter suggested, > > > i.e. users (such as above) who want "volatile"-like behaviour from atomic > > > ops can use alternative functions. How about something like: > > > > > > #define atomic_read_volatile(v) \ > > > ({ \ > > > forget(&(v)->counter); \ > > > ((v)->counter); \ > > > }) > > > > Wouldn't the above "forget" the value, throw it away, then forget > > that it forgot it, giving non-volatile semantics? > > Nope, I don't think so. I wrote the following trivial testcases: > [ See especially tp4.c and tp4.s (last example). ] Right. I should have said "wouldn't the compiler be within its rights to forget the value, throw it away, then forget that it forgot it". The value coming out of the #define above is an unadorned ((v)->counter), which has no volatile semantics. > == > $ cat tp1.c # Using volatile access casts > > #define atomic_read(a)(*(volatile int *)&a) > > int a; > > void func(void) > { > a = 0; > while (atomic_read(a)) > ; > } > == > $ gcc -Os -S tp1.c; cat tp1.s > > func: > pushl %ebp > movl%esp, %ebp > movl$0, a > .L2: > movla, %eax > testl %eax, %eax > jne .L2 > popl%ebp > ret > == > $ cat tp2.c # Using nothing; gcc will optimize the whole loop away > > #define forget(x) > > #define atomic_read(a)\ > ({ \ > forget(&(a)); \ > (a);\ > }) > > int a; > > void func(void) > { > a = 0; > while (atomic_read(a)) > ; > } > == > $ gcc -Os -S tp2.c; cat tp2.s > > func: > pushl %ebp > movl%esp, %ebp > popl%ebp > movl$0, a > ret > == > $ cat tp3.c # Using a full memory clobber barrier > > #define forget(x) asm volatile ("":::"memory") > > #define atomic_read(a)\ > ({ \ > forget(&(a)); \ > (a);\ > }) > > int a; > > void func(void) > { > a = 0; > while (atomic_read(a)) > ; > } > == > $ gcc -Os -S tp3.c; cat tp3.s > > func: > pushl %ebp > movl%esp, %ebp > movl$0, a > .L2: > cmpl$0, a > jne .L2 > popl%ebp > ret > == > $ cat tp4.c # Using a forget(var) macro > > #define forget(a) __asm__ __volatile__ ("" :"=m" (a) :"m" (a)) > > #define atomic_read(a)\ > ({ \ > forget(a); \ > (a);\ > }) > > int a; > > void func(void) > { > a = 0; > while (atomic_read(a)) > ; > } > == > $ gcc -Os -S tp4.c; cat tp4.s > > func: > pushl %ebp > movl%esp, %ebp > movl$0, a > .L2: > cmpl$0, a > jne .L2 > popl%ebp > ret > == > > Possibly these were too trivial to expose any potential problems that you > may have been referring to, so would be helpful if you could write a more > concrete example / sample code. The trick is to have a sufficiently complicated expression to force the compiler to run out of registers. If the value is non-volatile, it will refetch it (and expect it not to have changed, possibly being disappointed by an interrupt handler running on that same CPU). > > > Or possibly, implement these "volatile" atomic ops variants in inline asm > > > like the patch that Sebastian Siewior has submitted on another thread just > > > a while back. > > > > Given that you are advocating a change (please keep in mind that > > atomic_read() and atomic_set() had volatile semantics on almost all > > platforms), care to give some example where these historical volatile > > semantics are causing a problem? > > [...] > > Can you give even one example > > where the pre-existing volatile semantics are causing enough of a problem > > to justify adding ye
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Segher Boessenkool wrote: > > "Volatile behaviour" itself isn't consistently defined (at least > > definitely not consistently implemented in various gcc versions across > > platforms), > > It should be consistent across platforms; if not, file a bug please. > > > but it is /expected/ to mean something like: "ensure that > > every such access actually goes all the way to memory, and is not > > re-ordered w.r.t. to other accesses, as far as the compiler can take ^ (volatile) (or, alternatively, "other accesses to the same volatile object" ...) > > care of these". The last "as far as compiler can take care" disclaimer > > comes about due to CPUs doing their own re-ordering nowadays. > > You can *expect* whatever you want, but this isn't in line with > reality at all. > > volatile _does not_ prevent reordering wrt other accesses. > [...] > What volatile does are a) never optimise away a read (or write) > to the object, since the data can change in ways the compiler > cannot see; and b) never move stores to the object across a > sequence point. This does not mean other accesses cannot be > reordered wrt the volatile access. > > If the abstract machine would do an access to a volatile- > qualified object, the generated machine code will do that > access too. But, for example, it can still be optimised > away by the compiler, if it can prove it is allowed to. As (now) indicated above, I had meant multiple volatile accesses to the same object, obviously. BTW: #define atomic_read(a) (*(volatile int *)&(a)) #define atomic_set(a,i) (*(volatile int *)&(a) = (i)) int a; void func(void) { int b; b = atomic_read(a); atomic_set(a, 20); b = atomic_read(a); } gives: func: pushl %ebp movla, %eax movl%esp, %ebp movl$20, a movla, %eax popl%ebp ret so the first atomic_read() wasn't optimized away. > volatile _does not_ make accesses go all the way to memory. > [...] > If you want stuff to go all the way to memory, you need some > architecture-specific flush sequence; to make a store globally > visible before another store, you need mb(); before some following > read, you need mb(); to prevent reordering you need a barrier. Sure, which explains the "as far as the compiler can take care" bit. Poor phrase / choice of words, probably. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
[ The Cc: list scares me. Should probably trim it. ] On Wed, 15 Aug 2007, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote: > > >>How does the compiler know that msleep() has got barrier()s? > > > > > >Because msleep_interruptible() is in a separate compilation unit, > > >the compiler has to assume that it might modify any arbitrary global. > > > > No; compilation units have nothing to do with it, GCC can optimise > > across compilation unit boundaries just fine, if you tell it to > > compile more than one compilation unit at once. > > Last I checked, the Linux kernel build system did compile each .c file > as a separate compilation unit. > > > What you probably mean is that the compiler has to assume any code > > it cannot currently see can do anything (insofar as allowed by the > > relevant standards etc.) I think this was just terminology confusion here again. Isn't "any code that it cannot currently see" the same as "another compilation unit", and wouldn't the "compilation unit" itself expand if we ask gcc to compile more than one unit at once? Or is there some more specific "definition" for "compilation unit" (in gcc lingo, possibly?) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 01:24:42AM +0530, Satyam Sharma wrote: > [ The Cc: list scares me. Should probably trim it. ] Trim away! ;-) > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote: > > > >>How does the compiler know that msleep() has got barrier()s? > > > > > > > >Because msleep_interruptible() is in a separate compilation unit, > > > >the compiler has to assume that it might modify any arbitrary global. > > > > > > No; compilation units have nothing to do with it, GCC can optimise > > > across compilation unit boundaries just fine, if you tell it to > > > compile more than one compilation unit at once. > > > > Last I checked, the Linux kernel build system did compile each .c file > > as a separate compilation unit. > > > > > What you probably mean is that the compiler has to assume any code > > > it cannot currently see can do anything (insofar as allowed by the > > > relevant standards etc.) > > I think this was just terminology confusion here again. Isn't "any code > that it cannot currently see" the same as "another compilation unit", > and wouldn't the "compilation unit" itself expand if we ask gcc to > compile more than one unit at once? Or is there some more specific > "definition" for "compilation unit" (in gcc lingo, possibly?) This is indeed my understanding -- "compilation unit" is whatever the compiler looks at in one go. I have heard the word "module" used for the minimal compilation unit covering a single .c file and everything that it #includes, but there might be a better name for this. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Stefan Richter wrote: > LDD3 says on page 125: "The following operations are defined for the > type [atomic_t] and are guaranteed to be atomic with respect to all > processors of an SMP computer." > > Doesn't "atomic WRT all processors" require volatility? Atomic operations only require exclusive access to the cacheline while the value is modified. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
How does the compiler know that msleep() has got barrier()s? Because msleep_interruptible() is in a separate compilation unit, the compiler has to assume that it might modify any arbitrary global. No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. What you probably mean is that the compiler has to assume any code it cannot currently see can do anything (insofar as allowed by the relevant standards etc.) In many cases, the compiler also has to assume that msleep_interruptible() might call back into a function in the current compilation unit, thus possibly modifying global static variables. It most often is smart enough to see what compilation-unit-local variables might be modified that way, though :-) Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
I think this was just terminology confusion here again. Isn't "any code that it cannot currently see" the same as "another compilation unit", and wouldn't the "compilation unit" itself expand if we ask gcc to compile more than one unit at once? Or is there some more specific "definition" for "compilation unit" (in gcc lingo, possibly?) This is indeed my understanding -- "compilation unit" is whatever the compiler looks at in one go. I have heard the word "module" used for the minimal compilation unit covering a single .c file and everything that it #includes, but there might be a better name for this. Yes, that's what's called "compilation unit" :-) [/me double checks] Erm, the C standard actually calls it "translation unit". To be exact, to avoid any more confusion: 5.1.1.1/1: A C program need not all be translated at the same time. The text of the program is kept in units called source files, (or preprocessing files) in this International Standard. A source file together with all the headers and source files included via the preprocessing directive #include is known as a preprocessing translation unit. After preprocessing, a preprocessing translation unit is called a translation unit. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
What volatile does are a) never optimise away a read (or write) to the object, since the data can change in ways the compiler cannot see; and b) never move stores to the object across a sequence point. This does not mean other accesses cannot be reordered wrt the volatile access. If the abstract machine would do an access to a volatile- qualified object, the generated machine code will do that access too. But, for example, it can still be optimised away by the compiler, if it can prove it is allowed to. As (now) indicated above, I had meant multiple volatile accesses to the same object, obviously. Yes, accesses to volatile objects are never reordered with respect to each other. BTW: #define atomic_read(a) (*(volatile int *)&(a)) #define atomic_set(a,i) (*(volatile int *)&(a) = (i)) int a; void func(void) { int b; b = atomic_read(a); atomic_set(a, 20); b = atomic_read(a); } gives: func: pushl %ebp movla, %eax movl%esp, %ebp movl$20, a movla, %eax popl%ebp ret so the first atomic_read() wasn't optimized away. Of course. It is executed by the abstract machine, so it will be executed by the actual machine. On the other hand, try b = 0; if (b) b = atomic_read(a); or similar. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
What you probably mean is that the compiler has to assume any code it cannot currently see can do anything (insofar as allowed by the relevant standards etc.) I think this was just terminology confusion here again. Isn't "any code that it cannot currently see" the same as "another compilation unit", It is not; try gcc -combine or the upcoming link-time optimisation stuff, for example. and wouldn't the "compilation unit" itself expand if we ask gcc to compile more than one unit at once? Or is there some more specific "definition" for "compilation unit" (in gcc lingo, possibly?) "compilation unit" is a C standard term. It typically boils down to "single .c file". Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Of course, if we find there are more callers in the kernel who want the volatility behaviour than those who don't care, we can re-define the existing ops to such variants, and re-name the existing definitions to somethine else, say "atomic_read_nonvolatile" for all I care. Do we really need another set of APIs? Well, if there's one set of users who do care about volatile behaviour, and another set that doesn't, it only sounds correct to provide both those APIs, instead of forcing one behaviour on all users. But since there currently is only one such API, and there are users expecting the stronger behaviour, the only sane thing to do is let the API provide that behaviour. You can always add a new API with weaker behaviour later, and move users that are okay with it over to that new API. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have some patches to use -combine -fwhole-program for Linux. Highly experimental, you need a patched bleeding edge toolchain. If there's interest I'll clean it up and put it online. David Woodhouse had some similar patches about a year ago. In many cases, the compiler also has to assume that msleep_interruptible() might call back into a function in the current compilation unit, thus possibly modifying global static variables. It most often is smart enough to see what compilation-unit-local variables might be modified that way, though :-) Yep. For example, if it knows the current value of a given such local variable, and if all code paths that would change some other variable cannot be reached given that current value of the first variable. Or the most common thing: if neither the address of the translation- unit local variable nor the address of any function writing to that variable can "escape" from that translation unit, nothing outside the translation unit can write to the variable. At least given that gcc doesn't know about multiple threads of execution! Heh, only about the threads it creates itself (not relevant to the kernel, for sure :-) ) Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Possibly these were too trivial to expose any potential problems that you may have been referring to, so would be helpful if you could write a more concrete example / sample code. The trick is to have a sufficiently complicated expression to force the compiler to run out of registers. You can use -ffixed-XXX to keep the testcase simple. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 10:52:53PM +0200, Segher Boessenkool wrote: > >>I think this was just terminology confusion here again. Isn't "any > >>code > >>that it cannot currently see" the same as "another compilation unit", > >>and wouldn't the "compilation unit" itself expand if we ask gcc to > >>compile more than one unit at once? Or is there some more specific > >>"definition" for "compilation unit" (in gcc lingo, possibly?) > > > >This is indeed my understanding -- "compilation unit" is whatever the > >compiler looks at in one go. I have heard the word "module" used for > >the minimal compilation unit covering a single .c file and everything > >that it #includes, but there might be a better name for this. > > Yes, that's what's called "compilation unit" :-) > > [/me double checks] > > Erm, the C standard actually calls it "translation unit". > > To be exact, to avoid any more confusion: > > 5.1.1.1/1: > A C program need not all be translated at the same time. The > text of the program is kept in units called source files, (or > preprocessing files) in this International Standard. A source > file together with all the headers and source files included > via the preprocessing directive #include is known as a > preprocessing translation unit. After preprocessing, a > preprocessing translation unit is called a translation unit. I am OK with "translation" and "compilation" being near-synonyms. ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 11:05:35PM +0200, Segher Boessenkool wrote: > >>No; compilation units have nothing to do with it, GCC can optimise > >>across compilation unit boundaries just fine, if you tell it to > >>compile more than one compilation unit at once. > > > >Last I checked, the Linux kernel build system did compile each .c file > >as a separate compilation unit. > > I have some patches to use -combine -fwhole-program for Linux. > Highly experimental, you need a patched bleeding edge toolchain. > If there's interest I'll clean it up and put it online. > > David Woodhouse had some similar patches about a year ago. Sounds exciting... ;-) > >>>In many cases, the compiler also has to assume that > >>>msleep_interruptible() > >>>might call back into a function in the current compilation unit, thus > >>>possibly modifying global static variables. > >> > >>It most often is smart enough to see what compilation-unit-local > >>variables might be modified that way, though :-) > > > >Yep. For example, if it knows the current value of a given such local > >variable, and if all code paths that would change some other variable > >cannot be reached given that current value of the first variable. > > Or the most common thing: if neither the address of the translation- > unit local variable nor the address of any function writing to that > variable can "escape" from that translation unit, nothing outside > the translation unit can write to the variable. But there is usually at least one externally callable function in a .c file. > >At least given that gcc doesn't know about multiple threads of > >execution! > > Heh, only about the threads it creates itself (not relevant to > the kernel, for sure :-) ) ;-) Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Satyam Sharma writes: > > Doesn't "atomic WRT all processors" require volatility? > > No, it definitely doesn't. Why should it? > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity" > for SMP systems (ensure that that object is modified / set / replaced > in main memory atomically) and has nothing to do with "volatile" > behaviour. Atomic variables are "volatile" in the sense that they are liable to be changed at any time by mechanisms that are outside the knowledge of the C compiler, namely, other CPUs, or this CPU executing an interrupt routine. In the kernel we use atomic variables in precisely those situations where a variable is potentially accessed concurrently by multiple CPUs, and where each CPU needs to see updates done by other CPUs in a timely fashion. That is what they are for. Therefore the compiler must not cache values of atomic variables in registers; each atomic_read must result in a load and each atomic_set must result in a store. Anything else will just lead to subtle bugs. I have no strong opinion about whether or not the best way to achieve this is through the use of the "volatile" C keyword. Segher's idea of using asm instead seems like a good one to me. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote: > On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote: > > Herbert Xu <[EMAIL PROTECTED]> wrote: > > > > > Let's turn this around. Can you give a single example where > > > the volatile semantics is needed in a legitimate way? > > > > Accessing H/W registers? But apart from that... > > Communicating between process context and interrupt/NMI handlers using > per-CPU variables. Remeber we're talking about atomic_read/atomic_set. Please cite the actual file/function name you have in mind. Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 07:40:21AM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote: > > > > Part of the motivation here is to fix heisenbugs. If I knew where they > > By the same token we should probably disable optimisations > altogether since that too can create heisenbugs. Precisely the point -- use of volatile (whether in casts or on asms) in these cases are intended to disable those optimizations likely to result in heisenbugs. But they are also intended to leave other valuable optimizations in force. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 07:41:46AM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote: > > On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote: > > > Herbert Xu <[EMAIL PROTECTED]> wrote: > > > > > > > Let's turn this around. Can you give a single example where > > > > the volatile semantics is needed in a legitimate way? > > > > > > Accessing H/W registers? But apart from that... > > > > Communicating between process context and interrupt/NMI handlers using > > per-CPU variables. > > Remeber we're talking about atomic_read/atomic_set. Please > cite the actual file/function name you have in mind. Yep, we are indeed talking about atomic_read()/atomic_set(). We have been through this issue already in this thread. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote: > > Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote: > > > > Communicating between process context and interrupt/NMI handlers using > > > per-CPU variables. > > > > Remeber we're talking about atomic_read/atomic_set. Please > > cite the actual file/function name you have in mind. > > Yep, we are indeed talking about atomic_read()/atomic_set(). > > We have been through this issue already in this thread. Sorry, but I must've missed it. Could you cite the file or function for my benefit? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote: > > > > > > Communicating between process context and interrupt/NMI handlers using > > > > per-CPU variables. > > > > > > Remeber we're talking about atomic_read/atomic_set. Please > > > cite the actual file/function name you have in mind. > > > > Yep, we are indeed talking about atomic_read()/atomic_set(). > > > > We have been through this issue already in this thread. > > Sorry, but I must've missed it. Could you cite the file or > function for my benefit? I might summarize the thread if there is interest, but I am not able to do so right this minute. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Paul Mackerras wrote: > In the kernel we use atomic variables in precisely those situations > where a variable is potentially accessed concurrently by multiple > CPUs, and where each CPU needs to see updates done by other CPUs in a > timely fashion. That is what they are for. Therefore the compiler > must not cache values of atomic variables in registers; each > atomic_read must result in a load and each atomic_set must result in a > store. Anything else will just lead to subtle bugs. This may have been the intend. However, today the visibility is controlled using barriers. And we have barriers that we use with atomic operations. Having volatile be the default just lead to confusion. Atomic read should just read with no extras. Extras can be added by using variants like atomic_read_volatile or so. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote: > On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote: > > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote: > > > > > > > > Communicating between process context and interrupt/NMI handlers using > > > > > per-CPU variables. > > > > > > > > Remeber we're talking about atomic_read/atomic_set. Please > > > > cite the actual file/function name you have in mind. > > > > > > Yep, we are indeed talking about atomic_read()/atomic_set(). > > > > > > We have been through this issue already in this thread. > > > > Sorry, but I must've missed it. Could you cite the file or > > function for my benefit? > > I might summarize the thread if there is interest, but I am not able to > do so right this minute. Thanks. But I don't need a summary of the thread, I'm asking for an extant code snippet in our kernel that benefits from the volatile change and is not part of a busy-wait. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Christoph Lameter writes: > On Thu, 16 Aug 2007, Paul Mackerras wrote: > > > In the kernel we use atomic variables in precisely those situations > > where a variable is potentially accessed concurrently by multiple > > CPUs, and where each CPU needs to see updates done by other CPUs in a > > timely fashion. That is what they are for. Therefore the compiler > > must not cache values of atomic variables in registers; each > > atomic_read must result in a load and each atomic_set must result in a > > store. Anything else will just lead to subtle bugs. > > This may have been the intend. However, today the visibility is controlled > using barriers. And we have barriers that we use with atomic operations. Those barriers are for when we need ordering between atomic variables and other memory locations. An atomic variable by itself doesn't and shouldn't need any barriers for other CPUs to be able to see what's happening to it. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Paul Mackerras wrote: > Those barriers are for when we need ordering between atomic variables > and other memory locations. An atomic variable by itself doesn't and > shouldn't need any barriers for other CPUs to be able to see what's > happening to it. It does not need any barriers. As soon as one cpu acquires the cacheline for write it will be invalidated in the caches of the others. So the other cpu will have to refetch. No need for volatile. The issue here may be that the compiler has fetched the atomic variable earlier and put it into a register. However, that prefetching is limited because it cannot cross functions calls etc. The only problem could be loops where the compiler does not refetch the variable since it assumes that it does not change and there are no function calls in the body of the loop. But AFAIK these loops need cpu_relax and other measures anyways to avoid bad effects from busy waiting. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > Seems to me that we face greater chance of confusion without the > volatile than with, particularly as compiler optimizations become > more aggressive. Yes, we could simply disable optimization, but > optimization can be quite helpful. A volatile default would disable optimizations for atomic_read. atomic_read without volatile would allow for full optimization by the compiler. Seems that this is what one wants in many cases. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
[ Sorry for empty subject line in previous mail. I intended to make a patch so cleared it to change it, but ultimately neither made a patch nor restored subject line. Done that now. ] On Thu, 16 Aug 2007, Herbert Xu wrote: > On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote: > > > > that are: > > > > while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { > > mdelay(1); > > msecs--; > > } > > > > where mdelay() becomes __const_udelay() which happens to be in another > > translation unit (arch/i386/lib/delay.c) and hence saves this callsite > > from being a bug :-) > > The udelay itself certainly should have some form of cpu_relax in it. Yes, a form of barrier() must be present in mdelay() or udelay() itself as you say, having it in __const_udelay() is *not* enough (superflous actually, considering it is already a separate translation unit and invisible to the compiler). However, there are no compiler barriers on the macro-definition-path between mdelay(1) and __const_udelay(), so the only thing that saves us from being a bug here is indeed the different-translation-unit concept. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote: > > On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote: > > > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote: > > > > > > > > > > Communicating between process context and interrupt/NMI handlers > > > > > > using > > > > > > per-CPU variables. > > > > > > > > > > Remeber we're talking about atomic_read/atomic_set. Please > > > > > cite the actual file/function name you have in mind. > > > > > > > > Yep, we are indeed talking about atomic_read()/atomic_set(). > > > > > > > > We have been through this issue already in this thread. > > > > > > Sorry, but I must've missed it. Could you cite the file or > > > function for my benefit? > > > > I might summarize the thread if there is interest, but I am not able to > > do so right this minute. > > Thanks. But I don't need a summary of the thread, I'm asking > for an extant code snippet in our kernel that benefits from > the volatile change and is not part of a busy-wait. Sorry, can't help you there. I really do believe that the information you need (as opposed to the specific item you are asking for) really has been put forth in this thread. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote: > > > The udelay itself certainly should have some form of cpu_relax in it. > > Yes, a form of barrier() must be present in mdelay() or udelay() itself > as you say, having it in __const_udelay() is *not* enough (superflous > actually, considering it is already a separate translation unit and > invisible to the compiler). As long as __const_udelay does something which has the same effect as barrier it is enough even if it's in the same unit. As a matter of fact it does on i386 where __delay either uses rep_nop or asm/volatile. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote: > On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote: > > > Thanks. But I don't need a summary of the thread, I'm asking > > for an extant code snippet in our kernel that benefits from > > the volatile change and is not part of a busy-wait. > > Sorry, can't help you there. I really do believe that the information > you need (as opposed to the specific item you are asking for) really > has been put forth in this thread. That only leads me to believe that such a code snippet simply does not exist. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 05:42:07PM -0700, Christoph Lameter wrote: > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > Seems to me that we face greater chance of confusion without the > > volatile than with, particularly as compiler optimizations become > > more aggressive. Yes, we could simply disable optimization, but > > optimization can be quite helpful. > > A volatile default would disable optimizations for atomic_read. > atomic_read without volatile would allow for full optimization by the > compiler. Seems that this is what one wants in many cases. The volatile cast should not disable all that many optimizations, for example, it is much less hurtful than barrier(). Furthermore, the main optimizations disabled (pulling atomic_read() and atomic_set() out of loops) really do need to be disabled. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 05:26:34PM -0700, Christoph Lameter wrote: > On Thu, 16 Aug 2007, Paul Mackerras wrote: > > > In the kernel we use atomic variables in precisely those situations > > where a variable is potentially accessed concurrently by multiple > > CPUs, and where each CPU needs to see updates done by other CPUs in a > > timely fashion. That is what they are for. Therefore the compiler > > must not cache values of atomic variables in registers; each > > atomic_read must result in a load and each atomic_set must result in a > > store. Anything else will just lead to subtle bugs. > > This may have been the intend. However, today the visibility is controlled > using barriers. And we have barriers that we use with atomic operations. > Having volatile be the default just lead to confusion. Atomic read should > just read with no extras. Extras can be added by using variants like > atomic_read_volatile or so. Seems to me that we face greater chance of confusion without the volatile than with, particularly as compiler optimizations become more aggressive. Yes, we could simply disable optimization, but optimization can be quite helpful. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Segher Boessenkool wrote: > [...] > > BTW: > > > > #define atomic_read(a) (*(volatile int *)&(a)) > > #define atomic_set(a,i) (*(volatile int *)&(a) = (i)) > > > > int a; > > > > void func(void) > > { > > int b; > > > > b = atomic_read(a); > > atomic_set(a, 20); > > b = atomic_read(a); > > } > > > > gives: > > > > func: > > pushl %ebp > > movla, %eax > > movl%esp, %ebp > > movl$20, a > > movla, %eax > > popl%ebp > > ret > > > > so the first atomic_read() wasn't optimized away. > > Of course. It is executed by the abstract machine, so > it will be executed by the actual machine. On the other > hand, try > > b = 0; > if (b) > b = atomic_read(a); > > or similar. Yup, obviously. Volatile accesses (or any access to volatile objects), or even "__volatile__ asms" (which gcc normally promises never to elid) can always be optimized for cases such as these where the compiler can trivially determine that the code in question is not reachable. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have some patches to use -combine -fwhole-program for Linux. Highly experimental, you need a patched bleeding edge toolchain. If there's interest I'll clean it up and put it online. David Woodhouse had some similar patches about a year ago. Sounds exciting... ;-) Yeah, the breakage is *quite* spectacular :-) In many cases, the compiler also has to assume that msleep_interruptible() might call back into a function in the current compilation unit, thus possibly modifying global static variables. It most often is smart enough to see what compilation-unit-local variables might be modified that way, though :-) Yep. For example, if it knows the current value of a given such local variable, and if all code paths that would change some other variable cannot be reached given that current value of the first variable. Or the most common thing: if neither the address of the translation- unit local variable nor the address of any function writing to that variable can "escape" from that translation unit, nothing outside the translation unit can write to the variable. But there is usually at least one externally callable function in a .c file. Of course, but often none of those will (indirectly) write a certain static variable. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 08:53:16AM +0800, Herbert Xu wrote: > On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote: > > On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote: > > > > > Thanks. But I don't need a summary of the thread, I'm asking > > > for an extant code snippet in our kernel that benefits from > > > the volatile change and is not part of a busy-wait. > > > > Sorry, can't help you there. I really do believe that the information > > you need (as opposed to the specific item you are asking for) really > > has been put forth in this thread. > > That only leads me to believe that such a code snippet simply > does not exist. Whatever... Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Hi Herbert, On Thu, 16 Aug 2007, Herbert Xu wrote: > On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote: > > > > > The udelay itself certainly should have some form of cpu_relax in it. > > > > Yes, a form of barrier() must be present in mdelay() or udelay() itself > > as you say, having it in __const_udelay() is *not* enough (superflous > > actually, considering it is already a separate translation unit and > > invisible to the compiler). > > As long as __const_udelay does something which has the same > effect as barrier it is enough even if it's in the same unit. Only if __const_udelay() is inlined. But as I said, __const_udelay() -- although marked "inline" -- will never be inlined anywhere in the kernel in reality. It's an exported symbol, and never inlined from modules. Even from built-in targets, the definition of __const_udelay is invisible when gcc is compiling the compilation units of those callsites. The compiler has no idea that that function has barriers or not, so we're saved here _only_ by the lucky fact that __const_udelay() is in a different compilation unit. > As a matter of fact it does on i386 where __delay either uses > rep_nop or asm/volatile. __delay() can be either delay_tsc() or delay_loop() on i386. delay_tsc() uses the rep_nop() there for it's own little busy loop, actually. But for a call site that inlines __const_udelay() -- if it were ever moved to a .h file and marked inline -- the call to __delay() will _still_ be across compilation units. So, again for this case, it does not matter if the callee function has compiler barriers or not (it would've been a different story if we were discussing real/CPU barriers, I think), what saves us here is just the fact that a call is made to a function from a different compilation unit, which is invisible to the compiler when compiling the callsite, and hence acting as the compiler barrier. Regarding delay_loop(), it uses "volatile" for the "asm" which has quite different semantics from the C language "volatile" type-qualifier keyword and does not imply any compiler barrier at all. Satyam - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > The volatile cast should not disable all that many optimizations, > for example, it is much less hurtful than barrier(). Furthermore, > the main optimizations disabled (pulling atomic_read() and atomic_set() > out of loops) really do need to be disabled. In many cases you do not need a barrier. Having volatile there *will* impact optimization because the compiler cannot use a register that may contain the value that was fetched earlier. And the compiler cannot choose freely when to fetch the value. The order of memory accesses are fixed if you use volatile. If the variable is not volatile then the compiler can arrange memory accesses any way they fit and thus generate better code. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 05:59:41PM -0700, Christoph Lameter wrote: > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > The volatile cast should not disable all that many optimizations, > > for example, it is much less hurtful than barrier(). Furthermore, > > the main optimizations disabled (pulling atomic_read() and atomic_set() > > out of loops) really do need to be disabled. > > In many cases you do not need a barrier. Having volatile there *will* > impact optimization because the compiler cannot use a register that may > contain the value that was fetched earlier. And the compiler cannot choose > freely when to fetch the value. The order of memory accesses are fixed if > you use volatile. If the variable is not volatile then the compiler can > arrange memory accesses any way they fit and thus generate better code. Understood. My point is not that the impact is precisely zero, but rather that the impact on optimization is much less hurtful than the problems that could arise otherwise, particularly as compilers become more aggressive in their optimizations. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Christoph Lameter writes: > A volatile default would disable optimizations for atomic_read. > atomic_read without volatile would allow for full optimization by the > compiler. Seems that this is what one wants in many cases. Name one such case. An atomic_read should do a load from memory. If the programmer puts an atomic_read() in the code then the compiler should emit a load for it, not re-use a value returned by a previous atomic_read. I do not believe it would ever be useful for the compiler to collapse two atomic_read statements into a single load. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Paul E. McKenney wrote: > Understood. My point is not that the impact is precisely zero, but > rather that the impact on optimization is much less hurtful than the > problems that could arise otherwise, particularly as compilers become > more aggressive in their optimizations. The problems arise because barriers are not used as required. Volatile has wishy washy semantics and somehow marries memory barriers with data access. It is clearer to separate the two. Conceptual cleanness usually translates into better code. If one really wants the volatile then lets make it explicit and use atomic_read_volatile() - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler optimisations an acceptable price to pay for some peace of mind. Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Precisely the point -- use of volatile (whether in casts or on asms) in these cases are intended to disable those optimizations likely to result in heisenbugs. The only thing volatile on an asm does is create a side effect on the asm statement; in effect, it tells the compiler "do not remove this asm even if you don't need any of its outputs". It's not disabling optimisation likely to result in bugs, heisen- or otherwise; _not_ putting the volatile on an asm that needs it simply _is_ a bug :-) Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 11:51:42AM +1000, Paul Mackerras wrote: > > Name one such case. See sk_stream_mem_schedule in net/core/stream.c: /* Under limit. */ if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) { if (*sk->sk_prot->memory_pressure) *sk->sk_prot->memory_pressure = 0; return 1; } /* Over hard limit. */ if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) { sk->sk_prot->enter_memory_pressure(); goto suppress_allocation; } We don't need to reload sk->sk_prot->memory_allocated here. Now where is your example again? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Christoph Lameter wrote: > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > Understood. My point is not that the impact is precisely zero, but > > rather that the impact on optimization is much less hurtful than the > > problems that could arise otherwise, particularly as compilers become > > more aggressive in their optimizations. > > The problems arise because barriers are not used as required. Volatile > has wishy washy semantics and somehow marries memory barriers with data > access. It is clearer to separate the two. Conceptual cleanness usually > translates into better code. If one really wants the volatile then lets > make it explicit and use > > atomic_read_volatile() Completely agreed, again. To summarize again (had done so about ~100 mails earlier in this thread too :-) ... atomic_{read,set}_volatile() -- guarantees volatility also along with atomicity (the two _are_ different concepts after all, irrespective of whether callsites normally want one with the other or not) atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler free to elid / coalesce / optimize such accesses, can keep the object in question cached in a local register, leads to smaller text, etc. As to which one should be the default atomic_read() is a question of whether majority of callsites (more weightage to important / hot codepaths, lesser to obscure callsites) want a particular behaviour. Do we have a consensus here? (hoping against hope, probably :-) [ This thread has gotten completely out of hand ... for my mail client alpine as well, it now seems. Reminds of that 1000+ GPLv3 fest :-) ] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu writes: > See sk_stream_mem_schedule in net/core/stream.c: > > /* Under limit. */ > if (atomic_read(sk->sk_prot->memory_allocated) < > sk->sk_prot->sysctl_mem[0]) { > if (*sk->sk_prot->memory_pressure) > *sk->sk_prot->memory_pressure = 0; > return 1; > } > > /* Over hard limit. */ > if (atomic_read(sk->sk_prot->memory_allocated) > > sk->sk_prot->sysctl_mem[2]) { > sk->sk_prot->enter_memory_pressure(); > goto suppress_allocation; > } > > We don't need to reload sk->sk_prot->memory_allocated here. Are you sure? How do you know some other CPU hasn't changed the value in between? Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
A volatile default would disable optimizations for atomic_read. atomic_read without volatile would allow for full optimization by the compiler. Seems that this is what one wants in many cases. Name one such case. An atomic_read should do a load from memory. If the programmer puts an atomic_read() in the code then the compiler should emit a load for it, not re-use a value returned by a previous atomic_read. I do not believe it would ever be useful for the compiler to collapse two atomic_read statements into a single load. An atomic_read() implemented as a "normal" C variable read would allow that read to be combined with another "normal" read from that variable. This could perhaps be marginally useful, although I'd bet you cannot see it unless counting cycles on a simulator or counting bits in the binary size. With an asm() implementation, the compiler can not do this; with a "volatile" implementation (either volatile variable or volatile-cast), this invokes undefined behaviour (in both C and GCC). Segher - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 07:45:44AM +0530, Satyam Sharma wrote: > > Completely agreed, again. To summarize again (had done so about ~100 mails > earlier in this thread too :-) ... > > atomic_{read,set}_volatile() -- guarantees volatility also along with > atomicity (the two _are_ different concepts after all, irrespective of > whether callsites normally want one with the other or not) > > atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler > free to elid / coalesce / optimize such accesses, can keep the object > in question cached in a local register, leads to smaller text, etc. > > As to which one should be the default atomic_read() is a question of > whether majority of callsites (more weightage to important / hot > codepaths, lesser to obscure callsites) want a particular behaviour. > > Do we have a consensus here? (hoping against hope, probably :-) I can certainly agree with this. But I have to say that I still don't know of a single place where one would actually use the volatile variant. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote: > Herbert Xu writes: > > > See sk_stream_mem_schedule in net/core/stream.c: > > > > /* Under limit. */ > > if (atomic_read(sk->sk_prot->memory_allocated) < > > sk->sk_prot->sysctl_mem[0]) { > > if (*sk->sk_prot->memory_pressure) > > *sk->sk_prot->memory_pressure = 0; > > return 1; > > } > > > > /* Over hard limit. */ > > if (atomic_read(sk->sk_prot->memory_allocated) > > > sk->sk_prot->sysctl_mem[2]) { > > sk->sk_prot->enter_memory_pressure(); > > goto suppress_allocation; > > } > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > Are you sure? How do you know some other CPU hasn't changed the value > in between? Yes I'm sure, because we don't care if others have increased the reservation. Note that even if we did we'd be using barriers so volatile won't do us any good here. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Paul Mackerras wrote: > > We don't need to reload sk->sk_prot->memory_allocated here. > > Are you sure? How do you know some other CPU hasn't changed the value > in between? The cpu knows because the cacheline was not invalidated. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu wrote: But I have to say that I still don't know of a single place where one would actually use the volatile variant. Given that many of the existing users do currently have "volatile", are you comfortable simply removing that behaviour from them? Are you sure that you will not introduce any issues? Forcing a re-read is only a performance penalty. Removing it can cause behavioural changes. I would be more comfortable making the default match the majority of the current implementations (ie: volatile semantics). Then, if someone cares about performance they can explicitly validate the call path and convert it over to the non-volatile version. Correctness before speed... Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Paul Mackerras wrote: > Herbert Xu writes: > > > See sk_stream_mem_schedule in net/core/stream.c: > > > > /* Under limit. */ > > if (atomic_read(sk->sk_prot->memory_allocated) < > > sk->sk_prot->sysctl_mem[0]) { > > if (*sk->sk_prot->memory_pressure) > > *sk->sk_prot->memory_pressure = 0; > > return 1; > > } > > > > /* Over hard limit. */ > > if (atomic_read(sk->sk_prot->memory_allocated) > > > sk->sk_prot->sysctl_mem[2]) { > > sk->sk_prot->enter_memory_pressure(); > > goto suppress_allocation; > > } > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > Are you sure? How do you know some other CPU hasn't changed the value > in between? I can't speak for this particular case, but there could be similar code examples elsewhere, where we do the atomic ops on an atomic_t object inside a higher-level locking scheme that would take care of the kind of problem you're referring to here. It would be useful for such or similar code if the compiler kept the value of that atomic object in a register. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 03:23:28AM +0200, Segher Boessenkool wrote: > No; compilation units have nothing to do with it, GCC can optimise > across compilation unit boundaries just fine, if you tell it to > compile more than one compilation unit at once. > >>> > >>>Last I checked, the Linux kernel build system did compile each .c > >>>file > >>>as a separate compilation unit. > >> > >>I have some patches to use -combine -fwhole-program for Linux. > >>Highly experimental, you need a patched bleeding edge toolchain. > >>If there's interest I'll clean it up and put it online. > >> > >>David Woodhouse had some similar patches about a year ago. > > > >Sounds exciting... ;-) > > Yeah, the breakage is *quite* spectacular :-) I bet!!! ;-) > >In many cases, the compiler also has to assume that > >msleep_interruptible() > >might call back into a function in the current compilation unit, > >thus > >possibly modifying global static variables. > > It most often is smart enough to see what compilation-unit-local > variables might be modified that way, though :-) > >>> > >>>Yep. For example, if it knows the current value of a given such > >>>local > >>>variable, and if all code paths that would change some other variable > >>>cannot be reached given that current value of the first variable. > >> > >>Or the most common thing: if neither the address of the translation- > >>unit local variable nor the address of any function writing to that > >>variable can "escape" from that translation unit, nothing outside > >>the translation unit can write to the variable. > > > >But there is usually at least one externally callable function in > >a .c file. > > Of course, but often none of those will (indirectly) write a certain > static variable. But there has to be some path to the static functions, assuming that they are not dead code. Yes, there can be cases where the compiler knows enough about the state of the variables to rule out some of code paths to them, but I have no idea how often this happens in kernel code. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Segher Boessenkool wrote: Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler optimisations an acceptable price to pay for some peace of mind. So why is this a good tradeoff? I also think that just adding things to APIs in the hope it might fix up some bugs isn't really a good road to go down. Where do you stop? On the actual proposal to make atomic_operators volatile: I think the better approach in the long term, for both maintainability of the code and education of coders, is to make the use of barriers _more_ explicit rather than sprinkling these "just in case" ones around. You may get rid of a few atomic_read heisenbugs (in noise when compared to all bugs), but if the coder was using a regular atomic load, or a test_bit (which is also atomic), etc. then they're going to have problems. It would be better for Linux if everyone was to have better awareness of barriers than to hide some of the cases where they're required. A pretty large number of bugs I see in lock free code in the VM is due to memory ordering problems. It's hard to find those bugs, or even be aware when you're writing buggy code if you don't have some feel for barriers. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Herbert Xu wrote: > > Do we have a consensus here? (hoping against hope, probably :-) > > I can certainly agree with this. I agree too. > But I have to say that I still don't know of a single place > where one would actually use the volatile variant. I suspect that what you say is true after we have looked at all callers. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, 15 Aug 2007, Christoph Lameter wrote: > On Thu, 16 Aug 2007, Paul Mackerras wrote: > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > > > Are you sure? How do you know some other CPU hasn't changed the value > > in between? > > The cpu knows because the cacheline was not invalidated. Crap my statement above is wrong. We do not care that the value was changed otherwise we would have put a barrier in there. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 03:30:44AM +0200, Segher Boessenkool wrote: > >>>Part of the motivation here is to fix heisenbugs. If I knew where > >>>they > >> > >>By the same token we should probably disable optimisations > >>altogether since that too can create heisenbugs. > > > >Precisely the point -- use of volatile (whether in casts or on asms) > >in these cases are intended to disable those optimizations likely to > >result in heisenbugs. > > The only thing volatile on an asm does is create a side effect > on the asm statement; in effect, it tells the compiler "do not > remove this asm even if you don't need any of its outputs". > > It's not disabling optimisation likely to result in bugs, > heisen- or otherwise; _not_ putting the volatile on an asm > that needs it simply _is_ a bug :-) Yep. And the reason it is a bug is that it fails to disable the relevant compiler optimizations. So I suspect that we might actually be saying the same thing here. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Wed, Aug 15, 2007 at 06:41:40PM -0700, Christoph Lameter wrote: > On Wed, 15 Aug 2007, Paul E. McKenney wrote: > > > Understood. My point is not that the impact is precisely zero, but > > rather that the impact on optimization is much less hurtful than the > > problems that could arise otherwise, particularly as compilers become > > more aggressive in their optimizations. > > The problems arise because barriers are not used as required. Volatile > has wishy washy semantics and somehow marries memory barriers with data > access. It is clearer to separate the two. Conceptual cleanness usually > translates into better code. If one really wants the volatile then lets > make it explicit and use > > atomic_read_volatile() There are indeed architectures where you can cause gcc to emit memory barriers in response to volatile. I am assuming that we are -not- making gcc do this. Given this, then volatiles and memory barrier instructions are orthogonal -- one controls the compiler, the other controls the CPU. Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 10:11:05AM +0800, Herbert Xu wrote: > On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote: > > Herbert Xu writes: > > > > > See sk_stream_mem_schedule in net/core/stream.c: > > > > > > /* Under limit. */ > > > if (atomic_read(sk->sk_prot->memory_allocated) < > > > sk->sk_prot->sysctl_mem[0]) { > > > if (*sk->sk_prot->memory_pressure) > > > *sk->sk_prot->memory_pressure = 0; > > > return 1; > > > } > > > > > > /* Over hard limit. */ > > > if (atomic_read(sk->sk_prot->memory_allocated) > > > > sk->sk_prot->sysctl_mem[2]) { > > > sk->sk_prot->enter_memory_pressure(); > > > goto suppress_allocation; > > > } > > > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > > > Are you sure? How do you know some other CPU hasn't changed the value > > in between? > > Yes I'm sure, because we don't care if others have increased > the reservation. > > Note that even if we did we'd be using barriers so volatile > won't do us any good here. If the load-coalescing is important to performance, why not load into a local variable? Thanx, Paul - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, 16 Aug 2007, Satyam Sharma wrote: > On Thu, 16 Aug 2007, Paul Mackerras wrote: > > Herbert Xu writes: > > > > > See sk_stream_mem_schedule in net/core/stream.c: > > > > > > /* Under limit. */ > > > if (atomic_read(sk->sk_prot->memory_allocated) < > > > sk->sk_prot->sysctl_mem[0]) { > > > if (*sk->sk_prot->memory_pressure) > > > *sk->sk_prot->memory_pressure = 0; > > > return 1; > > > } > > > > > > /* Over hard limit. */ > > > if (atomic_read(sk->sk_prot->memory_allocated) > > > > sk->sk_prot->sysctl_mem[2]) { > > > sk->sk_prot->enter_memory_pressure(); > > > goto suppress_allocation; > > > } > > > > > > We don't need to reload sk->sk_prot->memory_allocated here. > > > > Are you sure? How do you know some other CPU hasn't changed the value > > in between? > > I can't speak for this particular case, but there could be similar code > examples elsewhere, where we do the atomic ops on an atomic_t object > inside a higher-level locking scheme that would take care of the kind of > problem you're referring to here. It would be useful for such or similar > code if the compiler kept the value of that atomic object in a register. We might not be using atomic_t (and ops) if we already have a higher-level locking scheme, actually. So as Herbert mentioned, such cases might just not care. [ Too much of this thread, too little sleep, sorry! ] Anyway, the problem, of course, is that this conversion to a stronger / safer-by-default behaviour doesn't happen with zero cost to performance. Converting atomic ops to "volatile" behaviour did add ~2K to kernel text for archs such as i386 (possibly to important codepaths) that didn't have those semantics already so it would be constructive to actually look at those differences and see if there were really any heisenbugs that got rectified. Or if there were legitimate optimizations that got wrongly disabled. Onus lies on those proposing the modifications, I'd say ;-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Satyam Sharma writes: > I can't speak for this particular case, but there could be similar code > examples elsewhere, where we do the atomic ops on an atomic_t object > inside a higher-level locking scheme that would take care of the kind of > problem you're referring to here. It would be useful for such or similar > code if the compiler kept the value of that atomic object in a register. If there is a higher-level locking scheme then there is no point to using atomic_t variables. Atomic_t is specifically for the situation where multiple CPUs are updating a variable without locking. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Herbert Xu writes: > > Are you sure? How do you know some other CPU hasn't changed the value > > in between? > > Yes I'm sure, because we don't care if others have increased > the reservation. But others can also reduce the reservation. Also, the code sets and clears *sk->sk_prot->memory_pressure nonatomically with respect to the reads of sk->sk_prot->memory_allocated, so in fact the code doesn't guarantee any particular relationship between the two. That code looks like a beautiful example of buggy, racy code where someone has sprinkled magic fix-the-races dust (otherwise known as atomic_t) around in a vain attempt to fix the races. That's assuming that all that stuff actually performs any useful purpose, of course, and that there isn't some lock held by the callers. In the latter case it is pointless using atomic_t. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
Christoph Lameter writes: > > But I have to say that I still don't know of a single place > > where one would actually use the volatile variant. > > I suspect that what you say is true after we have looked at all callers. It seems that there could be a lot of places where atomic_t is used in a non-atomic fashion, and that those uses are either buggy, or there is some lock held at the time which guarantees that other CPUs aren't changing the value. In both cases there is no point in using atomic_t; we might as well just use an ordinary int. In particular, atomic_read seems to lend itself to buggy uses. People seem to do things like: atomic_add(&v, something); if (atomic_read(&v) > something_else) ... and expect that there is some relationship between the value that the atomic_add stored and the value that the atomic_read will return, which there isn't. People seem to think that using atomic_t magically gets rid of races. It doesn't. I'd go so far as to say that anywhere where you want a non-"volatile" atomic_read, either your code is buggy, or else an int would work just as well. Paul. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
On Thu, Aug 16, 2007 at 01:23:06PM +1000, Paul Mackerras wrote: > > In particular, atomic_read seems to lend itself to buggy uses. People > seem to do things like: > > atomic_add(&v, something); > if (atomic_read(&v) > something_else) ... If you're referring to the code in sk_stream_mem_schedule then it's working as intended. The atomicity guarantees that the atomic_add/atomic_sub won't be seen in parts by other readers. We certainly do not need to see other atomic_add/atomic_sub operations immediately. If you're referring to another code snippet please cite. > I'd go so far as to say that anywhere where you want a non-"volatile" > atomic_read, either your code is buggy, or else an int would work just > as well. An int won't work here because += and -= do not have the atomicity guarantees that atomic_add/atomic_sub do. In particular, this may cause an atomic_read on another CPU to give a bogus reading. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html