On 27/12/14 16:02, paul_kon...@dell.com wrote: > >> On Dec 26, 2014, at 6:19 PM, Andrew Haley <a...@redhat.com> wrote: >> >> On 26/12/14 22:49, Matt Godbolt wrote: >>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley <a...@redhat.com> wrote: >>>> On 26/12/14 20:32, Matt Godbolt wrote: >>>>> Is there a reason why (in principal) the volatile increment can't be >>>>> made into a single add? Clang and ICC both emit the same code for the >>>>> volatile and non-volatile case. >>>> >>>> Yes. Volatiles use the "as if" rule, where every memory access is as >>>> written. a volatile increment is defined as a load, an increment, and >>>> a store. >>> >>> That makes sense to me from a logical point of view. My >>> understanding though is the volatile keyword was mainly used when >>> working with memory-mapped devices, where memory loads and stores >>> could not be elided. A single-instruction load-modify-write like >>> "increment [addr]" adheres to these constraints even though it is a >>> single instruction. I realise my understanding could be wrong here! >>> If not though, both clang and icc are taking a short-cut that may >>> puts them into non-compliant state. >> >> It's hard to be certain. The language used by the standard is very >> unhelpful: it requires all accesses to be as written, but does not >> define exactly what constitutes an access. > > I would look at this sort of thing with the mindset of a network > protocol designer. If the externally visible actions are correct, > the implementation is correct. Details not visible at the external > reference interface are irrelevant. > > In the case of volatile variables, the external interface in > question is the one at the point where that address is implemented — > a memory cell, or memory mapped I/O device on a bus. So the > required behavior is that load and store operations (read and write > transactions at that interface) occur as written.
I believe this is incorrect. For accesses to reach memory in program order on most architectures would require volatile memory references to emit memory barriers, and the C committee decided not to require that. > If a processor has add instructions that support memory references > (as in x86 and vax, but not mips), such an instruction will perform > a read cycle followed by a write cycle. So as seen at the critical > interface, the behavior is the same as if you were to do an explicit > load, register add, store sequence. Therefore the use of a single > add-to-memory is a valid implementation. I agree. Andrew.