On Tue, 2018-02-27 at 12:56 +0000, Ruslan Nikolaev via gcc wrote: > But, of course, it is kind of annoying that double-width types (and that also > includes potentially 64-bit on some 32-bit processors, e.g. i586 also has > cmpxchg8b and no official way to read atomically otherwise) need special > handling and compiler extensions which basically means that in a number of > cases I cannot write portable code, I need to put a bunch of > architecture-dependent ifdefs, for say, 64 bit atomics even.
The extension I outlined gives you a portable to use wide CAS, provided that the particular implementation supports this extension. Whether the standard covers this extension is a different matter. You can certainly propose it to the C and/or C++ committees, but that gets easier if you can show existing practice. > (And all this mess to accommodate almost non-existent case when someone wants > to use atomic_load on read-only memory for wide types, in which no good > solution exists anyway.) You keep repeating that claim. You also keep ignoring the point about what kind of performance programs can expect atomic loads to have, if they are declared lock-free. Also note that the use case is *not* about wider-than-machine-word accesses on read-only memory, but whether a portable use of atomics (which doesn't have to consider machine word size) can use atomic loads on read-only memory. > Particularly, imagine when someones writes some lock-free code for different > types (in templates, macros, etc). It basically uses same C11 atomic > primitives but for various integer sizes. Now I need special handling for > larger types because whatever libatomic provides does not guarantee > lock-freedom (i.e., useless) which otherwise I do not need. If you use C11 atomics, you're bound to the usage intended by the standard. Which means that if you need lock freedom, you must check or ensure that the implementation provides it to you. If you make an implicit assumption there beyond what the standard promises you, you are not writing portable C11 concurrent code -- you are writing architecture/platform-specific code. Now, imagine someone writes atomic code using C11, and expects atomic loads to actually perform like loads and lot like RMWs -- which is what most concurrent code does, in particular lots of nonblocking code ... > True that wider types may not be available across all architectures, but I > would prefer to have generic and standard-conformant code at least for those > that have them. It is conforming to the standard. > > The standard doesn't specify read-only memory, so it also doesn't forbid > > the concept. The implementation takes it into account though, and thus > > it's defined in that context. > But my point is that a programmer cannot rely on this feature anyway unless > she/he wants to write code which compiles only with gcc. She/he wants to rely on implementation-specific behavior, so that's not a problem. > It is unspecified by the standard and implementations that use > read-modify-write for atomic_load are perfectly valid. The whole point to > have this standard in the first place is to allow code be compiled by > different compilers, otherwise people can just rely on gcc-specific > extensions. And they can, because GCC's behavior conforms to the standard. It doesn't have the implementation-specific properties you prefer, but that's not about the standard but about your personal preferences. > > The topic we're currently discussing does not significantly affect when > > we can remove __sync builtins, IMO. > > They are the only builtins that directly expose double-width operations. > Short of using assembly fall-backs, they are the only option right now. We can still have an extension such as the one I outlined. > > They do care about whether atomic operations are natively supported on > > that particular type -- and that should include a load. > I think, the whole point to have atomic operations is ability to provide > lock-free operations whenever possible. Even though standard does not > guarantee it, that is almost the only sane use case. Otherwise, there is no > point -- you can always use locks. If they do not care about lock-freedom, > they should just use locks. The standards actually just promise you obstruction freedom. Forward progress guarantees are a part of the intention behind the lock-free class, but not all of it. There's address-freedom too, and an implicit assumption about what rough class of performance a particular operation is in. The majority of synchronization code will care much more about performance than about the operation being actually lock-free or not (things like signal handlers or C++ unsequenced execution policy the are exceptions). Case in point: Lots of concurrent code built out of lock-free atomics is actually not lock-free but has blocking parts -- so it couldn't be used in things like signal handlers anyway. > > > Nobody is proposing to mark things as lock-free if they aren't. Thus, I > > don't see any change to what's usable in signal handlers. > It is not obvious to anyone that atomic_load will block. What's your point? Iff atomic_load will never block, it will be marked lock-free. Programs can check for this, so it will be obvious to them that whether a certain operation might block. > It will *not* for single-width types. So, again we see differences for > single- and double-width types. So? Lock-freedom is a per-type property.