On Tue, 2018-02-27 at 12:56 +0000, Ruslan Nikolaev via gcc wrote:
> But, of course, it is kind of annoying that double-width types (and that also 
> includes potentially 64-bit on some 32-bit processors, e.g. i586 also has 
> cmpxchg8b and no official way to read atomically otherwise) need special 
> handling and compiler extensions which basically means that in a number of 
> cases I cannot write portable code, I need to put a bunch of 
> architecture-dependent ifdefs, for say, 64 bit atomics even.

The extension I outlined gives you a portable to use wide CAS, provided
that the particular implementation supports this extension.

Whether the standard covers this extension is a different matter.  You
can certainly propose it to the C and/or C++ committees, but that gets
easier if you can show existing practice.

> (And all this mess to accommodate almost non-existent case when someone wants 
> to use atomic_load on read-only memory for wide types, in which no good 
> solution exists anyway.)

You keep repeating that claim.  You also keep ignoring the point about
what kind of performance programs can expect atomic loads to have, if
they are declared lock-free.

Also note that the use case is *not* about wider-than-machine-word
accesses on read-only memory, but whether a portable use of atomics
(which doesn't have to consider machine word size) can use atomic loads
on read-only memory. 

> Particularly, imagine when someones writes some lock-free code for different 
> types (in templates, macros, etc). It basically uses same C11 atomic 
> primitives but for various integer sizes. Now I need special handling for 
> larger types because whatever libatomic provides does not guarantee 
> lock-freedom (i.e., useless) which otherwise I do not need.

If you use C11 atomics, you're bound to the usage intended by the
standard.  Which means that if you need lock freedom, you must check or
ensure that the implementation provides it to you.  If you make an
implicit assumption there beyond what the standard promises you, you are
not writing portable C11 concurrent code -- you are writing
architecture/platform-specific code.

Now, imagine someone writes atomic code using C11, and expects atomic
loads to actually perform like loads and lot like RMWs -- which is what
most concurrent code does, in particular lots of nonblocking  code ...

> True that wider types may not be available across all architectures, but I 
> would prefer to have generic and standard-conformant code at least for those 
> that have them.

It is conforming to the standard.

> > The standard doesn't specify read-only memory, so it also doesn't forbid
> > the concept.  The implementation takes it into account though, and thus
> > it's defined in that context.
> But my point is that a programmer cannot rely on this feature anyway unless 
> she/he wants to write code which compiles only with gcc.

She/he wants to rely on implementation-specific behavior, so that's not
a problem.

> It is unspecified by the standard and implementations that use 
> read-modify-write for atomic_load are perfectly valid. The whole point to 
> have this standard in the first place is to allow code be compiled by 
> different compilers, otherwise people can just rely on gcc-specific 
> extensions.

And they can, because GCC's behavior conforms to the standard.  It
doesn't have the implementation-specific properties you prefer, but
that's not about the standard but about your personal preferences.

> > The topic we're currently discussing does not significantly affect when
> > we can remove __sync builtins, IMO.
> 
> They are the only builtins that directly expose double-width operations. 
> Short of using assembly fall-backs, they are the only option right now.

We can still have an extension such as the one I outlined.

> > They do care about whether atomic operations are natively supported on
> > that particular type -- and that should include a load.
> I think, the whole point to have atomic operations is ability to provide 
> lock-free operations whenever possible. Even though standard does not 
> guarantee it, that is almost the only sane use case. Otherwise, there is no 
> point -- you can always use locks. If they do not care about lock-freedom, 
> they should just use locks.

The standards actually just promise you obstruction freedom.  Forward
progress guarantees are a part of the intention behind the lock-free
class, but not all of it.  There's address-freedom too, and an implicit
assumption about what rough class of performance a particular operation
is in.

The majority of synchronization code will care much more about
performance than about the operation being actually lock-free or not
(things like signal handlers or C++ unsequenced execution policy the are
exceptions).  Case in point: Lots of concurrent code built out of
lock-free atomics is actually not lock-free but has blocking parts -- so
it couldn't be used in things like signal handlers anyway.

> 
> > Nobody is proposing to mark things as lock-free if they aren't.  Thus, I
> > don't see any change to what's usable in signal handlers.
> It is not obvious to anyone that atomic_load will block.

What's your point?  Iff atomic_load will never block, it will be marked
lock-free.  Programs can check for this, so it will be obvious to them
that whether a certain operation might block.  

> It will *not* for single-width types. So, again we see differences for 
> single- and double-width types.

So? Lock-freedom is a per-type property.


Reply via email to