[Catching up on some oooold mail.]

On 2025-01-22 02:02, Jₑₙₛ Gustedt wrote:
I hope you agree that the C23 and gcc
attributes coincide if no pointers are involved.

Has that been decided yet? I see from N3494 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm> that in February you proposed wording saying they don't coincide, but that change doesn't seem to appear in the C2y September draft N3685 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3685.pdf>.


WG14 didn't wanted to annotate any of the exisiting interfaces in the
C library for C23, that's all

If that's a continuing desire, it'd be helpful for the standard to say why the desire is present, at least in a footnote or rationale. This could contain text like what you said later in your email: strcmp ("a", "b") need not return the same integer if called twice, abs (n) has undefined behavior if n == INT_MIN, and implementations can declare standard functions to be [[unsequenced]] or [[reproducible]] even if the standard does not require such declarations. This sort of explanation would help explain the intent of [[unsequenced]] and [[reproducible]].


By "extension", I assume you mean that [[unsequenced]] is intended
to be looser than __attribute__((const)). That is, every const
function is unsequenced, but the reverse is not true. This is what
Bruno said in the above quote.

Yes that's my idea.

Then I'm more confused than ever, unfortunately.

Here's another illustration of my confusion. 6.7.13.8.3 EXAMPLE 2
says that given this definition:

    typedef struct toto toto;
    toto const *toto_zero(void) [[unsequenced]];

"a single call can be executed during thread startup and the return
value p and the value of the object *p of type toto const can be
cached." This means a compiler can optimize code like this:

     toto const x = *toto_zero();
     change_state();
     toto y = *toto_zero();

as if it were this:

     toto const x = *toto_zero();
     change_state();
     toto y = x;

That is, EXAMPLE 2 means an unsequenced function not only can examine
storage addressed by pointers passed to it: it also guarantees that
when it returns a pointer, storage addressed by that returned pointer
always has contents derivable from the function's arguments
(including the storage addressed by those arguments). In particular
tot_zero's [[unsequenced]] guarantees that change_state does not
modify the storage addressed by the pointer returned by toto_zero.

But this means that, contrary to stated intent,
__attribute__((const)) does not imply [[unsequenced]]. For example:

     toto x;
     toto const *toto1(void) { return &x; }
     void change_state() { memset(x, 1, sizeof x); }

toto1 is a const function, but it's not unsequenced because toto1
returns a pointer to storage that change_state modifies.

In other words, [[unsequenced]] allows some compiler optimizations
that __attribute__((const)) does not - contrary to stated intent.

Ok, yes, that is a difference then that we should note.

I tried to write something down to note that, but failed. After this paragraph is the text that I wrote. It fails because I do not see how the text of the note follows from the wording in the standard. (For example, I don't see how the standard allows a compiler to cache the storage addressed by the pointer that toto_zero returns. No doubt there's a connection there, but I'm just not seeing it, and that connection should be explained in the note. Also, it should be explained why it's important to allow such caching - it surely has something to do with a desire to treat pointers-to-objects as arguments and returned values that are the pointed-to objects' values, not the pointers, but I'm not seeing the ins and outs of that.)

----- start of failed note -----

How about adding something like the following text to the standard's rationale:

When C23 was written, [[unsequenced]] was intended to be weaker than GNU C's __attribute__((const)), in the sense that every function that could be marked with __attribute__ ((const)) could also be marked with [[unsequenced]]. The converse was not intended to be true: an [[unsequenced]] function might not be __attribute__((const)) because [[unsequenced]] functions are allowed to inspect the contents of objects addressed by their arguments whereas __attribute__((const)) functions are not.

Unfortunately, after C23 was published, it was discovered that C23 did not formalize this intent. For example:

  int x;
  const int *cf() { return &x; }
  void change_state() { x = 1; }

Although cf can be marked with __attribute__((const)) it is not unsequenced because it returns a pointer to storage that change_state modifies.

This problem does not occur with [[reproducible]] and GNU C's __attribute__((pure)): [[reproducible]] is strictly weaker than __attribute__((pure)), as was intended.

----- end of failed note -----


Could you perhaps be a bit more constructive and indicate what we
could add to 6.7.13.8.1 p3 or to the individual clauses of the two
attributes?

It's a bit of chicken-and-egg problem. I don't fully understand that part of the standard despite reading it many times. It's cleverly written but almost entirely unmotivated and the commentary in N2956 doesn't help much. Exposition needs to be written by someone who both understands the standard and knows how to explain it.

One possibility is to start with the commentary in Gnulib's m4/gnulib-common.m4, which has the most detailed exposition of [[reproducible]] and [[unsequenced]] that I know of. And then fix that commentary (since it surely has mistakes, and it doesn't cover issues like multithreading or access to volatile storage) so that it is clear to nonexpert readers. Here is the commentary for [[reproducible]]:

   It is OK for a compiler to move a call, or omit a duplicate call
   and reuse a cached value returned either directly or indirectly
   via a pointer argument, if other observable state is the same;
   however, these pointer arguments cannot alias.
   This attribute is safe for a function that is effectless and idempotent;
   see ISO C 23 § 6.7.13.8 for a definition of these terms.
   (This attribute is looser than _GL_ATTRIBUTE_UNSEQUENCED because
   the function need not be stateless or independent.  It is looser
   from _GL_ATTRIBUTE_PURE because the function need not return
   exactly once, and it can change state addressed by its pointer arguments
   that do not alias.)
See also <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2956.htm> and
   <https://stackoverflow.com/questions/76847905/>.
   ATTENTION! Efforts are underway to change the meaning of this attribute.
   See <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm>.  */
   /* Applies to: functions, pointer to functions, function types.  */

and here is the commentary for [[unsequenced]]:

   It is OK for a compiler to move a call, or omit a duplicate call
   and reuse a cached return value, addressed by its arguments is the same.
   This attribute is safe for a function that is effectless, idempotent,
   stateless, and independent; see ISO C 23 § 6.7.13.8 for a definition of
   these terms.
   (This attribute is stricter than _GL_ATTRIBUTE_REPRODUCIBLE because
   the function must be stateless and independent.  It differs from
   _GL_ATTRIBUTE_CONST because the function need not return exactly
   once and can depend on state accessed via its pointer arguments
   that do not alias, or on other state that happens to have the
   same value for all calls to the function.)
See also <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2956.htm> and
   <https://stackoverflow.com/questions/76847905/>.
   ATTENTION! Efforts are underway to change the meaning of this attribute.
   See <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm>.  */
   /* Applies to: functions, pointer to functions, function types.  */

You can get a full copy of the file containing this commentary (which also explains const and pure) by doing:

   git clone https://https.git.savannah.gnu.org/git/gnulib.git

and then look for 'reproducible' and 'unsequenced' in the file gnulib/m4/gnulib-common.m4.




and the proposed change doesn't help.
Without a clearer explanation of why these attributes are present I
fear we'll continue to have misunderstandings and glitches.

That's what we are trying to do here, no?

Let's not underestimate the difficulty of making these concepts
clear.

Oh, no, indeed, I don't think they are. The text that now is in C23
is probably one of the most delicate texts I ever wrote in my whole
career.

Oh, this means you weren't around when the sequence-point language
got added to the C standard in the 1980s.

no, and I don't know why we would be discussing acient history here.

My comments this week are in the hope that things have gotten better
in the standardization process.

I wouldn't know about that obviously, but I appreciate that you give
us the benefit of the doubt ;-)

All I can tell you that our attempt to standardize a scarcely
documented but commonly used gcc feature seems to have failed,
partially. That's a pity.

Now n3424 is an attempt to make things better, I am all ear.

Jₑₙₛ



Some other problems I noticed in N3220:

* § 6.7.13.8.1 ¶ 4 says "the based-on relation between
pointer parameters and lvalues". The based-on relation is between pointer expressions and objects, not between parameters and lvalues. I think I know the intent here, but the wording should be made clearer. Similar wording problems occur in ¶ 6, 7.

* § 6.7.13.8.1 ¶ 4 says "An object definition of an object X in a function f escapes if an access to X happens while no call to f is active." However, the term "active" is not defined. If f calls g and g accesses X, is f "active" during that access? If f calls g which calls f, and the inner f returns and g accesses an object X defined by the inner f, is this allowed because the outer f is still "active"? That sort of thing.

* § 6.7.13.8.1 ¶ 4 says "operations that allow to change this state", which is ungrammatical. Perhaps "operations that are allowed to change this state" was intended? But who or what is doing the allowing here? Perhaps change the wording to just "operations that change this state"?

* § 6.7.13.8.1 ¶ 6's definition of "independent" means that in the following code:

  int f (int *p, int q) { return *p + q; }
  struct s { int *p; int q; };
  int g (struct s s) { return *s.p + s.q; }

f is reproducible but g is not, even though they are near equivalents; GCC 15 x8-64 generates identical machine code for them. Presumably this was intended; there should be something saying so, though, and why. For example, does C23 allow GCC to support g being declared [[reproducible]]?

* § 6.7.13.8.1 ¶ 7's definition of "effectless" says a function call evaluation "is effectless if any store operation that is sequenced during the call is the modification of an object that synchronizes with the call". But ¶ 4 says that an object X synchronizes with a call merely if every access to X occurs either before the call, or during the call, or after the call. Hence in this program:

   int counter;
   void clear_counter() { counter = 0; }

clear_counter is "effectless". This meaning of "effectless" is so counterintuitive that I wonder whether I am misinterpreting the definition: it means clear_counter is reproducible (since it is obviously idempotent).

* § 6.7.13.8.1 ¶ 12 says "If possible, it is recommended that implementations diagnose if an attribute of this clause is applied to a function definition that does not have the corresponding property." Does this recommendation extend to function "g" above with GCC, where g is equivalent to a reproducible function at the machine level? Is C23 intended to recommend that GCC diagnose a declaration of g with [[reproducible]]?

* § 6.7.13.8.1 ¶ 12 says "It is recommended that applications that assert the independent or effectless properties for functions qualify pointer parameters with restrict." This recommendation surely goes too far. N3220 itself does not follow this recommendation in any of its examples (§ 6.7.13.8.2 ¶ 3, § 6.7.13.8.3 ¶ 8). Although I suppose it can make sense to use 'restrict' on some functions that take pointer parameters and are marked [[unsequenced]] or [[reproducible]], this applies only in some circumstances, and if any recommendation is given these circumstances should be spelled out and examples given.

There are some other problems but I've run out of time for now, and I expect readers of this email have run out of patience.

Reply via email to