[Catching up on some oooold mail.]
On 2025-01-22 02:02, Jₑₙₛ Gustedt wrote:
I hope you agree that the C23 and gcc
attributes coincide if no pointers are involved.
Has that been decided yet? I see from N3494
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm> that in
February you proposed wording saying they don't coincide, but that
change doesn't seem to appear in the C2y September draft N3685
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3685.pdf>.
WG14 didn't wanted to annotate any of the exisiting interfaces in the
C library for C23, that's all
If that's a continuing desire, it'd be helpful for the standard to say
why the desire is present, at least in a footnote or rationale. This
could contain text like what you said later in your email: strcmp ("a",
"b") need not return the same integer if called twice, abs (n) has
undefined behavior if n == INT_MIN, and implementations can declare
standard functions to be [[unsequenced]] or [[reproducible]] even if the
standard does not require such declarations. This sort of explanation
would help explain the intent of [[unsequenced]] and [[reproducible]].
By "extension", I assume you mean that [[unsequenced]] is intended
to be looser than __attribute__((const)). That is, every const
function is unsequenced, but the reverse is not true. This is what
Bruno said in the above quote.
Yes that's my idea.
Then I'm more confused than ever, unfortunately.
Here's another illustration of my confusion. 6.7.13.8.3 EXAMPLE 2
says that given this definition:
typedef struct toto toto;
toto const *toto_zero(void) [[unsequenced]];
"a single call can be executed during thread startup and the return
value p and the value of the object *p of type toto const can be
cached." This means a compiler can optimize code like this:
toto const x = *toto_zero();
change_state();
toto y = *toto_zero();
as if it were this:
toto const x = *toto_zero();
change_state();
toto y = x;
That is, EXAMPLE 2 means an unsequenced function not only can examine
storage addressed by pointers passed to it: it also guarantees that
when it returns a pointer, storage addressed by that returned pointer
always has contents derivable from the function's arguments
(including the storage addressed by those arguments). In particular
tot_zero's [[unsequenced]] guarantees that change_state does not
modify the storage addressed by the pointer returned by toto_zero.
But this means that, contrary to stated intent,
__attribute__((const)) does not imply [[unsequenced]]. For example:
toto x;
toto const *toto1(void) { return &x; }
void change_state() { memset(x, 1, sizeof x); }
toto1 is a const function, but it's not unsequenced because toto1
returns a pointer to storage that change_state modifies.
In other words, [[unsequenced]] allows some compiler optimizations
that __attribute__((const)) does not - contrary to stated intent.
Ok, yes, that is a difference then that we should note.
I tried to write something down to note that, but failed. After this
paragraph is the text that I wrote. It fails because I do not see how
the text of the note follows from the wording in the standard. (For
example, I don't see how the standard allows a compiler to cache the
storage addressed by the pointer that toto_zero returns. No doubt
there's a connection there, but I'm just not seeing it, and that
connection should be explained in the note. Also, it should be explained
why it's important to allow such caching - it surely has something to do
with a desire to treat pointers-to-objects as arguments and returned
values that are the pointed-to objects' values, not the pointers, but
I'm not seeing the ins and outs of that.)
----- start of failed note -----
How about adding something like the following text to the standard's
rationale:
When C23 was written, [[unsequenced]] was intended to be weaker than GNU
C's __attribute__((const)), in the sense that every function that could
be marked with __attribute__ ((const)) could also be marked with
[[unsequenced]]. The converse was not intended to be true: an
[[unsequenced]] function might not be __attribute__((const)) because
[[unsequenced]] functions are allowed to inspect the contents of objects
addressed by their arguments whereas __attribute__((const)) functions
are not.
Unfortunately, after C23 was published, it was discovered that C23 did
not formalize this intent. For example:
int x;
const int *cf() { return &x; }
void change_state() { x = 1; }
Although cf can be marked with __attribute__((const)) it is not
unsequenced because it returns a pointer to storage that change_state
modifies.
This problem does not occur with [[reproducible]] and GNU C's
__attribute__((pure)): [[reproducible]] is strictly weaker than
__attribute__((pure)), as was intended.
----- end of failed note -----
Could you perhaps be a bit more constructive and indicate what we
could add to 6.7.13.8.1 p3 or to the individual clauses of the two
attributes?
It's a bit of chicken-and-egg problem. I don't fully understand that
part of the standard despite reading it many times. It's cleverly
written but almost entirely unmotivated and the commentary in N2956
doesn't help much. Exposition needs to be written by someone who both
understands the standard and knows how to explain it.
One possibility is to start with the commentary in Gnulib's
m4/gnulib-common.m4, which has the most detailed exposition of
[[reproducible]] and [[unsequenced]] that I know of. And then fix that
commentary (since it surely has mistakes, and it doesn't cover issues
like multithreading or access to volatile storage) so that it is clear
to nonexpert readers. Here is the commentary for [[reproducible]]:
It is OK for a compiler to move a call, or omit a duplicate call
and reuse a cached value returned either directly or indirectly
via a pointer argument, if other observable state is the same;
however, these pointer arguments cannot alias.
This attribute is safe for a function that is effectless and idempotent;
see ISO C 23 § 6.7.13.8 for a definition of these terms.
(This attribute is looser than _GL_ATTRIBUTE_UNSEQUENCED because
the function need not be stateless or independent. It is looser
from _GL_ATTRIBUTE_PURE because the function need not return
exactly once, and it can change state addressed by its pointer arguments
that do not alias.)
See also
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2956.htm> and
<https://stackoverflow.com/questions/76847905/>.
ATTENTION! Efforts are underway to change the meaning of this attribute.
See <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm>. */
/* Applies to: functions, pointer to functions, function types. */
and here is the commentary for [[unsequenced]]:
It is OK for a compiler to move a call, or omit a duplicate call
and reuse a cached return value, addressed by its arguments is the same.
This attribute is safe for a function that is effectless, idempotent,
stateless, and independent; see ISO C 23 § 6.7.13.8 for a definition of
these terms.
(This attribute is stricter than _GL_ATTRIBUTE_REPRODUCIBLE because
the function must be stateless and independent. It differs from
_GL_ATTRIBUTE_CONST because the function need not return exactly
once and can depend on state accessed via its pointer arguments
that do not alias, or on other state that happens to have the
same value for all calls to the function.)
See also
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2956.htm> and
<https://stackoverflow.com/questions/76847905/>.
ATTENTION! Efforts are underway to change the meaning of this attribute.
See <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3494.htm>. */
/* Applies to: functions, pointer to functions, function types. */
You can get a full copy of the file containing this commentary (which
also explains const and pure) by doing:
git clone https://https.git.savannah.gnu.org/git/gnulib.git
and then look for 'reproducible' and 'unsequenced' in the file
gnulib/m4/gnulib-common.m4.
and the proposed change doesn't help.
Without a clearer explanation of why these attributes are present I
fear we'll continue to have misunderstandings and glitches.
That's what we are trying to do here, no?
Let's not underestimate the difficulty of making these concepts
clear.
Oh, no, indeed, I don't think they are. The text that now is in C23
is probably one of the most delicate texts I ever wrote in my whole
career.
Oh, this means you weren't around when the sequence-point language
got added to the C standard in the 1980s.
no, and I don't know why we would be discussing acient history here.
My comments this week are in the hope that things have gotten better
in the standardization process.
I wouldn't know about that obviously, but I appreciate that you give
us the benefit of the doubt ;-)
All I can tell you that our attempt to standardize a scarcely
documented but commonly used gcc feature seems to have failed,
partially. That's a pity.
Now n3424 is an attempt to make things better, I am all ear.
Jₑₙₛ
Some other problems I noticed in N3220:
* § 6.7.13.8.1 ¶ 4 says "the based-on relation between
pointer parameters and lvalues". The based-on relation is between
pointer expressions and objects, not between parameters and lvalues. I
think I know the intent here, but the wording should be made clearer.
Similar wording problems occur in ¶ 6, 7.
* § 6.7.13.8.1 ¶ 4 says "An object definition of an object X in a
function f escapes if an access to X happens while no call to f is
active." However, the term "active" is not defined. If f calls g and g
accesses X, is f "active" during that access? If f calls g which calls
f, and the inner f returns and g accesses an object X defined by the
inner f, is this allowed because the outer f is still "active"? That
sort of thing.
* § 6.7.13.8.1 ¶ 4 says "operations that allow to change this state",
which is ungrammatical. Perhaps "operations that are allowed to change
this state" was intended? But who or what is doing the allowing here?
Perhaps change the wording to just "operations that change this state"?
* § 6.7.13.8.1 ¶ 6's definition of "independent" means that in the
following code:
int f (int *p, int q) { return *p + q; }
struct s { int *p; int q; };
int g (struct s s) { return *s.p + s.q; }
f is reproducible but g is not, even though they are near equivalents;
GCC 15 x8-64 generates identical machine code for them. Presumably this
was intended; there should be something saying so, though, and why. For
example, does C23 allow GCC to support g being declared [[reproducible]]?
* § 6.7.13.8.1 ¶ 7's definition of "effectless" says a function call
evaluation "is effectless if any store operation that is sequenced
during the call is the modification of an object that synchronizes with
the call". But ¶ 4 says that an object X synchronizes with a call merely
if every access to X occurs either before the call, or during the call,
or after the call. Hence in this program:
int counter;
void clear_counter() { counter = 0; }
clear_counter is "effectless". This meaning of "effectless" is so
counterintuitive that I wonder whether I am misinterpreting the
definition: it means clear_counter is reproducible (since it is
obviously idempotent).
* § 6.7.13.8.1 ¶ 12 says "If possible, it is recommended that
implementations diagnose if an attribute of this clause is applied
to a function definition that does not have the corresponding property."
Does this recommendation extend to function "g" above with GCC, where g
is equivalent to a reproducible function at the machine level? Is C23
intended to recommend that GCC diagnose a declaration of g with
[[reproducible]]?
* § 6.7.13.8.1 ¶ 12 says "It is recommended that applications that
assert the independent or effectless properties for functions qualify
pointer parameters with restrict." This recommendation surely goes too
far. N3220 itself does not follow this recommendation in any of its
examples (§ 6.7.13.8.2 ¶ 3, § 6.7.13.8.3 ¶ 8). Although I suppose it can
make sense to use 'restrict' on some functions that take pointer
parameters and are marked [[unsequenced]] or [[reproducible]], this
applies only in some circumstances, and if any recommendation is given
these circumstances should be spelled out and examples given.
There are some other problems but I've run out of time for now, and I
expect readers of this email have run out of patience.