Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Aaron Ballman Wed, 23 Jul 2025 04:54:14 -0700

On Wed, Jul 23, 2025 at 5:47 AM Martin Uecker <ma.uec...@gmail.com> wrote:
>


This is a personal stance of mine, not a Clang community response.

> IMHO there are enough reasons to reject delayed parsing
> as bad design for C.  We should work towards proper
> language features that cleanly fit into the language,
> instead of pushing the boundaries with macros.

I'm not certain there are enough reasons to reject delayed parsing as
a bad design for C. Whether it's a good "design for C" is entirely
subjective. The technical issues you've raised on it all seem to be
things which can be caught statically at compile time and Apple has
already shipped this feature to users who have managed to use it
successfully, which suggests there's a possibility that this is
acceptable in practice. However, I can definitely see why it would be
better to avoid the technical issues in the first place by not going
down that route at all.

> But this requires true collaboration, which can not
> exist when one side is not able to compromise. What
> happens next time there is a disagreement?  Will clang
> again try to force its decision on the rest of us?

True collaboration goes two ways and the stream of acerbic, unhelpful
accusations like this destroy a lot of people's interest in wanting to
help find a solution here. The word "toxic" has come up around this
topic within the Clang community and I don't blame people for walking
away (if I wasn't lead maintainer, I'd have done so despite my
personal interest in seeing C become a safer language). If you're
interested in working together across communities, maybe don't
continue to post these kinds of unconstructive comments?

That said, John McCall pointed out some usage patterns Apple has with
their existing feature:

* 655 simple references to variables or struct members: __counted_by(len)
* 73 dereferences of variables or struct members: __counted_by(*lenp)
* 80 integer literals: __counted_by(8)
* 60 macro references: __counted_by(NUM_EIGHT) [1]
* 9 simple sizeof expressions: __counted_by(sizeof(eight_bytes_t))
* 28 others my script couldn’t categorize:
  * 7 more complicated integer constant expressions:
__counted_by(num_bytes_for_bits(NUM_FIFTY_SEVEN)) [2]
  * 16 arithmetically-adjusted references to a single variable or
struct member: __counted_by(2 * len + 8)
  * 1 nested struct member: __counted_by(header.len)
  * 4 combinations of struct members: __counted_by(len + cnt) [3]

Do the Linux kernel folks think this looks somewhat like what their
usage patterns will be as well? If so, I'd like to argue for my
personal stake in the ground: we don't need any new language features
to solve this problem, we can use the existing facilities to do so and
downscope the initial feature set until a better solution comes along
for forward references. Use two attributes: counted_by (whose argument
specifies an already in-scope identifier of what holds the count) and
counts (whose argument specifies an already in-scope identifier of
what it counts). e.g.,
```
struct S {
  char *start_buffer;
  int start_len __counts(start_buffer);
  int end_len;
  char *end_buffer __counted_by(end_len);
};

void func(char *buffer, int N __counts(buffer), int M, char *buffer
__counted_by(M));
```
It's kind of gross to need two attributes to do the same notional
thing, but it does solve the vast majority of the usages seen in the
wild if you're willing to accept some awkwardness around things like:
```
struct S {
  char *buffer;
  int *len __counts(buffer); // Note that len is a pointer
};
```
because we'd need the semantics of `counts` to include dereferencing
to the `int` in order to be a valid count. We'd be sacrificing the
ability to handle the "others my script couldn't categorize", but
that's 28 out of the 905 total cases and maybe that's acceptable?

~Aaron

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

Reply via email to