Re: n3294 - The restrict function attribute as a replacement of the restrict qualifier

2024-07-26 Thread Martin Uecker via Gcc
Am Samstag, dem 27.07.2024 um 00:26 +0200 schrieb Alejandro Colomar:
> On Sat, Jul 27, 2024 at 12:03:20AM GMT, Martin Uecker wrote:
> > > Maybe if GNU C compilers (GCC and Clang) add it first as an extension,
> > > adding diagnostics, it would help.
> > 
> > Both GCC and Clang already have such diagnostics and/or run-time checks:
> > 
> > https://godbolt.org/z/MPnxqb9h7
> 
> Hi Martin,
> 
> I guess that's prior art enough to make this UB in ISO C.  Is there any
> paper for this already?  Does any of your paper cover that?  Should I
> prepare one?
> 

What do you mean by "this"?  Adding UB would likely see a lot
of opposition, even where this could enable run-time checks.  

N2906 would make 

int foo(char f[4]);
int foo(char f[5]);

a constraint violation (although having those types be incompatible
could also cause UB indirectly, this would not be its main effect).

So I think brining a new version of this paper forward would be
a possible next step, addressing the issues raised in the past.

Martin



Re: n3294 - The restrict function attribute as a replacement of the restrict qualifier

2024-07-26 Thread Martin Uecker via Gcc
Am Freitag, dem 26.07.2024 um 23:49 +0200 schrieb Alejandro Colomar via Gcc:
> On Fri, Jul 26, 2024 at 09:22:42PM GMT, Joseph Myers wrote:
> > On Fri, 26 Jul 2024, Alejandro Colomar via Gcc wrote:
> > 
> > > > See reflector message SC22WG14.18575, 17 Nov 2020 (the former convenor 
> > > > replying when I asked about just that paper).
> > > 
> > > Where can I find reflector messages?
> > 
> > https://www.open-std.org/jtc1/sc22/wg14/18575
> 
> Thanks!
> 
> > 
> > > And another one to propose that [n] means the same as [static n] except
> > > for the nonnull property of static.
> > 
> > I'm not convinced that introducing extra undefined behavior for things 
> > that have been valid since C89 (which would be the effect of such a change 
> > for any code that passes a smaller array) is a good idea - the general 
> > mood is to *reduce* undefined behavior.
> 
> While [n] has always _officially_ meant the same as [], it has never
> made any sense to write code like that.  Unofficially, it has always
> meant the obvious thing.
> 
> Maybe if GNU C compilers (GCC and Clang) add it first as an extension,
> adding diagnostics, it would help.

Both GCC and Clang already have such diagnostics and/or run-time checks:

https://godbolt.org/z/MPnxqb9h7

Martin


> 
> Does anyone know of any existing code that uses [n] for meaning anything
> other than "n elements are available to the function"?
> 
> Functions that specify [n] most likely (definitely?) already mean that
> n elements are accessed, and thus passing something different than n
> elements results in UB one way or another.  Having the compiler enforce
> that via diagnostics and UB is probably an improvement.
> 
> Cheers,
> Alex
> 
> > 
> > -- 
> > Joseph S. Myers
> > josmy...@redhat.com
> > 
> 



[gcc r15-2026] c, objc: Add -Wunterminated-string-initialization

2024-07-14 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:44c9403ed1833ae71a59e84f9e37af3182be0df5

commit r15-2026-g44c9403ed1833ae71a59e84f9e37af3182be0df5
Author: Alejandro Colomar 
Date:   Sat Jun 29 15:10:43 2024 +0200

c, objc: Add -Wunterminated-string-initialization

Warn about the following:

char  s[3] = "foo";

Initializing a char array with a string literal of the same length as
the size of the array is usually a mistake.  Rarely is the case where
one wants to create a non-terminated character sequence from a string
literal.

In some cases, for writing faster code, one may want to use arrays
instead of pointers, since that removes the need for storing an array of
pointers apart from the strings themselves.

char  *log_levels[]   = { "info", "warning", "err" };
vs.
char  log_levels[][7] = { "info", "warning", "err" };

This forces the programmer to specify a size, which might change if a
new entry is later added.  Having no way to enforce null termination is
very dangerous, however, so it is useful to have a warning for this, so
that the compiler can make sure that the programmer didn't make any
mistakes.  This warning catches the bug above, so that the programmer
will be able to fix it and write:

char  log_levels[][8] = { "info", "warning", "err" };

This warning already existed as part of -Wc++-compat, but this patch
allows enabling it separately.  It is also included in -Wextra, since
it may not always be desired (when unterminated character sequences are
wanted), but it's likely to be desired in most cases.

Since Wc++-compat now includes this warning, the test has to be modified
to expect the text of the new warning too, in .

Link: https://lists.gnu.org/archive/html/groff/2022-11/msg00059.html
Link: https://lists.gnu.org/archive/html/groff/2022-11/msg00063.html
Link: 
https://inbox.sourceware.org/gcc/36da94eb-1cac-5ae8-7fea-ec66160cf...@gmail.com/T/

PR c/115185

gcc/c-family/ChangeLog:

* c.opt: Add -Wunterminated-string-initialization.

gcc/c/ChangeLog:

* c-typeck.cc (digest_init): Separate warnings about character
arrays being initialized as unterminated character sequences
with string literals, from -Wc++-compat, into a new warning,
-Wunterminated-string-initialization.

gcc/ChangeLog:

* doc/invoke.texi: Document the new
-Wunterminated-string-initialization.

gcc/testsuite/ChangeLog:

* gcc.dg/Wcxx-compat-14.c: Adapt the test to match the new text
of the warning, which doesn't say anything about C++ anymore.
* gcc.dg/Wunterminated-string-initialization.c: New test.

Acked-by: Doug McIlroy 
Acked-by: Mike Stump 
Reviewed-by: Sandra Loosemore 
Reviewed-by: Martin Uecker 
Signed-off-by: Alejandro Colomar 
Reviewed-by: Marek Polacek 

Diff:
---
 gcc/c-family/c.opt   |  4 
 gcc/c/c-typeck.cc|  6 +++---
 gcc/doc/invoke.texi  | 20 +++-
 gcc/testsuite/gcc.dg/Wcxx-compat-14.c|  2 +-
 .../gcc.dg/Wunterminated-string-initialization.c |  6 ++
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 5c1006ff321f..a52682d835ce 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1464,6 +1464,10 @@ Wunsuffixed-float-constants
 C ObjC Var(warn_unsuffixed_float_constants) Warning
 Warn about unsuffixed float constants.
 
+Wunterminated-string-initialization
+C ObjC Var(warn_unterminated_string_initialization) Warning LangEnabledBy(C 
ObjC,Wextra || Wc++-compat)
+Warn about character arrays initialized as unterminated character sequences 
with a string literal.
+
 Wunused
 C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall)
 ; documented in common.opt
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 574114d541fd..7e0f01ed22b9 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8878,11 +8878,11 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
pedwarn_init (init_loc, 0,
  ("initializer-string for array of %qT "
   "is too long"), typ1);
- else if (warn_cxx_compat
+ else if (warn_unterminated_string_initialization
   && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
-   warning_at (init_loc, OPT_Wc___compat,
+   warning_at (init_loc, OPT_Wunterminated_string_initialization,
("initializer-string for array of %qT "
-"is too long for C++"), typ1);
+"is too long"), typ1);
  if 

Re: IFNDR on UB? [was: Straw poll on shifts with out of range operands]

2024-07-13 Thread Martin Uecker via Gcc
Am Montag, dem 01.07.2024 um 15:19 +0200 schrieb Matthias Kretz:
> On Sunday, 30 June 2024 08:33:35 GMT+2 Martin Uecker wrote:
> > Am Sonntag, dem 30.06.2024 um 05:03 +0200 schrieb Matthias Kretz:
> > > On Saturday, 29 June 2024 16:20:55 GMT+2 Martin Uecker wrote:
> > > > Am Samstag, dem 29.06.2024 um 08:50 -0500 schrieb Matthias Kretz via 
> Gcc:
> > > > > 
,
..
> > 
> > > > But I am not sure how this is relevant here as this affects only
> > > > observable behavior and the only case where GCC does not seem to
> > > > already conform to this is volatile.
> > > 
> > > Now you lost me.
> > 
> > Consider the following example:
> > 
> > int f(int x)
> > {
> >  int r = 0;
> >  if (x < 10)
> >r = 1;
> >  if (x < 10)
> >__builtin_unreachable();
> >  return r;
> > }
> > 
> > But removing the store to 'r' here as GCC does:
> > 
> > https://godbolt.org/z/h7qqrGsbz
> > 
> > can simply be justified by the "as if" principle as
> > any other optimization, it does not need to rely on a weird
> > intepretation that the UB from __builin_unreachable() travels
> > back in time.
> 
> I don't know of anybody saying that "time-travel optimization" refers to 
> anything different than what you're showing here. 

The C++ standard allows also removing or changing previous
observable behavior.

"However, if any such execution contains an undefined operation,
this International Standard places no requirement on the implementation
executing that program with that input (not even with regard to
operations preceding the first undefined operation)."

This was discussed extensively in the past, e.g. when discussing
the specification for memset_explicit.


> The part that people find 
> scary is when this "as if" happens at a distance, like in
> https://godbolt.org/z/jP4x1c3E6

True, although without "true time travel" (i.e. excluding
time-travel affecting previous observable behavior), this is
limited in the damage it can do (e.g. can not undo changes
committed to a log or control inputs to an external machine)
and has a much simpler interpretation that does not need to
refer to metaphysical concepts: Instead of the UB  affecting 
previous behavior by travelling back in time, it just has
arbitrary behavior that might undo the store.

> 
> > > [...]
> > 
> > I think it is a good idea. The compiler can optionally treat UB as
> > a translation time error. We discussed similar ideas in the past
> > in WG14. But this will only work for very specific instances of UB
> > under certain conditions.
> 
> Yes. But it's an exact match of the "time-travel" cases. I.e. whenever const-
> prop determines "unreachable" the code could be ill-formed.

I am not sure it is so simple.  What about if all this is in
dead code?  I assume one needs a rule similar to "for a functions,
when it is not possible to invoke the function without reaching
an operation which has UB, then the program ill-formed." or
"if a function call expression always invokes UB when executed,
then...". I guess the later is would apply to the precondition
viplations you mentioned WG21 is considering.

> 
> 
> > > > Also IMHO this should be split up from
> > > > UBsan which has specific semantics and upstream dependencies
> > > > which are are not always ideal.  (But UBSan could share the
> > > > same infrastructure)
> > > 
> > > I'm not sure what you're thinking of here. UBsan detects UB at runtime
> > > whereas my '-fharden=1' proposal is about flagging UB as ill-formed on
> > > compile-time. So UBsan is a more verbose '-fharden=2' then?
> > 
> > Yes, I was talking about the -fharden=2 case. In principle UBSan
> > with traps instead of diagnostics would do this. In practice,
> > I think we need something which is not tied to UBSan.
> 
> Yes, basically a deployable variant of UBsan?
> 

Yes, there is something such as -fcf-protection
or -fvtable-verify

Martin

> 
> On Sunday, 30 June 2024 08:56:41 GMT+2 Martin Uecker wrote:
> > 0) nothing
> > 1) expands to __builtin_unreachable()
> > 2) expands to __builtin_trap()
> > 3) expands to a __builtin_warning (as suggested before
> > by Martin Sebor) that causes the backend to emit an error
> > in a very late pass when the __builtin_warning has not
> > been removed during optimization.
> 
> This __builtin_warning seems to be equivalent to my __error() function, using 
> a [[gnu::warning]] attribute instead of [[gnu::error]]. Which is certainly 
> another viable build/-fharden/whateverwecallit mode.
> 





Re: Apply function attributes (e.g., [[gnu::access()]]) to pointees too

2024-07-11 Thread Martin Uecker via Gcc


Am Donnerstag, dem 11.07.2024 um 11:35 +0200 schrieb Alejandro Colomar via Gcc:
> Hi,
> 
> I was wondering how we could extend attributes such as gnu::access() to
> apply it to pointees too.  Currently, there's no way to specify the
> access mode of a pointee.
> 
> Let's take for example strsep(3):
> 
> With current syntax, this is what we can specify:
> 
>   [[gnu::access(read_write, 1)]]
>   [[gnu::access(read_only, 2)]]
>   [[gnu::nonnull(1, 2)]]
>   [[gnu::null_terminated_string_arg(2)]]
>   char *
>   strsep(char **restrict sp, const char *delim);

The main problem from a user perspective is that
these are attributes on the function declaration
and not on the argument (type).

> 
> I was thinking that with floating numbers, one could specify the number
> of dereferences with a number after the decimal point.  It's a bit
> weird, since the floating point is interpreted as two separate integer
> numbers separated by a '.', but could work.  In this case:
> 
>   [[gnu::access(read_write, 1)]]
>   [[gnu::access(read_write, 1.1)]]
>   [[gnu::access(read_only, 2)]]
>   [[gnu::nonnull(1, 2)]]
>   [[gnu::null_terminated_string_arg(1.1)]]
>   [[gnu::null_terminated_string_arg(2)]]
>   char *
>   strsep(char **restrict sp, const char *delim);
> 
> Which would mark the pointer *sp as read_write and a string.  What do
> you think about it?

If the attributes could be applied to the type, then
one could attach them directly at an intermediate
pointer level, which would be more intuitive and
less fragile.


Martin






[gcc r15-1930] Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]

2024-07-09 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:5b46f196cdb62af0e611315ea411938d756a0ad1

commit r15-1930-g5b46f196cdb62af0e611315ea411938d756a0ad1
Author: Martin Uecker 
Date:   Sun Jun 23 09:10:20 2024 +0200

Fix test errors after r15-1394 for sizeof(int)==sizeof(long) [PR115545]

Some tests added to test the type of redeclarations of enumerators
in r15-1394 fail on architectures where sizeof(long) == sizeof(int).
Adapt tests to use long long and/or accept that long long is selected
as type for the enumerator.

PR testsuite/115545

gcc/testsuite/

* gcc.dg/pr115109.c: Adapt test.
* gcc.dg/c23-tag-enum-6.c: Adapt test.
* gcc.dg/c23-tag-enum-7.c: Adapt test.

Diff:
---
 gcc/testsuite/gcc.dg/c23-tag-enum-6.c |  4 ++--
 gcc/testsuite/gcc.dg/c23-tag-enum-7.c | 12 ++--
 gcc/testsuite/gcc.dg/pr115109.c   |  4 ++--
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-6.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
index 29aef7ee3fdf..d8d304d9b3df 100644
--- a/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
@@ -7,10 +7,10 @@ enum E : int { a = 1, b = 2 };
 enum E : int { b = _Generic(a, enum E: 2), a = 1 };
 
 enum H { x = 1 };
-enum H { x = 2UL + UINT_MAX }; /* { dg-error "outside the range" } */
+enum H { x = 2ULL + UINT_MAX };/* { dg-error "outside the range" } */
 
 enum K : int { z = 1 };
-enum K : int { z = 2UL + UINT_MAX };   /* { dg-error "outside the range" } */
+enum K : int { z = 2ULL + UINT_MAX };  /* { dg-error "outside the range" } */
 
 enum F { A = 0, B = UINT_MAX };
 enum F { B = UINT_MAX, A };/* { dg-error "outside the range" } */
diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-7.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
index d4c787c8f716..974735bf2ef4 100644
--- a/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
@@ -4,23 +4,23 @@
 #include 
 
 // enumerators are all representable in int
-enum E { a = 1UL, b = _Generic(a, int: 2) };
+enum E { a = 1ULL, b = _Generic(a, int: 2) };
 static_assert(_Generic(a, int: 1));
 static_assert(_Generic(b, int: 1));
-enum E { a = 1UL, b = _Generic(a, int: 2) };
+enum E { a = 1ULL, b = _Generic(a, int: 2) };
 static_assert(_Generic(a, int: 1));
 static_assert(_Generic(b, int: 1));
 
 // enumerators are not representable in int
-enum H { c = 1UL << (UINT_WIDTH + 1), d = 2 };
+enum H { c = 1ULL << (UINT_WIDTH + 1), d = 2 };
 static_assert(_Generic(c, enum H: 1));
 static_assert(_Generic(d, enum H: 1));
-enum H { c = 1UL << (UINT_WIDTH + 1), d = _Generic(c, enum H: 2) };
+enum H { c = 1ULL << (UINT_WIDTH + 1), d = _Generic(c, enum H: 2) };
 static_assert(_Generic(c, enum H: 1));
 static_assert(_Generic(d, enum H: 1));
 
 // there is an overflow in the first declaration
-enum K { e = UINT_MAX, f, g = _Generic(e, unsigned int: 0) + _Generic(f, 
unsigned long: 1) };
+enum K { e = UINT_MAX, f, g = _Generic(e, unsigned int: 0) + _Generic(f, 
unsigned long: 1, unsigned long long: 1) };
 static_assert(_Generic(e, enum K: 1));
 static_assert(_Generic(f, enum K: 1));
 static_assert(_Generic(g, enum K: 1));
@@ -30,7 +30,7 @@ static_assert(_Generic(f, enum K: 1));
 static_assert(_Generic(g, enum K: 1));
 
 // there is an overflow in the first declaration
-enum U { k = INT_MAX, l, m = _Generic(k, int: 0) + _Generic(l, long: 1) };
+enum U { k = INT_MAX, l, m = _Generic(k, int: 0) + _Generic(l, long: 1, long 
long: 1) };
 static_assert(_Generic(k, enum U: 1));
 static_assert(_Generic(l, enum U: 1));
 static_assert(_Generic(m, enum U: 1));
diff --git a/gcc/testsuite/gcc.dg/pr115109.c b/gcc/testsuite/gcc.dg/pr115109.c
index 4baee0f34453..8245ff7fadb7 100644
--- a/gcc/testsuite/gcc.dg/pr115109.c
+++ b/gcc/testsuite/gcc.dg/pr115109.c
@@ -3,6 +3,6 @@
 
 #include 
 
-enum E { a = 1UL << (ULONG_WIDTH - 5), b = 2 };
-enum E { a = 1ULL << (ULONG_WIDTH - 5), b = _Generic(a, enum E: 2) };
+enum E { a = 1ULL << (ULLONG_WIDTH - 5), b = 2 };
+enum E { a = 1ULL << (ULLONG_WIDTH - 5), b = _Generic(a, enum E: 2) };


[gcc r15-1929] c: Fix ICE for redeclaration of structs with different alignment [PR114727]

2024-07-09 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:7825c07bbaf503c47ecedd87e3d64be003b24f2c

commit r15-1929-g7825c07bbaf503c47ecedd87e3d64be003b24f2c
Author: Martin Uecker 
Date:   Sat Jun 29 15:53:43 2024 +0200

c: Fix ICE for redeclaration of structs with different alignment [PR114727]

For redeclarations of struct in C23, if one has an alignment attribute
that makes the alignment different, we later get an ICE in verify_types.
This patches disallows such redeclarations by declaring such types to
be different.

PR c/114727

gcc/c/
* c-typeck.cc (tagged_types_tu_compatible): Add test.

gcc/testsuite/
* gcc.dg/pr114727.c: New test.

Diff:
---
 gcc/c/c-typeck.cc   | 3 +++
 gcc/testsuite/gcc.dg/pr114727.c | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e486ac04f9cf..455dc374b481 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1603,6 +1603,9 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
  != TYPE_REVERSE_STORAGE_ORDER (t2)))
 return false;
 
+  if (TYPE_USER_ALIGN (t1) != TYPE_USER_ALIGN (t2))
+data->different_types_p = true;
+
   /* For types already being looked at in some active
  invocation of this function, assume compatibility.
  The cache is built as a linked list on the stack
diff --git a/gcc/testsuite/gcc.dg/pr114727.c b/gcc/testsuite/gcc.dg/pr114727.c
new file mode 100644
index ..12949590ce09
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114727.c
@@ -0,0 +1,6 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+#define Y [[gnu::aligned(128)]]
+extern struct Y foo { int x; } x;
+struct foo { int x; }; /* { dg-error "redefinition" } */


[gcc r15-1928] c: Fix ICE for incorrect code in comptypes_verify [PR115696]

2024-07-09 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:592a746533a278a5fd3e7b5dff004e1846ef26a4

commit r15-1928-g592a746533a278a5fd3e7b5dff004e1846ef26a4
Author: Martin Uecker 
Date:   Sat Jun 29 15:36:18 2024 +0200

c: Fix ICE for incorrect code in comptypes_verify [PR115696]

The new verification code produces an ICE for incorrect code.  Add the
same logic as already used in comptypes to to bail out under certain
conditions.

PR c/115696

gcc/c/
* c-typeck.cc (comptypes_verify): Bail out for
identical, empty, and erroneous input types.

gcc/testsuite/
* gcc.dg/pr115696.c: New test.

Diff:
---
 gcc/c/c-typeck.cc   | 4 
 gcc/testsuite/gcc.dg/pr115696.c | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ffcab7df4d3b..e486ac04f9cf 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1175,6 +1175,10 @@ common_type (tree t1, tree t2)
 static bool
 comptypes_verify (tree type1, tree type2)
 {
+  if (type1 == type2 || !type1 || !type2
+  || TREE_CODE (type1) == ERROR_MARK || TREE_CODE (type2) == ERROR_MARK)
+return true;
+
   if (TYPE_CANONICAL (type1) != TYPE_CANONICAL (type2)
   && !TYPE_STRUCTURAL_EQUALITY_P (type1)
   && !TYPE_STRUCTURAL_EQUALITY_P (type2))
diff --git a/gcc/testsuite/gcc.dg/pr115696.c b/gcc/testsuite/gcc.dg/pr115696.c
new file mode 100644
index ..50b8ebc24f08
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr115696.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-Wno-implicit-int" } */
+
+a();   /* { dg-warning "no type or storage" } */
+a; /* { dg-error "redeclared" } */
+   /* { dg-warning "no type or storage" "" { target *-*-* } .-1 } */
+a();   /* { dg-warning "no type or storage" } */


Re: WG14 paper for removing restrict from nptr in strtol(3)

2024-07-08 Thread Martin Uecker via Gcc
Am Montag, dem 08.07.2024 um 22:17 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Mon, Jul 08, 2024 at 06:05:08PM GMT, Martin Uecker wrote:
> > Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar:
> > > On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote:
> > 
> > ...
> > > And then have it mean something strict, such as: The object pointed to
> > > by the pointer is not pointed to by any other pointer; period.
> > > 
> > > This definition is already what -Wrestrict seems to understand.
> > 
> > One of the main uses of restrict is scientific computing. In this
> > context such a definition of "restrict" would not work for many 
> > important use cases. But I agree that for warning purposes the
> > definition of "restrict" in ISO C is not helpful.
> 
> Do you have some examples of functions where this matters and is
> important?  I'm curious to see them.  Maybe we find some alternative.

In many numerical algorithms you want to operate on
different parts of the same array object.  E.g. for matrix
decompositions you want to take a row / column and add it 
to another. Other examples are algorithms that decompose
some input (.e.g. high and low band in a wavelet transform)
and store it into the same output array, etc.

Without new notation for strided array slicing, one
fundamentally needs the flexibility of restrict that
only guarantuees that actual accesses do not conflict.

But this then implies that one can not use restrict as a
contract specification on function prototypes, but has
to analyze the implementation of a function to see if
it is used correctly.  But I would not see it as a design 
problem of restrict. It was simply not the intended use 
case when originally designed. 


> 
> > > > Has the C standard clarified the meaning of 'restrict' since that
> > > > discussion?  Without that, I wasn't planning to touch 'restrict' in
> > > > GCC's -fanalyzer.
> > > 
> > > Meh; no they didn't.  
> > 
> > There were examples added in C23 and there are now several papers
> > being under discussion.
> 
> Hmm, yeah, the examples help with the formal definition.  I was thinking
> of the definition itself, which I still find quite confusing.  :-)

Indeed.

Martin

> 
> Have a lovely night!
> Alex
> 



Re: WG14 paper for removing restrict from nptr in strtol(3)

2024-07-08 Thread Martin Uecker via Gcc
Am Montag, dem 08.07.2024 um 17:01 +0200 schrieb Alejandro Colomar:
> On Mon, Jul 08, 2024 at 10:30:48AM GMT, David Malcolm wrote:

...
> And then have it mean something strict, such as: The object pointed to
> by the pointer is not pointed to by any other pointer; period.
> 
> This definition is already what -Wrestrict seems to understand.

One of the main uses of restrict is scientific computing. In this
context such a definition of "restrict" would not work for many 
important use cases. But I agree that for warning purposes the
definition of "restrict" in ISO C is not helpful.

> 
> > Later, I added a new -Wanalyzer-overlapping-buffers warning in GCC 14,
> > which simply has a hardcoded set of standard library functions that it
> > "knows" to warn about.
> 
> Hmmm, so it doesn't help at all for anything other than libc.  Ok.
> 
> > Has the C standard clarified the meaning of 'restrict' since that
> > discussion?  Without that, I wasn't planning to touch 'restrict' in
> > GCC's -fanalyzer.
> 
> Meh; no they didn't.  

There were examples added in C23 and there are now several papers
being under discussion.


> I understand.  That's why I don't like innovations
> in ISO C, and prefer that implementations innovate with real stuff.


Re: WG14 paper for removing restrict from nptr in strtol(3)

2024-07-07 Thread Martin Uecker via Gcc
Am Sonntag, dem 07.07.2024 um 13:07 +0200 schrieb Alejandro Colomar via Gcc:
> Hi Martin,
> 
> On Sun, Jul 07, 2024 at 09:15:23AM GMT, Martin Uecker wrote:
> > 
> > Hi Alejandro,
> > 
> > if in caller it is known that endptr has access mode "write_only"
> > then it can conclude that the content of *endptr has access mode
> > "none", couldn't it?
> 
> H.  I think you're correct.  I'll incorporate that and see how it
> affects the caller.
> 
> At first glance, I think it would result in
> 
>   nptraccess(read_only)   alias *endptr
>   endptr  access(write_only)  unique
>   errno   access(read_write)  unique
>   *endptr access(none)alias nptr
> 
> Which is actually having perfect information, regardless of 'restrict'
> on nptr.  :-)

Yes, but my point is that even with "restrict" a smarter
compiler could then also be smart enough not to warn even
when *endptr aliases nptr.

> 
> > You also need to discuss backwards compatibility.  Changing
> > the type of those functions can break valid programs.
> 
> I might be forgetting about other possibilities, but the only one I had
> in mind that could break API would be function pointers.  However, a
> small experiment seems to say it doesn't:

Right, the outermost qualifiers are ignored, so this is not a
compatibility problem.  So I think this is not an issue, but
it is worth pointing it out.

Martin

> 
>   $ cat strtolp.c 
>   #include 
> 
>   long
>   alx_strtol(const char *nptr, char **restrict endptr, int base)
>   {
>   return strtol(nptr, endptr, base);
>   }
> 
>   typedef long (*strtolp_t)(const char *restrict nptr,
> char **restrict endptr, int base);
>   typedef long (*strtolpnr_t)(const char *nptr,
>  char **restrict endptr, int base);
> 
>   int
>   main(void)
>   {
>   [[maybe_unused]] strtolp_ta = 
>   [[maybe_unused]] strtolpnr_t  b = 
>   [[maybe_unused]] strtolp_tc = _strtol;
>   [[maybe_unused]] strtolpnr_t  d = _strtol;
>   }
> 
>   $ cc -Wall -Wextra strtolp.c 
>   $
> 
> Anyway, I'll say that it doesn't seem to break API.
> 
> >  You would
> > need to make a case that this is unlikely to affect any real
> > world program.
> 
> If you have something else in mind that could break API, please let me
> know, and I'll add it to the experiments.
> 
> Thanks!
> 
> Have a lovely day!
> Alex
> 



Re: WG14 paper for removing restrict from nptr in strtol(3)

2024-07-07 Thread Martin Uecker via Gcc


Hi Alejandro,

if in caller it is known that endptr has access mode "write_only"
then it can conclude that the content of *endptr has access mode
"none", couldn't it?

You also need to discuss backwards compatibility.  Changing
the type of those functions can break valid programs.  You would
need to make a case that this is unlikely to affect any real
world program.

Martin

Am Sonntag, dem 07.07.2024 um 03:58 +0200 schrieb Alejandro Colomar:
> Hi,
> 
> I've incorporated feedback, and here's a new revision, let's call it
> v0.2, of the draft for a WG14 paper.  I've attached the man(7) source,
> and the generated PDF.
> 
> Cheers,
> Alex
> 
> 



Re: [PATCH v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions

2024-07-05 Thread Martin Uecker via Gcc
Am Freitag, dem 05.07.2024 um 21:28 +0200 schrieb Alejandro Colomar:

...
> 
> > > Showing that you can contrive a case where a const char* restrict and
> > > char** restrict can alias doesn't mean there's a problem with strtol.
> > 
> > I think his point is that a const char* restrict and something which
> > is stored in a char* whose address is then passed can alias and there
> > a warning would make sense in other situations.   
> 
> Indeed.
> 
> > But I am also not convinced removing restrict would be an improvement.
> > It would make more sense to have an annotation that indicates that
> > endptr is only used as output.
> 
> What is the benefit of keeping restrict there?  It doesn't provide any
> benefits, AFAICS.

Not really I think.  I am generally not a fan of restrict.
IMHO it is misdesigned and I would like to see it replaced
with something better.  But I also not convinced it really
helps to remove it here.

If we marked endptr as "write_only" (which it might already
be) then a future warning mechanism for -Wrestrict could
ignore the content of *endptr. 

Martin







Re: [PATCH v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions

2024-07-05 Thread Martin Uecker via Gcc
Am Freitag, dem 05.07.2024 um 17:24 +0100 schrieb Jonathan Wakely:
> On Fri, 5 Jul 2024 at 17:02, Xi Ruoyao via Gcc  wrote:
> > 
> > On Fri, 2024-07-05 at 17:53 +0200, Alejandro Colomar wrote:
> > > At least, I hope there's consensus that while current GCC doesn't warn
> > > about this, ideally it should, which means it should warn for valid uses
> > > of strtol(3), which means strtol(3) should be fixed, in all of ISO,
> > > POSIX, and glibc.
> > 
> > It **shouldn't**.  strtol will only violate restrict if it's wrongly
> > implemented, or something dumb is done like "strtol((const char*) ,
> > , 0)".
> > 
> > See my previous reply.
> 
> Right, is there a valid use of strtol where a warning would be justified?
> 
> Showing that you can contrive a case where a const char* restrict and
> char** restrict can alias doesn't mean there's a problem with strtol.

I think his point is that a const char* restrict and something which
is stored in a char* whose address is then passed can alias and there
a warning would make sense in other situations.   

But I am also not convinced removing restrict would be an improvement.
It would make more sense to have an annotation that indicates that
endptr is only used as output.

Martin  





Re: [PATCH v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions

2024-07-05 Thread Martin Uecker via Gcc
Am Freitag, dem 05.07.2024 um 17:53 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Fri, Jul 05, 2024 at 05:34:55PM GMT, Martin Uecker wrote:
> > > I've written functions that more closely resemble strtol(3), to show
> > > that in the end they all share the same issue regarding const-ness:
> 
> (Above I meant s/const/restrict/)
> 
> > > 
> > >   $ cat d.c 
> > >   int d(const char *restrict ca, char *restrict a)
> > >   {
> > >   return ca > a;
> > >   }
> > > 
> > >   int main(void)
> > >   {
> > >   char x = 3;
> > >   char *xp = 
> > >   d(xp, xp);
> > >   }
> > >   $ cc -Wall -Wextra d.c 
> > >   d.c: In function ‘main’:
> > >   d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter 
> > > aliases with argument 1 [-Wrestrict]
> > >  10 | d(xp, xp);
> > > | ^
> > > 
> > > This trivial program causes a diagnostic.  (Although I think the '>'
> > > should also cause a diagnostic!!)
> > > 
> > > Let's add a reference, to resemble strtol(3):
> > > 
> > >   $ cat d2.c 
> > >   int d2(const char *restrict ca, char *restrict *restrict ap)
> > >   {
> > >   return ca > *ap;
> > >   }
> > > 
> > >   int main(void)
> > >   {
> > >   char x = 3;
> > >   char *xp = 
> > >   d2(xp, );
> > >   }
> > >   $ cc -Wall -Wextra d2.c 
> > >   $ 
> > > 
> > > Why does this not cause a -Wrestrict diagnostic, while d.c does?  How
> > > are these programs any different regarding pointer restrict-ness?
> > 
> > It would require data flow anaylsis to produce the diagnostic while
> > the first can simply be diagnosed by comparing arguments.
> 
> Agree.  It seems like a task for -fanalyzer.
> 
>   $ cc -Wall -Wextra -fanalyzer -fuse-linker-plugin -flto d2.c
>   $
> 
> I'm unable to trigger that at all.  It's probably not implemented, I
> guess.  I've updated the bug report
>  to change the
> component to 'analyzer'.
> 
> At least, I hope there's consensus that while current GCC doesn't warn
> about this, ideally it should, which means it should warn for valid uses
> of strtol(3), which means strtol(3) should be fixed, in all of ISO,
> POSIX, and glibc.

I am not sure. 

> 
> > > > > Well, I don't know how to report that defect to WG14.  If you help me,
> > > > > I'll be pleased to do so.  Do they have a public mailing list or
> > > > > anything like that?
> > > > 
> > > > One can submit clarification or change requests:
> > > > 
> > > > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html
> 
> P.S.:
> 
> I've sent a mail to UNE (the Spanish National Body for ISO), and
> asked them about joining WG14.  Let's see what they say.
> 
> P.S. 2:
> 
> I'm also preparing a paper; would you mind championing it if I'm not yet
> able to do it when it's ready?

Guests can present too.

> 
> P.S. 3:
> 
> Do you know of any Spanish member of WG14?  Maybe I can talk with them
> to have more information about how they work.

You could ask Miguel Ojeda.

Martin

> 



Re: [PATCH v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions

2024-07-05 Thread Martin Uecker via Gcc
Am Freitag, dem 05.07.2024 um 17:23 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Fri, Jul 05, 2024 at 05:02:15PM GMT, Martin Uecker wrote:
> > > But when the thing gets non-trivial, as in strtol(3), GCC misses the
> > > -Wrestrict diagnostic, as reported in
> > > .
> > > 
> > > Let's write a reproducer by altering the dumb.c program from above, with
> > > just another reference:
> > > 
> > >   int
> > >   dumb2(int *restrict a, int *restrict *restrict ap)
> > >   {
> > >   // We don't access the objects
> > >   return a == *ap;
> > >   }
> > > 
> > >   int
> > >   main(void)
> > >   {
> > >   int x = 3;
> > >   int *xp = 
> > > 
> > >   return dumb2(, );
> > >   }
> > > 
> > > GCC doesn't report anything bad here, even though it's basically the
> > > same as the program from above:
> > > 
> > >   $ cc -Wall -Wextra dumb2.c
> > >   $
> > 
> > strtol does have  a "char * restrict * restrict" though, so the
> > situation is different.   A "char **" and a "const char *"
> > shouldn't alias anyway. 
> 
> Pedantically, it is actually declared as 'char **restrict' (the inner
> one is not declared as restrict, even though it will be restricted,
> since there are no other unrestricted pointers).
> 
> I've written functions that more closely resemble strtol(3), to show
> that in the end they all share the same issue regarding const-ness:
> 
>   $ cat d.c 
>   int d(const char *restrict ca, char *restrict a)
>   {
>   return ca > a;
>   }
> 
>   int main(void)
>   {
>   char x = 3;
>   char *xp = 
>   d(xp, xp);
>   }
>   $ cc -Wall -Wextra d.c 
>   d.c: In function ‘main’:
>   d.c:10:9: warning: passing argument 2 to ‘restrict’-qualified parameter 
> aliases with argument 1 [-Wrestrict]
>  10 | d(xp, xp);
> | ^
> 
> This trivial program causes a diagnostic.  (Although I think the '>'
> should also cause a diagnostic!!)
> 
> Let's add a reference, to resemble strtol(3):
> 
>   $ cat d2.c 
>   int d2(const char *restrict ca, char *restrict *restrict ap)
>   {
>   return ca > *ap;
>   }
> 
>   int main(void)
>   {
>   char x = 3;
>   char *xp = 
>   d2(xp, );
>   }
>   $ cc -Wall -Wextra d2.c 
>   $ 
> 
> Why does this not cause a -Wrestrict diagnostic, while d.c does?  How
> are these programs any different regarding pointer restrict-ness?

It would require data flow anaylsis to produce the diagnostic while
the first can simply be diagnosed by comparing arguments.

Martin

> 
> > > Well, I don't know how to report that defect to WG14.  If you help me,
> > > I'll be pleased to do so.  Do they have a public mailing list or
> > > anything like that?
> > 
> > One can submit clarification or change requests:
> > 
> > https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html
> 
> Thanks!  Will do.  Anyway, I think this should be discussed in glibc/gcc
> in parallel, since it's clearly a missed diagnostic, and possibly a
> dangerous use of restrict if the compiler does any assumptions that
> shouldn't be done.
> 
> Have a lovely day!
> Alex
> 



Re: [PATCH v1] Remove 'restrict' from 'nptr' in strtol(3)-like functions

2024-07-05 Thread Martin Uecker via Gcc
Am Freitag, dem 05.07.2024 um 16:37 +0200 schrieb Alejandro Colomar via Gcc:
> [CC += linux-man@, since we're discussing an API documented there, and
>  the manual page would also need to be updated]
> 
> Hi Xi,  Jakub,
> 
> On Fri, Jul 05, 2024 at 09:38:21PM GMT, Xi Ruoyao wrote:
> > On Fri, 2024-07-05 at 15:03 +0200, Alejandro Colomar wrote:
> > > ISO C specifies these APIs as accepting a restricted pointer in their
> > > first parameter:
> > > 
> > > $ stdc c99 strtol
> > > long int strtol(const char *restrict nptr, char **restrict endptr, int 
> > > base);
> > > $ stdc c11 strtol
> > > long int strtol(const char *restrict nptr, char **restrict endptr, int 
> > > base);
> > > 
> > > However, it should be considered a defect in ISO C.  It's common to see
> > > code that aliases it:
> > > 
> > >   char str[] = "10 20";
> > > 
> > >   p = str;
> > >   a = strtol(p, , 0);  // Let's ignore error handling for
> > >   b = strtol(p, , 0);  // simplicity.
> > 
> > Why this is wrong?
> > 
> > During the execution of strtol() the only expression accessing the
> > object "p" is *endptr.  When the body of strtol() refers "nptr" it
> > accesses a different object, not "p".
> 
> 
> 
> Theoretically, 'restrict' is defined in terms of accesses, not just
> references, so it's fine for strtol(3) to hold two references of p in
> restrict pointers.  That is, the following code is valid:
> 
>   int
>   dumb(int *restrict a, int *restrict also_a)
>   {
>   // We don't access the objects
>   return a == also_a;
>   }
> 
>   int
>   main(void)
>   {
>   int x = 3;
> 
>   return dumb(, );
>   }
> 
> However, in practice that's dumb.  The caller cannot know that the
> function doesn't access the object, so it must be cautious and enable
> -Wrestrict, which should be paranoid and do not allow passing references
> to the same object in different arguments, just in case the function
> decides to access to objects.  Of course, GCC reports a diagnostic for
> the previous code:
> 
>   $ cc -Wall -Wextra dumb.c 
>   dumb.c: In function ‘main’:
>   dumb.c:13:21: warning: passing argument 1 to ‘restrict’-qualified 
> parameter aliases with argument 2 [-Wrestrict]
>  13 | return dumb(, );
> | ^~  ~~
> 
> ... even when there's no UB, since the object is not being accessed.
> 
> But when the thing gets non-trivial, as in strtol(3), GCC misses the
> -Wrestrict diagnostic, as reported in
> .
> 
> Let's write a reproducer by altering the dumb.c program from above, with
> just another reference:
> 
>   int
>   dumb2(int *restrict a, int *restrict *restrict ap)
>   {
>   // We don't access the objects
>   return a == *ap;
>   }
> 
>   int
>   main(void)
>   {
>   int x = 3;
>   int *xp = 
> 
>   return dumb2(, );
>   }
> 
> GCC doesn't report anything bad here, even though it's basically the
> same as the program from above:
> 
>   $ cc -Wall -Wextra dumb2.c
>   $

strtol does have  a "char * restrict * restrict" though, so the
situation is different.   A "char **" and a "const char *"
shouldn't alias anyway. 


> 
> Again, there's no UB, but we really want to be cautious and get a
> diagnostic as callers, just in case the callee decides to access the
> object; we never know.
> 
> So, GCC should be patched to report a warning in the program above.
> That will also cause strtol(3) to start issuing warnings in use cases
> like the one I showed.
> 
> Even further, let's try something really weird: inequality comparison,
> which is only defined for pointers to the same array object:
> 
>   int
>   dumb3(int *restrict a, int *restrict *restrict ap)
>   {
>   // We don't access the objects
>   return a > *ap;
>   }
> 
>   int
>   main(void)
>   {
>   int x = 3;
>   int *xp = 
> 
>   return dumb3(, );
>   }
> 
> The behavior is still defined, since the obnjects are not accessed, but
> the compiler should really warn, on both sides:
> 
> -  The caller is passing references to the same object in restricted
>parameters, which is a red flag.
> 
> -  The callee is comparing for inequality pointers that should, under
>normal circumstances, cause Undefined Behavior.
> 
> 
> > And if this is really wrong you should report it to WG14 before changing
> > glibc.
> 
> Well, I don't know how to report that defect to WG14.  If you help me,
> I'll be pleased to do so.  Do they have a public mailing list or
> anything like that?

One can submit clarification or change requests:

https://www.open-std.org/jtc1/sc22/wg14/www/contributing.html

Martin




Re: IFNDR on UB? [was: Straw poll on shifts with out of range operands]

2024-06-30 Thread Martin Uecker via Gcc


Actually, it is very much aligned with what I want in C.
In general I want to have pragma-based compilation modes
for memory safety:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3211.pdf

(Bjarne Stroustrup has a proposal for profiles in C++ which
goes in similar direction I think)

>From an implementation point of view, if we annotated all
operations with UB in the front ends with a new

__builtin_undefined()

that - depending on configuration and/or mode - does:

0) nothing
1) expands to __builtin_unreachable()
2) expands to __builtin_trap()  
3) expands to a __builtin_warning (as suggested before
by Martin Sebor) that causes the backend to emit an error
in a very late pass when the __builtin_warning has not
been removed during optimization.

Then this would solve all my problems related to UB.

Martin

Am Sonntag, dem 30.06.2024 um 08:33 +0200 schrieb Martin Uecker via Gcc:
> Am Sonntag, dem 30.06.2024 um 05:03 +0200 schrieb Matthias Kretz:
> > On Saturday, 29 June 2024 16:20:55 GMT+2 Martin Uecker wrote:
> > > Am Samstag, dem 29.06.2024 um 08:50 -0500 schrieb Matthias Kretz via Gcc:
> > > > I.e. once UB becomes IFNDR, the dreaded time-travel optimizations can't
> > > > happen anymore. Instead precondition checks bubble up because otherwise
> > > > the program is ill-formed.
> > > 
> > > It is not clear to mean what you mean by this?
> > 
> > It would help if you could point out what is unclear to you. I assume you 
> > know 
> > IFNDR? And I gave an example for the "bubbling up" of precondition checks 
> > directly above your quoted paragraph.
> 
> I think I understood it now:  You want to make UB be IFNDR so
> that the compiler is allowed to diagnose it at translation
> time in certain cases (although this would not generally be
> required for IFNDR).
> 
> > 
> > > Note that in C time-travel optimizations are already not allowed.
> > 
> > Then, calling __builtin_unreachable is non-conforming for C? ... at least 
> > in 
> > the English sense of "this code is impossible to reach", which implies that 
> > the condition leading up to it must be 'false', allowing time-travel 
> > optimization. Or how would C define 'unreachable'?
> 
> __builtin_uneachable is an extension, it can do whatever it wants.
> 
> But note that compilers do not seem to eliminate the control flow path
> leading to it:
> 
> 
> https://godbolt.org/z/coq9Yra1j
> 
> So even if it is defined in terms of C's UB, these implementations
> would still be conforming to C.
> 
> > 
> > > But I am not sure how this is relevant here as this affects only
> > > observable behavior and the only case where GCC does not seem to
> > > already conform to this is volatile.
> > 
> > Now you lost me.
> 
> Consider the following example:
> 
> int f(int x)
> {
>  int r = 0;
>  if (x < 10)
>r = 1;
>  if (x < 10)
>__builtin_unreachable();
>  return r;
> }
> 
> But removing the store to 'r' here as GCC does:
> 
> https://godbolt.org/z/h7qqrGsbz
> 
> can simply be justified by the "as if" principle as
> any other optimization, it does not need to rely on a weird
> intepretation that the UB from __builin_unreachable() travels
> back in time.
> 
> > 
> > > Of course, C++ may be different but I suspect that some of the
> > > discussion is confusing compiler bugs with time-travel:
> > 
> > "some of the discussion" is referring to what?
> 
> To discussions inside WG21 that seems to believe that it
> is important that compilers can do  time-travel optimizations,
> when this is actually not the case.
> 
> > 
> > > > Again, I don't believe this would be conforming to the C++ standard. 
> > > > But I
> > > > believe it's a very interesting mode to add as a compiler flag.
> > > > 
> > > > -fharden=0 (default)
> > > > -fharden=1 (make UB ill-formed or unreachable)
> > > > -fharden=2 (make UB ill-formed or trap)
> > > > 
> > > > If there's interest I'd be willing to look into a patch to libstdc++,
> > > > building upon the above sketch as a starting point. Ultimately, if this
> > > > becomes a viable build mode, I'd like to have a replacement for the
> > > > [[gnu::error("")]] hack with a dedicated builtin.
> > > 
> > > -fharden should never turn this into unreachable.
> > 
> > Well, if the default is 'unreachable' and the next step is 'ill-formed or 
> > unreachable' it's a step up. But I'm all for a better name.
> 
> I think it is a good idea. Th

Re: IFNDR on UB? [was: Straw poll on shifts with out of range operands]

2024-06-30 Thread Martin Uecker via Gcc
Am Sonntag, dem 30.06.2024 um 05:03 +0200 schrieb Matthias Kretz:
> On Saturday, 29 June 2024 16:20:55 GMT+2 Martin Uecker wrote:
> > Am Samstag, dem 29.06.2024 um 08:50 -0500 schrieb Matthias Kretz via Gcc:
> > > I.e. once UB becomes IFNDR, the dreaded time-travel optimizations can't
> > > happen anymore. Instead precondition checks bubble up because otherwise
> > > the program is ill-formed.
> > 
> > It is not clear to mean what you mean by this?
> 
> It would help if you could point out what is unclear to you. I assume you 
> know 
> IFNDR? And I gave an example for the "bubbling up" of precondition checks 
> directly above your quoted paragraph.

I think I understood it now:  You want to make UB be IFNDR so
that the compiler is allowed to diagnose it at translation
time in certain cases (although this would not generally be
required for IFNDR).

> 
> > Note that in C time-travel optimizations are already not allowed.
> 
> Then, calling __builtin_unreachable is non-conforming for C? ... at least in 
> the English sense of "this code is impossible to reach", which implies that 
> the condition leading up to it must be 'false', allowing time-travel 
> optimization. Or how would C define 'unreachable'?

__builtin_uneachable is an extension, it can do whatever it wants.

But note that compilers do not seem to eliminate the control flow path
leading to it:


https://godbolt.org/z/coq9Yra1j

So even if it is defined in terms of C's UB, these implementations
would still be conforming to C.

> 
> > But I am not sure how this is relevant here as this affects only
> > observable behavior and the only case where GCC does not seem to
> > already conform to this is volatile.
> 
> Now you lost me.

Consider the following example:

int f(int x)
{
 int r = 0;
 if (x < 10)
   r = 1;
 if (x < 10)
   __builtin_unreachable();
 return r;
}

But removing the store to 'r' here as GCC does:

https://godbolt.org/z/h7qqrGsbz

can simply be justified by the "as if" principle as
any other optimization, it does not need to rely on a weird
intepretation that the UB from __builin_unreachable() travels
back in time.

> 
> > Of course, C++ may be different but I suspect that some of the
> > discussion is confusing compiler bugs with time-travel:
> 
> "some of the discussion" is referring to what?

To discussions inside WG21 that seems to believe that it
is important that compilers can do  time-travel optimizations,
when this is actually not the case.

> 
> > > Again, I don't believe this would be conforming to the C++ standard. But I
> > > believe it's a very interesting mode to add as a compiler flag.
> > > 
> > > -fharden=0 (default)
> > > -fharden=1 (make UB ill-formed or unreachable)
> > > -fharden=2 (make UB ill-formed or trap)
> > > 
> > > If there's interest I'd be willing to look into a patch to libstdc++,
> > > building upon the above sketch as a starting point. Ultimately, if this
> > > becomes a viable build mode, I'd like to have a replacement for the
> > > [[gnu::error("")]] hack with a dedicated builtin.
> > 
> > -fharden should never turn this into unreachable.
> 
> Well, if the default is 'unreachable' and the next step is 'ill-formed or 
> unreachable' it's a step up. But I'm all for a better name.

I think it is a good idea. The compiler can optionally treat UB as
a translation time error. We discussed similar ideas in the past
in WG14. But this will only work for very specific instances of UB
under certain conditions.

> 
> > IMHO the FEs should insert the conditional traps when requested to
> > and the middle end could then treat it as UB and more freely
> > decide what to do.
> 
> Right I was thinking of turning my library-solution hack into a builtin (if 
> it 
> shows potential). The behavior of which then depends on a compiler flag. Then 
> both library and language UB could invoke that builtin. E.g. 'operator+(int, 
> int)' would add '__check_precondition(not __builtin_add_overflow_p(a, b, a));'
> With my proposed '-fharden=1 -O2' you'd then get a compilation error on 
> '0x7fff' + 1', but no code size increase for all other additions. With 
> '-fharden=2 -O2' the 'lea' would turn into an actual 'add' instruction with 
> subsequent 'jo' to 'ud2' (on x86).

Yes, I fully agree with this.  
> 
> > Also IMHO this should be split up from
> > UBsan which has specific semantics and upstream dependencies
> > which are are not always ideal.  (But UBSan could share the
> > same infrastructure)
> 
> I'm not sure what you're thinking of here. UBsan detects UB at runtime 
> whereas 
> my '-fharden=1' proposal is about flagging UB as ill-formed on compile-time. 
> So UBsan is a more verbose '-fharden=2' then?

Yes, I was talking about the -fharden=2 case. In principle UBSan
with traps instead of diagnostics would do this. In practice,
I think we need something which is not tied to UBSan.

Martin


> 
> - Matthias
> 



Re: IFNDR on UB? [was: Straw poll on shifts with out of range operands]

2024-06-29 Thread Martin Uecker via Gcc
Am Samstag, dem 29.06.2024 um 08:50 -0500 schrieb Matthias Kretz via Gcc:


...
> I.e. once UB becomes IFNDR, the dreaded time-travel optimizations can't 
> happen 
> anymore. Instead precondition checks bubble up because otherwise the program 
> is ill-formed.

It is not clear to mean what you mean by this?

Note that in C time-travel optimizations are already not allowed.
But I am not sure how this is relevant here as this affects only
observable behavior and the only case where GCC does not seem to
already conform to this is volatile.

Of course, C++ may be different but I suspect that some of the
discussion is confusing compiler bugs with time-travel:

https://developercommunity.visualstudio.com/t/Invalid-optimization-in-CC/10337428?q=muecker



> 
> Again, I don't believe this would be conforming to the C++ standard. But I 
> believe it's a very interesting mode to add as a compiler flag.
> 
> -fharden=0 (default)
> -fharden=1 (make UB ill-formed or unreachable)
> -fharden=2 (make UB ill-formed or trap)
> 
> If there's interest I'd be willing to look into a patch to libstdc++, 
> building 
> upon the above sketch as a starting point. Ultimately, if this becomes a 
> viable build mode, I'd like to have a replacement for the [[gnu::error("")]] 
> hack with a dedicated builtin.

-fharden should never turn this into unreachable.

But I agree that we should have options for different choices. 

IMHO the FEs should insert the conditional traps when requested to
and the middle end could then treat it as UB and more freely
decide what to do.  Also IMHO this should be split up from
UBsan which has specific semantics and upstream dependencies
which are are not always ideal.  (But UBSan could share the
same infrastructure)

Martin





[gcc r15-1698] c: Error message for incorrect use of static in array declarations.

2024-06-27 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:da7976a015a4388b8ed843412c3c1c840451cf0f

commit r15-1698-gda7976a015a4388b8ed843412c3c1c840451cf0f
Author: Martin Uecker 
Date:   Thu Jun 27 21:47:56 2024 +0200

c: Error message for incorrect use of static in array declarations.

Add an explicit error messages when c99's static is
used without a size expression in an array declarator.

gcc/c:
* c-parser.cc (c_parser_direct_declarator_inner): Add
error message.

gcc/testsuite:
* gcc.dg/c99-arraydecl-4.c: New test.

Diff:
---
 gcc/c/c-parser.cc  | 63 --
 gcc/testsuite/gcc.dg/c99-arraydecl-4.c | 14 
 2 files changed, 44 insertions(+), 33 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 6a3f96d5b61..8c4e697a4e1 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -4715,8 +4715,6 @@ c_parser_direct_declarator_inner (c_parser *parser, bool 
id_present,
   location_t brace_loc = c_parser_peek_token (parser)->location;
   struct c_declarator *declarator;
   struct c_declspecs *quals_attrs = build_null_declspecs ();
-  bool static_seen;
-  bool star_seen;
   struct c_expr dimen;
   dimen.value = NULL_TREE;
   dimen.original_code = ERROR_MARK;
@@ -4724,49 +4722,48 @@ c_parser_direct_declarator_inner (c_parser *parser, 
bool id_present,
   c_parser_consume_token (parser);
   c_parser_declspecs (parser, quals_attrs, false, false, true,
  false, false, false, false, cla_prefer_id);
-  static_seen = c_parser_next_token_is_keyword (parser, RID_STATIC);
-  if (static_seen)
-   c_parser_consume_token (parser);
-  if (static_seen && !quals_attrs->declspecs_seen_p)
-   c_parser_declspecs (parser, quals_attrs, false, false, true,
-   false, false, false, false, cla_prefer_id);
+
+  location_t static_loc = UNKNOWN_LOCATION;
+  if (c_parser_next_token_is_keyword (parser, RID_STATIC))
+   {
+ static_loc = c_parser_peek_token (parser)->location;
+ c_parser_consume_token (parser);
+ if (!quals_attrs->declspecs_seen_p)
+   c_parser_declspecs (parser, quals_attrs, false, false, true,
+   false, false, false, false, cla_prefer_id);
+   }
   if (!quals_attrs->declspecs_seen_p)
quals_attrs = NULL;
   /* If "static" is present, there must be an array dimension.
 Otherwise, there may be a dimension, "*", or no
 dimension.  */
-  if (static_seen)
+  const bool static_seen = (static_loc != UNKNOWN_LOCATION);
+  bool star_seen = false;
+  if (c_parser_next_token_is (parser, CPP_MULT)
+ && c_parser_peek_2nd_token (parser)->type == CPP_CLOSE_SQUARE)
{
- star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
+ star_seen = true;
+ c_parser_consume_token (parser);
}
-  else
+  else if (!c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
+   dimen = c_parser_expr_no_commas (parser, NULL);
+
+  if (static_seen)
{
- if (c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
-   {
- dimen.value = NULL_TREE;
- star_seen = false;
-   }
- else if (c_parser_next_token_is (parser, CPP_MULT))
-   {
- if (c_parser_peek_2nd_token (parser)->type == CPP_CLOSE_SQUARE)
-   {
- dimen.value = NULL_TREE;
- star_seen = true;
- c_parser_consume_token (parser);
-   }
- else
-   {
- star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
-   }
-   }
- else
+ if (star_seen)
{
+ error_at (static_loc,
+   "% may not be used with an unspecified "
+   "variable length array size");
+ /* Prevent further errors.  */
  star_seen = false;
- dimen = c_parser_expr_no_commas (parser, NULL);
+ dimen.value = error_mark_node;
}
+ else if (!dimen.value)
+   error_at (static_loc,
+ "% may not be used without an array size");
}
+
   if (c_parser_next_token_is (parser, CPP_CLOSE_SQUARE))
c_parser_consume_token (parser);
   else
diff --git a/gcc/testsuite/gcc.dg/c99-arraydecl-4.c 
b/gcc/testsuite/gcc.dg/c99-arraydecl-4.c
new file mode 100644
index 000..f8cad3b9429
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c99-arraydecl-4.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c99 -pedantic-errors" } */
+
+void fo(char buf[static]); /* { dg-error "'static' may not be used without 
an array size" } */
+void fo(char buf[static]) { }  /* { dg-error "'static' may not be used without 
an array size" } 

Re: consistent unspecified pointer comparison

2024-06-27 Thread Martin Uecker via Gcc
Am Donnerstag, dem 27.06.2024 um 12:05 -0700 schrieb Andrew Pinski via Gcc:
> On Thu, Jun 27, 2024 at 11:57 AM Jason Merrill via Gcc  
> wrote:
> > 
> > On Thu, Jun 27, 2024 at 2:38 PM Richard Biener
> >  wrote:
> > > > Am 27.06.2024 um 19:04 schrieb Jason Merrill via Gcc :
> > > > 
> > > > https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2434r1.html
> > > > proposes to require that repeated unspecified comparisons be
> > > > self-consistent, which does not match current behavior in either GCC
> > > > or Clang.  The argument is that the current allowance to be
> > > > inconsistent is user-unfriendly and does not enable significant
> > > > optimizations.  Any feedback about this?
> > > 
> > > Can you give an example of an unspecified comparison?  I think the only 
> > > way to do what the paper wants is for the implementation to make the 
> > > comparison specified (without the need to document it).  Is the 
> > > self-consistency required only within some specified scope (a single 
> > > expression?) or even across TUs (which might be compiled by different 
> > > compilers or compiler versions)?
> > > 
> > > So my feedback would be to make the comparison well-defined.
> > > 
> > > I’m still curious about which ones are unspecified now.
> > 
> > https://eel.is/c++draft/expr#eq-3.1
> > "If one pointer represents the address of a complete object, and
> > another pointer represents the address one past the last element of a
> > different complete object, the result of the comparison is
> > unspecified."
> > 
> > This is historically unspecified primarily because we don't want to
> > force a particular layout of multiple variables.
> > 
> > See the example under "consequences for implementations" in the paper.
> 
> There is instability due to floating point too;
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93681
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93806
> 
> and uninitialized variables:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93301
> (but that might be fixed via https://wg21.link/P2795R5).

For pointer comparison:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502


Martin

> 
> > 
> > Jason
> > 



Re: consistent unspecified pointer comparison

2024-06-27 Thread Martin Uecker via Gcc
Am Donnerstag, dem 27.06.2024 um 13:02 -0400 schrieb Jason Merrill via Gcc:
> https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2434r1.html
> proposes to require that repeated unspecified comparisons be
> self-consistent, which does not match current behavior in either GCC
> or Clang.  The argument is that the current allowance to be
> inconsistent is user-unfriendly and does not enable significant
> optimizations.  Any feedback about this?

Making pointer comparison self-consistent in cases where there is
no UB makes a lot of sense, because everything else is dangerous.

This is not the only thing this paper proposes. I do not think
angelic / demonic nondeterminism is good language design, and
especially think it is not a good fit for systems languages
such as C or C++. Also most programmers will not understand it.

Martin


Re: Union initialization semantics

2024-06-19 Thread Martin Uecker via Gcc
Am Mittwoch, dem 19.06.2024 um 13:59 +0100 schrieb Jonathan Wakely via Gcc:
> On Wed, 19 Jun 2024 at 11:57, Alexander Monakov  wrote:
> > 
> > Hello,
> > 
> > I vaguely remember there was a recent, maybe within last two months, 
> > discussion
> > about semantics of union initialization where sizeof(first member) is less 
> > than
> > sizeof(union). The question was whether it's okay to initialize just that 
> > first
> > member and leave garbage bits in the other, larger, members of the union, 
> > like
> > in this example:
> > 
> > union A {
> > char a;
> > long : 0;
> > };
> > 
> > void fn(void *);
> > 
> > void my(void)
> > {
> > union A a = { 0 };
> > fn();
> > }
> > 
> > (except in my example there's no other named member, but I think the example
> > in that discussion was less contrived)
> > 
> > Perhaps somebody remembers where it was (I'm thinking Bugzilla) and could 
> > point
> > me to it? My attempts to search for it aren't turning anything up so far.
> 
> Somebody asked about this internally at Red Hat recently, and I
> responded with this quote from C17 6.2.6.1 p7:
> "When a value is stored in a member of an object of union type, the
> bytes of the object representation that do not correspond to that
> member but do correspond to other members take unspecified values. "
> 
> This looks related too:
> https://discourse.llvm.org/t/union-initialization-and-aliasing-clang-18-seems-to-miscompile-musl/77724/3
> They don't seem to have found the quote above though.
> 
> I think it got reported to GCC's bugzilla too, I'll see if I can find it 
> again.
> 
> > If someone knows what semantics GCC implements, that also would be welcome.
> 
> GCC seems to initialize the trailing bits, unnecessarily.

Note that C23 will require the padding bits to be initialized with zero
for default initialization {}.

Martin





[gcc r15-1394] c23: Fix for redeclared enumerator initialized with different type [PR115109]

2024-06-18 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:c9b96a68860bfdee49d40b4a844af7c5ef69cd12

commit r15-1394-gc9b96a68860bfdee49d40b4a844af7c5ef69cd12
Author: Martin Uecker 
Date:   Sat May 18 22:00:04 2024 +0200

c23: Fix for redeclared enumerator initialized with different type 
[PR115109]

c23 specifies that the type of a redeclared enumerator is the one of the
previous declaration.  Convert initializers with different type accordingly
and emit an error when the value does not fit.

2024-06-01 Martin Uecker  

PR c/115109

gcc/c/
* c-decl.cc (build_enumerator): When redeclaring an
enumerator convert value to previous type.  For redeclared
enumerators use underlying type for computing the next value.

gcc/testsuite/
* gcc.dg/pr115109.c: New test.
* gcc.dg/c23-tag-enum-6.c: New test.
* gcc.dg/c23-tag-enum-7.c: New test.

Diff:
---
 gcc/c/c-decl.cc   | 29 ++---
 gcc/testsuite/gcc.dg/c23-tag-enum-6.c | 20 +
 gcc/testsuite/gcc.dg/c23-tag-enum-7.c | 41 +++
 gcc/testsuite/gcc.dg/pr115109.c   |  8 +++
 4 files changed, 95 insertions(+), 3 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 6c09eb731284..01326570e2b2 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -10277,6 +10277,7 @@ build_enumerator (location_t decl_loc, location_t loc,
  struct c_enum_contents *the_enum, tree name, tree value)
 {
   tree decl;
+  tree old_decl;
 
   /* Validate and default VALUE.  */
 
@@ -10336,6 +10337,23 @@ build_enumerator (location_t decl_loc, location_t loc,
 definition.  */
   value = convert (the_enum->enum_type, value);
 }
+  else if (flag_isoc23
+  && (old_decl = lookup_name_in_scope (name, current_scope))
+  && old_decl != error_mark_node
+  && TREE_TYPE (old_decl)
+  && TREE_TYPE (TREE_TYPE (old_decl))
+  && TREE_CODE (old_decl) == CONST_DECL)
+{
+  /* Enumeration constants in a redeclaration have the previous type.  */
+  tree previous_type = TREE_TYPE (DECL_INITIAL (old_decl));
+  if (!int_fits_type_p (value, previous_type))
+   {
+ error_at (loc, "value of redeclared enumerator outside the range "
+"of %qT", previous_type);
+ locate_old_decl (old_decl);
+   }
+  value = convert (previous_type, value);
+}
   else
 {
   /* Even though the underlying type of an enum is unspecified, the
@@ -10402,9 +10420,14 @@ build_enumerator (location_t decl_loc, location_t loc,
 false);
 }
   else
-the_enum->enum_next_value
-  = build_binary_op (EXPR_LOC_OR_LOC (value, input_location),
-PLUS_EXPR, value, integer_one_node, false);
+{
+  /* In a redeclaration the type can already be the enumeral type.  */
+  if (TREE_CODE (TREE_TYPE (value)) == ENUMERAL_TYPE)
+   value = convert (ENUM_UNDERLYING_TYPE (TREE_TYPE (value)), value);
+  the_enum->enum_next_value
+   = build_binary_op (EXPR_LOC_OR_LOC (value, input_location),
+  PLUS_EXPR, value, integer_one_node, false);
+}
   the_enum->enum_overflow = tree_int_cst_lt (the_enum->enum_next_value, value);
   if (the_enum->enum_overflow
   && !ENUM_FIXED_UNDERLYING_TYPE_P (the_enum->enum_type))
diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-6.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
new file mode 100644
index ..29aef7ee3fdf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-6.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -fno-short-enums" } */
+
+#include 
+
+enum E : int { a = 1, b = 2 };
+enum E : int { b = _Generic(a, enum E: 2), a = 1 };
+
+enum H { x = 1 };
+enum H { x = 2UL + UINT_MAX }; /* { dg-error "outside the range" } */
+
+enum K : int { z = 1 };
+enum K : int { z = 2UL + UINT_MAX };   /* { dg-error "outside the range" } */
+
+enum F { A = 0, B = UINT_MAX };
+enum F { B = UINT_MAX, A };/* { dg-error "outside the range" } */
+
+enum G : unsigned int { C = 0, D = UINT_MAX };
+enum G : unsigned int { D = UINT_MAX, C }; /* { dg-error 
"overflow" } */
+
diff --git a/gcc/testsuite/gcc.dg/c23-tag-enum-7.c 
b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
new file mode 100644
index ..d4c787c8f716
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-enum-7.c
@@ -0,0 +1,41 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -fno-short-enums" } */
+
+#include 
+
+// enumerators are all representable in int
+enum E { a = 1UL, b = _Generic(a, int: 2) };
+static_assert(_Generic(a, int: 1));
+static_assert(_Generic(b, int: 1));
+enum E { a = 1UL, b = _Generic(a, int: 2) };
+static_assert(_Generic(a, int: 1));
+static_assert(_Generic(b, int: 1));
+
+// enumerators are not representable in int
+enum H { c = 1UL << (UINT_WIDTH + 

Re: check_qualified_type

2024-06-17 Thread Martin Uecker via Gcc
Am Montag, dem 17.06.2024 um 15:40 +0200 schrieb Jakub Jelinek:
> On Mon, Jun 17, 2024 at 03:33:05PM +0200, Martin Uecker wrote:
> > > I've done that and that was because build_qualified_type uses that
> > > predicate, where qualified types created by build_qualified_type have
> > > as TYPE_CANONICAL the qualified type of the main variant of the canonical
> > > type, while in all other cases TYPE_CANONICAL is just the main variant of
> > > the type.
> > > Guess we could also just do
> > >   if (TYPE_QUALS (x) == TYPE_QUALS (t))
> > > TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
> > >   else if (TYPE_CANONICAL (t) != t
> > >  || TYPE_QUALS (x) != TYPE_QUALS (TYPE_CANONICAL (t)))
> > > TYPE_CANONICAL (x)
> > >   = build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x));
> > >   else
> > > TYPE_CANONICAL (x) = x;
> > > 
> > 
> > Ok, that works. I think the final "else" is then then impossible to reach
> > and can be eliminated as well, because if TYPE_CANONICAL (t) == t then 
> > TYPE_QUALS (x) == TYPE_QUALS (TYPE_CANONICAL (t)) is identical to
> > TYPE_QUALS (x) == TYPE_QUALS (t) which is the first case.
> 
> If c_update_type_canonical is only ever called for the main variants of the
> type and they always have !TYPE_QUALS (t), then yes.
> But if we rely on that, perhaps we should gcc_checking_assert that.
> So
>   gcc_checking_assert (t == TYPE_MAIN_VARIANT (t) && !TYPE_QUALS (t));
> or something similar at the start of the function.

It calls itself recursively on pointers to the type.  But to
me the third branch looks dead in any case, because the first
two cover all possibilities.

Martin

> Then we could also change the
>   for (tree x = TYPE_MAIN_VARIANT (t); x; x = TYPE_NEXT_VARIANT (x))
> to
>   for (tree x = t; x; x = TYPE_NEXT_VARIANT (x))
> and
> if (TYPE_QUALS (x) == TYPE_QUALS (t))
> ...
> to
> if (!TYPE_QUALS (x))
>   TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
> else
>   build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x));
> 





> 



Re: check_qualified_type

2024-06-17 Thread Martin Uecker via Gcc
Am Montag, dem 17.06.2024 um 14:57 +0200 schrieb Jakub Jelinek:
> On Mon, Jun 17, 2024 at 02:42:05PM +0200, Richard Biener wrote:
> > > > > I am trying to understand what check_qualified_type
> > > > > does exactly. The direct comparison of TYPE_NAMES seems incorrect
> > > > > for C and its use is c_update_type_canonical then causes
> > > > > PR114930 and PR115502.  In the later function I think
> > > > > it is not really needed and I guess one could simply remove
> > > > > it, but I wonder if it works incorrectly in other cases 
> > > > > too?
> > > > 
> > > > TYPE_NAMES is compared because IIRC typedefs are recorded as variants
> > > > and 'const T' isn't the same as 'const int' with typedef int T.
> > > 
> > > so if it is intentional that it differentiates between 
> > > 
> > > struct foo
> > > 
> > > and
> > > 
> > > typedef struct foo bar
> > > 
> > > then I will change c_update_type_canonical to not use it,
> > > because both types should have the same TYPE_CANONICAL
> > 
> > The check is supposed to differentiate between variants and all variants
> > have the same TYPE_CANONICAL so I'm not sure why you considered using
> > this function for canonical type compute?
> 
> I've done that and that was because build_qualified_type uses that
> predicate, where qualified types created by build_qualified_type have
> as TYPE_CANONICAL the qualified type of the main variant of the canonical
> type, while in all other cases TYPE_CANONICAL is just the main variant of
> the type.
> Guess we could also just do
>   if (TYPE_QUALS (x) == TYPE_QUALS (t))
> TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
>   else if (TYPE_CANONICAL (t) != t
>  || TYPE_QUALS (x) != TYPE_QUALS (TYPE_CANONICAL (t)))
> TYPE_CANONICAL (x)
>   = build_qualified_type (TYPE_CANONICAL (t), TYPE_QUALS (x));
>   else
> TYPE_CANONICAL (x) = x;
> 

Ok, that works. I think the final "else" is then then impossible to reach
and can be eliminated as well, because if TYPE_CANONICAL (t) == t then 
TYPE_QUALS (x) == TYPE_QUALS (TYPE_CANONICAL (t)) is identical to
TYPE_QUALS (x) == TYPE_QUALS (t) which is the first case.

Martin





Re: -Wcast-qual consistency with initialization conversion and double pointer types

2024-06-17 Thread Martin Uecker via Gcc
Am Montag, dem 17.06.2024 um 12:06 + schrieb Joseph Myers:
> On Sun, 16 Jun 2024, Martin Uecker via Gcc wrote:
> 
> > I think it should not warn about:
> > 
> > char *x;
> > *(char * volatile *)
> > 
> > as this is regular qualifier adding and this is
> > a bug in GCC.
> > 
> > I would guess it looks at all qualifiers added at
> > all level but should ignore the one on the first level.
> 
> This is meant to be implementing, as an extension to C, the C++ rules 
> (where converting from char** to const char** is unsafe, but converting 
> from char** to const char*const* is safe).  So the first question is what 
> C++ thinks of this conversion.
> 
Note that this is about the case where no third-level qualifier
is added. We should still warn about converting from char** to
const char **, and to volatile char ** but probably not (I think)
when converting to char*const*, const char*const*, 
volatile char*const*, and also not when converting
to char*volatile*.  So not when all intermediate casts
are const but also not when only a qualifier is added
to the second level but not to deeper levels. 

Martin



Re: check_qualified_type

2024-06-17 Thread Martin Uecker via Gcc
Am Montag, dem 17.06.2024 um 08:01 +0200 schrieb Richard Biener via Gcc:
> On Sun, 16 Jun 2024, Martin Uecker wrote:
> 
> > 
> > 
> > I am trying to understand what check_qualified_type
> > does exactly. The direct comparison of TYPE_NAMES seems incorrect
> > for C and its use is c_update_type_canonical then causes
> > PR114930 and PR115502.  In the later function I think
> > it is not really needed and I guess one could simply remove
> > it, but I wonder if it works incorrectly in other cases 
> > too?
> 
> TYPE_NAMES is compared because IIRC typedefs are recorded as variants
> and 'const T' isn't the same as 'const int' with typedef int T.

so if it is intentional that it differentiates between 

struct foo

and

typedef struct foo bar

then I will change c_update_type_canonical to not use it,
because both types should have the same TYPE_CANONICAL

Martin




check_qualified_type

2024-06-16 Thread Martin Uecker via Gcc



I am trying to understand what check_qualified_type
does exactly. The direct comparison of TYPE_NAMES seems incorrect
for C and its use is c_update_type_canonical then causes
PR114930 and PR115502.  In the later function I think
it is not really needed and I guess one could simply remove
it, but I wonder if it works incorrectly in other cases 
too?


Martin




Re: -Wcast-qual consistency with initialization conversion and double pointer types

2024-06-16 Thread Martin Uecker via Gcc


I think it should not warn about:

char *x;
*(char * volatile *)

as this is regular qualifier adding and this is
a bug in GCC.

I would guess it looks at all qualifiers added at
all level but should ignore the one on the first level.

Martin


Am Samstag, dem 15.06.2024 um 10:17 -0700 schrieb Ryan Libby via Gcc:
> I'm not a C language expert and I'm looking for advice on whether a
> -Wcast-qual diagnostic in one situation and not another is intentional
> behavior.
> 
> Here's a set of examples (same as attachment).
> 
> % cat cast-qual-example.c
> #define F(name, type, qual) \
> typedef type t_##name;  \
> void name(void) {   \
> t_##name x = 0, y, z;   \
> y = *(t_##name qual *)   \
> z = *(t_##name qual *){}; \
> }
> 
> F(fcc, char, const)
> F(fpc, char *, const)
> F(fcv, char, volatile)
> F(fpv, char *, volatile)
> 
> void fpv2(void) {
> char *x = 0, *y, *z;
> y = *(char * volatile *)
> z = *(char * volatile *){};
> }
> 
> void eg1(void) {
> /* Adapted from -Wcast-qual doc */
> char v0 = 'v';
> char *v1 = 
> char **p = 
> /* p is char ** value.  */
> char * volatile *q = (char * volatile *) p;
> /* Assignment of volatile pointer to char is OK. */
> char u0 = 'u';
> char * volatile u1 = 
> *q = u1;
> /* Now *q is accessed through a non-volatile-qualified pointer. */
> *p = 0;
> }
> 
> void eg2(void) {
> char v = 'v';
> char *p = 
> /* p is char * value.  */
> char volatile *q = (char volatile *) p;
> /* Assignment of volatile char is OK (and also plain char). */
> char volatile u = 'u';
> *q = u;
> /* Now *q is accessed through a non-volatile-qualified pointer. */
> *p = 0;
> }
> 
> % gcc13 -std=c17 -Wall -Wextra -Wcast-qual -Wno-unused -c
> cast-qual-example.c -o /dev/null
> cast-qual-example.c: In function 'fpv':
> cast-qual-example.c:5:14: warning: to be safe all intermediate
> pointers in cast from 'char **' to 'char * volatile*' must be 'const'
> qualified [-Wcast-qual]
> 5 | y = *(t_##name qual *)   \
>   |  ^
> cast-qual-example.c:12:1: note: in expansion of macro 'F'
>12 | F(fpv, char *, volatile)
>   | ^
> cast-qual-example.c: In function 'fpv2':
> cast-qual-example.c:16:14: warning: to be safe all intermediate
> pointers in cast from 'char **' to 'char * volatile*' must be 'const'
> qualified [-Wcast-qual]
>16 | y = *(char * volatile *)
>   |  ^
> cast-qual-example.c: In function 'eg1':
> cast-qual-example.c:26:30: warning: to be safe all intermediate
> pointers in cast from 'char **' to 'char * volatile*' must be 'const'
> qualified [-Wcast-qual]
>26 | char * volatile *q = (char * volatile *) p;
>   |  ^
> % clang -std=c17 -Wall -Wextra -Wcast-qual -Wno-unused -c
> cast-qual-example.c -o /dev/null
> %
> 
> The macro and typedef are to illustrate the point, they aren't otherwise
> needed, and fpv2 shows the same thing without them.
> 
> So, in the conversion of char ** to char * volatile *, the cast before
> the assignment of y is diagnosed, but the conversion in the
> initialization of the compound literal for the assignment of z is not.
> 
> First, is the cast construct actually different from the initialization
> construct in terms of safety?  I would think not, but maybe I am
> missing something.
> 
> I think that both assignment expressions in fpv as a whole are
> ultimately safe, considering also the immediate dereference of the
> temporary outer pointer value.
> 
> In eg1 and eg2 I modified examples from the -Wcast-qual documentation.
> eg1 is diagnosed, eg2 is not.
> 
> I think that the *p assignment in eg1 might be undefined behavior
> (6.7.3, referring to an object with volatile-qualified type (*q) through
> an lvalue without volatile-qualified type (*p)).
> 
> But then I don't get why the same wouldn't be true if we take away the
> inner pointer and repeat the exercise with plain char (eg1 vs eg2).
> 
> So, what's going on here?  Is the gcc behavior intentional?  Is it
> consistent?  And is there a recommended way to construct a temporary
> volatile pointer to an object (which may itself be a pointer) without
> tripping -Wcast-qual, without just casting away type information (as in,
> without intermediate casts through void *, uintptr_t, etc), and
> preferably also without undefined behavior?
> 
> I have checked that the behavior is the same with current sources and
> -std=c23 (gcc (GCC) 15.0.0 20240614 (experimental)).
> 
> P.s. I have seen gcc bug 84166 that advises that the -Wcast-qual warning
> from the cast is intentional in that case.  I think this case is
> different because in that case the qualifiers are on the innermost type.
> 
> Thank you,
> 
> Ryan Libby



[gcc r15-934] C23: allow aliasing for types derived from structs with variable size

2024-05-30 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:d2cfe8a73b3c4195a25cde28e1641ef36ebb08c1

commit r15-934-gd2cfe8a73b3c4195a25cde28e1641ef36ebb08c1
Author: Martin Uecker 
Date:   Fri May 24 12:35:27 2024 +0200

C23: allow aliasing for types derived from structs with variable size

Previously, we set the aliasing set of structures with variable size

struct foo { int x[n]; char b; };

to zero. The reason is that such types can be compatible to diffrent
structure types which are incompatible.

struct foo { int x[2]; char b; };
struct foo { int x[3]; char b; };

But it is not enough to set the aliasing set to zero, because derived
types would then still end up in different equivalence classes even
though they might be compatible.  Instead those types should be set
to structural equivalency.  We also add checking assertions that
ensure that TYPE_CANONICAL is set correctly for all tagged types.

gcc/c/
* c-decl.cc (finish_struct): Do not set TYPE_CANONICAL for
structure or unions with variable size.
* c-objc-common.cc (c_get_alias_set): Do not set alias set to zero.
* c-typeck.cc (comptypes_verify): New function.
(comptypes,comptypes_same_p,comptypes_check_enum_int): Add 
assertion.
(comptypes_equiv_p): Add assertion that ensures that compatible
types have the same equivalence class.
(tagged_types_tu_compatible_p): Remove now unneeded special case.

gcc/testsuite/
* gcc.dg/gnu23-tag-alias-8.c: New test.

Diff:
---
 gcc/c/c-decl.cc  |  2 +-
 gcc/c/c-objc-common.cc   |  5 -
 gcc/c/c-typeck.cc| 37 +---
 gcc/testsuite/gcc.dg/gnu23-tag-alias-8.c | 24 +
 4 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 6e6606c9570..9f7d55c0b10 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9749,7 +9749,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_BEING_DEFINED (t) = 0;
 
   /* Set type canonical based on equivalence class.  */
-  if (flag_isoc23)
+  if (flag_isoc23 && !C_TYPE_VARIABLE_SIZE (t))
 {
   if (c_struct_htab == NULL)
c_struct_htab = hash_table::create_ggc (61);
diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index 283f6a8ae26..738e899a2a9 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -420,11 +420,6 @@ c_var_mod_p (tree x, tree fn ATTRIBUTE_UNUSED)
 alias_set_type
 c_get_alias_set (tree t)
 {
-  /* Structs with variable size can alias different incompatible
- structs.  Let them alias anything.   */
-  if (RECORD_OR_UNION_TYPE_P (t) && C_TYPE_VARIABLE_SIZE (t))
-return 0;
-
   return c_common_get_alias_set (t);
 }
 
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 09b2c265a46..48934802148 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1167,6 +1167,28 @@ common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+
+
+/* Helper function for comptypes.  For two compatible types, return 1
+   if they pass consistency checks.  In particular we test that
+   TYPE_CANONICAL is set correctly, i.e. the two types can alias.  */
+
+static bool
+comptypes_verify (tree type1, tree type2)
+{
+  if (TYPE_CANONICAL (type1) != TYPE_CANONICAL (type2)
+  && !TYPE_STRUCTURAL_EQUALITY_P (type1)
+  && !TYPE_STRUCTURAL_EQUALITY_P (type2))
+{
+  /* FIXME: check other types. */
+  if (RECORD_OR_UNION_TYPE_P (type1)
+ || TREE_CODE (type1) == ENUMERAL_TYPE
+ || TREE_CODE (type2) == ENUMERAL_TYPE)
+   return false;
+}
+  return true;
+}
+
 struct comptypes_data {
   bool enum_and_int_p;
   bool different_types_p;
@@ -1188,6 +1210,8 @@ comptypes (tree type1, tree type2)
   struct comptypes_data data = { };
   bool ret = comptypes_internal (type1, type2, );
 
+  gcc_checking_assert (!ret || comptypes_verify (type1, type2));
+
   return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
@@ -1201,6 +1225,8 @@ comptypes_same_p (tree type1, tree type2)
   struct comptypes_data data = { };
   bool ret = comptypes_internal (type1, type2, );
 
+  gcc_checking_assert (!ret || comptypes_verify (type1, type2));
+
   if (data.different_types_p)
 return false;
 
@@ -1218,6 +1244,8 @@ comptypes_check_enum_int (tree type1, tree type2, bool 
*enum_and_int_p)
   bool ret = comptypes_internal (type1, type2, );
   *enum_and_int_p = data.enum_and_int_p;
 
+  gcc_checking_assert (!ret || comptypes_verify (type1, type2));
+
   return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
@@ -1232,6 +1260,8 @@ comptypes_check_different_types (tree type1, tree type2,
   bool ret = comptypes_internal (type1, type2, );
   *different_types_p = data.different_types_p;
 
+  gcc_checking_assert (!ret || comptypes_verify (type1, type2));
+
   return ret ? (data.warning_needed ? 2 : 

[gcc r15-933] C: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-30 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:867d1264fe71d4291194373d1a1c409cac97a597

commit r15-933-g867d1264fe71d4291194373d1a1c409cac97a597
Author: Martin Uecker 
Date:   Sun May 19 23:13:22 2024 +0200

C: allow aliasing of compatible types derived from enumeral types [PR115157]

Aliasing of enumeral types with the underlying integer is now allowed
by setting the aliasing set to zero.  But this does not allow aliasing
of derived types which are compatible as required by ISO C.  Instead,
initially set structural equality.  Then set TYPE_CANONICAL and update
pointers and main variants when the type is completed (as done for
structures and unions in C23).

PR tree-optimization/115157
PR tree-optimization/115177

gcc/c/
* c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / TYPE_CANONICAL.
* c-objc-common.cc (get_alias_set): Remove special case.
(get_aka_type): Add special case.

gcc/c-family/
* c-attribs.cc (handle_hardbool_attribute): Set TYPE_CANONICAL
for hardbools.

gcc/
* godump.cc (go_output_typedef): Use TYPE_MAIN_VARIANT instead
of TYPE_CANONICAL.

gcc/testsuite/
* gcc.dg/enum-alias-1.c: New test.
* gcc.dg/enum-alias-2.c: New test.
* gcc.dg/enum-alias-3.c: New test.
* gcc.dg/enum-alias-4.c: New test.

Diff:
---
 gcc/c-family/c-attribs.cc   |  1 +
 gcc/c/c-decl.cc | 11 +--
 gcc/c/c-objc-common.cc  |  7 ++-
 gcc/godump.cc   | 10 +++---
 gcc/testsuite/gcc.dg/enum-alias-1.c | 24 
 gcc/testsuite/gcc.dg/enum-alias-2.c | 25 +
 gcc/testsuite/gcc.dg/enum-alias-3.c | 26 ++
 gcc/testsuite/gcc.dg/enum-alias-4.c | 22 ++
 8 files changed, 112 insertions(+), 14 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 605469dd7dd..e3833ed5f20 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -1074,6 +1074,7 @@ handle_hardbool_attribute (tree *node, tree name, tree 
args,
 
   TREE_SET_CODE (*node, ENUMERAL_TYPE);
   ENUM_UNDERLYING_TYPE (*node) = orig;
+  TYPE_CANONICAL (*node) = TYPE_CANONICAL (orig);
 
   tree false_value;
   if (args)
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index b691b91b3db..6e6606c9570 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5051,7 +5051,7 @@ shadow_tag_warned (const struct c_declspecs *declspecs, 
int warned)
  if (t == NULL_TREE)
{
  t = make_node (code);
- if (flag_isoc23 && code != ENUMERAL_TYPE)
+ if (flag_isoc23 || code == ENUMERAL_TYPE)
SET_TYPE_STRUCTURAL_EQUALITY (t);
  pushtag (input_location, name, t);
}
@@ -8828,7 +8828,7 @@ parser_xref_tag (location_t loc, enum tree_code code, 
tree name,
  the forward-reference will be altered into a real type.  */
 
   ref = make_node (code);
-  if (flag_isoc23 && code != ENUMERAL_TYPE)
+  if (flag_isoc23 || code == ENUMERAL_TYPE)
 SET_TYPE_STRUCTURAL_EQUALITY (ref);
   if (code == ENUMERAL_TYPE)
 {
@@ -9919,6 +9919,7 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
 {
   enumtype = make_node (ENUMERAL_TYPE);
   TYPE_SIZE (enumtype) = NULL_TREE;
+  SET_TYPE_STRUCTURAL_EQUALITY (enumtype);
   pushtag (loc, name, enumtype);
   if (fixed_underlying_type != NULL_TREE)
{
@@ -9935,6 +9936,8 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
  TYPE_SIZE (enumtype) = NULL_TREE;
  TYPE_PRECISION (enumtype) = TYPE_PRECISION (fixed_underlying_type);
  ENUM_UNDERLYING_TYPE (enumtype) = fixed_underlying_type;
+ TYPE_CANONICAL (enumtype) = TYPE_CANONICAL (fixed_underlying_type);
+ c_update_type_canonical (enumtype);
  layout_type (enumtype);
}
 }
@@ -10094,6 +10097,10 @@ finish_enum (tree enumtype, tree values, tree 
attributes)
   ENUM_UNDERLYING_TYPE (enumtype) =
c_common_type_for_size (TYPE_PRECISION (tem), TYPE_UNSIGNED (tem));
 
+  TYPE_CANONICAL (enumtype) =
+   TYPE_CANONICAL (ENUM_UNDERLYING_TYPE (enumtype));
+  c_update_type_canonical (enumtype);
+
   layout_type (enumtype);
 }
 
diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index 42a62c84fe7..283f6a8ae26 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -130,6 +130,8 @@ get_aka_type (tree type)
 
   result = get_aka_type (orig_type);
 }
+  else if (TREE_CODE (type) == ENUMERAL_TYPE)
+return type;
   else
 {
   tree canonical = TYPE_CANONICAL (type);
@@ -418,11 +420,6 @@ c_var_mod_p (tree x, tree fn ATTRIBUTE_UNUSED)
 alias_set_type
 

[gcc r15-912] C23: fix aliasing for structures/unions with incomplete types

2024-05-29 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:86b98d939989427ff025bcfd536ad361fcdc699c

commit r15-912-g86b98d939989427ff025bcfd536ad361fcdc699c
Author: Martin Uecker 
Date:   Sat Mar 30 19:49:48 2024 +0100

C23: fix aliasing for structures/unions with incomplete types

When incomplete structure/union types are completed later, compatibility
of struct types that contain pointers to such types changes.  When forming
equivalence classes for TYPE_CANONICAL, we therefor need to be conservative
and treat all structs with the same tag which are pointer targets as
equivalent for purposed of determining equivalency of structure/union
types which contain such types as member. This avoids having to update
TYPE_CANONICAL of such structure/unions recursively. The pointer types
themselves are updated in c_update_type_canonical.

gcc/c/
* c-typeck.cc (comptypes_internal): Add flag to track
whether a struct is the target of a pointer.
(tagged_types_tu_compatible): When forming equivalence
classes, treat nested pointed-to structs as equivalent.

gcc/testsuite/
* gcc.dg/c23-tag-incomplete-alias-1.c: New test.

Diff:
---
 gcc/c/c-typeck.cc | 43 +--
 gcc/testsuite/gcc.dg/c23-tag-incomplete-alias-1.c | 36 +++
 2 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ad4c7add562..09b2c265a46 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1172,6 +1172,7 @@ struct comptypes_data {
   bool different_types_p;
   bool warning_needed;
   bool anon_field;
+  bool pointedto;
   bool equiv;
 
   const struct tagged_tu_seen_cache* cache;
@@ -1235,8 +1236,36 @@ comptypes_check_different_types (tree type1, tree type2,
 }
 
 
-/* Like comptypes, but if it returns nonzero for struct and union
-   types considered equivalent for aliasing purposes.  */
+/* Like comptypes, but if it returns true for struct and union types
+   considered equivalent for aliasing purposes, i.e. for setting
+   TYPE_CANONICAL after completing a struct or union.
+
+   This function must return false only for types which are not
+   compatible according to C language semantics (cf. comptypes),
+   otherwise the middle-end would make incorrect aliasing decisions.
+   It may return true for some similar types that are not compatible
+   according to those stricter rules.
+
+   In particular, we ignore size expression in arrays so that the
+   following structs are in the same equivalence class:
+
+   struct foo { char (*buf)[]; };
+   struct foo { char (*buf)[3]; };
+   struct foo { char (*buf)[4]; };
+
+   We also treat unions / structs with members which are pointers to
+   structures or unions with the same tag as equivalent (if they are not
+   incompatible for other reasons).  Although incomplete structure
+   or union types are not compatible to any other type, they may become
+   compatible to different types when completed.  To avoid having to update
+   TYPE_CANONICAL at this point, we only consider the tag when forming
+   the equivalence classes.  For example, the following types with tag
+   'foo' are all considered equivalent:
+
+   struct bar;
+   struct foo { struct bar *x };
+   struct foo { struct bar { int a; } *x };
+   struct foo { struct bar { char b; } *x };  */
 
 bool
 comptypes_equiv_p (tree type1, tree type2)
@@ -1357,6 +1386,7 @@ comptypes_internal (const_tree type1, const_tree type2,
   /* Do not remove mode information.  */
   if (TYPE_MODE (t1) != TYPE_MODE (t2))
return false;
+  data->pointedto = true;
   return comptypes_internal (TREE_TYPE (t1), TREE_TYPE (t2), data);
 
 case FUNCTION_TYPE:
@@ -1375,7 +1405,7 @@ comptypes_internal (const_tree type1, const_tree type2,
 
if ((d1 == NULL_TREE) != (d2 == NULL_TREE))
  data->different_types_p = true;
-   /* Ignore size mismatches.  */
+   /* Ignore size mismatches when forming equivalence classes.  */
if (data->equiv)
  return true;
/* Sizes must match unless one is missing or variable.  */
@@ -1515,6 +1545,12 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
   if (TYPE_NAME (t1) != TYPE_NAME (t2))
 return false;
 
+  /* When forming equivalence classes for TYPE_CANONICAL in C23, we treat
+ structs with the same tag as equivalent, but only when they are targets
+ of pointers inside other structs.  */
+  if (data->equiv && data->pointedto)
+return true;
+
   if (!data->anon_field && NULL_TREE == TYPE_NAME (t1))
 return false;
 
@@ -1610,6 +1646,7 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
  return false;
 
data->anon_field = !DECL_NAME (s1);
+   data->pointedto = false;
 
data->cache = 
if (!comptypes_internal (TREE_TYPE (s1), TREE_TYPE (s2), data))
diff --git 

[gcc r15-825] c: Fix for some variably modified types not being recognized [PR114831]

2024-05-24 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:9f1798c1a93257526196a3c19828e40fb28ac551

commit r15-825-g9f1798c1a93257526196a3c19828e40fb28ac551
Author: Martin Uecker 
Date:   Sat May 18 14:40:02 2024 +0200

c: Fix for some variably modified types not being recognized [PR114831]

We did not evaluate expressions with variably modified types correctly
in typeof and did not produce warnings when jumping over declarations
using typeof.  After addressof or array-to-pointer decay we construct
new pointer types that have to be marked variably modified if the pointer
target is variably modified.

2024-05-18 Martin Uecker  

PR c/114831
gcc/c/
* c-typeck.cc (array_to_pointer_conversion, build_unary_op):
Propagate flag to pointer target.

gcc/testsuite/
* gcc.dg/pr114831-1.c: New test.
* gcc.dg/pr114831-2.c: New test.
* gcc.dg/gnu23-varmod-1.c: New test.
* gcc.dg/gnu23-varmod-2.c: New test.

Diff:
---
 gcc/c/c-typeck.cc |  9 +
 gcc/testsuite/gcc.dg/gnu23-varmod-1.c | 12 
 gcc/testsuite/gcc.dg/gnu23-varmod-2.c | 16 
 gcc/testsuite/gcc.dg/pr114831-1.c | 27 +++
 gcc/testsuite/gcc.dg/pr114831-2.c | 16 
 5 files changed, 80 insertions(+)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7ecca9f58c6..2d092357e0f 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1891,8 +1891,12 @@ array_to_pointer_conversion (location_t loc, tree exp)
 
   copy_warning (exp, orig_exp);
 
+  bool varmod = C_TYPE_VARIABLY_MODIFIED (restype);
+
   ptrtype = build_pointer_type (restype);
 
+  C_TYPE_VARIABLY_MODIFIED (ptrtype) = varmod;
+
   if (INDIRECT_REF_P (exp))
 return convert (ptrtype, TREE_OPERAND (exp, 0));
 
@@ -4630,6 +4634,7 @@ build_unary_op (location_t location, enum tree_code code, 
tree xarg,
   tree eptype = NULL_TREE;
   const char *invalid_op_diag;
   bool int_operands;
+  bool varmod;
 
   int_operands = EXPR_INT_CONST_OPERANDS (xarg);
   if (int_operands)
@@ -5113,8 +5118,12 @@ build_unary_op (location_t location, enum tree_code 
code, tree xarg,
   gcc_assert (TREE_CODE (arg) != COMPONENT_REF
  || !DECL_C_BIT_FIELD (TREE_OPERAND (arg, 1)));
 
+  varmod = C_TYPE_VARIABLY_MODIFIED (argtype);
+
   argtype = build_pointer_type (argtype);
 
+  C_TYPE_VARIABLY_MODIFIED (argtype) = varmod;
+
   /* ??? Cope with user tricks that amount to offsetof.  Delete this
 when we have proper support for integer constant expressions.  */
   val = get_base_address (arg);
diff --git a/gcc/testsuite/gcc.dg/gnu23-varmod-1.c 
b/gcc/testsuite/gcc.dg/gnu23-varmod-1.c
new file mode 100644
index 000..add10d13573
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu23-varmod-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } 
+ * { dg-options "-std=gnu23" } */
+
+int foo(int n)
+{
+   int (*a(void))[n] { return 0; };
+   goto err;   /* { dg-error "jump into scope" "variably modified" } */
+   typeof((n++,a)) b2; 
+err:
+   return n;
+}
+
diff --git a/gcc/testsuite/gcc.dg/gnu23-varmod-2.c 
b/gcc/testsuite/gcc.dg/gnu23-varmod-2.c
new file mode 100644
index 000..c36af1d1647
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu23-varmod-2.c
@@ -0,0 +1,16 @@
+/* { dg-do run } 
+ * { dg-options "-std=gnu23" } */
+
+int foo(int n)
+{
+   int (*a(void))[n] { return 0; };
+   typeof((n++,a)) b2;
+   return n;
+}
+
+int main()
+{
+   if (2 != foo(1))
+   __builtin_abort();
+}
+
diff --git a/gcc/testsuite/gcc.dg/pr114831-1.c 
b/gcc/testsuite/gcc.dg/pr114831-1.c
new file mode 100644
index 000..ed30a494b3c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114831-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23" } */
+
+void f(int n)
+{
+   int a[n];
+   goto foo;   /* { dg-error "jump into scope" "variably modified" } */
+   typeof(a) b1;   
+foo:
+}
+
+void g(int n)
+{
+   int a2[1][n];
+   goto foo;   /* { dg-error "jump into scope" "variably modified" } */
+   typeof((n++,a2)) b2;
+foo:
+}
+
+void h(int n)
+{
+   int a[n];
+   typeof(a) b1;   
+   goto foo;   /* { dg-error "jump into scope" "variably modified" } */
+   typeof() b;
+foo:
+}
diff --git a/gcc/testsuite/gcc.dg/pr114831-2.c 
b/gcc/testsuite/gcc.dg/pr114831-2.c
new file mode 100644
index 000..ecfd87988c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114831-2.c
@@ -0,0 +1,16 @@
+/* { dg-do run } 
+ * { dg-options "-std=c23" } */
+
+int foo(int n)
+{
+   int a[1][n];
+   typeof((n++,a)) b2;
+   return n;
+}
+
+int main()
+{
+   if (2 != foo(1))
+   __builtin_abort();
+}
+


Re: strtol(3) with QChar arguments

2024-05-05 Thread Martin Uecker via Gcc
Am Sonntag, dem 05.05.2024 um 15:13 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> I was wondering why C23 didn't use QChar for strtol(3).  It has the same
> problems that string functions have: a const input string and a
> non-const output string (the endptr).

I am not sure whether strtol was discussed.

> 
> I think endptr should have the same constness of the string passed to
> strtol(3), no?
> 
> Should this be addressed for C3x?  For liba2i.git, I'm working on
> const-generic versions of strtol(3) wrappers, which have helped simplify
> the const/non-const mix of pointers in shadow.git.

One potential issue is that for strtol such a change would break
all callers that pass a const-qualified pointer as first argument
and provide an argument for enptr second, which now has to be
a pointer to a non-const pointer.

For the functions we changed this breaks only cases where
a const qualified pointer is passed and then the result
is assigned to a non-const pointer, which could already be
considered questionable in existing code.

Martin

> 
> Have a lovely day!
> Alex
> 



Re: Generated files in libgfortran for Fortran intrinsic procedures (was: Updated Sourceware infrastructure plans)

2024-04-18 Thread Martin Uecker via Gcc
Am Donnerstag, dem 18.04.2024 um 14:01 +0200 schrieb Tobias Burnus:
> Hi Janne,
> 
> Janne Blomqvist wrote:
> > back when I was active I did think about this
> > issue. IMHO the best of my ideas was to convert these into C++
> > templates.

I haven't looked at libgfortran but I didn't find it problematic
at all to use C in similar numerical code and this helps
with portability. 

Either I use macros, which I keep short and then do not find
inferior to templates (having used C++ for years previously) or 
- if there is really a lot of code that needs to be specialized 
for a type - simply by using includes:

#define matmul_type double
#include "matmul_impl.c"

Martin


> 
> I think this will work – but we have to be super careful:
> 
> With C++, there is the problem that we definitely do not want to add 
> dependency on libstdc++ nor to use some features which require special 
> hardware support (like exceptions [always bad], symbol aliases, ...). — 
> On some systems, a full C++ support might be not available, like 
> embedded systems (including some odd embedded OS) or offloading devices.
> 
> The libstdc++ dependency would be detected by linking as we currently 
> do. For in-language features, we have to ensure the appropriate flags 
> -fno-exceptions (and probably a few more). And it should be clear what 
> language features to use.
> 
> If we do, I think that would surely be an option.
> 
> > What we're essentially doing with the M4 stuff and the
> > proposed in-house Python reimplementation is to make up for lack of
> > monomorphization in plain old C. Rather than doing some DIY templates,
> > switch the implementation language to something which has that feature
> > built-in, in this case C++.  No need to convert the entire libgfortran
> > to C++ if you don't want to, just those objects that are generated
> > from the M4 templates. Something like
> > 
> > template
> > void matmul(T* a, T* b, T* c, ...)
> > {
> > // actual matmul code here
> > }
> > 
> > extern "C" {
> >// Instantiate template for every type and export the symbol
> >void matmul_r4(gfc_array_r4* a, gfc_array_r4* b, gfc_array_r4* c, ...)
> >{
> >  matmul(a, b, c, ...);
> >}
> >// And so on for other types
> > }
> 
> Cheers,
> 
> Tobias



Re: Sourceware mitigating and preventing the next xz-backdoor

2024-04-06 Thread Martin Uecker via Gcc
Am Samstag, dem 06.04.2024 um 15:00 +0200 schrieb Richard Biener:
> On Fri, Apr 5, 2024 at 11:18 PM Andrew Sutton via Gcc  wrote:
> > 
> > > 
> > > 
> > > 
> > > > I think the key difference here is that Autotools allows arbitrarily
> > > generated code to be executed at any time. More modern build systems
> > > require the use of specific commands/files to run arbitrary code, e.g.
> > > CMake (IIRC [`execute_process()`][2] and [`ExternalProject`][3]), Meson
> > > ([`run_command()`][1]), Cargo ([`build.rs`][4]).\
> > > 
> > > To me it seems that Cargo is the absolute worst case with respect to
> > > supply chain attacks.
> > > 
> > > It pulls in dependencies recursively from a relatively uncurated
> > > list of projects, puts the source of all those dependencies into a
> > > hidden directory in home, and runs Build.rs automatically with
> > > user permissions.
> > > 
> > 
> > 100% this. Wait until you learn how proc macros work.
> 
> proc macro execution should be heavily sandboxed, otherwise it seems
> compiling something is enough to get arbitrary code executed with the
> permission of the compiling user.  I mean it's not rocket science - browsers
> do this for javascript.  Hmm, we need a webassembly target ;)

This would be useful anyhow. 

And locking down the compiler using landlock to only access specified
files / directories would also be nice in general.

Martin





[gcc r14-9805] Revert "Fix ICE with -g and -std=c23 related to incomplete types [PR114361]"

2024-04-05 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:8057f9aa1f7e70490064de796d7a8d42d446caf8

commit r14-9805-g8057f9aa1f7e70490064de796d7a8d42d446caf8
Author: Martin Uecker 
Date:   Fri Apr 5 12:14:56 2024 +0200

Revert "Fix ICE with -g and -std=c23 related to incomplete types [PR114361]"

This reverts commit 871bb5ad2dd56343d80b6a6d269e85efdce5  because it
breaks LTO and needs a bit more work. See PR 114574.

Diff:
---
 gcc/c/c-decl.cc |  1 -
 gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c | 14 --
 gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c | 13 -
 gcc/testsuite/gcc.dg/pr114361.c | 11 ---
 4 files changed, 39 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f2083b9d96f..c747abe9f4e 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9722,7 +9722,6 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t);
   C_TYPE_VARIABLY_MODIFIED (x) = C_TYPE_VARIABLY_MODIFIED (t);
   C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE;
-  TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
 }
 
   /* Update type location to the one of the definition, instead of e.g.
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
deleted file mode 100644
index 82d652569e9..000
--- a/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
+++ /dev/null
@@ -1,14 +0,0 @@
-/* { dg-do compile }
- * { dg-options "-std=c23 -g" } */
-
-struct a;
-typedef struct a b;
-
-void g() {
-struct a { b* x; };
-}
-
-struct a { b* x; };
-
-
-
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
deleted file mode 100644
index bc47a04ece5..000
--- a/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
+++ /dev/null
@@ -1,13 +0,0 @@
-/* { dg-do compile }
- * { dg-options "-std=c23 -g" } */
-
-struct a;
-typedef struct a b;
-
-void f() {
-   extern struct a { b* x; } t;
-}
-
-extern struct a { b* x; } t;
-
-
diff --git a/gcc/testsuite/gcc.dg/pr114361.c b/gcc/testsuite/gcc.dg/pr114361.c
deleted file mode 100644
index 0f3feb53566..000
--- a/gcc/testsuite/gcc.dg/pr114361.c
+++ /dev/null
@@ -1,11 +0,0 @@
-/* PR c/114361 */
-/* { dg-do compile } */
-/* { dg-options "-std=gnu23 -g" } */
-
-void f()
-{
-typedef struct foo bar;
-typedef __typeof( ({ (struct foo { bar *x; }){ }; }) ) wuz;
-struct foo { wuz *x; };
-}
-


Re: Sourceware mitigating and preventing the next xz-backdoor

2024-04-03 Thread Martin Uecker via Gcc
Am Mittwoch, dem 03.04.2024 um 13:46 -0500 schrieb Jonathon Anderson via Gcc:
> Hello all,
> 
> On Wed, 2024-04-03 at 16:00 +0200, Michael Matz wrote:
> > > My take a way is that software needs to become less complex. Do 
> > > we really still need complex build systems such as autoconf?
> > 
> > (And, FWIW, testing for features isn't "complex".  And have you looked at 
> > other build systems?  I have, and none of them are less complex, just 
> > opaque in different ways from make+autotools).
> 
> Some brief opinions from a humble end-user:
> 
> I think the key difference here is that Autotools allows arbitrarily 
> generated code to be executed at any time. More modern build systems require 
> the use of specific commands/files to run arbitrary code, e.g. CMake (IIRC 
> [`execute_process()`][2] and [`ExternalProject`][3]), Meson 
> ([`run_command()`][1]), Cargo ([`build.rs`][4]).\

To me it seems that Cargo is the absolute worst case with respect to
supply chain attacks.

It pulls in dependencies recursively from a relatively uncurated
list of projects, puts the source of all those dependencies into a 
hidden directory in home, and runs Build.rs automatically with
user permissions.

Martin





> IMHO there are similarities here to the memory "safety" of Rust: Rust code 
> can have memory errors, but it can only come from Rust code declared as 
> `unsafe` (or bugs in the compiler itself). The scope is limited and those 
> scopes can be audited with more powerful microscopes... and removed if/when 
> the build system gains first-class support upstream.
> 
> There are other features in the newest build systems listed here (Meson and 
> Cargo) that make this particular attack vector harder. These build systems 
> don't have release tarballs with auxiliary files (e.g. [Meson's is very close 
> to `git archive`][5]), nor do their DSLs allow writing files to the source 
> tree. One could imagine a build/CI worker where the source tree is a 
> read-only bind-mount of a `git archive` extract, that might help defend 
> against attacks of this specific design.
> 
> It's also worth noting that Meson and Cargo use non-Turing-complete 
> declarative DSLs for their build configuration. This implies there is an 
> upper bound on the [cyclomatic complexity][6]-per-line of the build script 
> DSL itself. That doesn't mean you can't write complex Meson code (I have), 
> but it ends up being suspiciously long and thus clear something complex and 
> out of the ordinary is being done.
> 
> Of course, this doesn't make the build system any less complex, but projects 
> using newer build systems seem easier to secure and audit than those using 
> overly flexible build systems like Autotools and maybe even CMake. IMHO using 
> a late-model build system is a relatively low technical hurdle to overcome 
> for the benefits noted above, switching should be considered and in a 
> positive light.
> 
> (For context: my team recently switched our main C/C++ project from Autotools 
> to Meson. The one-time refactor itself was an effort, but after that we got 
> back up to speed quickly and we haven't looked back. Other projects may have 
> an easier time using an unofficial port in the [Meson WrapDB][7] as a 
> starting point.)
> 
> -Jonathon
> 
> [1]: https://mesonbuild.com/External-commands.html
> [2]: 
> https://cmake.org/cmake/help/latest/command/execute_process.html#execute-process
> [3]: https://cmake.org/cmake/help/latest/module/ExternalProject.html
> [4]: https://doc.rust-lang.org/cargo/reference/build-scripts.html
> [5]: https://mesonbuild.com/Creating-releases.html
> [6]: https://en.wikipedia.org/wiki/Cyclomatic_complexity
> [7]: https://mesonbuild.com/Wrapdb-projects.html



Re: Sourceware mitigating and preventing the next xz-backdoor

2024-04-03 Thread Martin Uecker via Gcc
Am Mittwoch, dem 03.04.2024 um 18:02 +0200 schrieb Michael Matz:
> Hello,
> 
> On Wed, 3 Apr 2024, Martin Uecker wrote:
> 
> > The backdoor was hidden in a complicated autoconf script...
> 
> Which itself had multiple layers and could just as well have been a 
> complicated cmake function.

Don't me wrong, cmake is no way better. Another problem was 
actually hidden in a cmake test in upstream git in plain
sight:

https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05edd14dedfe63833f8ccbe41b55823b00;hp=af071ef7702debef4f1d324616a0137a5001c14c

> 
> > > (And, FWIW, testing for features isn't "complex".  And have you looked at 
> > > other build systems?  I have, and none of them are less complex, just 
> > > opaque in different ways from make+autotools).
> > 
> > I ask a very specific question: To what extend is testing 
> > for features instead of semantic versions and/or supported
> > standards still necessary?
> 
> I can't answer this with absolute certainty, but points to consider: the 
> semantic versions need to be maintained just as well, in some magic way.  

It would certainly need to be maintained just like now autoconf
configuration needs to be maintained.

> Because ultimately software depend on features of dependencies not on 
> arbitrary numbers given to them.  The numbers encode these features, in 
> the best case, when there are no errors.  So, no, version numbers are not 
> a replacement for feature tests, they are a proxy.  One that is manually 
> maintained, and hence prone to errors.

Tests are also prone to errors and - as the example above shows -
also very fragile and susceptible to manipulation.

> 
> Now, supported standards: which one? ;-)  Or more in earnest: while on 
> this mailing list here we could chose a certain set, POSIX, some 
> languages, Windows, MacOS (versions so-and-so).  What about other 
> software relying on other 3rdparty feature providers (libraries or system 
> services)?  Without standards?
> 
> So, without absolute certainty, but with a little bit of it: yes, feature 
> tests are required in general.  That doesn't mean that we could not 
> do away with quite some of them for (e.g.) GCC, those that hold true on 
> any platform we support.  But we can't get rid of the infrastructure for 
> that, and can't get rid of certain classes of tests.
> 
> > This seems like a problematic approach that may have been necessary 
> > decades ago, but it seems it may be time to move on.
> 
> I don't see that.  Many aspects of systems remain non-standardized.

This is just part of the problem.

Martin

> 
> 
> Ciao,
> Michael.



Re: Sourceware mitigating and preventing the next xz-backdoor

2024-04-03 Thread Martin Uecker via Gcc
Am Mittwoch, dem 03.04.2024 um 16:00 +0200 schrieb Michael Matz:
> Hello,
> 
> On Wed, 3 Apr 2024, Martin Uecker via Gcc wrote:
> 
> > > > Seems reasonable, but note that it wouldn't make any difference to
> > > > this attack.  The liblzma library was modified to corrupt the sshd
> > > > binary, when sshd was linked against liblzma.  The actual attack
> > > > occurred via a connection to a corrupt sshd.  If sshd was running as
> > > > root, as is normal, the attacker had root access to the machine.  None
> > > > of the attacking steps had anything to do with having root access
> > > > while building or installing the program.
> > 
> > There does not seem a single good solution against something like this.
> > 
> > My take a way is that software needs to become less complex. Do 
> > we really still need complex build systems such as autoconf?
> 
> Do we really need complex languages like C++ to write our software in?  
> SCNR :)  

Probably not.

> Complexity lies in the eye of the beholder, but to be honest in 
> the software that we're dealing with here, the build system or autoconf 
> does _not_ come to mind first when thinking about complexity.

The backdoor was hidden in a complicated autoconf script...

> 
> (And, FWIW, testing for features isn't "complex".  And have you looked at 
> other build systems?  I have, and none of them are less complex, just 
> opaque in different ways from make+autotools).

I ask a very specific question: To what extend is testing 
for features instead of semantic versions and/or supported
standards still necessary?  This seems like a problematic approach
that  may have been necessary decades ago, but it seems it may be
time to move on.


Martin




Re: Sourceware mitigating and preventing the next xz-backdoor

2024-04-03 Thread Martin Uecker via Gcc
Am Dienstag, dem 02.04.2024 um 13:28 -0700 schrieb Ian Lance Taylor via Gcc:
> > On Tue, Apr 2, 2024 at 1:21 PM Paul Koning via Gcc  wrote:
> > > > 
> > > > Would it help to require (rather than just recommend) "don't use root 
> > > > except for the actual 'install' step" ?
> > 
> > Seems reasonable, but note that it wouldn't make any difference to
> > this attack.  The liblzma library was modified to corrupt the sshd
> > binary, when sshd was linked against liblzma.  The actual attack
> > occurred via a connection to a corrupt sshd.  If sshd was running as
> > root, as is normal, the attacker had root access to the machine.  None
> > of the attacking steps had anything to do with having root access
> > while building or installing the program.

There does not seem a single good solution against something like this.

My take a way is that software needs to become less complex. Do 
we really still need complex build systems such as autoconf?
Are there still so many different configurations with subtle differences 
that every single feature needs to be tested individually by
running code at build time?

Martin






[gcc r14-9763] Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

2024-04-02 Thread Martin Uecker via Gcc-cvs
https://gcc.gnu.org/g:871bb5ad2dd56343d80b6a6d269e85efdce5

commit r14-9763-g871bb5ad2dd56343d80b6a6d269e85efdce5
Author: Martin Uecker 
Date:   Thu Mar 28 19:15:40 2024 +0100

Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

We did not copy TYPE_CANONICAL to the incomplete variants when
completing a structure.

PR c/114361

gcc/c/
* c-decl.cc (finish_struct): Set TYPE_CANONICAL when completing
strucute types.

gcc/testsuite/
* gcc.dg/pr114361.c: New test.
* gcc.dg/c23-tag-incomplete-1.c: New test.
* gcc.dg/c23-tag-incomplete-2.c: New test.

Diff:
---
 gcc/c/c-decl.cc |  1 +
 gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c | 14 ++
 gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c | 13 +
 gcc/testsuite/gcc.dg/pr114361.c | 11 +++
 4 files changed, 39 insertions(+)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index c747abe9f4e..f2083b9d96f 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9722,6 +9722,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t);
   C_TYPE_VARIABLY_MODIFIED (x) = C_TYPE_VARIABLY_MODIFIED (t);
   C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE;
+  TYPE_CANONICAL (x) = TYPE_CANONICAL (t);
 }
 
   /* Update type location to the one of the definition, instead of e.g.
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
new file mode 100644
index 000..82d652569e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-incomplete-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+struct a;
+typedef struct a b;
+
+void g() {
+struct a { b* x; };
+}
+
+struct a { b* x; };
+
+
+
diff --git a/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c 
b/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
new file mode 100644
index 000..bc47a04ece5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-tag-incomplete-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile }
+ * { dg-options "-std=c23 -g" } */
+
+struct a;
+typedef struct a b;
+
+void f() {
+   extern struct a { b* x; } t;
+}
+
+extern struct a { b* x; } t;
+
+
diff --git a/gcc/testsuite/gcc.dg/pr114361.c b/gcc/testsuite/gcc.dg/pr114361.c
new file mode 100644
index 000..0f3feb53566
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr114361.c
@@ -0,0 +1,11 @@
+/* PR c/114361 */
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23 -g" } */
+
+void f()
+{
+typedef struct foo bar;
+typedef __typeof( ({ (struct foo { bar *x; }){ }; }) ) wuz;
+struct foo { wuz *x; };
+}
+


Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 14:21 +0100 schrieb Richard Biener:
> On Mon, Mar 18, 2024 at 12:56 PM Martin Uecker  wrote:
> > 
> > Am Montag, dem 18.03.2024 um 11:55 +0100 schrieb Martin Uecker:
> > > Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> > > > On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > > > 
> > > 
> > > > 
> > > > Let me give you an complication example made valid in C++:
> > > > 
> > > > struct B { float x; float y; };
> > > > struct X { int n; char buf[8]; } x, y;
> > > > 
> > > > void foo(struct B *b)
> > > > {
> > > >   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
> > > 
> > > Let's make it an explicit store for the moment
> > > (should not make a difference though):
> > > 
> > > *(struct B*)x.buf = *b;
> > > 
> > > >   y = x; // (*)
> > > > }
> > > > 
> > > > What's the effective type of 'x' in the 'y = x' copy?
> > > 
> > > Good point. The existing wording would take the declared
> > > type of x as the effective type, but this may not be
> > > what you are interested in. Let's assume that x has no declared
> > > type but that it had effective type struct X before the
> > > store to x.buf (because of an even earlier store to
> > > x with type struct X).
> > > 
> > > There is a general question how stores to subobjects
> > > affect effective types and I do not think this is clear
> > > even before this proposed change.
> > 
> > Actually, I think this is not allowed because:
> > 
> > "An object shall have its stored value accessed only by an
> > lvalue expression that has one of the following types:
> > 
> > — a type compatible with the effective type of the object,
> > ...
> > — an aggregate or union type that includes one of the
> > aforementioned types among its members (including,
> > recursively, a member of a subaggregate or contained union), or
> > 
> > — a character type."
> > 
> > ... and we would need to move "a character type" above
> > in the list to make it defined.
> 
> So after
> 
> *(struct B*)x.buf = *b;
> 
> 'x' cannot be used to access itself?  In particular also
> an access to 'x.n' is affected by this?

According to the current wording and assuming x has no
a declared type,  x.buf would acquire an effective 
type of struct B. Then if  x.buf is read as part of 
a x it is accessed with an lvalue of struct X (which
does not include a struct B but a character buffer).

So yes, currently it would  be undefined behavior 
and the proposed wording would not change this. Clearly,
we should include an additional change to fix this.

> 
> You are right that the current wording of the standard doesn't
> clarify any of this but this kind of storage abstraction is used
> commonly in the embedded world when there's no runtime
> library providing allocation.  And you said you want to make
> the standard closer to implementation practice ...

Well, we are working on it... Any help is much appreciated.

> 
> Elsewhere when doing 'y = x' people refer to the wording that
> aggregates are copied elementwise but it's not specified how
> those elementwise accesses work - the lvalues are still of type
> X here or are new lvalues implicitly formed and fall under the
> other wordings? 

I think there is no wording for elementwise copy.

My understanding is that the 

"...an aggregate or union type that includes..."

wording above is supposed to define this via an lvalue
access with aggregate or union type.  It blesses the
implied access to the elements via the access with 
an lvalue which has the type of the aggregate.  


>  Can I thus form an effective type of X by
> storing it's subobjects at respective offsets (ignoring padding,
> for example) and can I then use an lvalue of type 'X' to access
> the whole aggregate?

I think this is defined behavior.  The subjects get
their effective types via the individual stores and then 
the access using lvalue of type 'X' is ok according to
the "..an aggregate or union type that includes.."
rule.


Martin




Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
very awkward or inefficient 
> to write code that is completely "safe" (in terms of having fully 
> defined behaviour from the C standards or from implementation-dependent 
> behaviour).  Making your own dynamic memory allocation functions is one 
> such case.  So I have a tendency to jump on any suggestion of changes to 
> the C (or C++) standards that could let people write such essential code 
> in a safer or more efficient manner.

That something is undefined does not automatically mean it is 
forbidden or unsafe.  It simply means it is not portable.  I think
in the embedded space it will be difficult to make everything well
defined.  But I fully agree that widely used techniques should
ideally be based on defined behavior and we should  change the
standard accordingly.

> 
> > > (It is also not uncommon to have the backing space allocated by the
> > > linker, but then it falls under the existing "no declared type" case.)
> > 
> > Yes, although with the change we would make the "no declared type" also
> > be byte arrays, so there is then simply no difference anymore.
> > 
> 
> Fair enough.  (Linker-defined storage does not just have no declared 
> type, it has no directly declared size or other properties either.  The 
> start and the stop of the storage area is typically declared as "extern 
> uint8_t __space_start[], __space_stop[];", or perhaps as single 
> characters or uint32_t types.  The space in between is just calculated 
> as the difference between pointers to these.)
> 
> > > 
> > > 
> > > I would not want uint32_t to be considered an "alias anything" type, but
> > > I have occasionally seen such types used for memory store backings.  It
> > > is perhaps worth considering defining "byte type" as "non-atomic
> > > character type, [u]int8_t (if they exist), or other
> > > implementation-defined types".
> > 
> > This could make sense, the question is whether we want to encourage
> > the use of other types for this use case, as this would then not
> > be portable.
> 
> I think uint8_t should be highly portable, except to targets where it 
> does not exist (and in this day and age, that basically means some DSP 
> devices that have 16-bit, 24-bit or 32-bit char).
> 
> There is precedence for this wording, however, in 6.7.2.1p5 for 
> bit-fields - "A bit-field shall have a type that is a qualified or 
> unqualified version of _Bool, signed int, unsigned int, or some other 
> implementation-defined type".
> 
> I think it should be clear enough that using an implementation-defined 
> type rather than a character type would potentially limit portability. 
> For the kinds of systems I am thinking off, extreme portability is 
> normally not of prime concern - efficiency on a particular target with a 
> particular compiler is often more important.

Thanks, I will bring back this information to WG14.
> 
> > 
> > Are there important reason for not using "unsigned char" ?
> > 
> 
> What is "important" is often a subjective matter.  One reason many 
> people use "uint8_t" is that they prefer to be explicit about sizes, and 
> would rather have a hard error if the code is used on a target that 
> doesn't support the size.  Some coding standards, such as the very 
> common (though IMHO somewhat flawed) MISRA standard, strongly encourage 
> size-specific types and consider the use of "int" or "unsigned char" as 
> a violation of their rules and directives.  Many libraries and code 
> bases with a history older than C99 have their own typedef names for 
> size-specific types or low-level storage types, such as "sys_uint8", 
> "BYTE", "u8", and so on, and users may prefer these for consistency. 
> And for people with a background in hardware or assembly (not uncommon 
> for small systems embedded programming), or other languages such as 
> Rust, "unsigned char" sounds vague, poorly defined, and somewhat 
> meaningless as a type name for a raw byte of memory or a minimal sized 
> unsigned integer.
> 
> Of course most alternative names for bytes would be typedefs of 
> "unsigned char" and therefore work just the same way.  But as noted 
> before, uint8_t could be defined in another manner on some systems (and 
> on GCC for the AVR, it /is/ defined in a different way - though I have 
> no idea why).
> 
> And bigger types, such as uint32_t, have been used to force alignment 
> for backing store (either because the compiler did not support _Alignas, 
> or the programmer did not know about it).  (But I am not suggesting that 
> plain &qu

Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 11:55 +0100 schrieb Martin Uecker:
> Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> > On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > 
> 
> > 
> > Let me give you an complication example made valid in C++:
> > 
> > struct B { float x; float y; };
> > struct X { int n; char buf[8]; } x, y;
> > 
> > void foo(struct B *b)
> > {
> >   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);
> 
> Let's make it an explicit store for the moment
> (should not make a difference though):
> 
> *(struct B*)x.buf = *b;
> 
> >   y = x; // (*)
> > }
> > 
> > What's the effective type of 'x' in the 'y = x' copy? 
> 
> Good point. The existing wording would take the declared
> type of x as the effective type, but this may not be
> what you are interested in. Let's assume that x has no declared
> type but that it had effective type struct X before the
> store to x.buf (because of an even earlier store to 
> x with type struct X).
> 
> There is a general question how stores to subobjects
> affect effective types and I do not think this is clear
> even before this proposed change.

Actually, I think this is not allowed because:

"An object shall have its stored value accessed only by an
lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,
...
— an aggregate or union type that includes one of the
aforementioned types among its members (including,
recursively, a member of a subaggregate or contained union), or

— a character type."

... and we would need to move "a character type" above
in the list to make it defined.

Martin




Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc



Hi David,

Am Montag, dem 18.03.2024 um 10:00 +0100 schrieb David Brown:
> Hi,
> 
> I would very glad to see this change in the standards.
> 
> 
> Should "byte type" include all character types (signed, unsigned and 
> plain), or should it be restricted to "unsigned char" since that is the 
> "byte" type ?  (I think allowing all character types makes sense, but 
> only unsigned char is guaranteed to be suitable for general object 
> backing store.)

At the moment, the special type that can access all others are
all non-atomic character types.  So for symmetry reasons, it
seems that this is also what we want for backing store.

I am not sure what you mean by "only unsigned char". Are you talking
about C++?  "unsigned char" has no special role in C.

> 
> Should it also include "uint8_t" (if it exists) ?  "uint8_t" is often an 
> alias for "unsigned char", but it could be something different, like an 
> alias for __UINT8_TYPE__, or "unsigned int 
> __attribute__((mode(QImode)))", which is used in the AVR gcc port.

I think this might be a reason to not include it, as it could
affect aliasing analysis. At least, this would be a different
independent change to consider.

> 
> In my line of work - small-systems embedded development - it is common 
> to have "home-made" or specialised memory allocation systems rather than 
> relying on a generic heap.  This is, I think, some of the "existing 
> practice" that you are considering here - there is a "backing store" of 
> some sort that can be allocated and used as objects of a type other than 
> the declared type of the backing store.  While a simple unsigned char 
> array is a very common kind of backing store, there are others that are 
> used, and it would be good to be sure of the correctness guarantees for 
> these.  Possibilities that I have seen include:
> 
> unsigned char heap1[N];
> 
> uint8_t heap2[N];
> 
> union {
>   double dummy_for_alignment;
>   char heap[N];
> } heap3;
> 
> struct {
>   uint32_t capacity;
>   uint8_t * p_next_free;
>   uint8_t heap[N];
> } heap4;
> 
> uint32_t heap5[N];
> 
> Apart from this last one, if "uint8_t" is guaranteed to be a "byte 
> type", then I believe your wording means that these unions and structs 
> would also work as "byte arrays".  But it might be useful to add a 
> footnote clarifying that.
> 

I need to think about this. 

> (It is also not uncommon to have the backing space allocated by the 
> linker, but then it falls under the existing "no declared type" case.)

Yes, although with the change we would make the "no declared type" also 
be byte arrays, so there is then simply no difference anymore.

> 
> 
> I would not want uint32_t to be considered an "alias anything" type, but 
> I have occasionally seen such types used for memory store backings.  It 
> is perhaps worth considering defining "byte type" as "non-atomic 
> character type, [u]int8_t (if they exist), or other 
> implementation-defined types".

This could make sense, the question is whether we want to encourage
the use of other types for this use case, as this would then not
be portable.

Are there important reason for not using "unsigned char" ?

> 
> Some other compilers might guarantee not to do type-based alias analysis 
> and thus view all types as "byte types" in this way.  For gcc, there 
> could be a kind of reverse "may_alias" type attribute to create such types.
> 
> 
> 
> There are a number of other features that could make allocation 
> functions more efficient and safer in use, and which could be ideally be 
> standardised in the C standards or at least added as gcc extensions, but 
> I think that's more than you are looking for here!

It is possible to submit proposal to WG14.

Martin


> 
> David
> 
> 
> 
> On 18/03/2024 08:03, Martin Uecker via Gcc wrote:
> > 
> > Hi,
> > 
> > can you please take a quick look at this? This is intended to align
> > the C standard with existing practice with respect to aliasing by
> > removing the special rules for "objects with no declared type" and
> > making it fully symmetric and only based on types with non-atomic
> > character types being able to alias everything.
> > 
> > 
> > Unrelated to this change, I have another question:  I wonder if GCC
> > (or any other compiler) actually exploits the " or is copied as an
> > array of  byte type, " rule to  make assumptions about the effective
> > types of the target array? I know comp

Re: aliasing

2024-03-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.03.2024 um 09:26 +0100 schrieb Richard Biener:
> On Mon, Mar 18, 2024 at 8:03 AM Martin Uecker  wrote:
> > 
> > 
> > Hi,
> > 
> > can you please take a quick look at this? This is intended to align
> > the C standard with existing practice with respect to aliasing by
> > removing the special rules for "objects with no declared type" and
> > making it fully symmetric and only based on types with non-atomic
> > character types being able to alias everything.
> > 
> > 
> > Unrelated to this change, I have another question:  I wonder if GCC
> > (or any other compiler) actually exploits the " or is copied as an
> > array of  byte type, " rule to  make assumptions about the effective
> > types of the target array?
> 
> We do not make assumptions about this anymore.  We did in the
> past (might be a distant past) transform say
> 
> struct X { int i; float f; } a, b;
> 
> void foo ()
> {
>   __builtin_memcpy (, , sizeof (struct X));
> }
> 
> into
> 
>   a = b;
> 
> which has an lvalue of type struct X.  But this assumed b's effective
> type was X.  Nowadays we treat the copy as using alias set zero.
> That effectively means the destination gets its effective type "cleared"
> (all subsequent accesses are valid to access storage with the effective
> type of a byte array).

Ok, thanks!  I wonder whether we should remove this special rule
from the standard.  I mostly worried about the "copied as an
array of  byte type" wording which seems difficult to precisely
define.

> 
> > I know compilers do this work memcpy...
> > Maybe also if a loop is transformed to memcpy?
> 
> We currently do not preserve the original effective type of the destination
> (or the effective type used to access the source) when doing this.  With
> some tricks we could (we also lose aligment guarantees of the original
> accesses).
> 
> > Martin
> > 
> > 
> > Add the following definition after 3.5, paragraph 2:
> > 
> > byte array
> > object having either no declared type or an array of objects declared with 
> > a byte type
> > 
> > byte type
> > non-atomic character type

This essentially becomes the "alias anything" type.

> > 
> > Modify 6.5,paragraph 6:
> > The effective type of an object that is not a byte array, for an access to 
> > its
> > stored value, is the declared type of the object.97) If a value is
> > stored into a byte array through an lvalue having a byte type, then
> > the type of the lvalue becomes the effective type of the object for that
> > access and for subsequent accesses that do not modify the stored value.
> > If a value is copied into a byte array using memcpy or memmove, or is
> > copied as an array of byte type, then the effective type of the
> > modified object for that access and for subsequent accesses that do not
> > modify the value is the effective type of the object from which the
> > value is copied, if it has one. For all other accesses to a byte array,
> > the effective type of the object is simply the type of the lvalue used
> > for the access.
> 
> What's the purpose of this change?  To me this reads more confusing and
> complicated than what I find in the c23 draft from April last year.

Note that C23 has been finalized. This change is proposed for the
revision after c23. 

> 
> I'll note that GCC does not take advantage of "The effective type of an
> object for an access to its stored value is the declard type of the object",
> instead it always relies on the type of the lvalue (treating non-atomic
> character types specially, as well as treating all string ops like memcpy
> or strcpy as using a character type for the access) and the effective type
> of the object for that access and for subsequent accesses that do not
> modify the stored value always becomes that of the lvalue type used for
> the access.

Understood.

> 
> Let me give you an complication example made valid in C++:
> 
> struct B { float x; float y; };
> struct X { int n; char buf[8]; } x, y;
> 
> void foo(struct B *b)
> {
>   memcpy (x.buf, b, sizeof (struct B)); // in C++:  new (x.buf) B (*b);

Let's make it an explicit store for the moment
(should not make a difference though):

*(struct B*)x.buf = *b;

>   y = x; // (*)
> }
> 
> What's the effective type of 'x' in the 'y = x' copy? 

Good point. The existing wording would take the declared
type of x as the effective type, but this may not be
what you are interested in. Let's assume that x has no declared
type but that it had effective type struct X before the
store to x.buf (because of an even earlier store to 
x with type struct X).

There is a general question how stores to subobjects
affect effective types and I do not think this is clear
even before this proposed change.


>  With your new
> wording, does 'B' transfer to x.buf with memcpy?  

Yes, it would. At least this is the intention.

Note that this would currently be undefined behavior
because x.buf has a declared type. So this is main
thing we want to change, i.e. making this defined.

> What's the

aliasing

2024-03-18 Thread Martin Uecker via Gcc


Hi,

can you please take a quick look at this? This is intended to align
the C standard with existing practice with respect to aliasing by
removing the special rules for "objects with no declared type" and
making it fully symmetric and only based on types with non-atomic
character types being able to alias everything.


Unrelated to this change, I have another question:  I wonder if GCC
(or any other compiler) actually exploits the " or is copied as an
array of  byte type, " rule to  make assumptions about the effective
types of the target array? I know compilers do this work memcpy...  
Maybe also if a loop is transformed to memcpy?

Martin


Add the following definition after 3.5, paragraph 2:

byte array
object having either no declared type or an array of objects declared with a 
byte type

byte type
non-atomic character type

Modify 6.5,paragraph 6:
The effective type of an object that is not a byte array, for an access to its
stored value, is the declared type of the object.97) If a value is
stored into a byte array through an lvalue having a byte type, then
the type of the lvalue becomes the effective type of the object for that
access and for subsequent accesses that do not modify the stored value.
If a value is copied into a byte array using memcpy or memmove, or is 
copied as an array of byte type, then the effective type of the
modified object for that access and for subsequent accesses that do not
modify the value is the effective type of the object from which the
value is copied, if it has one. For all other accesses to a byte array,
the effective type of the object is simply the type of the lvalue used
for the access.

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3230.pdf




Re: Question on -fwrapv and -fwrapv-pointer

2023-09-20 Thread Martin Uecker via Gcc
Am Mittwoch, dem 20.09.2023 um 13:40 -0700 schrieb Kees Cook via Gcc:
> On Sat, Sep 16, 2023 at 10:36:52AM +0200, Martin Uecker wrote:
> > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > > > > > 
> > > > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > > > > > > > > CLANG already provided -fsanitize=unsigned-integer-overflow. 
> > > > > > > > > GCC
> > > > > > > > > might need to do the same.
> > > > > > > > 
> > > > > > > > NO. There is no such thing as unsigned integer overflow. That 
> > > > > > > > option
> > > > > > > > is badly designed and the GCC community has rejected a few 
> > > > > > > > times now
> > > > > > > > having that sanitizer before. It is bad form to have a 
> > > > > > > > sanitizer for
> > > > > > > > well defined code.
> > > > > > > 
> > > > > > > Even though unsigned integer overflow is well defined, it might be
> > > > > > > unintentional, shall we warn user about this?
> > > > > > 
> > > > > > *Everything* could be unintentional and should be warned then.  GCC 
> > > > > > is a
> > > > > > compiler, not an advanced AI educating the programmers.
> > > > > 
> > > > > Well, you are right in some sense. -:)
> > > > > 
> > > > > However, overflow is one important source for security flaws, it’s 
> > > > > important  for compilers to detect
> > > > > overflows in the programs in general.
> > > > 
> > > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > > point here. unsigned wraps and does NOT overflow. Yes there is a major
> > > > difference.
> > > 
> > > Right, yes. I will try to pick my language very carefully. :)
> > > 
> > > The practical problem I am trying to solve in the 30 million lines of
> > > Linux kernel code is that of catching arithmetic wrap-around. The
> > > problem is one of evolving the code -- I can't just drop -fwrapv and
> > > -fwrapv-pointer because it's not possible to fix all the cases at once.
> > > (And we really don't want to reintroduce undefined behavior.)
> > > 
> > > So, for signed, pointer, and unsigned types, we need:
> > > 
> > > a) No arithmetic UB -- everything needs to have deterministic behavior.
> > >The current solution here is "-fno-strict-overflow", which eliminates
> > >the UB and makes sure everything wraps.
> > > 
> > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> > >would work with -fsanitize=[signed-integer|pointer]-overflow except
> > >due to "a)" we always wrap. And there isn't currently coverage like
> > >this for unsigned (in GCC).
> > > 
> > > Our problem is that the kernel is filled with a mix of places where there
> > > is intended wrap-around and unintended wrap-around. We can chip away at
> > > fixing the intended wrap-around that we can find with static analyzers,
> > > etc, but at the end of the day there is a long tail of finding the places
> > > where intended wrap-around is hiding. But when the refactoring is
> > > sufficiently completely, we can move the wrap-around warning to a trap,
> > > and the kernel will not longer have this class of security flaw.
> > > 
> > > As a real-world example, here is a bug where a u8 wraps around causing
> > > an under-allocation that allowed for a heap overwrite:
> > > 
> > > https://git.kernel.org/linus/6311071a0562
> > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> > > 
> > > If there were more than 255 elements in a linked list, the allocation
> > > would be too small, and the second loop would write past the end of the
> > > allocation. This is a pretty classic allocation underflow and linear
> > > heap write overflow security flaw. (And it would be trivially stopped by
> > > trapping on the u8 wrap around.)
> > > 
> > > So, I want to be able to catch that at run-time. But we also have code
> > > doing things like "if (ulong + offset < ulong) { ... }":
> > > 
> > > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187
> > > 
> > > This is easy for a static analyzer to find and we can replace it with a
> > > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
> > > them all immediately, especially for the signed and pointer cases.
> > > 
> > > So, I need to retain the "everything wraps" behavior while still being
> > > able to detect when it happens.
> > 
> > 
> > Hi Kees,
> > 
> > I have a couple of questions:
> > 
> > Currently, my thinking was that you would use signed integers
> > if you want the usual integer arithmetic rules we know from
> > elementary school and if you overflow this is clearly a bug 
> > you can diagnose with UBsan.
> > 
> > There are people who think that signed overflow should be
> > defined to wrap, but I think this would be a severe
> > mistake because then code would start to rely on it, which
> > makes it then difficult to 

[C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]

2023-09-18 Thread Martin Uecker via Gcc-patches



Compared to the previous version I changed the name of the
warning to "Walloc-size" which matches "Wanalyzer-allocation-size"
but is still in line with the other -Walloc-something warnings
we have. I also added it to Wextra.

I found PR71219 that requests the warning and points out that 
it is recommended by the C secure coding guidelines and added
the PR to the commit log  (although the version with cast is not
diagnosed so far.)  

I did not have time to implement the extensions suggested
on the list,  i.e. warn when the size is not a multiple
of the size of the type and warn for if the size is not
suitable for a flexible array member. (this is also a bit
more complicated than it seems)

Bootstrapped and regression tested on x86_64.


Martin


Add option Walloc-size that warns about allocations that have
insufficient storage for the target type of the pointer the
storage is assigned to.

PR c/71219
gcc:
* doc/invoke.texi: Document -Walloc-size option.

gcc/c-family:

* c.opt (Walloc-size): New option.

gcc/c:
* c-typeck.cc (convert_for_assignment): Add warning.

gcc/testsuite:

* gcc.dg/Walloc-size-1.c: New test.
---
 gcc/c-family/c.opt   |  4 
 gcc/c/c-typeck.cc| 27 +
 gcc/doc/invoke.texi  | 10 
 gcc/testsuite/gcc.dg/Walloc-size-1.c | 36 
 4 files changed, 77 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/Walloc-size-1.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7348ad42ee0..9ba08a1fb6d 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -319,6 +319,10 @@ Walloca
 C ObjC C++ ObjC++ Var(warn_alloca) Warning
 Warn on any use of alloca.
 
+Walloc-size
+C ObjC Var(warn_alloc_size) Warning
+Warn when allocating insufficient storage for the target type of the assigned 
pointer.
+
 Walloc-size-larger-than=
 C ObjC C++ LTO ObjC++ Var(warn_alloc_size_limit) Joined Host_Wide_Int ByteSize 
Warning Init(HOST_WIDE_INT_MAX)
 -Walloc-size-larger-than=   Warn for calls to allocation functions 
that
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e2bfd2caf85..c759c6245ed 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -7384,6 +7384,33 @@ convert_for_assignment (location_t location, location_t 
expr_loc, tree type,
"request for implicit conversion "
"from %qT to %qT not permitted in C++", rhstype, type);
 
+  /* Warn of new allocations that are not big enough for the target
+type.  */
+  tree fndecl;
+  if (warn_alloc_size
+ && TREE_CODE (rhs) == CALL_EXPR
+ && (fndecl = get_callee_fndecl (rhs)) != NULL_TREE
+ && DECL_IS_MALLOC (fndecl))
+   {
+ tree fntype = TREE_TYPE (fndecl);
+ tree fntypeattrs = TYPE_ATTRIBUTES (fntype);
+ tree alloc_size = lookup_attribute ("alloc_size", fntypeattrs);
+ if (alloc_size)
+   {
+ tree args = TREE_VALUE (alloc_size);
+ int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
+ /* For calloc only use the second argument.  */
+ if (TREE_CHAIN (args))
+   idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN (args))) - 1;
+ tree arg = CALL_EXPR_ARG (rhs, idx);
+ if (TREE_CODE (arg) == INTEGER_CST
+ && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl)))
+warning_at (location, OPT_Walloc_size, "allocation of "
+"insufficient size %qE for type %qT with "
+"size %qE", arg, ttl, TYPE_SIZE_UNIT (ttl));
+   }
+   }
+
   /* See if the pointers point to incompatible address spaces.  */
   asl = TYPE_ADDR_SPACE (ttl);
   asr = TYPE_ADDR_SPACE (ttr);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 33befee7d6b..a4fbcf5e1b5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8086,6 +8086,16 @@ always leads to a call to another @code{cold} function 
such as wrappers of
 C++ @code{throw} or fatal error reporting functions leading to @code{abort}.
 @end table
 
+@opindex Wno-alloc-size
+@opindex Walloc-size
+@item -Walloc-size
+Warn about calls to allocation functions decorated with attribute
+@code{alloc_size} that specify insufficient size for the target type of
+the pointer the result is assigned to, including those to the built-in
+forms of the functions @code{aligned_alloc}, @code{alloca},
+@code{calloc},
+@code{malloc}, and @code{realloc}.
+
 @opindex Wno-alloc-zero
 @opindex Walloc-zero
 @item -Walloc-zero
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-1.c 
b/gcc/testsuite/gcc.dg/Walloc-size-1.c
new file mode 100644
index 000..61806f58192
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloc-size-1.c
@@ -0,0 +1,36 @@
+/* Tests the warnings for insufficient allocation size.
+   { dg-do compile }
+   { dg-options "-Walloc-size" }
+ * */
+#include 
+#include 
+
+struct 

Re: Question on -fwrapv and -fwrapv-pointer

2023-09-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.09.2023 um 10:47 +0200 schrieb Richard Biener via Gcc:
> On Mon, Sep 18, 2023 at 10:17 AM Martin Uecker  wrote:
> > 
> > Am Montag, dem 18.09.2023 um 09:31 +0200 schrieb Richard Biener via Gcc:
> > > On Sat, Sep 16, 2023 at 10:38 AM Martin Uecker via Gcc  
> > > wrote:
> > > > 
> > > > 
> > > > 
> > > > (moved to gcc@)
> > > > 
> > > > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > > > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > > > > > > > > > > CLANG already provided 
> > > > > > > > > > > -fsanitize=unsigned-integer-overflow. GCC
> > > > > > > > > > > might need to do the same.
> > > > > > > > > > 
> > > > > > > > > > NO. There is no such thing as unsigned integer overflow. 
> > > > > > > > > > That option
> > > > > > > > > > is badly designed and the GCC community has rejected a few 
> > > > > > > > > > times now
> > > > > > > > > > having that sanitizer before. It is bad form to have a 
> > > > > > > > > > sanitizer for
> > > > > > > > > > well defined code.
> > > > > > > > > 
> > > > > > > > > Even though unsigned integer overflow is well defined, it 
> > > > > > > > > might be
> > > > > > > > > unintentional, shall we warn user about this?
> > > > > > > > 
> > > > > > > > *Everything* could be unintentional and should be warned then.  
> > > > > > > > GCC is a
> > > > > > > > compiler, not an advanced AI educating the programmers.
> > > > > > > 
> > > > > > > Well, you are right in some sense. -:)
> > > > > > > 
> > > > > > > However, overflow is one important source for security flaws, 
> > > > > > > it’s important  for compilers to detect
> > > > > > > overflows in the programs in general.
> > > > > > 
> > > > > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > > > > point here. unsigned wraps and does NOT overflow. Yes there is a 
> > > > > > major
> > > > > > difference.
> > > > > 
> > > > > Right, yes. I will try to pick my language very carefully. :)
> > > > > 
> > > > > The practical problem I am trying to solve in the 30 million lines of
> > > > > Linux kernel code is that of catching arithmetic wrap-around. The
> > > > > problem is one of evolving the code -- I can't just drop -fwrapv and
> > > > > -fwrapv-pointer because it's not possible to fix all the cases at 
> > > > > once.
> > > > > (And we really don't want to reintroduce undefined behavior.)
> > > > > 
> > > > > So, for signed, pointer, and unsigned types, we need:
> > > > > 
> > > > > a) No arithmetic UB -- everything needs to have deterministic 
> > > > > behavior.
> > > > >The current solution here is "-fno-strict-overflow", which 
> > > > > eliminates
> > > > >the UB and makes sure everything wraps.
> > > > > 
> > > > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> > > > >would work with -fsanitize=[signed-integer|pointer]-overflow except
> > > > >due to "a)" we always wrap. And there isn't currently coverage like
> > > > >this for unsigned (in GCC).
> > > > > 
> > > > > Our problem is that the kernel is filled with a mix of places where 
> > > > > there
> > > > > is intended wrap-around and unintended wrap-around. We can chip away 
> > > > > at
> > > > > fixing the intended wrap-around that we can find wit

Re: Question on -fwrapv and -fwrapv-pointer

2023-09-18 Thread Martin Uecker via Gcc
Am Montag, dem 18.09.2023 um 09:31 +0200 schrieb Richard Biener via Gcc:
> On Sat, Sep 16, 2023 at 10:38 AM Martin Uecker via Gcc  
> wrote:
> > 
> > 
> > 
> > (moved to gcc@)
> > 
> > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > > > > > 
> > > > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > > > > > > > > CLANG already provided -fsanitize=unsigned-integer-overflow. 
> > > > > > > > > GCC
> > > > > > > > > might need to do the same.
> > > > > > > > 
> > > > > > > > NO. There is no such thing as unsigned integer overflow. That 
> > > > > > > > option
> > > > > > > > is badly designed and the GCC community has rejected a few 
> > > > > > > > times now
> > > > > > > > having that sanitizer before. It is bad form to have a 
> > > > > > > > sanitizer for
> > > > > > > > well defined code.
> > > > > > > 
> > > > > > > Even though unsigned integer overflow is well defined, it might be
> > > > > > > unintentional, shall we warn user about this?
> > > > > > 
> > > > > > *Everything* could be unintentional and should be warned then.  GCC 
> > > > > > is a
> > > > > > compiler, not an advanced AI educating the programmers.
> > > > > 
> > > > > Well, you are right in some sense. -:)
> > > > > 
> > > > > However, overflow is one important source for security flaws, it’s 
> > > > > important  for compilers to detect
> > > > > overflows in the programs in general.
> > > > 
> > > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > > point here. unsigned wraps and does NOT overflow. Yes there is a major
> > > > difference.
> > > 
> > > Right, yes. I will try to pick my language very carefully. :)
> > > 
> > > The practical problem I am trying to solve in the 30 million lines of
> > > Linux kernel code is that of catching arithmetic wrap-around. The
> > > problem is one of evolving the code -- I can't just drop -fwrapv and
> > > -fwrapv-pointer because it's not possible to fix all the cases at once.
> > > (And we really don't want to reintroduce undefined behavior.)
> > > 
> > > So, for signed, pointer, and unsigned types, we need:
> > > 
> > > a) No arithmetic UB -- everything needs to have deterministic behavior.
> > >The current solution here is "-fno-strict-overflow", which eliminates
> > >the UB and makes sure everything wraps.
> > > 
> > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> > >would work with -fsanitize=[signed-integer|pointer]-overflow except
> > >due to "a)" we always wrap. And there isn't currently coverage like
> > >this for unsigned (in GCC).
> > > 
> > > Our problem is that the kernel is filled with a mix of places where there
> > > is intended wrap-around and unintended wrap-around. We can chip away at
> > > fixing the intended wrap-around that we can find with static analyzers,
> > > etc, but at the end of the day there is a long tail of finding the places
> > > where intended wrap-around is hiding. But when the refactoring is
> > > sufficiently completely, we can move the wrap-around warning to a trap,
> > > and the kernel will not longer have this class of security flaw.
> > > 
> > > As a real-world example, here is a bug where a u8 wraps around causing
> > > an under-allocation that allowed for a heap overwrite:
> > > 
> > > https://git.kernel.org/linus/6311071a0562
> > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> > > 
> > > If there were more than 255 elements in a linked list, the allocation
> > > would be too small, and the second loop would write past the end of the
> > > allocation. This is a pretty classic allocation underflow and linear
> > > heap write overflow security flaw. (And it would be trivially stopped by
> > > trapping on the u8 wrap around

Re: Question on -fwrapv and -fwrapv-pointer

2023-09-16 Thread Martin Uecker via Gcc



(moved to gcc@)

> On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> > >
> > >
> > >
> > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > > >
> > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > >  CLANG already provided -fsanitize=unsigned-integer-overflow. GCC
> > >  might need to do the same.
> > > >>>
> > > >>> NO. There is no such thing as unsigned integer overflow. That option
> > > >>> is badly designed and the GCC community has rejected a few times now
> > > >>> having that sanitizer before. It is bad form to have a sanitizer for
> > > >>> well defined code.
> > > >>
> > > >> Even though unsigned integer overflow is well defined, it might be
> > > >> unintentional, shall we warn user about this?
> > > >
> > > > *Everything* could be unintentional and should be warned then.  GCC is a
> > > > compiler, not an advanced AI educating the programmers.
> > >
> > > Well, you are right in some sense. -:)
> > >
> > > However, overflow is one important source for security flaws, it’s 
> > > important  for compilers to detect
> > > overflows in the programs in general.
> > 
> > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > point here. unsigned wraps and does NOT overflow. Yes there is a major
> > difference.
> 
> Right, yes. I will try to pick my language very carefully. :)
> 
> The practical problem I am trying to solve in the 30 million lines of
> Linux kernel code is that of catching arithmetic wrap-around. The
> problem is one of evolving the code -- I can't just drop -fwrapv and
> -fwrapv-pointer because it's not possible to fix all the cases at once.
> (And we really don't want to reintroduce undefined behavior.)
> 
> So, for signed, pointer, and unsigned types, we need:
> 
> a) No arithmetic UB -- everything needs to have deterministic behavior.
>The current solution here is "-fno-strict-overflow", which eliminates
>the UB and makes sure everything wraps.
> 
> b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
>would work with -fsanitize=[signed-integer|pointer]-overflow except
>due to "a)" we always wrap. And there isn't currently coverage like
>this for unsigned (in GCC).
> 
> Our problem is that the kernel is filled with a mix of places where there
> is intended wrap-around and unintended wrap-around. We can chip away at
> fixing the intended wrap-around that we can find with static analyzers,
> etc, but at the end of the day there is a long tail of finding the places
> where intended wrap-around is hiding. But when the refactoring is
> sufficiently completely, we can move the wrap-around warning to a trap,
> and the kernel will not longer have this class of security flaw.
> 
> As a real-world example, here is a bug where a u8 wraps around causing
> an under-allocation that allowed for a heap overwrite:
> 
> https://git.kernel.org/linus/6311071a0562
> https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> 
> If there were more than 255 elements in a linked list, the allocation
> would be too small, and the second loop would write past the end of the
> allocation. This is a pretty classic allocation underflow and linear
> heap write overflow security flaw. (And it would be trivially stopped by
> trapping on the u8 wrap around.)
> 
> So, I want to be able to catch that at run-time. But we also have code
> doing things like "if (ulong + offset < ulong) { ... }":
> 
> https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187
> 
> This is easy for a static analyzer to find and we can replace it with a
> non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
> them all immediately, especially for the signed and pointer cases.
> 
> So, I need to retain the "everything wraps" behavior while still being
> able to detect when it happens.


Hi Kees,

I have a couple of questions:

Currently, my thinking was that you would use signed integers
if you want the usual integer arithmetic rules we know from
elementary school and if you overflow this is clearly a bug 
you can diagnose with UBsan.

There are people who think that signed overflow should be
defined to wrap, but I think this would be a severe
mistake because then code would start to rely on it, which
makes it then difficult to differentiate between bugs and
intended uses (e.g. the unfortunate situation you have 
with the kernel).

I assume you want to combine UBSan plus wrapping for
production use?  Or only for testing?   Or in other words:
why would testing UBSan and production with wrapping
not be sufficient to find and fix all bugs? 

Wrapping would not be correct because it may lead to
logic errors or use-after-free etc.  I assume it is still
preferred because it more deterministic than whatever comes
out of the optimizer assuming that overflow has UB. Is this
the reasoning applied here?


For unsigned the intended use case is 

Re: Complex numbers in compilers - upcoming GNU Tools Cauldron.

2023-09-12 Thread Martin Uecker via Gcc
Am Dienstag, dem 12.09.2023 um 11:25 +0200 schrieb Richard Biener via Gcc:
> On Tue, Sep 5, 2023 at 10:44 PM Toon Moene  wrote:
> > 
> > This is going to be an interesting discussion.
> > 
> > In the upcoming GNU Tools Cauldron meeting the representation of complex
> > numbers in GCC will be discussed from the following "starting point":
> > 
> > "Complex numbers are used to describe many physical phenomenons and are
> > of prime importance in data signal processing. Nevertheless, despite
> > being part of the C and C++ standards since C99, they are still not
> > completely first class citizens in mainstream compilers."
> > 
> > *This* is from the Fortran 66 Standard (http://moene.org/~toon/f66.pdf -
> > a photocopy of the 1966 Standard):
> > 
> > - - - - -
> > 
> > Chapter 4. Data Types:
> > ...
> > 4.2.4 Complex Type.
> > 
> > A complex datum is processor approximation to the value of a complex number.
> > ...
> > 
> > - - - - -
> > 
> > I can recall people complaining about the way complex arithmetic was
> > handled by compilers since the late 70s.
> > 
> > This is even obvious in weather forecasting software I have to deal with
> > *today* (all written in Fortran). Some models use complex variables to
> > encode the "spectral" (wave-decomposed) computations in parts where that
> > is useful - others just "degrade" those algorithms to explicitly use reals.
> 
> Lack of applications / benchmarks using complex numbers is also a
> problem for any work on this.
> 

I could probably provide some examples such as a FFT, 
complex Gaussian random number generation, mandelbrot
set computation, etc.

Martin






[C PATCH 1/6 v2] c: reorganize recursive type checking

2023-09-10 Thread Martin Uecker via Gcc-patches


Thanks Joseph, below is a a revised version of this patch
with slight additional changes to the comment of
tagged_types_tu_compatible_p.

ok for trunk? 

Martin

Am Mittwoch, dem 06.09.2023 um 20:59 + schrieb Joseph Myers:
> On Sat, 26 Aug 2023, Martin Uecker via Gcc-patches wrote:
> 
> > -static int
> > +static bool
> >  comp_target_types (location_t location, tree ttl, tree ttr)
> 
> The comment above this function should be updated to refer to returning 
> true, not to returning 1.  And other comments on common_pointer_type and 
> inside that function should be updated to refer to comp_target_types 
> returning true, not nonzero.
> 
> > @@ -1395,17 +1382,13 @@ free_all_tagged_tu_seen_up_to (const struct 
> > tagged_tu_seen_cache *tu_til)
> >  
> >  /* Return 1 if two 'struct', 'union', or 'enum' types T1 and T2 are
> > compatible.  If the two types are not the same (which has been
> > -   checked earlier), this can only happen when multiple translation
> > -   units are being compiled.  See C99 6.2.7 paragraph 1 for the exact
> > -   rules.  ENUM_AND_INT_P and DIFFERENT_TYPES_P are as in
> > -   comptypes_internal.  */
> > +   checked earlier).  */
> >  
> > -static int
> > +static bool
> >  tagged_types_tu_compatible_p (const_tree t1, const_tree t2,
> > - bool *enum_and_int_p, bool *different_types_p)
> > + struct comptypes_data* data)
> 
> Similarly, this comment should be updated for the new return type.  Also 
> the GNU style is "struct comptypes_data *data" with space before not after 
> '*'.
> 
> > @@ -1631,9 +1603,9 @@ tagged_types_tu_compatible_p (const_tree t1, 
> > const_tree t2,
> > Otherwise, the argument types must match.
> > ENUM_AND_INT_P and DIFFERENT_TYPES_P are as in comptypes_internal.  */
> >  
> > -static int
> > +static bool
> >  function_types_compatible_p (const_tree f1, const_tree f2,
> > -bool *enum_and_int_p, bool *different_types_p)
> > +struct comptypes_data *data)
> 
> Another comment to update for a changed return type.
> 
> >  /* Check two lists of types for compatibility, returning 0 for
> > -   incompatible, 1 for compatible, or 2 for compatible with
> > -   warning.  ENUM_AND_INT_P and DIFFERENT_TYPES_P are as in
> > -   comptypes_internal.  */
> > +   incompatible, 1 for compatible.  ENUM_AND_INT_P and
> > +   DIFFERENT_TYPES_P are as in comptypes_internal.  */
> >  
> > -static int
> > +static bool
> >  type_lists_compatible_p (const_tree args1, const_tree args2,
> > -bool *enum_and_int_p, bool *different_types_p)
> > +struct comptypes_data *data)
> 
> This one also needs updating to remove references to parameters that no 
> longer exist.
> 

c: reorganize recursive type checking

Reorganize recursive type checking to use a structure to
store information collected during the recursion and
returned to the caller (warning_needed, enum_and_init_p,
different_types_p).

gcc/c:
* c-typeck.cc (struct comptypes_data): Add structure.
(tagged_types_tu_compatible_p,
function_types_compatible_p, type_lists_compatible_p,
comptypes_internal): Add structure to interface, change
return type to bool, and adapt calls.
(comptarget_types): Change return type too bool.
(comptypes, comptypes_check_enum_int,
comptypes_check_different_types): Adapt calls.
---
 gcc/c/c-typeck.cc | 282 --
 1 file changed, 121 insertions(+), 161 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e2bfd2caf85..e55e887da14 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -90,12 +90,14 @@ static bool require_constant_elements;
 static bool require_constexpr_value;
 
 static tree qualify_type (tree, tree);
-static int tagged_types_tu_compatible_p (const_tree, const_tree, bool *,
-bool *);
-static int comp_target_types (location_t, tree, tree);
-static int function_types_compatible_p (const_tree, const_tree, bool *,
-   bool *);
-static int type_lists_compatible_p (const_tree, const_tree, bool *, bool *);
+struct comptypes_data;
+static bool tagged_types_tu_compatible_p (const_tree, const_tree,
+ struct comptypes_data *);
+static bool comp_target_types (location_t, tree, tree);
+static bool function_types_compatible_p (const_tree, const_tree,
+struct comptypes_data *);
+static bool type_lists_compatible_p (const_tree, const_tree,
+   

[C PATCH] c: flag for tag compatibility rules

2023-08-26 Thread Martin Uecker via Gcc-patches


Add a flag to turn tag compatibility rules on or off
independent from the language version.

gcc/c-family:
* c.opt (flag_tag_compat): New flag.

gcc/c:
* c-decl.cc (diagnose_mismatched_decls, start_struct,
finish_struct, start_enum, finish_enum): Support flag.
* c-typeck.cc (composite_type_internal): Support flag.

gcc/doc:
* invoke.texi: Document flag.

gcc/testsuite:
* gcc.dg/asan/pr81470.c: Turn off tag compatibility.
* gcc.dg/c99-tag-1.c: Turn off tag compatibility.
* gcc.dg/c99-tag-2.c: Turn off tag compatibility.
* gcc.dg/decl-3.c: Turn off tag compatibility.
* gcc.dg/enum-redef-1.c: Turn off tag compatibility.
* gcc.dg/pr17188-1.c: Turn off tag compatibility.
* gcc.dg/pr18809-1.c: Turn off tag compatibility.
* gcc.dg/pr39084.c: Turn off tag compatibility.
* gcc.dg/pr79983.c: Turn off tag compatibility.
---
 gcc/c-family/c.opt  |  3 +++
 gcc/c/c-decl.cc | 12 ++--
 gcc/c/c-typeck.cc   |  2 +-
 gcc/doc/invoke.texi |  5 +
 gcc/testsuite/gcc.dg/asan/pr81460.c |  1 +
 gcc/testsuite/gcc.dg/c99-tag-1.c|  2 +-
 gcc/testsuite/gcc.dg/c99-tag-2.c|  2 +-
 gcc/testsuite/gcc.dg/decl-3.c   |  1 +
 gcc/testsuite/gcc.dg/enum-redef-1.c |  2 ++
 gcc/testsuite/gcc.dg/pr17188-1.c|  2 +-
 gcc/testsuite/gcc.dg/pr18809-1.c|  1 +
 gcc/testsuite/gcc.dg/pr39084.c  |  2 +-
 gcc/testsuite/gcc.dg/pr79983.c  |  2 +-
 13 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 2242524cd3e..f95f12ba249 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -2214,6 +2214,9 @@ Enum(strong_eval_order) String(some) Value(1)
 EnumValue
 Enum(strong_eval_order) String(all) Value(2)
 
+ftag-compat
+C Var(flag_tag_compat) Init(1)
+
 ftemplate-backtrace-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(template_backtrace_limit) 
Init(10)
 Set the maximum number of template instantiation notes for a single warning or 
error.
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 2137ba8b845..6d1e0d5c382 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -2094,7 +2094,7 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
  given scope.  */
   if (TREE_CODE (olddecl) == CONST_DECL)
 {
-  if (flag_isoc2x
+  if ((flag_isoc2x || flag_tag_compat)
  && TYPE_NAME (DECL_CONTEXT (newdecl))
  && DECL_CONTEXT (newdecl) != DECL_CONTEXT (olddecl)
  && TYPE_NAME (DECL_CONTEXT (newdecl)) == TYPE_NAME (DECL_CONTEXT 
(olddecl)))
@@ -8723,7 +8723,7 @@ start_struct (location_t loc, enum tree_code code, tree 
name,
 
   /* For C2X, even if we already have a completed definition,
  we do not use it. We will check for consistency later.  */
-  if (flag_isoc2x && ref && TYPE_SIZE (ref))
+  if ((flag_isoc2x || flag_tag_compat) && ref && TYPE_SIZE (ref))
 ref = NULL_TREE;
 
   if (ref && TREE_CODE (ref) == code)
@@ -9515,7 +9515,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
 }
 
   /* Check for consistency with previous definition */
-  if (flag_isoc2x)
+  if (flag_isoc2x || flag_tag_compat)
 {
   tree vistype = previous_tag (t);
   if (vistype
@@ -9534,7 +9534,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_BEING_DEFINED (t) = 0;
 
   /* Set type canonical based on equivalence class.  */
-  if (flag_isoc2x)
+  if (flag_isoc2x || flag_tag_compat)
 {
   if (NULL == c_struct_htab)
c_struct_htab = hash_table::create_ggc (61);
@@ -9672,7 +9672,7 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
   if (name != NULL_TREE)
 enumtype = lookup_tag (ENUMERAL_TYPE, name, true, );
 
-  if (flag_isoc2x && enumtype != NULL_TREE
+  if ((flag_isoc2x || flag_tag_compat) && enumtype != NULL_TREE
   && TREE_CODE (enumtype) == ENUMERAL_TYPE
   && TYPE_VALUES (enumtype) != NULL_TREE)
 enumtype = NULL_TREE;
@@ -9941,7 +9941,7 @@ finish_enum (tree enumtype, tree values, tree attributes)
 struct_parse_info->struct_types.safe_push (enumtype);
 
   /* Check for consistency with previous definition */
-  if (flag_isoc2x)
+  if (flag_isoc2x || flag_tag_compat)
 {
   tree vistype = previous_tag (enumtype);
   if (vistype
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 357367eab09..b99f0c3e2fd 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -512,7 +512,7 @@ composite_type_internal (tree t1, tree t2, struct 
composite_cache* cache)
 
 case RECORD_TYPE:
 case UNION_TYPE:
-  if (flag_isoc2x && !comptypes_same_p (t1, t2))
+  if ((flag_isoc2x || flag_tag_compat) && !comptypes_same_p (t1, t2))
{
  gcc_checking_assert (COMPLETE_TYPE_P (t1) && COMPLETE_TYPE_P (t2));
  gcc_checking_assert (comptypes (t1, t2));
diff --git a/gcc/doc/invoke.texi 

[C PATCH 6/6] c23: construct composite type for tagged types

2023-08-26 Thread Martin Uecker via Gcc-patches



Support for constructing composite type for structs and unions
in C23.

gcc/c:
* c-typeck.cc (composite_type_internal): Adapted from
composite_type to support structs and unions.
(composite_type): New wrapper function.
(build_conditional_operator): Return composite type.

gcc/testsuite:
* gcc.dg/c2x-tag-composite-1.c: New test.
* gcc.dg/c2x-tag-composite-2.c: New test.
* gcc.dg/c2x-tag-composite-3.c: New test.
* gcc.dg/c2x-tag-composite-4.c: New test.
---
 gcc/c/c-typeck.cc  | 114 +
 gcc/testsuite/gcc.dg/c2x-tag-composite-1.c |  26 +
 gcc/testsuite/gcc.dg/c2x-tag-composite-2.c |  16 +++
 gcc/testsuite/gcc.dg/c2x-tag-composite-3.c |  17 +++
 gcc/testsuite/gcc.dg/c2x-tag-composite-4.c |  21 
 5 files changed, 176 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-composite-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-composite-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-composite-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-composite-4.c

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 2489fa1e3d1..357367eab09 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -381,8 +381,15 @@ build_functype_attribute_variant (tree ntype, tree otype, 
tree attrs)
nonzero; if that isn't so, this may crash.  In particular, we
assume that qualifiers match.  */
 
+struct composite_cache {
+  tree t1;
+  tree t2;
+  tree composite;
+  struct composite_cache* next;
+};
+
 tree
-composite_type (tree t1, tree t2)
+composite_type_internal (tree t1, tree t2, struct composite_cache* cache)
 {
   enum tree_code code1;
   enum tree_code code2;
@@ -425,7 +432,8 @@ composite_type (tree t1, tree t2)
   {
tree pointed_to_1 = TREE_TYPE (t1);
tree pointed_to_2 = TREE_TYPE (t2);
-   tree target = composite_type (pointed_to_1, pointed_to_2);
+   tree target = composite_type_internal (pointed_to_1,
+  pointed_to_2, cache);
 t1 = build_pointer_type_for_mode (target, TYPE_MODE (t1), false);
t1 = build_type_attribute_variant (t1, attributes);
return qualify_type (t1, t2);
@@ -433,7 +441,8 @@ composite_type (tree t1, tree t2)
 
 case ARRAY_TYPE:
   {
-   tree elt = composite_type (TREE_TYPE (t1), TREE_TYPE (t2));
+   tree elt = composite_type_internal (TREE_TYPE (t1), TREE_TYPE (t2),
+   cache);
int quals;
tree unqual_elt;
tree d1 = TYPE_DOMAIN (t1);
@@ -501,9 +510,61 @@ composite_type (tree t1, tree t2)
return build_type_attribute_variant (t1, attributes);
   }
 
-case ENUMERAL_TYPE:
 case RECORD_TYPE:
 case UNION_TYPE:
+  if (flag_isoc2x && !comptypes_same_p (t1, t2))
+   {
+ gcc_checking_assert (COMPLETE_TYPE_P (t1) && COMPLETE_TYPE_P (t2));
+ gcc_checking_assert (comptypes (t1, t2));
+
+ /* If a composite type for these two types is already under
+construction, return it.  */
+
+ for (struct composite_cache *c = cache; c != NULL; c = c->next)
+   if (c->t1 == t1 && c->t2 == t2)
+  return c->composite;
+
+ /* Otherwise, create a new type node and link it into the cache.  */
+
+ tree n = make_node (code1);
+ struct composite_cache cache2 = { t1, t2, n, cache };
+ cache = 
+
+ tree f1 = TYPE_FIELDS (t1);
+ tree f2 = TYPE_FIELDS (t2);
+ tree fields = NULL_TREE;
+
+ for (tree a = f1, b = f2; a && b;
+  a = DECL_CHAIN (a), b = DECL_CHAIN (b))
+   {
+ tree ta = TREE_TYPE (a);
+ tree tb = TREE_TYPE (b);
+
+ gcc_assert (DECL_NAME (a) == DECL_NAME (b));
+ gcc_assert (comptypes (ta, tb));
+
+ tree f = build_decl (input_location, FIELD_DECL, DECL_NAME (a),
+  composite_type_internal (ta, tb, cache));
+
+ DECL_FIELD_CONTEXT (f) = n;
+ DECL_CHAIN (f) = fields;
+ fields = f;
+   }
+
+ TYPE_NAME (n) = TYPE_NAME (t1);
+ TYPE_FIELDS (n) = nreverse (fields);
+ TYPE_ATTRIBUTES (n) = attributes;
+ layout_type (n);
+ n = build_type_attribute_variant (n, attributes);
+ n = qualify_type (n, t1);
+
+ gcc_checking_assert (comptypes (n, t1));
+ gcc_checking_assert (comptypes (n, t2));
+
+ return n;
+   }
+  /* FALLTHRU */
+case ENUMERAL_TYPE:
   if (attributes != NULL)
{
  /* Try harder not to create a new aggregate type.  */
@@ -518,7 +579,8 @@ composite_type (tree t1, tree t2)
   /* Function types: prefer the one that specified arg types.
 If both do, merge the arg types.  Also merge the return types.  */
   {
-   tree valtype = composite_type (TREE_TYPE (t1), TREE_TYPE 

[C PATCH 5/6] c23: aliasing of compatible tagged types

2023-08-26 Thread Martin Uecker via Gcc-patches



Tell the backend which types are equivalent by setting
TYPE_CANONICAL to one struct in the set of equivalent
structs. Structs are considered equivalent by ignoring
all sizes of arrays nested in types below field level.

gcc/c:
* c-decl.cc (c_struct_hasher): Hash stable for struct
types.
(c_struct_hasher::hash, c_struct_hasher::equal): New functions.
(finish_struct): Set TYPE_CANONICAL to first struct in
equivalence class.
* c-objc-common.cc (c_get_alias_set): Let structs or
unions with variable size alias anything.
* c-tree.h (comptypes_equiv): New prototype.
* c-typeck.cc (comptypes_equiv): New function.
(comptypes_internal): Implement equivalence mode.
(tagged_types_tu_compatible): Implement equivalence mode.

gcc/testsuite:
* gcc.dg/c2x-tag-2.c: Remove xfail.
* gcc.dg/c2x-tag-6.c: Remove xfail.
* gcc.dg/c2x-tag-alias-1.c: New test.
* gcc.dg/c2x-tag-alias-2.c: New test.
* gcc.dg/c2x-tag-alias-3.c: New test.
* gcc.dg/c2x-tag-alias-4.c: New test.
* gcc.dg/c2x-tag-alias-5.c: New test.
* gcc.dg/c2x-tag-alias-6.c: New test.
* gcc.dg/c2x-tag-alias-7.c: New test.
* gcc.dg/c2x-tag-alias-8.c: New test.
---
 gcc/c/c-decl.cc| 48 +
 gcc/c/c-objc-common.cc |  5 ++
 gcc/c/c-tree.h |  1 +
 gcc/c/c-typeck.cc  | 31 
 gcc/testsuite/gcc.dg/c2x-tag-2.c   |  2 +-
 gcc/testsuite/gcc.dg/c2x-tag-6.c   |  2 +-
 gcc/testsuite/gcc.dg/c2x-tag-alias-1.c | 48 +
 gcc/testsuite/gcc.dg/c2x-tag-alias-2.c | 73 +++
 gcc/testsuite/gcc.dg/c2x-tag-alias-3.c | 48 +
 gcc/testsuite/gcc.dg/c2x-tag-alias-4.c | 73 +++
 gcc/testsuite/gcc.dg/c2x-tag-alias-5.c | 30 
 gcc/testsuite/gcc.dg/c2x-tag-alias-6.c | 77 
 gcc/testsuite/gcc.dg/c2x-tag-alias-7.c | 98 ++
 gcc/testsuite/gcc.dg/c2x-tag-alias-8.c | 90 +++
 14 files changed, 624 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-5.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-alias-8.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index b514e8a35ee..2137ba8b845 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -603,6 +603,36 @@ public:
   auto_vec typedefs_seen;
 };
 
+
+/* Hash table for structs and unions.  */
+struct c_struct_hasher : ggc_ptr_hash
+{
+  static hashval_t hash (tree t);
+  static bool equal (tree, tree);
+};
+
+/* Hash an RECORD OR UNION.  */
+hashval_t
+c_struct_hasher::hash (tree type)
+{
+  inchash::hash hstate;
+
+  hstate.add_int (TREE_CODE (type));
+  hstate.add_object (TYPE_NAME (type));
+
+  return hstate.end ();
+}
+
+/* Compare two RECORD or UNION types.  */
+bool
+c_struct_hasher::equal (tree t1,  tree t2)
+{
+  return comptypes_equiv_p (t1, t2);
+}
+
+/* All tagged typed so that TYPE_CANONICAL can be set correctly.  */
+static GTY (()) hash_table *c_struct_htab;
+
 /* Information for the struct or union currently being parsed, or
NULL if not parsing a struct or union.  */
 static class c_struct_parse_info *struct_parse_info;
@@ -9503,6 +9533,24 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
 
   C_TYPE_BEING_DEFINED (t) = 0;
 
+  /* Set type canonical based on equivalence class.  */
+  if (flag_isoc2x)
+{
+  if (NULL == c_struct_htab)
+   c_struct_htab = hash_table::create_ggc (61);
+
+  hashval_t hash = c_struct_hasher::hash (t);
+
+  tree *e = c_struct_htab->find_slot_with_hash (t, hash, INSERT);
+  if (*e)
+   TYPE_CANONICAL (t) = *e;
+  else
+   {
+ TYPE_CANONICAL (t) = t;
+ *e = t;
+   }
+}
+
   tree incomplete_vars = C_TYPE_INCOMPLETE_VARS (TYPE_MAIN_VARIANT (t));
   for (x = TYPE_MAIN_VARIANT (t); x; x = TYPE_NEXT_VARIANT (x))
 {
diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index e4aed61ed00..992225bbb29 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -389,6 +389,11 @@ c_get_alias_set (tree t)
   if (TREE_CODE (t) == ENUMERAL_TYPE)
 return get_alias_set (ENUM_UNDERLYING_TYPE (t));
 
+  /* Structs with variable size can alias different incompatible
+ structs.  Let them alias anything.   */
+  if (RECORD_OR_UNION_TYPE_P (t) && C_TYPE_VARIABLE_SIZE (t))
+return 0;
+
   return c_common_get_alias_set (t);
 }
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 511fd9ee0e5..1a8e8f072bd 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h

[C PATCH 4/6] c23: tag compatibility rules for enums

2023-08-26 Thread Martin Uecker via Gcc-patches



Allow redefinition of enum types and enumerators.

gcc/c:
* c-decl.cc (start_num): Allow redefinition.
(finish_enum): Diagnose conflicts.
(build_enumerator): Set context.
(diagnose_mismatched_decls): Diagnose conflicting enumerators.
(push_decl): Preserve context for enumerators.

gcc/testsuide/:
* gcc.dg/c2x-tag-enum-1.c: New test.
* gcc.dg/c2x-tag-enum-2.c: New test.
* gcc.dg/c2x-tag-enum-3.c: New test.
* gcc.dg/c2x-tag-enum-4.c: New test.
---
 gcc/c/c-decl.cc   | 47 --
 gcc/c/c-typeck.cc |  5 ++-
 gcc/testsuite/gcc.dg/c2x-tag-enum-1.c | 56 +++
 gcc/testsuite/gcc.dg/c2x-tag-enum-2.c | 23 +++
 gcc/testsuite/gcc.dg/c2x-tag-enum-3.c |  7 
 gcc/testsuite/gcc.dg/c2x-tag-enum-4.c | 22 +++
 6 files changed, 155 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-enum-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-enum-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-enum-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-enum-4.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index c5c6a853fa9..b514e8a35ee 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -2064,9 +2064,24 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
  given scope.  */
   if (TREE_CODE (olddecl) == CONST_DECL)
 {
-  auto_diagnostic_group d;
-  error ("redeclaration of enumerator %q+D", newdecl);
-  locate_old_decl (olddecl);
+  if (flag_isoc2x
+ && TYPE_NAME (DECL_CONTEXT (newdecl))
+ && DECL_CONTEXT (newdecl) != DECL_CONTEXT (olddecl)
+ && TYPE_NAME (DECL_CONTEXT (newdecl)) == TYPE_NAME (DECL_CONTEXT 
(olddecl)))
+   {
+ if (!simple_cst_equal (DECL_INITIAL (olddecl), DECL_INITIAL 
(newdecl)))
+   {
+ auto_diagnostic_group d;
+ error ("conflicting redeclaration of enumerator %q+D", newdecl);
+ locate_old_decl (olddecl);
+   }
+   }
+  else
+   {
+ auto_diagnostic_group d;
+ error ("redeclaration of enumerator %q+D", newdecl);
+ locate_old_decl (olddecl);
+   }
   return false;
 }
 
@@ -3227,8 +3242,11 @@ pushdecl (tree x)
 
   /* Must set DECL_CONTEXT for everything not at file scope or
  DECL_FILE_SCOPE_P won't work.  Local externs don't count
- unless they have initializers (which generate code).  */
+ unless they have initializers (which generate code).  We
+ also exclude CONST_DECLs because enumerators will get the
+ type of the enum as context.  */
   if (current_function_decl
+  && TREE_CODE (x) != CONST_DECL
   && (!VAR_OR_FUNCTION_DECL_P (x)
  || DECL_INITIAL (x) || !TREE_PUBLIC (x)))
 DECL_CONTEXT (x) = current_function_decl;
@@ -9606,9 +9624,15 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
   if (name != NULL_TREE)
 enumtype = lookup_tag (ENUMERAL_TYPE, name, true, );
 
+  if (flag_isoc2x && enumtype != NULL_TREE
+  && TREE_CODE (enumtype) == ENUMERAL_TYPE
+  && TYPE_VALUES (enumtype) != NULL_TREE)
+enumtype = NULL_TREE;
+
   if (enumtype == NULL_TREE || TREE_CODE (enumtype) != ENUMERAL_TYPE)
 {
   enumtype = make_node (ENUMERAL_TYPE);
+  TYPE_SIZE (enumtype) = NULL_TREE;
   pushtag (loc, name, enumtype);
   if (fixed_underlying_type != NULL_TREE)
{
@@ -9868,6 +9892,20 @@ finish_enum (tree enumtype, tree values, tree attributes)
   && !in_sizeof && !in_typeof && !in_alignof)
 struct_parse_info->struct_types.safe_push (enumtype);
 
+  /* Check for consistency with previous definition */
+  if (flag_isoc2x)
+{
+  tree vistype = previous_tag (enumtype);
+  if (vistype
+ && TREE_CODE (vistype) == TREE_CODE (enumtype)
+ && !C_TYPE_BEING_DEFINED (vistype))
+   {
+ TYPE_STUB_DECL (vistype) = TYPE_STUB_DECL (enumtype);
+ if (!comptypes_same_p (enumtype, vistype))
+   error("conflicting redefinition of enum %qT", enumtype);
+   }
+}
+
   C_TYPE_BEING_DEFINED (enumtype) = 0;
 
   return enumtype;
@@ -10047,6 +10085,7 @@ build_enumerator (location_t decl_loc, location_t loc,
 
   decl = build_decl (decl_loc, CONST_DECL, name, TREE_TYPE (value));
   DECL_INITIAL (decl) = value;
+  DECL_CONTEXT (decl) = the_enum->enum_type;
   pushdecl (decl);
 
   return tree_cons (decl, value, NULL_TREE);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 802c727d9d3..2b79cbba950 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -1396,6 +1396,9 @@ tagged_types_tu_compatible_p (const_tree t1, const_tree 
t2,
 {
 case ENUMERAL_TYPE:
   {
+   if (!comptypes (ENUM_UNDERLYING_TYPE (t1), ENUM_UNDERLYING_TYPE (t2)))
+ return false;
+
/* Speed up the case where the type values are in the same order.  */
tree tv1 = TYPE_VALUES (t1);
tree 

[C PATCH 3/6] c23: tag compatibility rules for struct and unions

2023-08-26 Thread Martin Uecker via Gcc-patches



Implement redeclaration and compatibility rules for
structures and unions in C23.

gcc/c/:
* c-decl.cc (previous_tag): New function.
(get_parm_info): Turn off warning for C2X.
(start_struct): Allow redefinitons.
(finish_struct): Diagnose conflicts.
* c-tree.h (comptypes_same_p): Add prototype.
* c-typeck.cc (comptypes_same_p): New function
(comptypes_internal): Activate comparison of tagged
types (convert_for_assignment): Ingore qualifiers.
(digest_init): Add error.
(initialized_elementwise_p): Allow compatible types.

gcc/testsuite/:
* gcc.dg/c2x-enum-7.c: Remove warning.
* gcc.dg/c2x-tag-1.c: New test.
* gcc.dg/c2x-tag-2.c: New test.
* gcc.dg/c2x-tag-3.c: New test.
* gcc.dg/c2x-tag-4.c: New test.
* gcc.dg/c2x-tag-5.c: New test.
* gcc.dg/c2x-tag-6.c: New test.
* gcc.dg/c2x-tag-7.c: New test.
* gcc.dg/c2x-tag-8.c: New test.
* gcc.dg/c2x-tag-9.c: New test.
* gcc.dg/c2x-tag-10.c: New test.
---
 gcc/c/c-decl.cc   | 56 ++---
 gcc/c/c-tree.h|  1 +
 gcc/c/c-typeck.cc | 38 +
 gcc/testsuite/gcc.dg/c2x-enum-7.c |  6 +--
 gcc/testsuite/gcc.dg/c2x-tag-1.c  | 68 +++
 gcc/testsuite/gcc.dg/c2x-tag-10.c | 31 ++
 gcc/testsuite/gcc.dg/c2x-tag-2.c  | 43 +++
 gcc/testsuite/gcc.dg/c2x-tag-3.c  | 16 
 gcc/testsuite/gcc.dg/c2x-tag-4.c  | 19 +
 gcc/testsuite/gcc.dg/c2x-tag-5.c  | 26 
 gcc/testsuite/gcc.dg/c2x-tag-6.c  | 34 
 gcc/testsuite/gcc.dg/c2x-tag-7.c  | 28 +
 gcc/testsuite/gcc.dg/c2x-tag-8.c  | 25 
 gcc/testsuite/gcc.dg/c2x-tag-9.c  | 12 ++
 14 files changed, 387 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-10.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-5.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-8.c
 create mode 100644 gcc/testsuite/gcc.dg/c2x-tag-9.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 1f9eb44dbaa..c5c6a853fa9 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -1993,6 +1993,24 @@ locate_old_decl (tree decl)
decl, TREE_TYPE (decl));
 }
 
+static tree
+previous_tag (tree type)
+{
+  struct c_binding *b = NULL;
+  tree name = TYPE_NAME (type);
+
+  if (name)
+b = I_TAG_BINDING (name);
+
+  if (b)
+b = b->shadowed;
+
+  if (b && B_IN_CURRENT_SCOPE (b))
+return b->decl;
+
+  return NULL_TREE;
+}
+
 /* Subroutine of duplicate_decls.  Compare NEWDECL to OLDDECL.
Returns true if the caller should proceed to merge the two, false
if OLDDECL should simply be discarded.  As a side effect, issues
@@ -8442,11 +8460,14 @@ get_parm_info (bool ellipsis, tree expr)
  if (TREE_CODE (decl) != UNION_TYPE || b->id != NULL_TREE)
{
  if (b->id)
-   /* The %s will be one of 'struct', 'union', or 'enum'.  */
-   warning_at (b->locus, 0,
-   "%<%s %E%> declared inside parameter list"
-   " will not be visible outside of this definition or"
-   " declaration", keyword, b->id);
+   {
+ /* The %s will be one of 'struct', 'union', or 'enum'.  */
+ if (!flag_isoc2x)
+   warning_at (b->locus, 0,
+   "%<%s %E%> declared inside parameter list"
+   " will not be visible outside of this 
definition or"
+   " declaration", keyword, b->id);
+   }
  else
/* The %s will be one of 'struct', 'union', or 'enum'.  */
warning_at (b->locus, 0,
@@ -8651,6 +8672,12 @@ start_struct (location_t loc, enum tree_code code, tree 
name,
 
   if (name != NULL_TREE)
 ref = lookup_tag (code, name, true, );
+
+  /* For C2X, even if we already have a completed definition,
+ we do not use it. We will check for consistency later.  */
+  if (flag_isoc2x && ref && TYPE_SIZE (ref))
+ref = NULL_TREE;
+
   if (ref && TREE_CODE (ref) == code)
 {
   if (TYPE_STUB_DECL (ref))
@@ -9439,6 +9466,25 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   warning_at (loc, 0, "union cannot be made transparent");
 }
 
+  /* Check for consistency with previous definition */
+  if (flag_isoc2x)
+{
+  tree vistype = previous_tag (t);
+  if (vistype
+ && TREE_CODE (vistype) == TREE_CODE (t)
+ && 

[C PATCH 2/6] c23: recursive type checking of tagged type

2023-08-26 Thread Martin Uecker via Gcc-patches




Adapt the old and unused code for type checking for C23.

gcc/c/:
* c-typeck.c (struct comptypes_data): Add anon_field flag.
(comptypes, comptypes_check_unum_int,
comptypes_check_different_types): Remove old cache.
(tagged_tu_types_compatible_p): Rewrite.
---
 gcc/c/c-typeck.cc | 261 +++---
 1 file changed, 58 insertions(+), 203 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index ed1520ed6ba..41ef05f005c 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -190,20 +190,14 @@ remove_c_maybe_const_expr (tree expr)
 return expr;
 }
 
-/* This is a cache to hold if two types are compatible or not.  */
+/* This is a cache to hold if two types are seen.  */
 
 struct tagged_tu_seen_cache {
   const struct tagged_tu_seen_cache * next;
   const_tree t1;
   const_tree t2;
-  /* The return value of tagged_types_tu_compatible_p if we had seen
- these two types already.  */
-  int val;
 };
 
-static const struct tagged_tu_seen_cache * tagged_tu_seen_base;
-static void free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache 
*);
-
 /* Do `exp = require_complete_type (loc, exp);' to make sure exp
does not have an incomplete type.  (That includes void types.)
LOC is the location of the use.  */
@@ -1043,10 +1037,12 @@ common_type (tree t1, tree t2)
 }
 
 struct comptypes_data {
-
   bool enum_and_int_p;
   bool different_types_p;
   bool warning_needed;
+  bool anon_field;
+
+  const struct tagged_tu_seen_cache* cache;
 };
 
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
@@ -1056,13 +1052,9 @@ struct comptypes_data {
 int
 comptypes (tree type1, tree type2)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-
   struct comptypes_data data = { };
   bool ret = comptypes_internal (type1, type2, );
 
-  free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
-
   return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
@@ -1072,14 +1064,10 @@ comptypes (tree type1, tree type2)
 int
 comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-
   struct comptypes_data data = { };
   bool ret = comptypes_internal (type1, type2, );
   *enum_and_int_p = data.enum_and_int_p;
 
-  free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
-
   return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
@@ -1090,14 +1078,10 @@ int
 comptypes_check_different_types (tree type1, tree type2,
 bool *different_types_p)
 {
-  const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-
   struct comptypes_data data = { };
   bool ret = comptypes_internal (type1, type2, );
   *different_types_p = data.different_types_p;
 
-  free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
-
   return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
@@ -1334,53 +1318,7 @@ comp_target_types (location_t location, tree ttl, tree 
ttr)
 
 /* Subroutines of `comptypes'.  */
 
-
-
-/* Allocate the seen two types, assuming that they are compatible. */
-
-static struct tagged_tu_seen_cache *
-alloc_tagged_tu_seen_cache (const_tree t1, const_tree t2)
-{
-  struct tagged_tu_seen_cache *tu = XNEW (struct tagged_tu_seen_cache);
-  tu->next = tagged_tu_seen_base;
-  tu->t1 = t1;
-  tu->t2 = t2;
-
-  tagged_tu_seen_base = tu;
-
-  /* The C standard says that two structures in different translation
- units are compatible with each other only if the types of their
- fields are compatible (among other things).  We assume that they
- are compatible until proven otherwise when building the cache.
- An example where this can occur is:
- struct a
- {
-   struct a *next;
- };
- If we are comparing this against a similar struct in another TU,
- and did not assume they were compatible, we end up with an infinite
- loop.  */
-  tu->val = 1;
-  return tu;
-}
-
-/* Free the seen types until we get to TU_TIL. */
-
-static void
-free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache *tu_til)
-{
-  const struct tagged_tu_seen_cache *tu = tagged_tu_seen_base;
-  while (tu != tu_til)
-{
-  const struct tagged_tu_seen_cache *const tu1
-   = (const struct tagged_tu_seen_cache *) tu;
-  tu = tu1->next;
-  XDELETE (CONST_CAST (struct tagged_tu_seen_cache *, tu1));
-}
-  tagged_tu_seen_base = tu_til;
-}
-
-/* Return 1 if two 'struct', 'union', or 'enum' types T1 and T2 are
+/* Return true if two 'struct', 'union', or 'enum' types T1 and T2 are
compatible.  If the two types are not the same (which has been
checked earlier).  */
 
@@ -1406,189 +1344,106 @@ tagged_types_tu_compatible_p (const_tree t1, 
const_tree t2,
 && DECL_ORIGINAL_TYPE (TYPE_NAME (t2)))
 t2 = DECL_ORIGINAL_TYPE (TYPE_NAME (t2));
 
-  /* C90 didn't have the requirement that the two tags be the same.  */
-  if 

[C PATCH 1/6] c: reorganize recursive type checking

2023-08-26 Thread Martin Uecker via Gcc-patches




Reorganize recursive type checking to use a structure to
store information collected during the recursion and
returned to the caller (warning_needed, enum_and_init_p,
different_types_p).

gcc/c:
* c-typeck.cc (struct comptypes_data): Add structure.
(tagged_types_tu_compatible_p,
function_types_compatible_p, type_lists_compatible_p,
comptypes_internal): Add structure to interface, change
return type to bool, and adapt calls.
(comptarget_types): Change return type too bool.
(comptypes, comptypes_check_enum_int,
comptypes_check_different_types): Adapt calls.
---
 gcc/c/c-typeck.cc | 266 --
 1 file changed, 114 insertions(+), 152 deletions(-)

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e6ddf37d412..ed1520ed6ba 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -90,12 +90,14 @@ static bool require_constant_elements;
 static bool require_constexpr_value;
 
 static tree qualify_type (tree, tree);
-static int tagged_types_tu_compatible_p (const_tree, const_tree, bool *,
-bool *);
-static int comp_target_types (location_t, tree, tree);
-static int function_types_compatible_p (const_tree, const_tree, bool *,
-   bool *);
-static int type_lists_compatible_p (const_tree, const_tree, bool *, bool *);
+struct comptypes_data;
+static bool tagged_types_tu_compatible_p (const_tree, const_tree,
+ struct comptypes_data *);
+static bool comp_target_types (location_t, tree, tree);
+static bool function_types_compatible_p (const_tree, const_tree,
+struct comptypes_data *);
+static bool type_lists_compatible_p (const_tree, const_tree,
+struct comptypes_data *);
 static tree lookup_field (tree, tree);
 static int convert_arguments (location_t, vec, tree,
  vec *, vec *, tree,
@@ -125,7 +127,8 @@ static tree find_init_member (tree, struct obstack *);
 static void readonly_warning (tree, enum lvalue_use);
 static int lvalue_or_else (location_t, const_tree, enum lvalue_use);
 static void record_maybe_used_decl (tree);
-static int comptypes_internal (const_tree, const_tree, bool *, bool *);
+static bool comptypes_internal (const_tree, const_tree,
+   struct comptypes_data *data);
 
 /* Return true if EXP is a null pointer constant, false otherwise.  */
 
@@ -1039,6 +1042,13 @@ common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+struct comptypes_data {
+
+  bool enum_and_int_p;
+  bool different_types_p;
+  bool warning_needed;
+};
+
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
or various other operations.  Return 2 if they are compatible
but a warning may be needed if you use them together.  */
@@ -1047,12 +1057,13 @@ int
 comptypes (tree type1, tree type2)
 {
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-  int val;
 
-  val = comptypes_internal (type1, type2, NULL, NULL);
+  struct comptypes_data data = { };
+  bool ret = comptypes_internal (type1, type2, );
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
-  return val;
+  return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
 /* Like comptypes, but if it returns non-zero because enum and int are
@@ -1062,12 +1073,14 @@ int
 comptypes_check_enum_int (tree type1, tree type2, bool *enum_and_int_p)
 {
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-  int val;
 
-  val = comptypes_internal (type1, type2, enum_and_int_p, NULL);
+  struct comptypes_data data = { };
+  bool ret = comptypes_internal (type1, type2, );
+  *enum_and_int_p = data.enum_and_int_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
-  return val;
+  return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
 /* Like comptypes, but if it returns nonzero for different types, it
@@ -1078,40 +1091,40 @@ comptypes_check_different_types (tree type1, tree type2,
 bool *different_types_p)
 {
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
-  int val;
 
-  val = comptypes_internal (type1, type2, NULL, different_types_p);
+  struct comptypes_data data = { };
+  bool ret = comptypes_internal (type1, type2, );
+  *different_types_p = data.different_types_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
-  return val;
+  return ret ? (data.warning_needed ? 2 : 1) : 0;
 }
 
-/* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
-   or various other operations.  Return 2 if they are compatible
-   but a warning may be needed if you use them together.  If
-   ENUM_AND_INT_P is not NULL, and one type is an enum and the other a
-   compatible integer type, then this sets *ENUM_AND_INT_P to true;
-   *ENUM_AND_INT_P is 

c23 type compatibility rules, v2

2023-08-26 Thread Martin Uecker via Gcc-patches



This is a revised series for the C23 rules for type
compatibility.

1/6 c: reorganize recursive type checking
2/6 c23: recursive type checking of tagged type
3/6 c23: tag compatibility rules for struct and unions
4/6 c23: tag compatibility rules for enums
5/6 c23: aliasing of compatible tagged types
6/6 c23: construct composite type for tagged types
x/x c: flag for tag compatibility rules


1. simplifies type checking without functionality changes
as a preparation step. (This is based on a similar preparatory
patch I posted before for checking size expressions).

2. implements the new rules in comptypes for tagged types but 
the code still remains unused. This removes a lot of old
code because we now require union members to have the same
order and merges the code for structs and unions.

3. implements the rules for structs and unions.

4. does the same for enum types and enumerators. 

5. sets TYPE_CANONICAL based on a equivalence class of types
which makes aliasing work correctly. For this there is a new
comptypes_equiv_p that does relaxed checking (ignoring size
expressions in nested types but not for fields).

6. adds support for the composite type.

There is an extra patch that adds the a flag to activate
the compatibility rules independently from language mode
and activates it by default.

1-2 should cause no change in function. 3-6 implement the
new semantics for C23.

Bootstrapped and regression tested on x86_64 (also with the
extra patch).

Martin







[committed] fix misleading identation breaking bootstrap

2023-08-20 Thread Martin Uecker via Gcc-patches


Committed as obvious.


fix misleading identation breaking bootstrap

Fix identation issue introduced by 966f3c13
"Fix format attribute for printf".

gcc/c-family/ChangeLog:

* c-format.cc: Fix identation.

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index 122ff9bd1cd..b3ef2d44ce9 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -1214,8 +1214,8 @@ check_function_format (const_tree fn, tree attrs, int 
nargs,
skipped_default_format = true;
break;
  }
-   if (skipped_default_format)
- continue;
+  if (skipped_default_format)
+continue;
}
 
  if (warn_format)




c: Support for -Wuseless-cast [RR84510]

2023-08-10 Thread Martin Uecker via Gcc-patches



This patch adds the missing support for -Wuseless-cast
to the C FE as requested by some users. It found about 
50 useless casts in one of my projects without false 
positives.

(I also implemented a detection for various
unneeded pointer casts in convert_for_assignment
such as unneeded casts from / to void or casts
followed by an implicit conversion to the original
type, but I did not figure out how to reliably 
identify casts there... But this would be a potential
future enhancement.)


Regression tested on bootstrapped on x86_64-pc-linux-gnu.



c: Support for -Wuseless-cast [RR84510]

Add support for Wuseless-cast C (and ObjC).

PR c/84510

gcc/c/:
* c-typeck.cc (build_c_cast): Add warning.

gcc/doc/:
* invoke.texi: Update.

gcc/testsuite/:
* Wuseless-cast.c: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0ed87fcc7be..c7b567ba7ab 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1490,7 +1490,7 @@ C++ ObjC++ Var(warn_zero_as_null_pointer_constant) Warning
 Warn when a literal '0' is used as null pointer.
 
 Wuseless-cast
-C++ ObjC++ Var(warn_useless_cast) Warning
+C ObjC C++ ObjC++ Var(warn_useless_cast) Warning
 Warn about useless casts.
 
 Wsubobject-linkage
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7cf411155c6..6f2fff51683 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -6062,9 +6062,13 @@ build_c_cast (location_t loc, tree type, tree expr)
 
   if (type == TYPE_MAIN_VARIANT (TREE_TYPE (value)))
 {
-  if (RECORD_OR_UNION_TYPE_P (type))
-   pedwarn (loc, OPT_Wpedantic,
-"ISO C forbids casting nonscalar to the same type");
+  if (RECORD_OR_UNION_TYPE_P (type)
+ && pedwarn (loc, OPT_Wpedantic,
+ "ISO C forbids casting nonscalar to the same type"))
+ ;
+  else if (warn_useless_cast)
+   warning_at (loc, OPT_Wuseless_cast,
+   "useless cast to type %qT", type);
 
   /* Convert to remove any qualifiers from VALUE's type.  */
   value = convert (type, value);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 674f956f4b8..75ca72f3190 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4772,7 +4772,7 @@ pointers after reallocation.
 
 @opindex Wuseless-cast
 @opindex Wno-useless-cast
-@item -Wuseless-cast @r{(C++ and Objective-C++ only)}
+@item -Wuseless-cast @r{(C, Objective-C, C++ and Objective-C++ only)}
 Warn when an expression is cast to its own type.  This warning does not
 occur when a class object is converted to a non-reference type as that
 is a way to create a temporary:
diff --git a/gcc/testsuite/gcc.dg/Wuseless-cast.c 
b/gcc/testsuite/gcc.dg/Wuseless-cast.c
new file mode 100644
index 000..86e87584b87
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wuseless-cast.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuseless-cast" } */
+
+void foo(void)
+{  
+   // casts to the same type
+   int i = 0;
+   const int ic = 0;
+   struct foo { int x; } x = { 0 };
+   int q[3];
+   (int)ic;/* { dg-warning "useless cast" } */
+   (int)i; /* { dg-warning "useless cast" } */
+   (const int)ic;  /* { dg-warning "useless cast" } */
+   (const int)i;   /* { dg-warning "useless cast" } */
+   (struct foo)x;  /* { dg-warning "useless cast" } */
+   (int(*)[3])  /* { dg-warning "useless cast" } */
+   (_Atomic(int))i;/* { dg-warning "useless cast" } */
+
+   // not the same
+   int n = 3;
+   (int(*)[n])  // no warning
+   int j = (int)0UL;
+   enum X { A = 1 } xx = { A };
+   enum Y { B = 1 } yy = (enum Y)xx;
+}
+




Re: [V2][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-08-08 Thread Martin Uecker via Gcc-patches



I am sure this has been discussed before, but seeing that you
test for a specific formula, let me point out the following:

There at least three different size expression which could
make sense. Consider

short foo { int a; short b; char t[]; }; 

Most people seem to use

sizeof(struct foo) + N * sizeof(foo->t);

which for N == 3 yields 11 bytes on x86-64 because the formula
adds the padding of the original struct. There is an example
in the  C standard that uses this formula.


But he minimum size of an object which stores N elements is

max(sizeof (struct s), offsetof(struct s, t[n]))

which is 9 bytes. 

This is what clang uses for statically allocated objects with
initialization, while GCC uses the rule above (11 bytes). But 
bdos / bos  then returns the smaller size of 9 which is a bit
confusing.


https://godbolt.org/z/K1hvaK1ns

https://github.com/llvm/llvm-project/issues/62929
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956


Then there is also the size of a similar array where the FAM
is replaced with an array of static size:

struct foo { int a; short b; char t[3]; }; 

This would make the most sense to me, but it has 12 bytes
because the padding is according to the usual alignment
rules.


Martin



Am Montag, dem 07.08.2023 um 09:16 -0700 schrieb Kees Cook:
> On Fri, Aug 04, 2023 at 07:44:28PM +, Qing Zhao wrote:
> > This is the 2nd version of the patch, per our discussion based on the
> > review comments for the 1st version, the major changes in this version
> > are:
> 
> Thanks for the update!
> 
> > 
> > 1. change the name "element_count" to "counted_by";
> > 2. change the parameter for the attribute from a STRING to an
> > Identifier;
> > 3. Add logic and testing cases to handle anonymous structure/unions;
> > 4. Clarify documentation to permit the situation when the allocation
> > size is larger than what's specified by "counted_by", at the same time,
> > it's user's error if allocation size is smaller than what's specified by
> > "counted_by";
> > 5. Add a complete testing case for using counted_by attribute in
> > __builtin_dynamic_object_size when there is mismatch between the
> > allocation size and the value of "counted_by", the expecting behavior
> > for each case and the explanation on why in the comments. 
> 
> All the "normal" test cases I have are passing; this is wonderful! :)
> 
> I'm still seeing unexpected situations when I've intentionally set
> counted_by to be smaller than alloc_size, but I assume it's due to not
> yet having the patch you mention below.
> 
> > As discussed, I plan to add two more separate patch sets after this initial
> > patch set is approved and committed.
> > 
> > set 1. A new warning option and a new sanitizer option for the user error
> >    when the allocation size is smaller than the value of "counted_by".
> > set 2. An improvement to __builtin_dynamic_object_size  for the following
> >    case:
> > 
> > struct A
> > {
> > size_t foo;
> > int array[] __attribute__((counted_by (foo)));
> > };
> > 
> > extern struct fix * alloc_buf ();
> > 
> > int main ()
> > {
> > struct fix *p = alloc_buf ();
> > __builtin_object_size(p->array, 0) == sizeof(struct A) + p->foo * 
> > sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */ 
> > __builtin_object_size(p->array, 2) == sizeof(struct A) + p->foo * 
> > sizeof(int);
> >   /* with the current algorithm, it’s UNKNOWN */
> > }
> 
> Should the above be bdos instead of bos?
> 
> > Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> I've updated the Linux kernel's macros for the name change and done
> build tests with my first pass at "easy" cases for adding counted_by:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/counted_by=adc5b3cb48a049563dc673f348eab7b6beba8a9b
> 
> Everything is working as expected. :)
> 
> -Kees
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging





[C PATCH] Support typename as selector in _Generic

2023-08-05 Thread Martin Uecker via Gcc-patches


Clang now has an extension which accepts a typename for
_Generic.  This is simple to implement and is useful.

Do we want this?

Clang calls it a "Clang extension" in the pedantic
warning.  I changed it to "an extension"  I am not
sure what the policy is.

Do we need an extra warning option? Clang has one.

No documentation so far.

Bootstrapped and regression tested on x86_64-pc-linux-gnu.

Martin


c: Support typename as selector in _Generic

Support typenames as first argument to _Generic which is an
extension supported by Clang. It makes it easier to test
for types with qualifiers in combination with typeof.

gcc/c/:
* c-parser.cc (c_parser_generic_selection): Support typename
in _Generic selector.

gcc/testsuite/:
* gnu2x-generic.c: New test.


diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 57a01dc2fa3..9aea2425294 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -9312,30 +9312,51 @@ c_parser_generic_selection (c_parser *parser)
   if (!parens.require_open (parser))
 return error_expr;
 
-  c_inhibit_evaluation_warnings++;
   selector_loc = c_parser_peek_token (parser)->location;
-  selector = c_parser_expr_no_commas (parser, NULL);
-  selector = default_function_array_conversion (selector_loc, selector);
-  c_inhibit_evaluation_warnings--;
 
-  if (selector.value == error_mark_node)
+  if (c_token_starts_typename (c_parser_peek_token (parser)))
 {
-  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
-  return selector;
-}
-  mark_exp_read (selector.value);
-  selector_type = TREE_TYPE (selector.value);
-  /* In ISO C terms, rvalues (including the controlling expression of
- _Generic) do not have qualified types.  */
-  if (TREE_CODE (selector_type) != ARRAY_TYPE)
-selector_type = TYPE_MAIN_VARIANT (selector_type);
-  /* In ISO C terms, _Noreturn is not part of the type of expressions
- such as , but in GCC it is represented internally as a type
- qualifier.  */
-  if (FUNCTION_POINTER_TYPE_P (selector_type)
-  && TYPE_QUALS (TREE_TYPE (selector_type)) != TYPE_UNQUALIFIED)
-selector_type
-  = build_pointer_type (TYPE_MAIN_VARIANT (TREE_TYPE (selector_type)));
+  /* Language extension introduced by Clang.  */
+  pedwarn (selector_loc, OPT_Wpedantic, "passing a type argument as "
+  "first argument to %<_Generic%> is an extension");
+  struct c_type_name *type_name;
+  c_inhibit_evaluation_warnings++;
+  type_name = c_parser_type_name (parser);
+  c_inhibit_evaluation_warnings--;
+  if (NULL == type_name)
+   {
+ c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
+ return error_expr;
+   }
+  /* Qualifiers are preserved.  */
+  selector_type = groktypename (type_name, NULL, NULL);
+}
+  else
+{
+  c_inhibit_evaluation_warnings++;
+  selector = c_parser_expr_no_commas (parser, NULL);
+  selector = default_function_array_conversion (selector_loc, selector);
+  c_inhibit_evaluation_warnings--;
+
+  if (selector.value == error_mark_node)
+   {
+ c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
+ return selector;
+   }
+  mark_exp_read (selector.value);
+  selector_type = TREE_TYPE (selector.value);
+  /* In ISO C terms, rvalues (including the controlling expression of
+_Generic) do not have qualified types.  */
+  if (TREE_CODE (selector_type) != ARRAY_TYPE)
+   selector_type = TYPE_MAIN_VARIANT (selector_type);
+  /* In ISO C terms, _Noreturn is not part of the type of expressions
+such as , but in GCC it is represented internally as a type
+qualifier.  */
+  if (FUNCTION_POINTER_TYPE_P (selector_type)
+ && TYPE_QUALS (TREE_TYPE (selector_type)) != TYPE_UNQUALIFIED)
+  selector_type
+   = build_pointer_type (TYPE_MAIN_VARIANT (TREE_TYPE (selector_type)));
+}
 
   if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>"))
 {
@@ -9401,7 +9422,7 @@ c_parser_generic_selection (c_parser *parser)
   assoc.expression = c_parser_expr_no_commas (parser, NULL);
 
   if (!match)
- c_inhibit_evaluation_warnings--;
+   c_inhibit_evaluation_warnings--;
 
   if (assoc.expression.value == error_mark_node)
{
diff --git a/gcc/testsuite/gcc.dg/gnu2x-generic.c 
b/gcc/testsuite/gcc.dg/gnu2x-generic.c
new file mode 100644
index 000..82b09578072
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu2x-generic.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+_Static_assert(_Generic(const int, const int: 1, int: 0), "");
+_Static_assert(_Generic(  int, const int: 0, int: 1), "");
+_Static_assert(_Generic(int[4], int[4]: 1), "");
+_Static_assert(_Generic(typeof(int[4]), int[4]: 1), "");
+
+void foo(int n)
+{
+   _Static_assert(_Generic(int[n++], int[4]: 1), "");
+}
+
+#pragma GCC diagnostic warning "-Wpedantic"

[committed] c: Less warnings for parameters declared as arrays [PR98536]

2023-08-05 Thread Martin Uecker via Gcc-patches


I splitted up the patch into two parts and committed only
the FE parts which were already approved and the tests.
This solves one of the two issues.

Bootstrapped and regression tested on x86_64-pc-linux-gnu.


Less warnings for parameters declared as arrays [PR98536]

To avoid false positivies, tune the warnings for parameters declared
as arrays with size expressions.  Do not warn when more bounds are
specified in the declaration than before.

PR c/98536

c-family/
* c-warn.cc (warn_parm_array_mismatch): Do not warn if more
bounds are specified.

gcc/testsuite:
* gcc.dg/Wvla-parameter-4.c: Adapt test.
* gcc.dg/attr-access-2.c: Adapt test.


diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index d4d62c48b20..b7c5d7c01a2 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -3599,23 +3599,13 @@ warn_parm_array_mismatch (location_t origloc, tree 
fndecl, tree newparms)
  continue;
}
 
- if (newunspec != curunspec)
+ if (newunspec > curunspec)
{
  location_t warnloc = newloc, noteloc = origloc;
  const char *warnparmstr = newparmstr.c_str ();
  const char *noteparmstr = curparmstr.c_str ();
  unsigned warnunspec = newunspec, noteunspec = curunspec;
 
- if (newunspec < curunspec)
-   {
- /* If the new declaration has fewer unspecified bounds
-point the warning to the previous declaration to make
-it clear that that's the one to change.  Otherwise,
-point it to the new decl.  */
- std::swap (warnloc, noteloc);
- std::swap (warnparmstr, noteparmstr);
- std::swap (warnunspec, noteunspec);
-   }
  if (warning_n (warnloc, OPT_Wvla_parameter, warnunspec,
 "argument %u of type %s declared with "
 "%u unspecified variable bound",
@@ -3643,14 +3633,10 @@ warn_parm_array_mismatch (location_t origloc, tree 
fndecl, tree newparms)
}
 
   /* Iterate over the lists of VLA variable bounds, comparing each
-pair for equality, and diagnosing mismatches.  The case of
-the lists having different lengths is handled above so at
-this point they do .  */
-  for (tree newvbl = newa->size, curvbl = cura->size; newvbl;
+pair for equality, and diagnosing mismatches.  */
+  for (tree newvbl = newa->size, curvbl = cura->size; newvbl && curvbl;
   newvbl = TREE_CHAIN (newvbl), curvbl = TREE_CHAIN (curvbl))
{
- gcc_assert (curvbl);
-
  tree newpos = TREE_PURPOSE (newvbl);
  tree curpos = TREE_PURPOSE (curvbl);
 
diff --git a/gcc/testsuite/gcc.dg/Wvla-parameter-4.c 
b/gcc/testsuite/gcc.dg/Wvla-parameter-4.c
index 599ad19a3e4..f35faea361a 100644
--- a/gcc/testsuite/gcc.dg/Wvla-parameter-4.c
+++ b/gcc/testsuite/gcc.dg/Wvla-parameter-4.c
@@ -12,11 +12,6 @@ typedef int IA3[3];
 /* Verify the warning points to the declaration with more unspecified
bounds, guiding the user to specify them rather than making them all
unspecified.  */
-void* f_pIA3ax (IA3 *x[*]); // { dg-warning "argument 1 of type 
'int \\\(\\\*\\\[\\\*]\\\)\\\[3]' .aka '\[^\n\r\}\]+'. declared with 1 
unspecified variable bound" }
-void* f_pIA3ax (IA3 *x[*]);
-void* f_pIA3ax (IA3 *x[n]); // { dg-message "subsequently declared 
as 'int \\\(\\\*\\\[n]\\\)\\\[3]' with 0 unspecified variable bounds" "note" }
-void* f_pIA3ax (IA3 *x[n]) { return x; }
-
 
 void* f_pIA3an (IA3 *x[n]);  // { dg-message "previously declared 
as 'int \\\(\\\*\\\[n]\\\)\\\[3]' with 0 unspecified variable bounds" "note" }
 void* f_pIA3an (IA3 *x[n]);
diff --git a/gcc/testsuite/gcc.dg/attr-access-2.c 
b/gcc/testsuite/gcc.dg/attr-access-2.c
index 76baddffc9f..616b7a9527c 100644
--- a/gcc/testsuite/gcc.dg/attr-access-2.c
+++ b/gcc/testsuite/gcc.dg/attr-access-2.c
@@ -60,16 +60,6 @@ RW (2, 1) void f10 (int n, char a[n])   // { dg-warning 
"attribute 'access *\\\(
 // { dg-warning "argument 2 of type 
'char\\\[n]' declared as a variable length array"  "" { target *-*-* } .-1 }
 { (void) (void) }
 
-
-/* The following is diagnosed to point out declarations with the T[*]
-   form in headers where specifying the bound is just as important as
-   in the definition (to detect bugs).  */
-  void f11 (int, char[*]);  // { dg-warning "argument 2 of type 
'char\\\[\\\*\\\]' declared with 1 unspecified variable bound" }
-  void f11 (int m, char a[m]);  // { dg-message "subsequently declared 
as 'char\\\[m]' with 0 unspecified variable bounds" "note" }
-RW (2, 1) void f11 (int n, char arr[n]) // { dg-message "subsequently declared 
as 'char\\\[n]' with 0 unspecified 

[C PATCH] _Generic should not warn in non-active branches [PR68193,PR97100]

2023-08-04 Thread Martin Uecker via Gcc-patches



Here is a patch to reduce false positives in _Generic.

Bootstrapped and regression tested on x86_64-linux.

Martin

c: _Generic should not warn in non-active branches [PR68193,PR97100]

To avoid false diagnostics, use c_inhibit_evaluation_warnings when
a generic association is known to match during parsing.  We may still
generate false positives if the default branch comes earler than
a specific association that matches.

PR c/68193
PR c/97100

gcc/c/:
* c-parser.cc (c_parser_generic_selection): Inhibit evaluation
warnings branches that are known not be taken during parsing.

gcc/testsuite/ChangeLog:
* gcc.dg/pr68193.c: New test.


diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 24a6eb6e459..d1863b301e0 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -9350,7 +9350,17 @@ c_parser_generic_selection (c_parser *parser)
  return error_expr;
}
 
+  bool match = assoc.type == NULL_TREE
+  || comptypes (assoc.type, selector_type);
+
+  if (!match)
+   c_inhibit_evaluation_warnings++;
+
   assoc.expression = c_parser_expr_no_commas (parser, NULL);
+
+  if (!match)
+ c_inhibit_evaluation_warnings--;
+
   if (assoc.expression.value == error_mark_node)
{
  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL);
@@ -9387,7 +9397,7 @@ c_parser_generic_selection (c_parser *parser)
  match_found = associations.length ();
}
}
-  else if (comptypes (assoc.type, selector_type))
+  else if (match)
{
  if (match_found < 0 || matched_assoc.type == NULL_TREE)
{
diff --git a/gcc/testsuite/gcc.dg/pr68193.c b/gcc/testsuite/gcc.dg/pr68193.c
new file mode 100644
index 000..2267593e363
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68193.c
@@ -0,0 +1,15 @@
+/*  pr69193 */
+/* { dg-do compile } */
+/* { dg-options "-Wall" } */
+
+int
+main (void)
+{
+  int i = 0;
+  int j = _Generic (i,
+   int: 0,
+   long int: (i = (long int) 9223372036854775808UL));
+  return i + j;
+}
+
+




Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-08-02 Thread Martin Uecker via Gcc-patches
Am Mittwoch, dem 02.08.2023 um 16:45 + schrieb Qing Zhao:
> 
> > On Aug 1, 2023, at 10:31 AM, Martin Uecker  wrote:
> > 
> > Am Dienstag, dem 01.08.2023 um 13:27 + schrieb Qing Zhao:
> > > 
> > > > On Aug 1, 2023, at 3:51 AM, Martin Uecker via Gcc-patches 
> > > >  wrote:
> > > > 
> > 
> > 
> > > > > Hi Martin,
> > > > > Just wondering if it'd be a good idea perhaps to warn if alloc size is
> > > > > not a multiple of TYPE_SIZE_UNIT instead of just less-than ?
> > > > > So it can catch cases like:
> > > > > int *p = malloc (sizeof (int) + 2); // probably intended malloc
> > > > > (sizeof (int) * 2)
> > > > > 
> > > > > FWIW, this is caught using -fanalyzer:
> > > > > f.c: In function 'f':
> > > > > f.c:3:12: warning: allocated buffer size is not a multiple of the
> > > > > pointee's size [CWE-131] [-Wanalyzer-allocation-size]
> > > > >3 |   int *p = __builtin_malloc (sizeof(int) + 2);
> > > > >  |^~
> > > > > 
> > > > > Thanks,
> > > > > Prathamesh
> > > > 
> > > > Yes, this is probably a good idea.  It might need special
> > > > logic for flexible array members then...
> > > 
> > > Why special logic for FAM on such warning? (Not a multiple of 
> > > TYPE_SIZE_UNIT for the element).
> > > 
> > 
> > For
> > 
> > struct { int n; char buf[]; } *p = malloc(sizeof *p + n);
> > p->n = n;
> > 
> > the size would not be a multiple.
> 
> But n is still a multiple of sizeof (char), right? Do I miss anything here?

Right, for a struct with FAM we could check that it is
sizeof () plus a multiple of the element size of the FAM.
Still special logic... 

Martin


> Qing
> > 
> > Martin
> > 
> > 
> > 
> > 
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging




Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-08-02 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 01.08.2023 um 15:45 -0700 schrieb Kees Cook:
> On Mon, Jul 31, 2023 at 08:14:42PM +, Qing Zhao wrote:
> > /* In general, Due to type casting, the type for the pointee of a pointer
> >does not say anything about the object it points to,
> >So, __builtin_object_size can not directly use the type of the pointee
> >to decide the size of the object the pointer points to.
> > 
> >there are only two reliable ways:
> >A. observed allocations  (call to the allocation functions in the 
> > routine)
> >B. observed accesses (read or write access to the location of the 
> >  pointer points to)
> > 
> >that provide information about the type/existence of an object at
> >the corresponding address.
> > 
> >for A, we use the "alloc_size" attribute for the corresponding allocation
> >functions to determine the object size;
> > 
> >For B, we use the SIZE info of the TYPE attached to the corresponding 
> > access.
> >(We treat counted_by attribute as a complement to the SIZE info of the 
> > TYPE
> > for FMA) 
> > 
> >The only other way in C which ensures that a pointer actually points
> >to an object of the correct type is 'static':
> > 
> >void foo(struct P *p[static 1]);   
> > 
> >See https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624814.html
> >for more details.  */
> 
> This is a great explanation; thank you!
> 
> In the future I might want to have a new builtin that will allow
> a program to query a pointer when neither A nor B have happened. But
> for the first version of the __counted_by infrastructure, the above
> limitations seen fine.
> 
> For example, maybe __builtin_counted_size(p) (which returns sizeof(*p) +
> sizeof(*p->flex_array_member) * p->counted_by_member). Though since
> there might be multiple flex array members, maybe this can't work. :)

We had a _Lengthof proposal for arrays (instead of sizeof/sizeof)
and thought about how to extend this to structs with FAM. The
problem is that it can not rely on an attribute.

With GCC's VLA in structs you could do 

struct foo { int n; char buf[n_init]; } *p = malloc(sizeof *p);
p->n_init = n;

and get sizeof and bounds checking with UBSan
https://godbolt.org/z/d4nneqs3P

(but also compiler bugs and other issues)


Also see my experimental container library, where you can do:

vec_decl(int);
vec(int)* v = vec_alloc(int);

vec_push(, 1);
vec_push(, 3);

auto p = _array(v);
(*p)[1] = 1; // bounds check

Here, "vec_array()" would give you a regular C array view
of the vector contant and with correct dynamic size, so you
can apply "sizeof" and  have bounds checking with UBSan and
it just works (with clang / GCC without changes). 
https://github.com/uecker/noplate



Martin









Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-08-01 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 01.08.2023 um 13:27 + schrieb Qing Zhao:
> 
> > On Aug 1, 2023, at 3:51 AM, Martin Uecker via Gcc-patches 
> >  wrote:
> > 


> > > Hi Martin,
> > > Just wondering if it'd be a good idea perhaps to warn if alloc size is
> > > not a multiple of TYPE_SIZE_UNIT instead of just less-than ?
> > > So it can catch cases like:
> > > int *p = malloc (sizeof (int) + 2); // probably intended malloc
> > > (sizeof (int) * 2)
> > > 
> > > FWIW, this is caught using -fanalyzer:
> > > f.c: In function 'f':
> > > f.c:3:12: warning: allocated buffer size is not a multiple of the
> > > pointee's size [CWE-131] [-Wanalyzer-allocation-size]
> > >    3 |   int *p = __builtin_malloc (sizeof(int) + 2);
> > >  |^~
> > > 
> > > Thanks,
> > > Prathamesh
> > 
> > Yes, this is probably a good idea.  It might need special
> > logic for flexible array members then...
> 
> Why special logic for FAM on such warning? (Not a multiple of TYPE_SIZE_UNIT 
> for the element).
> 

For

struct { int n; char buf[]; } *p = malloc(sizeof *p + n);
p->n = n;

the size would not be a multiple.

Martin






Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-08-01 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 01.08.2023 um 02:11 +0530 schrieb Prathamesh Kulkarni:
> On Fri, 21 Jul 2023 at 16:52, Martin Uecker via Gcc-patches
>  wrote:
> > 
> > 
> > 
> > This patch adds a warning for allocations with insufficient size
> > based on the "alloc_size" attribute and the type of the pointer
> > the result is assigned to. While it is theoretically legal to
> > assign to the wrong pointer type and cast it to the right type
> > later, this almost always indicates an error. Since this catches
> > common mistakes and is simple to diagnose, it is suggested to
> > add this warning.
> > 
> > 
> > Bootstrapped and regression tested on x86.
> > 
> > 
> > Martin
> > 
> > 
> > 
> > Add option Walloc-type that warns about allocations that have
> > insufficient storage for the target type of the pointer the
> > storage is assigned to.
> > 
> > gcc:
> > * doc/invoke.texi: Document -Wstrict-flex-arrays option.
> > 
> > gcc/c-family:
> > 
> > * c.opt (Walloc-type): New option.
> > 
> > gcc/c:
> > * c-typeck.cc (convert_for_assignment): Add Walloc-type warning.
> > 
> > gcc/testsuite:
> > 
> > * gcc.dg/Walloc-type-1.c: New test.
> > 
> > 
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index 4abdc8d0e77..8b9d148582b 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -319,6 +319,10 @@ Walloca
> >  C ObjC C++ ObjC++ Var(warn_alloca) Warning
> >  Warn on any use of alloca.
> > 
> > +Walloc-type
> > +C ObjC Var(warn_alloc_type) Warning
> > +Warn when allocating insufficient storage for the target type of the
> > assigned pointer.
> > +
> >  Walloc-size-larger-than=
> >  C ObjC C++ LTO ObjC++ Var(warn_alloc_size_limit) Joined Host_Wide_Int
> > ByteSize Warning Init(HOST_WIDE_INT_MAX)
> >  -Walloc-size-larger-than=   Warn for calls to allocation
> > functions that
> > diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
> > index 7cf411155c6..2e392f9c952 100644
> > --- a/gcc/c/c-typeck.cc
> > +++ b/gcc/c/c-typeck.cc
> > @@ -7343,6 +7343,32 @@ convert_for_assignment (location_t location,
> > location_t expr_loc, tree type,
> > "request for implicit conversion "
> > "from %qT to %qT not permitted in C++", rhstype,
> > type);
> > 
> > +  /* Warn of new allocations are not big enough for the target
> > type.  */
> > +  tree fndecl;
> > +  if (warn_alloc_type
> > + && TREE_CODE (rhs) == CALL_EXPR
> > + && (fndecl = get_callee_fndecl (rhs)) != NULL_TREE
> > + && DECL_IS_MALLOC (fndecl))
> > +   {
> > + tree fntype = TREE_TYPE (fndecl);
> > + tree fntypeattrs = TYPE_ATTRIBUTES (fntype);
> > + tree alloc_size = lookup_attribute ("alloc_size",
> > fntypeattrs);
> > + if (alloc_size)
> > +   {
> > + tree args = TREE_VALUE (alloc_size);
> > + int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
> > + /* For calloc only use the second argument.  */
> > + if (TREE_CHAIN (args))
> > +   idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN
> > (args))) - 1;
> > + tree arg = CALL_EXPR_ARG (rhs, idx);
> > + if (TREE_CODE (arg) == INTEGER_CST
> > + && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl)))
> Hi Martin,
> Just wondering if it'd be a good idea perhaps to warn if alloc size is
> not a multiple of TYPE_SIZE_UNIT instead of just less-than ?
> So it can catch cases like:
> int *p = malloc (sizeof (int) + 2); // probably intended malloc
> (sizeof (int) * 2)
> 
> FWIW, this is caught using -fanalyzer:
> f.c: In function 'f':
> f.c:3:12: warning: allocated buffer size is not a multiple of the
> pointee's size [CWE-131] [-Wanalyzer-allocation-size]
> 3 |   int *p = __builtin_malloc (sizeof(int) + 2);
>   |^~
> 
> Thanks,
> Prathamesh

Yes, this is probably a good idea.  It might need special
logic for flexible array members then...


Martin


> > +warning_at (location, OPT_Walloc_type, "allocation of
> > "
> > +"insufficient size %qE for type %qT with
> > "
> > +"size %qE", arg, ttl, TYPE_SIZE_UNIT
> > (ttl));
> > +   }
>

Re: [C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-07-31 Thread Martin Uecker via Gcc-patches
Am Montag, dem 31.07.2023 um 15:39 -0400 schrieb Siddhesh Poyarekar:
> On 2023-07-21 07:21, Martin Uecker via Gcc-patches wrote:
> > 
> > 
> > This patch adds a warning for allocations with insufficient size
> > based on the "alloc_size" attribute and the type of the pointer
> > the result is assigned to. While it is theoretically legal to
> > assign to the wrong pointer type and cast it to the right type
> > later, this almost always indicates an error. Since this catches
> > common mistakes and is simple to diagnose, it is suggested to
> > add this warning.
> >   

...

> > 
> 
> Wouldn't this be much more useful in later phases with ranger feedback 
> like with the warn_access warnings?  That way the comparison won't be 
> limited to constant sizes.

Possibly. Having it in the FE made it simple to implement and
also reliable.  One thing I considered is also looking deeper
into the argument and detect obvious mistakes, e.g. if the
type in a sizeof is the right one. Such extensions would be
easier in the FE.

But I wouldn't mind replacing or extending this with something
smarter emitted from later phases. I probably do not have time
to work on this is myself in the near future though.

Martin




[PING 3] [PATCH] Less warnings for parameters declared as arrays [PR98541, PR98536]

2023-07-31 Thread Martin Uecker via Gcc-patches


Joseph, I would appreciate if you could take a look at this?  

This fixes the remaining issues which requires me to turn the
warnings off with -Wno-vla-parameter and -Wno-nonnull in my
projects.

Am Montag, dem 03.04.2023 um 21:34 +0200 schrieb Martin Uecker:
> 
> With the relatively new warnings (11..) affecting VLA bounds,
> I now get a lot of false positives with -Wall. In general, I find
> the new warnings very useful, but they seem a bit too
> aggressive and some minor tweaks are needed, otherwise they are
> too noisy.  This patch suggests two changes:
> 
> 1. For VLA bounds non-null is implied only when 'static' is
> used (similar to clang) and not already when a bound > 0 is
> specified:
> 
> int foo(int n, char buf[static n]);
> 
> int foo(10, 0); // warning with 'static' but not without.
> 
> 
> (It also seems problematic to require a size of 0 to indicate 
> that the pointer may be null, because 0 is not allowed in
> ISO C as a size. It is also inconsistent to how arrays with
> static bound behave.) 
> 
> There seems to be agreement about this change in PR98541.
> 
> 
> 2. GCC always warns when the number of unspecified
> bounds is different between two declarations:
> 
> int foo(int n, char buf[*]);
> int foo(int n, char buf[n]);
> 
> or
> 
> int foo(int n, char buf[n]);
> int foo(int n, char buf[*]);
> 
> But the first version is useful if the size expression
> can not be specified in a header (e.g. because it uses
> a macro or variable not available there) and there is
> currently no easy way to avoid this.  The warning for
> both cases was by design,  but I suggest to limit the
> warning to the second case. 
> 
> Note that the logic currently applied by GCC is too
> simplistic anyway, as GCC does not warn for
> 
> int foo(int x, int y, double m[*][y]);
> int foo(int x, int y, double m[x][*]);
> 
> because the number of specified / unspecified bounds
> is the same.  So I suggest to go with the attached
> patch now and add  more precise warnings later
> if there is more experience with these warning 
> in gernal and if this then still seems desirable.
> 
> 
> Martin
> 
> 
> Less warnings for parameters declared as arrays [PR98541, PR98536]
> 
> To avoid false positivies, tune the warnings for parameters declared
> as arrays with size expressions.  Only warn about null arguments with
> 'static'.  Also do not warn when more bounds are specified in the new
> declaration than before.
> 
> PR c/98541
> PR c/98536
> 
> c-family/
> * c-warn.cc (warn_parm_array_mismatch): Do not warn if more
> bounds are specified.
> 
> gcc/
> * gimple-ssa-warn-access.cc
>   (pass_waccess::maybe_check_access_sizes): For VLA bounds
> in parameters, only warn about null pointers with 'static'.
> 
> gcc/testsuite:
> * gcc.dg/Wnonnull-4: Adapt test.
> * gcc.dg/Wstringop-overflow-40.c: Adapt test.
> * gcc.dg/Wvla-parameter-4.c: Adapt test.
> * gcc.dg/attr-access-2.c: Adapt test.
> 
> 
> diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
> index 9ac43a1af6e..f79fb876142 100644
> --- a/gcc/c-family/c-warn.cc
> +++ b/gcc/c-family/c-warn.cc
> @@ -3599,23 +3599,13 @@ warn_parm_array_mismatch (location_t origloc, tree 
> fndecl, tree newparms)
> continue;
>   }
>  
> -   if (newunspec != curunspec)
> +   if (newunspec > curunspec)
>   {
> location_t warnloc = newloc, noteloc = origloc;
> const char *warnparmstr = newparmstr.c_str ();
> const char *noteparmstr = curparmstr.c_str ();
> unsigned warnunspec = newunspec, noteunspec = curunspec;
>  
> -   if (newunspec < curunspec)
> - {
> -   /* If the new declaration has fewer unspecified bounds
> -  point the warning to the previous declaration to make
> -  it clear that that's the one to change.  Otherwise,
> -  point it to the new decl.  */
> -   std::swap (warnloc, noteloc);
> -   std::swap (warnparmstr, noteparmstr);
> -   std::swap (warnunspec, noteunspec);
> - }
> if (warning_n (warnloc, OPT_Wvla_parameter, warnunspec,
>"argument %u of type %s declared with "
>"%u unspecified variable bound",
> @@ -3641,16 +3631,11 @@ warn_parm_array_mismatch (location_t origloc, tree 
> fndecl, tree newparms)
> continue;
>   }
>   }
> -
>/* Iterate over the lists of VLA variable bounds, comparing each
> -  pair for equality, and diagnosing mismatches.  The case of
> -  the lists having different lengths is handled above so at
> -  this point they do .  */
> -  for (tree newvbl = newa->size, curvbl = cura->size; newvbl;
> +  pair for equality, and 

user branches

2023-07-29 Thread Martin Uecker via Gcc


Hi all,

is it still possible to have user branches in the repository?

If so, how do I create one?  Simply pushing to users/uecker/vla
or something is rejected.

Martin


[C PATCH]: Add Walloc-type to warn about insufficient size in allocations

2023-07-21 Thread Martin Uecker via Gcc-patches



This patch adds a warning for allocations with insufficient size
based on the "alloc_size" attribute and the type of the pointer 
the result is assigned to. While it is theoretically legal to
assign to the wrong pointer type and cast it to the right type
later, this almost always indicates an error. Since this catches
common mistakes and is simple to diagnose, it is suggested to
add this warning.
 

Bootstrapped and regression tested on x86. 


Martin



Add option Walloc-type that warns about allocations that have
insufficient storage for the target type of the pointer the
storage is assigned to.

gcc:
* doc/invoke.texi: Document -Wstrict-flex-arrays option.

gcc/c-family:

* c.opt (Walloc-type): New option.

gcc/c:
* c-typeck.cc (convert_for_assignment): Add Walloc-type warning.

gcc/testsuite:

* gcc.dg/Walloc-type-1.c: New test.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 4abdc8d0e77..8b9d148582b 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -319,6 +319,10 @@ Walloca
 C ObjC C++ ObjC++ Var(warn_alloca) Warning
 Warn on any use of alloca.
 
+Walloc-type
+C ObjC Var(warn_alloc_type) Warning
+Warn when allocating insufficient storage for the target type of the
assigned pointer.
+
 Walloc-size-larger-than=
 C ObjC C++ LTO ObjC++ Var(warn_alloc_size_limit) Joined Host_Wide_Int
ByteSize Warning Init(HOST_WIDE_INT_MAX)
 -Walloc-size-larger-than=   Warn for calls to allocation
functions that
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7cf411155c6..2e392f9c952 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -7343,6 +7343,32 @@ convert_for_assignment (location_t location,
location_t expr_loc, tree type,
"request for implicit conversion "
"from %qT to %qT not permitted in C++", rhstype,
type);
 
+  /* Warn of new allocations are not big enough for the target
type.  */
+  tree fndecl;
+  if (warn_alloc_type
+ && TREE_CODE (rhs) == CALL_EXPR
+ && (fndecl = get_callee_fndecl (rhs)) != NULL_TREE
+ && DECL_IS_MALLOC (fndecl))
+   {
+ tree fntype = TREE_TYPE (fndecl);
+ tree fntypeattrs = TYPE_ATTRIBUTES (fntype);
+ tree alloc_size = lookup_attribute ("alloc_size",
fntypeattrs);
+ if (alloc_size)
+   {
+ tree args = TREE_VALUE (alloc_size);
+ int idx = TREE_INT_CST_LOW (TREE_VALUE (args)) - 1;
+ /* For calloc only use the second argument.  */
+ if (TREE_CHAIN (args))
+   idx = TREE_INT_CST_LOW (TREE_VALUE (TREE_CHAIN
(args))) - 1;
+ tree arg = CALL_EXPR_ARG (rhs, idx);
+ if (TREE_CODE (arg) == INTEGER_CST
+ && tree_int_cst_lt (arg, TYPE_SIZE_UNIT (ttl)))
+warning_at (location, OPT_Walloc_type, "allocation of
"
+"insufficient size %qE for type %qT with
"
+"size %qE", arg, ttl, TYPE_SIZE_UNIT
(ttl));
+   }
+   }
+
   /* See if the pointers point to incompatible address spaces.  */
   asl = TYPE_ADDR_SPACE (ttl);
   asr = TYPE_ADDR_SPACE (ttr);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 88e3c625030..6869bed64c3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8076,6 +8076,15 @@ always leads to a call to another @code{cold}
function such as wrappers of
 C++ @code{throw} or fatal error reporting functions leading to
@code{abort}.
 @end table
 
+@opindex Wno-alloc-type
+@opindex Walloc-type
+@item -Walloc-type
+Warn about calls to allocation functions decorated with attribute
+@code{alloc_size} that specify insufficient size for the target type
of
+the pointer the result is assigned to, including those to the built-in
+forms of the functions @code{aligned_alloc}, @code{alloca},
@code{calloc},
+@code{malloc}, and @code{realloc}.
+
 @opindex Wno-alloc-zero
 @opindex Walloc-zero
 @item -Walloc-zero
diff --git a/gcc/testsuite/gcc.dg/Walloc-type-1.c
b/gcc/testsuite/gcc.dg/Walloc-type-1.c
new file mode 100644
index 000..bc62e5e9aa3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Walloc-type-1.c
@@ -0,0 +1,37 @@
+/* Tests the warnings for insufficient allocation size. 
+   { dg-do compile }
+ * { dg-options "-Walloc-type" } 
+ * */
+#include 
+#include 
+
+struct b { int x[10]; };
+
+void fo0(void)
+{
+struct b *p = malloc(sizeof *p);
+}
+
+void fo1(void)
+{
+struct b *p = malloc(sizeof p);/* { dg-
warning "allocation of insufficient size" } */
+}
+
+void fo2(void)
+{
+struct b *p = alloca(sizeof p);/* { dg-
warning "allocation of insufficient size" } */
+}
+
+void fo3(void)
+{
+struct b *p = calloc(1, sizeof p); /* { dg-warning
"allocation of insufficient size" } */
+}
+
+void g(struct b* p);
+
+void fo4(void)
+{
+g(malloc(4));  /* { dg-warning "allocation of
insufficient size" } */
+}
+
+





[PING 2] [PATCH] Less warnings for parameters declared as arrays [PR98541, PR98536]

2023-07-19 Thread Martin Uecker via Gcc-patches



Ok for gcc-14 now?


Am Dienstag, dem 04.04.2023 um 19:31 -0600 schrieb Jeff Law:
> 
> 
> On 4/3/23 13:34, Martin Uecker via Gcc-patches wrote:
> > 
> > 
> > With the relatively new warnings (11..) affecting VLA bounds,
> > I now get a lot of false positives with -Wall. In general, I find
> > the new warnings very useful, but they seem a bit too
> > aggressive and some minor tweaks are needed, otherwise they are
> > too noisy.  This patch suggests two changes:
> > 
> > 1. For VLA bounds non-null is implied only when 'static' is
> > used (similar to clang) and not already when a bound > 0 is
> > specified:
> > 
> > int foo(int n, char buf[static n]);
> > 
> > int foo(10, 0); // warning with 'static' but not without.
> > 
> > 
> > (It also seems problematic to require a size of 0 to indicate
> > that the pointer may be null, because 0 is not allowed in
> > ISO C as a size. It is also inconsistent to how arrays with
> > static bound behave.)
> > 
> > There seems to be agreement about this change in PR98541.
> > 
> > 
> > 2. GCC always warns when the number of unspecified
> > bounds is different between two declarations:
> > 
> > int foo(int n, char buf[*]);
> > int foo(int n, char buf[n]);
> > 
> > or
> > 
> > int foo(int n, char buf[n]);
> > int foo(int n, char buf[*]);
> > 
> > But the first version is useful if the size expression
> > can not be specified in a header (e.g. because it uses
> > a macro or variable not available there) and there is
> > currently no easy way to avoid this.  The warning for
> > both cases was by design,  but I suggest to limit the
> > warning to the second case.
> > 
> > Note that the logic currently applied by GCC is too
> > simplistic anyway, as GCC does not warn for
> > 
> > int foo(int x, int y, double m[*][y]);
> > int foo(int x, int y, double m[x][*]);
> > 
> > because the number of specified / unspecified bounds
> > is the same.  So I suggest to go with the attached
> > patch now and add  more precise warnings later
> > if there is more experience with these warning
> > in gernal and if this then still seems desirable.
> > 
> > 
> > Martin
> > 
> > 
> >  Less warnings for parameters declared as arrays [PR98541,
> > PR98536]
> >  
> >  To avoid false positivies, tune the warnings for parameters
> > declared
> >  as arrays with size expressions.  Only warn about null
> > arguments with
> >  'static'.  Also do not warn when more bounds are specified in
> > the new
> >  declaration than before.
> >  
> >  PR c/98541
> >  PR c/98536
> >  
> >  c-family/
> >  * c-warn.cc (warn_parm_array_mismatch): Do not warn if
> > more
> >  bounds are specified.
> >  
> >  gcc/
> >  * gimple-ssa-warn-access.cc
> >    (pass_waccess::maybe_check_access_sizes): For VLA
> > bounds
> >  in parameters, only warn about null pointers with
> > 'static'.
> >  
> >  gcc/testsuite:
> >  * gcc.dg/Wnonnull-4: Adapt test.
> >  * gcc.dg/Wstringop-overflow-40.c: Adapt test.
> >  * gcc.dg/Wvla-parameter-4.c: Adapt test.
> >  * gcc.dg/attr-access-2.c: Adapt test.
> Neither appears to be a regression.  Seems like it should defer to
> gcc-14.
> jeff




Re: [PATCH] core: Support heap-based trampolines

2023-07-19 Thread Martin Uecker via Gcc-patches
Am Mittwoch, dem 19.07.2023 um 15:23 +0100 schrieb Iain Sandoe:
> Hi Martin,
> 
> > On 19 Jul 2023, at 11:43, Martin Uecker via Gcc-patches 
> >  wrote:
> > 
> > Am Mittwoch, dem 19.07.2023 um 10:29 +0100 schrieb Iain Sandoe:
> 
> > > > On 19 Jul 2023, at 10:04, Martin Uecker 
> > > > wrote:
> > > 
> > > > > > On 17 Jul 2023, 
> > > > > 
> > > > 
> > > > > > > You mention setjmp/longjmp - on darwin and other platforms
> > > > > requiring
> > > > > > > non-stack based trampolines
> > > > > > > does the system runtime provide means to deal with this issue
> > > > > > > like
> > > > > an
> > > > > > > alternate allocation method
> > > > > > > or a way to register cleanup?
> > > > > > 
> > > > > > There is an alternate mechanism relying on system libraries
> > > > > > that is
> > > > > possible on darwin specifically (I don’t know for other targets)
> > > > > but
> > > > > it will only work for signed binaries, and would require us to
> > > > > codesign everything produced by gcc. During development, it was
> > > > > deemed too big an ask and the current strategy was chosen (Iain
> > > > > can
> > > > > surely add more background on that if needed).
> > > > > 
> > > > > I do not think that this solves the setjump/longjump issue -
> > > > > since
> > > > > there’s still a notional allocation that takes place (it’s just
> > > > > that
> > > > > the mechanism for determining permissions is different).
> > > > > 
> > > > > It is also a big barrier for the general user - and prevents
> > > > > normal
> > > > > folks from distributing GCC - since codesigning requires an
> > > > > external
> > > > > certificate (i.e. I would really rather avoid it).
> > > > > 
> > > > > > > Was there ever an attempt to provide a "generic" trampoline
> > > > > > > driven
> > > > > by
> > > > > > > a more complex descriptor?
> 
> > > > > > My own opinion is that executable stack should go away on all
> > > > > targets at some point, so a truly generic solution to the problem
> > > > > would be great.
> > > > > 
> > > > > indeed it would.
> > > 
> > > > I think we need a solution rather sooner than later on all archs.
> > > 
> > > AFAICS the  heap-based trampolines can work for any arch**, this
> > > issue is about
> > > system security policy, rather than arch, specifically?
> > > 
> > > It seems to me that for any system security policy that permits JIT,
> > > (but not
> > > executable stack) the heap-based trampolines are viable.
> > 
> > I agree. 
> > 
> > BTW; One option we discussed before, was to map a page with 
> > pre-allocated trampolines, which look up the address of
> > a callee and the static chain in a table based on its own
> > address. Then no code generation is involved.
> 
> That reads similar to the scheme Apple have implemented for libobjc and 
> libffi.
> In order to be extensible (i.e to allow the table to grow at runtime), it 
> means
> having some loadable executable object; if that is implemented in a way shared
> between users (delivered as part of the implementation) then, for Darwin at
> least, it must be codesigned - which is somewhere I really want to avoid going
> with GCC.  
> 
> > The difficult part is avoiding leaks with longjmp / setjmp.
> > One idea was to have a shadow stack consisting of the
> > pre-allocated trampolines, but this probably causes other
> > issues...
> 
> With a per-thread table, I *think* for most targets, we discussed in the team
> maintaining a ’tide mark’ of the stack as part of the saved data in the
> trampoline (not used as part of the execution, but only as part of the 
> allocation
> mangement)… but ..
> 
> > I wonder how difficult it is to have longjmp / setjmp walk 
> > the stack in C?   This would also be useful for C++
> > interoperability and to free  heap-allocated VLAs.
> 
> … this would be a better solution (as we can see trampolines are a small
> leak c.f. the general uses)?
> 
> > As a user of nested functions, from my side it would also 
>

Re: [PATCH] core: Support heap-based trampolines

2023-07-19 Thread Martin Uecker via Gcc-patches
Am Mittwoch, dem 19.07.2023 um 10:29 +0100 schrieb Iain Sandoe:
> Hi Martin,
> 
> > On 19 Jul 2023, at 10:04, Martin Uecker 
> > wrote:
> 
> > > > On 17 Jul 2023, 
> > > 
> > 
> > > > > You mention setjmp/longjmp - on darwin and other platforms
> > > requiring
> > > > > non-stack based trampolines
> > > > > does the system runtime provide means to deal with this issue
> > > > > like
> > > an
> > > > > alternate allocation method
> > > > > or a way to register cleanup?
> > > > 
> > > > There is an alternate mechanism relying on system libraries
> > > > that is
> > > possible on darwin specifically (I don’t know for other targets)
> > > but
> > > it will only work for signed binaries, and would require us to
> > > codesign everything produced by gcc. During development, it was
> > > deemed too big an ask and the current strategy was chosen (Iain
> > > can
> > > surely add more background on that if needed).
> > > 
> > > I do not think that this solves the setjump/longjump issue -
> > > since
> > > there’s still a notional allocation that takes place (it’s just
> > > that
> > > the mechanism for determining permissions is different).
> > > 
> > > It is also a big barrier for the general user - and prevents
> > > normal
> > > folks from distributing GCC - since codesigning requires an
> > > external
> > > certificate (i.e. I would really rather avoid it).
> > > 
> > > > > Was there ever an attempt to provide a "generic" trampoline
> > > > > driven
> > > by
> > > > > a more complex descriptor?
> > > 
> > > We did look at the “unused address bits” mechanism that Ada has
> > > used
> > > - but that is not really available to a non-private ABI (unless
> > > the
> > > system vendor agrees to change ABI to leave a bit spare) for the
> > > base
> > > arch either the bits are not there (e.g. X86) or reserved (e.g.
> > > AArch64).
> > > 
> > > Andrew Burgess did the original work he might have comments on
> > > alternatives we tried
> > > 
> > 
> > For reference, I proposed a patch for this in 2018. It was not
> > accepted because minimum alignment for functions would increase
> > for some archs:
> > 
> > https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg01532.html
> 
> Right - that was the one we originally looked at and has the issue
> that it 
> breaks ABI - and thus would need vendor by-in to alter as you say.
> 
> > > > > (well, it could be a bytecode interpreter and the trampoline
> > > > > being
> > > > > bytecode on the stack?!)
> > > > 
> > > > My own opinion is that executable stack should go away on all
> > > targets at some point, so a truly generic solution to the problem
> > > would be great.
> > > 
> > > indeed it would.
> 
> > I think we need a solution rather sooner than later on all archs.
> 
> AFAICS the  heap-based trampolines can work for any arch**, this
> issue is about
> system security policy, rather than arch, specifically?
> 
> It seems to me that for any system security policy that permits JIT,
> (but not
> executable stack) the heap-based trampolines are viable.

I agree. 

BTW; One option we discussed before, was to map a page with 
pre-allocated trampolines, which look up the address of
a callee and the static chain in a table based on its own
address. Then no code generation is involved.

The difficult part is avoiding leaks with longjmp / setjmp.
One idea was to have a shadow stack consisting of the
pre-allocated trampolines, but this probably causes other
issues...

I wonder how difficult it is to have longjmp / setjmp walk 
the stack in C?   This would also be useful for C++
interoperability and to free  heap-allocated VLAs.


As a user of nested functions, from my side it would also 
ok to simply add a wide function pointer type that contains
address + static chain.  This would require changing code, 
but would also work with Clang's blocks and solve other 
language interoperability problems, while avoiding all 
existing ABI issues.

> 
> This seems to be a useful step forward; and we can add some other
> mechanism to the flag’s supported list if someone develops one?

I think it is a useful step forward.

Martin


> 
> Iain
> 
> ** modulo the target maintainers implementing the builtins.
> 
> 
> 



Re: [PATCH] core: Support heap-based trampolines

2023-07-19 Thread Martin Uecker via Gcc-patches



> 
> > On 17 Jul 2023, 
> 

> >> You mention setjmp/longjmp - on darwin and other platforms
> requiring
> >> non-stack based trampolines
> >> does the system runtime provide means to deal with this issue like
> an
> >> alternate allocation method
> >> or a way to register cleanup?
> > 
> > There is an alternate mechanism relying on system libraries that is
> possible on darwin specifically (I don’t know for other targets) but
> it will only work for signed binaries, and would require us to
> codesign everything produced by gcc. During development, it was
> deemed too big an ask and the current strategy was chosen (Iain can
> surely add more background on that if needed).
> 
> I do not think that this solves the setjump/longjump issue - since
> there’s still a notional allocation that takes place (it’s just that
> the mechanism for determining permissions is different).
> 
> It is also a big barrier for the general user - and prevents normal
> folks from distributing GCC - since codesigning requires an external
> certificate (i.e. I would really rather avoid it).
> 
> >> Was there ever an attempt to provide a "generic" trampoline driven
> by
> >> a more complex descriptor?
> 
> We did look at the “unused address bits” mechanism that Ada has used
> - but that is not really available to a non-private ABI (unless the
> system vendor agrees to change ABI to leave a bit spare) for the base
> arch either the bits are not there (e.g. X86) or reserved (e.g.
> AArch64).
> 
> Andrew Burgess did the original work he might have comments on
> alternatives we tried
> 

For reference, I proposed a patch for this in 2018. It was not
accepted because minimum alignment for functions would increase
for some archs:

https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg01532.html



> >> (well, it could be a bytecode interpreter and the trampoline being
> >> bytecode on the stack?!)
> > 
> > My own opinion is that executable stack should go away on all
> targets at some point, so a truly generic solution to the problem
> would be great.
> 
> indeed it would.
> 

I think we need a solution rather sooner than later on all archs.

Martin

> > Having something that works reliably across all targets, like you
> suggest, is a much bigger project that this patch, and I am not aware
> of any previous attempt at it.
> 
> The bytecode interpreter idea is neat;  (a) I wonder about
> performance and (b) it is, as FX says, a bigger project - certainly
> bigger than the voluntary Darwin time available :(
> 
> Iain
> 
> > 
> > 
> >> Otherwise I suggest to split the patch into libgcc, generic and
> target parts.
> > 
> > 
> 




Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-19 Thread Martin Uecker via Gcc-patches
Am Montag, dem 17.07.2023 um 16:40 -0700 schrieb Kees Cook:
> On Mon, Jul 17, 2023 at 09:17:48PM +, Qing Zhao wrote:
> > 
> > > On Jul 13, 2023, at 4:31 PM, Kees Cook 
> > > wrote:
> > > 
> > > In the bug, the problem is that "p" isn't known to be allocated,
> > > if I'm
> > > reading that correctly?
> > 
> > I think that the major point in PR109557
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > 
> > for the following pointer p.3_1, 
> > 
> > p.3_1 = p;
> > _2 = __builtin_object_size (p.3_1, 0);
> > 
> > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the
> > pointee of p when the TYPE_SIZE can be determined at compile time?
> > 
> > Answer:  From just knowing the type of the pointee of p, the
> > compiler cannot determine the size of the object.  
> 
> Why is that? "p" points to "struct P", which has a fixed size. There
> must be an assumption somewhere that a pointer is allocated,
> otherwise
> __bos would almost never work?

It often does not work, because it relies on the optimizer
propagating the information instead of the type system.

This is why it would be better to have proper *bounds* checks,
and not just object size checks. It is not quite clear to me
how BOS and bounds checking is supposed to work together,
but FAMs should be bounds checked. 

...

> 
> > 
> > > This may be
> > > desirable in a few situations. One example would be a large
> > > allocation
> > > that is slowly filled up by the program.
> > 
> > So, for such situation, whenever the allocation is filled up, the
> > field that hold the “counted_by” attribute should be increased at
> > the same time,
> > Then, the “counted_by” value always sync with the real allocation. 
> > > I.e. the counted_by member is
> > > slowly increased during runtime (but not beyond the true
> > > allocation size).
> > 
> > Then there should be source code to increase the “counted_by” field
> > whenever the allocated space increased too. 
> > > 
> > > Of course allocation size is only available in limited
> > > situations, so
> > > the loss of that info is fine: we have counted_by for everything
> > > else.
> > 
> > The point is: allocation size should synced with the value of
> > “counted_by”. LLVM’s RFC also have the similar requirement:
> > https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18
> 
> Right, I'm saying it would be nice if __alloc_size was checked as
> well,
> in the sense that if it is available, it knows without question what
> the
> size of the allocation is. If __alloc_size and __counted_by conflict,
> the smaller of the two should be the truth.
> 
> But, as I said, if there is some need to explicitly ignore
> __alloc_size
> when __counted_by is present, I can live with it; we just need to
> document it.
> 
> If the RFC and you agree that the __counted_by variable can only ever
> be
> (re)assigned after the flex array has been (re)allocated, then I
> guess
> we'll see how it goes. :) I think most places in the kernel using
> __counted_by will be fine, but I suspect we may have cases where we
> need
> to update it like in the loop I described above. If that's true, we
> can
> revisit the requirement then. :)

It should be the other way round: You should first set
'count' and then reassign the pointer, because you can then
often check the pointer assignment (reading 'count').  The
other way round this works only sometimes, i.e. if both
assignments are close together and the optimizer can see this.



Martin






Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-18 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 18.07.2023 um 16:25 + schrieb Qing Zhao:
> 
> 
> > On Jul 18, 2023, at 12:03 PM, Martin Uecker 
> > wrote:
> > 
> > Am Dienstag, dem 18.07.2023 um 15:37 + schrieb Qing Zhao:
> > > 
> > > 
> > > > On Jul 17, 2023, at 7:40 PM, Kees Cook 
> > > > wrote:
> > > > 
> > > > On Mon, Jul 17, 2023 at 09:17:48PM +, Qing Zhao wrote:
> > > > > 
> > > > > > On Jul 13, 2023, at 4:31 PM, Kees Cook
> > > > > > 
> > > > > > wrote:
> > > > > > 
> > > > > > In the bug, the problem is that "p" isn't known to be
> > > > > > allocated, if I'm
> > > > > > reading that correctly?
> > > > > 
> > > > > I think that the major point in PR109557
> > > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > > > > 
> > > > > for the following pointer p.3_1, 
> > > > > 
> > > > > p.3_1 = p;
> > > > > _2 = __builtin_object_size (p.3_1, 0);
> > > > > 
> > > > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of
> > > > > the
> > > > > pointee of p when the TYPE_SIZE can be determined at compile
> > > > > time?
> > > > > 
> > > > > Answer:  From just knowing the type of the pointee of p, the
> > > > > compiler cannot determine the size of the object.  
> > > > 
> > > > Why is that? "p" points to "struct P", which has a fixed size.
> > > > There
> > > > must be an assumption somewhere that a pointer is allocated,
> > > > otherwise
> > > > __bos would almost never work?
> > > 
> > > My understanding from the comments in PR109557 was: 
> > > 
> > > In general the pointer could point to the first object of an
> > > array
> > > that has more elements, or to an object of a different type. 
> > > Without seeing the real allocation to the pointer, the compiler
> > > cannot assume that the pointer point to an object that has
> > > the exact same type as its declaration. 
> > > 
> > > Sid and Martin, is the above understand correctly?
> > 
> > Yes. 
> > 
> > In the example, it *could* work because the compiler
> > could inline 'store' or otherwise use its knowledge about
> > the function definition to conclude that 'p' points to
> > an object of size 'sizeof (struct P)'. But this is fragile
> > because it relies on optimization and will not work across
> > TUs.
> > 
> > > Honestly, I am still not 100% clear on this yet.
> > 
> > > Jakub, Sid and Martin, could  you please explain a little bit
> > > more
> > > here, especially for the following small example?
> > > 
> > > [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
> > > #include 
> > > #include 
> > > struct P {
> > >   int k;
> > >   int x[10]; 
> > > } *p;
> > > 
> > > void store(int a, int b) 
> > > {
> > >   p = (struct P *)malloc (sizeof (struct P));
> > >   p->k = a;
> > >   p->x[b] = 0;
> > >   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> > > (int[10]));
> > >   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct
> > > P));
> > >   return;
> > > }
> > > 
> > > int main()
> > > {
> > >   store(7, 7);
> > >   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> > > (int[10]));
> > >   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct
> > > P));
> > >   free (p);
> > > }
> > > [opc@qinzhao-ol8u3-x86 109557]$ sh t
> > > /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
> > > a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p-
> > > >x,
> > > 1) == sizeof (int[10])' failed.
> > > t: line 19: 859944 Aborted (core dumped) ./a.out
> > > 
> > > 
> > > In the above, among the 4 assertions, only the last one failed.
> > 
> > I don't know why this is the case. 
> > 
> > > 
> > > Why GCC can use the TYPE_SIZE of the pointee of the pointer “p-
> > > >x” as
> > > the size of the object, 
> > 
> > I do not think it can do this in general. Is this how it 
> > is implemented?
> 
> No. -:)
> 
>  I guess that the implementation of this should base on your
> following case,  “observed accesses”:
> Although I am not 100% sure on the definition of “observed accesses”.
> 
> p->x  is an access to the field of the object, so compiler can assume
> that the object exist and have
> the type associate with this access?
> 

Only if the access happens at run-time, but the argument to BDOS is
not evaluated, so there is no access. At least, this is my guess based 
on C's semantics. 


> On the other hand, “p” is just a plain pointer, no observed access.
>
> > Thus would then seem incorrect to me.  
> > 
> > > but cannot use the TYPE_SIZE of the pointee of the pointer “p” as
> > > the
> > > size of the object? 
> > 
> > In general, the type of a pointer does not say anything about the
> > object it points to, because you could cast the pointer to a
> > different
> > type, pass it around, and then cast it back before use. 
> 
> Okay, I see.
> > 
> > Only observed allocations or observed accesses provide information
> > about the type / existence of an object at the corresponding
> > address.
> 
> What will be included in “observed accesses”?

Any read or write access can be used to determine that there must

Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-18 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 18.07.2023 um 15:37 + schrieb Qing Zhao:
> 
> 
> > On Jul 17, 2023, at 7:40 PM, Kees Cook 
> > wrote:
> > 
> > On Mon, Jul 17, 2023 at 09:17:48PM +, Qing Zhao wrote:
> > > 
> > > > On Jul 13, 2023, at 4:31 PM, Kees Cook 
> > > > wrote:
> > > > 
> > > > In the bug, the problem is that "p" isn't known to be
> > > > allocated, if I'm
> > > > reading that correctly?
> > > 
> > > I think that the major point in PR109557
> > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557):
> > > 
> > > for the following pointer p.3_1, 
> > > 
> > > p.3_1 = p;
> > > _2 = __builtin_object_size (p.3_1, 0);
> > > 
> > > Question: why the size of p.3_1 cannot use the TYPE_SIZE of the
> > > pointee of p when the TYPE_SIZE can be determined at compile
> > > time?
> > > 
> > > Answer:  From just knowing the type of the pointee of p, the
> > > compiler cannot determine the size of the object.  
> > 
> > Why is that? "p" points to "struct P", which has a fixed size.
> > There
> > must be an assumption somewhere that a pointer is allocated,
> > otherwise
> > __bos would almost never work?
> 
> My understanding from the comments in PR109557 was: 
> 
> In general the pointer could point to the first object of an array
> that has more elements, or to an object of a different type. 
> Without seeing the real allocation to the pointer, the compiler
> cannot assume that the pointer point to an object that has
> the exact same type as its declaration. 
> 
> Sid and Martin, is the above understand correctly?

Yes. 

In the example, it *could* work because the compiler
could inline 'store' or otherwise use its knowledge about
the function definition to conclude that 'p' points to
an object of size 'sizeof (struct P)'. But this is fragile
because it relies on optimization and will not work across
TUs.

> Honestly, I am still not 100% clear on this yet.

> Jakub, Sid and Martin, could  you please explain a little bit more
> here, especially for the following small example?
> 
> [opc@qinzhao-ol8u3-x86 109557]$ cat t.c
> #include 
> #include 
> struct P {
>   int k;
>   int x[10]; 
> } *p;
> 
> void store(int a, int b) 
> {
>   p = (struct P *)malloc (sizeof (struct P));
>   p->k = a;
>   p->x[b] = 0;
>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> (int[10]));
>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>   return;
> }
> 
> int main()
> {
>   store(7, 7);
>   assert (__builtin_dynamic_object_size (p->x, 1) == sizeof
> (int[10]));
>   assert (__builtin_dynamic_object_size (p, 1) == sizeof (struct P));
>   free (p);
> }
> [opc@qinzhao-ol8u3-x86 109557]$ sh t
> /home/opc/Install/latest/bin/gcc -O -fdump-tree-all t.c
> a.out: t.c:21: main: Assertion `__builtin_dynamic_object_size (p->x,
> 1) == sizeof (int[10])' failed.
> t: line 19: 859944 Aborted (core dumped) ./a.out
> 
> 
> In the above, among the 4 assertions, only the last one failed.

I don't know why this is the case. 

> 
> Why GCC can use the TYPE_SIZE of the pointee of the pointer “p->x” as
> the size of the object, 

I do not think it can do this in general. Is this how it 
is implemented? Thus would then seem incorrect to me.  

> but cannot use the TYPE_SIZE of the pointee of the pointer “p” as the
> size of the object? 

In general, the type of a pointer does not say anything about the
object it points to, because you could cast the pointer to a different
type, pass it around, and then cast it back before use. 

Only observed allocations or observed accesses provide information
about the type / existence of an object at the corresponding address.

The only other way in C which ensures that a pointer actually points
to an object of the correct type is 'static':

void foo(struct P *p[static 1]);



Martin


> 
> > 
> > > Therefore the bug has been closed. 
> > > 
> > > In your following testing 5:
> > > 
> > > > I'm not sure this is a reasonable behavior, but
> > > > let me get into the specific test, which looks like this:
> > > > 
> > > > TEST(counted_by_seen_by_bdos)
> > > > {
> > > >   struct annotated *p;
> > > >   int index = MAX_INDEX + unconst;
> > > > 
> > > >   p = alloc_annotated(index);
> > > > 
> > > >   REPORT_SIZE(p->array);
> > > > /* 1 */ EXPECT_EQ(sizeof(*p), offsetof(typeof(*p), array));
> > > >   /* Check array size alone. */
> > > > /* 2 */ EXPECT_EQ(__builtin_object_size(p->array, 1),
> > > > SIZE_MAX);
> > > > /* 3 */ EXPECT_EQ(__builtin_dynamic_object_size(p->array, 1),
> > > > p->foo * sizeof(*p->array));
> > > >   /* Check check entire object size. */
> > > > /* 4 */ EXPECT_EQ(__builtin_object_size(p, 1), SIZE_MAX);
> > > > /* 5 */ EXPECT_EQ(__builtin_dynamic_object_size(p, 1),
> > > > sizeof(*p) + p->foo * sizeof(*p->array));
> > > > }
> > > > 
> > > > Test 5 should pass as well, since, again, p can be examined.
> > > > Passing p
> > > > to __bdos implies it is allocated and the __counted_by
> > > > annotation can be
> > > > examined.
> > > 
> > > 

Re: [V1][PATCH 0/3] New attribute "element_count" to annotate bounds for C99 FAM(PR108896)

2023-07-06 Thread Martin Uecker via Gcc-patches
Am Donnerstag, dem 06.07.2023 um 18:56 + schrieb Qing Zhao:
> Hi, Kees,
> 
> I have updated my V1 patch with the following changes:
> A. changed the name to "counted_by"
> B. changed the argument from a string to an identifier
> C. updated the documentation and testing cases accordingly.
> 
> And then used this new gcc to test 
> https://github.com/kees/kernel-tools/blob/trunk/fortify/array-bounds.c (with 
> the following change)
> [opc@qinzhao-ol8u3-x86 Kees]$ !1091
> diff array-bounds.c array-bounds.c.org
> 32c32
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)   
> > __attribute__((__element_count__(#member)))
> 34c34
> < # define __counted_by(member)   __attribute__((counted_by (member)))
> ---
> > # define __counted_by(member)   /* 
> > __attribute__((__element_count__(#member))) */
> 
> Then I got the following result:
> [opc@qinzhao-ol8u3-x86 Kees]$ ./array-bounds 2>&1 | grep -v ^'#'
> TAP version 13
> 1..12
> ok 1 global.fixed_size_seen_by_bdos
> ok 2 global.fixed_size_enforced_by_sanitizer
> not ok 3 global.unknown_size_unknown_to_bdos
> not ok 4 global.unknown_size_ignored_by_sanitizer
> ok 5 global.alloc_size_seen_by_bdos
> ok 6 global.alloc_size_enforced_by_sanitizer
> not ok 7 global.element_count_seen_by_bdos
> ok 8 global.element_count_enforced_by_sanitizer
> not ok 9 global.alloc_size_with_smaller_element_count_seen_by_bdos
> not ok 10 global.alloc_size_with_smaller_element_count_enforced_by_sanitizer
> ok 11 global.alloc_size_with_bigger_element_count_seen_by_bdos
> ok 12 global.alloc_size_with_bigger_element_count_enforced_by_sanitizer
> 
> The same as your previous results. Then I took a look at all the failed 
> testing: 3, 4, 7, 9, and 10. And studied the reasons for all of them.
> 
>  in a summary, there are two major issues:
> 1.  The reason for the failed testing 7 is the same issue as I observed in 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109557
> Which is not a bug, it’s an expected behavior. 
> 
> 2. The common issue for  the failed testing 3, 4, 9, 10 is:
> 
> for the following annotated structure: 
> 
> 
> struct annotated {
> unsigned long flags;
> size_t foo;
> int array[] __attribute__((counted_by (foo)));
> };
> 
> 
> struct annotated *p;
> int index = 16;
> 
> p = malloc(sizeof(*p) + index * sizeof(*p->array));  // allocated real size 
> 
> p->foo = index + 2;  // p->foo was set by a different value than the real 
> size of p->array as in test 9 and 10
> or
> p->foo was not set to any value as in test 3 and 4
> 
> 
> 
> i.e, the value of p->foo is NOT synced with the number of elements allocated 
> for the array p->array.  
> 
> I think that this should be considered as an user error, and the 
> documentation of the attribute should include
> this requirement.  (In the LLVM’s RFC, such requirement was included in the 
> programing model: 
> https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854#maintaining-correctness-of-bounds-annotations-18)
> 
> We can add a new warning option -Wcounted-by to report such user error if 
> needed.
> 
> What’s your opinion on this?


Additionally, we could also have a sanitizer that
checks this at run-time.


Personally, I am still not very happy that in the
following example the two 'n's refer to different
entities:

void f(int n)
{
struct foo {
int n;   
int (*p[])[n] [[counted_by(n)]];
};
}

But I guess it will be difficult to convince everybody
that it would be wise to use a new syntax for
disambiguation:

void f(int n)
{
struct foo {
int n;   
int (*p[])[n] [[counted_by(.n)]];
};
}

Martin


> 
> thanks.
> 
> Qing
> 
> 
> > On May 26, 2023, at 4:40 PM, Kees Cook  wrote:
> > 
> > On Thu, May 25, 2023 at 04:14:47PM +, Qing Zhao wrote:
> > > GCC will pass the number of elements info from the attached attribute to 
> > > both 
> > > __builtin_dynamic_object_size and bounds sanitizer to check the 
> > > out-of-bounds
> > > or dynamic object size issues during runtime for flexible array members.
> > > 
> > > This new feature will provide nice protection to flexible array members 
> > > (which
> > > currently are completely ignored by both __builtin_dynamic_object_size and
> > > bounds sanitizers).
> > 
> > Testing went pretty well, though I think I found some bdos issues:
> > 
> > - some things that bdos can't know the size of, and correctly returned
> >  SIZE_MAX in the past, now thinks are 0-sized.
> > - while bdos correctly knows the size of an element_count-annotated
> >  flexible array, it doesn't know the size of the containing object
> >  (i.e. it returns SIZE_MAX).
> > 
> > Also, I think I found a precedence issue:
> > 
> > - if both __alloc_size and 'element_count' are in use, the _smallest_
> >  of the two is what I would expect to be enforced by the sanitizer
> >  and reported by __bdos. As is, alloc_size appears to be used when
> >  

Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)

2023-06-16 Thread Martin Uecker via Gcc-patches
Am Freitag, dem 16.06.2023 um 16:21 + schrieb Joseph Myers:
> On Fri, 16 Jun 2023, Martin Uecker via Gcc-patches wrote:
> 
> > > Note that no expressions can start with the '.' token at present.  As 
> > > soon 
> > > as you invent a new kind of expression that can start with that token, 
> > > you 
> > > have syntactic ambiguity.
> > > 
> > > struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; 
> > > };
> > > 
> > > Is ".c=.c" a use of the existing syntax for designated initializers, with 
> > > the first ".c" being a designator and the second being a use of the new 
> > > kind of expression, or is it an assignment expression, where both the LHS 
> > > and the RHS of the assignment use the new kind of expression?  And do 
> > > those .c, when the use the new kind of expression, refer to the inner or 
> > > outer struct definition?
> > 
> > I would treat this is one integrated feature. Essentially .c is
> > somthing like this->c for the current struct for designated
> > initializer *and* size expressions because it is semantically 
> > so close.In the initializer I would allow only 
> > the current use for designated initialization for all names of
> > member of the currently initialized struct,  so .c = .c would 
> > be invalid.   It should never refer to the outer struct if there
> 
> I'm not clear on what the intended disambiguation rule here is, when "." 
> is seen in initializer list context - does this rule depend on whether the 
> following identifier is a member of the struct being initialized, so 
> ".c=.c" would be OK above if the initialized struct didn't have a member 
> called c but the outer struct definition did? 

When initializers are parsed it is already clear what
the names of the members of the inner struct are, so
one can differentiate between designated initializers 
and potential other uses in an expression. 

So the main rule is: if you parse .something in a context
where a designator is allowed and "something" is a member
of the current struct, then it is a designator.

So for 

struct foo { int c; int buf[(struct { int d; }){ .d = .c }]; };

one knows during parsing that the .d is a designator
and that .c is not. For

struct foo { int c; int buf[(struct { int d; }){ .c = .c }]; };

one knows that both uses of .c are not.

Whether these different use cases should be allowed or not
is a different question, but my point is that there does
not seem to be a problem directly identifying the uses 
as a designator as usual. To me, this seems to imply that
it is safe to use the same syntax.

>  That seems like a rather 
> messy rule.  And does "would allow only" apply other than in the ambiguous 
> context?  That seems to be implied by ".c=.c" being invalid above, because 
> to make it invalid you need to disallow the new construct being used for 
> the second .c, not just make the first .c interpreted as a designator.

Yes. 
> 
> Again, this sort of thing needs a detailed written specification, with 
> multiple iterations discussed among different implementations. 

Oh, I agree with this.

>  The above 
> paragraph doesn't make clear to me any of: the disambiguation rules; what 
> is allowed in what context; how name lookup works (consider tricky cases 
> such as a reference to an identifier declared *later* in the same struct, 
> possibly in the context of C2x tag compatibility where a previous 
> definition of the struct is visible); when these expressions get 
> evaluated; what the underlying principles are behind those choices.

I also agree that all this needs careful consideration and written
rules.  My point is mereley that there does not seem to be a
fundamental issue differentiating the new feature from 
designators during parsing, so there may not be a risk using 
the same syntax.

> Using a token (existing or new) other than '.' - one that doesn't 
> introduce ambiguity in any context where expressions can be used - would 
> help significantly, although some of the issues would still apply.

The cost of using a new symbol is that one has two different
syntax for something which is semantically equivalent, i.e.
a notion to refer to a member of the current struct.

Martin

> 




Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)

2023-06-16 Thread Martin Uecker via Gcc-patches
Am Donnerstag, dem 15.06.2023 um 16:55 + schrieb Joseph Myers:
> On Thu, 15 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
...
> > 1. Update the routine “c_parser_postfix_expression” (is this the right 
> > place? ) to accept the new designator syntax.
> 
> Any design that might work with an expression is the sort of thing that 
> would likely involve many iterations on the specification (i.e. proposed 
> wording changes to the C standard) for the interpretation of the new kinds 
> of expressions, including how to resolve syntactic ambiguities and how 
> name lookup works, before it could be considered ready to implement, and 
> then a lot more work on the specification based on implementation 
> experience.
> 
> Note that no expressions can start with the '.' token at present.  As soon 
> as you invent a new kind of expression that can start with that token, you 
> have syntactic ambiguity.
> 
> struct s1 { int c; char a[(struct s2 { int c; char b[.c]; }) {.c=.c}.c]; };
> 
> Is ".c=.c" a use of the existing syntax for designated initializers, with 
> the first ".c" being a designator and the second being a use of the new 
> kind of expression, or is it an assignment expression, where both the LHS 
> and the RHS of the assignment use the new kind of expression?  And do 
> those .c, when the use the new kind of expression, refer to the inner or 
> outer struct definition?

I would treat this is one integrated feature. Essentially .c is
somthing like this->c for the current struct for designated
initializer *and* size expressions because it is semantically 
so close.In the initializer I would allow only 
the current use for designated initialization for all names of
member of the currently initialized struct,  so .c = .c would 
be invalid.   It should never refer to the outer struct if there
is a member with the same name in the inner struct, i.e. the
outside member is then hidden. 

So this would be ok:

struct s1 { int d; char a[(struct s2 { int c; char b[.c]; }) {.c=.d}.c]; };

Here the use of .d would be ok because it is not from the struct
currently initialized, but from an outside scope.

Martin






Re: [RFC] Add stdckdint.h header for C23

2023-06-11 Thread Martin Uecker via Gcc-patches


Hi Jakup,

two comments which may or may not be helpful:

Clang extended _Generic in a similar way:
https://github.com/llvm/llvm-project/commit/12728e144994efe84715f4e5dbb8c3104e9f0b5a

Although for _Generic you can achieve the same with checking
for compatiblilty of pointer to the type, and I do not think
this helps with the classification problem.


If I am not missing something, you should be able to check
for an enumerated type using _Generic by checking that the
type is not compatible to another enum type:

enum type_check { _X = 1 };

#define type_is_enum(x) \
_Generic(x, unsigned int: _Generic(x, enum type_check: 0, default: 1), 
default: 0)

https://godbolt.org/z/j6z4a4Mdn

For C23 with fixed underlying type this may become more
complicated. Maybe this becomes to messy.

Martin





Re: [C PATCH 3/4] introduce ubsan checking for assigment of VM types 3/4

2023-05-31 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 30.05.2023 um 22:59 + schrieb Joseph Myers:
> On Mon, 29 May 2023, Martin Uecker via Gcc-patches wrote:
> 
> >     Support instrumentation of function arguments for functions
> >     called via a declaration.  We can support only simple size
> 
> What do you mean by "via a declaration"?
> 
> If the *definition* is visible (and known to be the definition used
> at runtime rather than being interposed) then you can determine in
> some cases  that there is UB from bad bounds.  If only some other
> declaration is visible, or the definition might be interposed, VLA
> sizes in the declaration are equivalent to [*]; it's suspicious if
> they don't match, but it's not UB and so it would seem rather
> questionable for UBSan to treat it as such (cf. the rejection in 
> GCC of sanitization for some questionable cases of unsigned integer
> overflow that aren't UB either).

You are right that it is UB only with the additional
assumption that the bounds in the seen declaration are
the same as the ones in the definition.   But we now warn
about any mismatch since GCC 11 with -Wall based on the
understanding  that any such mismatch should be considered
a bug. There also does not seem  to be any valid use case
for having mismatching bounds and I think the intention 
of WG14 is clearly that they can be used for checking 
(cf. WG14 charter). So I think this is a different
situation for unsigned integer overflow.

Fom a practial point of view is is certainly very useful 
for users to be able to verify these bounds at run-time. 
But we could make it a separate UBSan option if it is
really a concern.

BTW: There was a similar discussion years ago about making
certain bound checks for arrays part of UBSan because it is
not clear that the bounds in the type of 'x' in x[n] are
relevant rather than the ones of the underlying array
(which may be different).  In the end both GCC and clang
have these UBSan checks now and I think  everybody is
happy about it.


> > + /*  Give up.  If we do not understand a size expression,
> > we can
> > + also not instrument any of the others because it may
> > have
> > + side effects affecting them.  (We could restart and
> > instrument
> > + the only the ones with integer constants.)   */
> > +   warning_at (location, 0, "Function call not
> > instrumented.");
> > +   return void_node;
> 
> This is not a properly formatted diagnostic message (should start
> with a 
> lowercase letter and not end with '.').

Thanks. I would probably remove this warning and re-introduce it
with another patch that also adds an option fir it.

Martin







Re: [C PATCH 3/4] introduce ubsan checking for assigment of VM types 3/4

2023-05-29 Thread Martin Uecker via Gcc-patches



c: introduce ubsan checking for assigment of VM types 3/4

Support instrumentation of function arguments for functions
called via a declaration.  We can support only simple size
expressions without side effects, because the UBSan
instrumentation is done before the call, but the expressions
are evaluated in the callee.

gcc/c-family:
* c-ubsan.cc (ubsan_instrument_vm_assign): Add arguments
for size expressions.
* c-ubsan.h (ubsan_instrument_vm_assign): Dito.

gcc/c:
* c-typeck.cc (process_vm_constraints): Add support
for instrumenting function arguments.

gcc/testsuide/gcc.dg:
* ubsan/vm-bounds-2.c: Update.
* ubsan/vm-bounds-3.c: New test.
* ubsan/vm-bounds-4.c: New test.

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 59ef9708188..a8f95aa39e8 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -337,19 +337,13 @@ ubsan_instrument_vla (location_t loc, tree size)
 /* Instrument assignment of variably modified types.  */
 
 tree
-ubsan_instrument_vm_assign (location_t loc, tree a, tree b)
+ubsan_instrument_vm_assign (location_t loc, tree a, tree as, tree b, tree bs)
 {
   tree t, tt;
 
   gcc_assert (TREE_CODE (a) == ARRAY_TYPE);
   gcc_assert (TREE_CODE (b) == ARRAY_TYPE);
 
-  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (a));
-  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (b));
-
-  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
-  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
-
   t = build2 (NE_EXPR, boolean_type_node, as, bs);
   if (flag_sanitize_trap & SANITIZE_VLA)
 tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index 1b07b0645f2..42be1d691a8 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -26,7 +26,7 @@ extern tree ubsan_instrument_shift (location_t, enum 
tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
 extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
-extern tree ubsan_instrument_vm_assign (location_t, tree, tree);
+extern tree ubsan_instrument_vm_assign (location_t, tree, tree, tree, tree);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
 extern void ubsan_maybe_instrument_reference (tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index a8fccc6f6ed..aeddac315fc 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -3408,7 +3408,8 @@ static tree
 convert_argument (location_t ploc, tree function, tree fundecl,
  tree type, tree origtype, tree val, tree valtype,
  bool npc, tree rname, int parmnum, int argnum,
- bool excess_precision, int warnopt)
+ bool excess_precision, int warnopt,
+ vec *instr_vec)
 {
   /* Formal parm type is specified by a function prototype.  */
 
@@ -3567,7 +3568,7 @@ convert_argument (location_t ploc, tree function, tree 
fundecl,
val, origtype, ic_argpass,
npc, fundecl, function,
parmnum + 1, warnopt,
-   NULL);
+   instr_vec);
 
   if (targetm.calls.promote_prototypes (fundecl ? TREE_TYPE (fundecl) : 0)
   && INTEGRAL_TYPE_P (type)
@@ -3582,15 +3583,111 @@ convert_argument (location_t ploc, tree function, tree 
fundecl,
 
 static tree
 process_vm_constraints (location_t location,
-   vec *instr_vec)
+   vec *instr_vec,
+   tree function, tree fundecl, vec *values)
 {
   unsigned int i;
   struct instrument_data* d;
   tree instr_expr = void_node;
+  tree args = NULL;
+
+  /* Find the arguments for the function declaration / type.  */
+  if (function)
+{
+  if (FUNCTION_DECL == TREE_CODE (function))
+   {
+ fundecl = function;
+ args = DECL_ARGUMENTS (fundecl);
+   }
+  else
+   {
+ /* Functions called via pointers are not yet supported.  */
+ return void_node;
+   }
+}
 
   FOR_EACH_VEC_SAFE_ELT (instr_vec, i, d)
 {
-  tree in = ubsan_instrument_vm_assign (location, d->t1, d->t2);
+  tree t1 = d->t1;
+  tree t2 = d->t2;
+
+  gcc_assert (ARRAY_TYPE == TREE_CODE (t1));
+  gcc_assert (ARRAY_TYPE == TREE_CODE (t2));
+
+  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (t1));
+  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (t2));
+
+  gcc_assert (as);
+  gcc_assert (bs);
+
+  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
+  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
+
+   

Re: [C PATCH 2/4] introduce ubsan checking for assigment of VM types 2/4

2023-05-29 Thread Martin Uecker via Gcc-patches



c: introduce ubsan checking for assigment of VM types 2/4

When checking compatibility of types during assignment, collect
all pairs of types where the outermost bound needs to match at
run-time.  This list is then processed to add UBSan checks for
each bound.

gcc/c-family:
* c-ubsan.cc (ubsan_instrument_vm_assign): New function.
* c-ubsan.h (ubsan_instrument_vm_assign: New function.

gcc/c:
* c-typeck.cc (struct instrument_data). New structure.
(comp_target_types_instr convert_for_assignment_instrument): New
interfaces for existing functions.
(struct comptypes_data): Add instrumentation.
(comptypes_check_enum_int_intr): New interface.
(comptypes_check_enum_int): Old interface (calls new).
(comptypes_internal): Collect VLA types needed for UBSan.
(comp_target_types_instr): New interface.
(comp_target_types): Old interface (calls new).
(function_types_compatible_p): No instrumentation for function
arguments.
(process_vm_constraints): New function.
(convert_for_assignment_instrument): New interface.
(convert_for_assignment): Instrument assignments.
* sanitizer.def: Add sanitizer builtins.

gcc/testsuite:
* gcc.dg/ubsan/vm-bounds-1.c: New test.
* gcc.dg/ubsan/vm-bounds-1b.c: New test.
* gcc.dg/ubsan/vm-bounds-2.c: New test.

libsanitizer/ubsan:
* ubsan_checks.inc: Add UBSan check.
* ubsan_handlers.cpp (handleVMBoundsMismatch): New function.
* ubsan_handlers.h (struct VMBoundsMismatchData): New structure.
(vm_bounds_mismatch): New handler.

diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
index 51aa83a378d..59ef9708188 100644
--- a/gcc/c-family/c-ubsan.cc
+++ b/gcc/c-family/c-ubsan.cc
@@ -334,6 +334,48 @@ ubsan_instrument_vla (location_t loc, tree size)
   return t;
 }
 
+/* Instrument assignment of variably modified types.  */
+
+tree
+ubsan_instrument_vm_assign (location_t loc, tree a, tree b)
+{
+  tree t, tt;
+
+  gcc_assert (TREE_CODE (a) == ARRAY_TYPE);
+  gcc_assert (TREE_CODE (b) == ARRAY_TYPE);
+
+  tree as = TYPE_MAX_VALUE (TYPE_DOMAIN (a));
+  tree bs = TYPE_MAX_VALUE (TYPE_DOMAIN (b));
+
+  as = fold_build2 (PLUS_EXPR, sizetype, as, size_one_node);
+  bs = fold_build2 (PLUS_EXPR, sizetype, bs, size_one_node);
+
+  t = build2 (NE_EXPR, boolean_type_node, as, bs);
+  if (flag_sanitize_trap & SANITIZE_VLA)
+tt = build_call_expr_loc (loc, builtin_decl_explicit (BUILT_IN_TRAP), 0);
+  else
+{
+  tree data = ubsan_create_data ("__ubsan_vm_data", 1, ,
+ubsan_type_descriptor (a, 
UBSAN_PRINT_ARRAY),
+ubsan_type_descriptor (b, 
UBSAN_PRINT_ARRAY),
+ubsan_type_descriptor (sizetype),
+NULL_TREE, NULL_TREE);
+  data = build_fold_addr_expr_loc (loc, data);
+  enum built_in_function bcode
+   = (flag_sanitize_recover & SANITIZE_VLA)
+ ? BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH
+ : BUILT_IN_UBSAN_HANDLE_VM_BOUNDS_MISMATCH_ABORT;
+  tt = builtin_decl_explicit (bcode);
+  tt = build_call_expr_loc (loc, tt, 3, data,
+   ubsan_encode_value (as),
+   ubsan_encode_value (bs));
+}
+  t = build3 (COND_EXPR, void_type_node, t, tt, void_node);
+
+  return t;
+}
+
+
 /* Instrument missing return in C++ functions returning non-void.  */
 
 tree
diff --git a/gcc/c-family/c-ubsan.h b/gcc/c-family/c-ubsan.h
index fef1033e1e4..1b07b0645f2 100644
--- a/gcc/c-family/c-ubsan.h
+++ b/gcc/c-family/c-ubsan.h
@@ -26,6 +26,7 @@ extern tree ubsan_instrument_shift (location_t, enum 
tree_code, tree, tree);
 extern tree ubsan_instrument_vla (location_t, tree);
 extern tree ubsan_instrument_return (location_t);
 extern tree ubsan_instrument_bounds (location_t, tree, tree *, bool);
+extern tree ubsan_instrument_vm_assign (location_t, tree, tree);
 extern bool ubsan_array_ref_instrumented_p (const_tree);
 extern void ubsan_maybe_instrument_array_ref (tree *, bool);
 extern void ubsan_maybe_instrument_reference (tree *);
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 2a1b7321b45..a8fccc6f6ed 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -94,6 +94,9 @@ struct comptypes_data;
 static int tagged_types_tu_compatible_p (const_tree, const_tree,
 struct comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
+struct instrument_data;
+static int comp_target_types_instr (location_t, tree, tree,
+   vec *);
 static int function_types_compatible_p (const_tree, const_tree,
struct comptypes_data *);
 

Re: [C PATCH 4/4] introduce ubsan checking for assigment of VM types 4/4

2023-05-29 Thread Martin Uecker via Gcc-patches





c: introduce ubsan checking for assigment of VM types 4/4

Support instrumentation of functions called via pointers.  To do so,
record the declaration with the parameter types, so that it can be
retrieved later.

gcc/c:
c-decl.cc (get_parm_info): Record function declaration
for arguments.
c-type.cc (process_vm_constraints): Instrument functions
called via pointers.

gcc/testsuide/gcc.dg:
* ubsan/vm-bounds-2.c: Add warning.
* ubsan/vm-bounds-5.c: New test.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 1af51c4acfc..c33adf7e5fe 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -8410,6 +8410,9 @@ get_parm_info (bool ellipsis, tree expr)
 declared types.  The back end may override this later.  */
  DECL_ARG_TYPE (decl) = type;
  types = tree_cons (0, type, types);
+
+ /* Record the decl for use of UBSan bounds checking.  */
+ TREE_PURPOSE (types) = decl;
}
  break;
 
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index aeddac315fc..43e7b96a55f 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -3601,9 +3601,20 @@ process_vm_constraints (location_t location,
}
   else
{
- /* Functions called via pointers are not yet supported.  */
- return void_node;
+ while (FUNCTION_TYPE != TREE_CODE (function))
+   function = TREE_TYPE (function);
+
+ args = TREE_PURPOSE (TYPE_ARG_TYPES (function));
+
+ if (!args)
+   {
+ /* FIXME: this can happen when forming composite types for the
+conditional operator.  */
+ warning_at (location, 0, "Function call not instrumented.");
+ return void_node;
+   }
}
+  gcc_assert (PARM_DECL == TREE_CODE (args));
 }
 
   FOR_EACH_VEC_SAFE_ELT (instr_vec, i, d)
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c 
b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
index 22f06231eaa..093cbddd2ea 100644
--- a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-2.c
@@ -31,7 +31,7 @@ void f(void)
 
int u = 3; int v = 4;
char a[u][v];
-   (1 ? f1 : f2)(u, v, a);
+   (1 ? f1 : f2)(u, v, a); /* { dg-warning "Function call not 
instrumented." } */
 }
 
 /* size expression in parameter */
diff --git a/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c 
b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c
new file mode 100644
index 000..1a251e39deb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ubsan/vm-bounds-5.c
@@ -0,0 +1,72 @@
+/* { dg-do run } */
+/* { dg-options "-fsanitize=vla-bound" } */
+
+
+void foo1(void (*p)(int n, char (*a)[n]))
+{
+   char A0[3];
+   (*p)(3, );
+   (*p)(4, );   /* */
+   /* { dg-output "bound 4 of type 'char \\\[\\\*\\\]' does not match 
bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void b0(int n, char (*a)[n]) { }
+
+
+int n;
+
+void foo2(void (*p)(int n, char (*a)[n]))
+{
+   n = 4;
+   char A0[3];
+   (*p)(3, );
+   (*p)(4, );
+   /* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not 
match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void foo3(void (*p)(int n0, char (*a)[n]))
+{
+   n = 4;
+   char A0[3];
+   (*p)(3, );   /* */
+   /* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not 
match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+   (*p)(4, );   /* */
+   /* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not 
match bound 3 of type 'char \\\[3\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+}
+
+void foo4(void (*p)(int n, char (*a)[n]))
+{
+   n = 3;
+   char A0[3];
+   (*p)(3, );
+   (*p)(4, );   /* */
+   /* { dg-output "\[^\n\r]*bound 4 of type 'char \\\[\\\*\\\]' does not 
match bound 3 of type 'char \\\[3\\\]'" } */
+}
+
+
+void foo5(void (*p)(int n0, char (*a)[n]))
+{
+   n = 3;
+   char A0[3];
+   (*p)(3, );
+   (*p)(4, );
+}
+
+
+void b1(int n0, char (*a)[n]) { }
+
+
+
+int main()
+{
+   foo1();
+
+   foo2();
+   foo3(); // we should diagnose mismatch and run-time discrepancies
+
+   foo4();
+   foo5(); // we should diagnose mismatch and run-time discrepancies
+}
+
+
+




[C PATCH 1/4] introduce ubsan checking for assigment of VM types 1/4

2023-05-29 Thread Martin Uecker via Gcc-patches


Hi Joseph and Martin,

this series adds UBSan checking for assignment of variably-modified
types, i.e. it checks that size expressions on both sides of the 
assignment match.

1. no functional change, adds a structure argument to the
comptypes family functions in the C FE.

2. checking for all assignments except function arguments
including the libsanitizer changes (no upstream discussion so far)

3. checking for function arguments, but only when the function is
referenced using its declaration.

4. checking for functions called via a pointer


Q1: Should this be -fsanitize=vla-bound ? I used it because it is
related and does not have much functionality.

Q2: I now have warnings when a function can not be instrumented
because size expressions are too complicated or information was
lost before. Probably this needs to have a flag.

Martin



c: introduce ubsan checking for assigment of VM types 1/4

Reorganize recursive type checking to use a structure to
store information collected during the recursion and
returned to the caller (enum_and_init_p, different_types_p).

gcc/c:
* c-typeck.cc (struct comptypes_data): Add structure.
(tagged_types_tu_compatible_p,
function_types_compatible_p, type_lists_compatible_p,
comptyes_internal): Add structure to interface and
adapt calls.
(comptypes, comptypes_check_enum_int,
comptypes_check_different_types): Adapt calls.

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 22e240a3c2a..2a1b7321b45 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -90,12 +90,14 @@ static bool require_constant_elements;
 static bool require_constexpr_value;
 
 static tree qualify_type (tree, tree);
-static int tagged_types_tu_compatible_p (const_tree, const_tree, bool *,
-bool *);
+struct comptypes_data;
+static int tagged_types_tu_compatible_p (const_tree, const_tree,
+struct comptypes_data *);
 static int comp_target_types (location_t, tree, tree);
-static int function_types_compatible_p (const_tree, const_tree, bool *,
-   bool *);
-static int type_lists_compatible_p (const_tree, const_tree, bool *, bool *);
+static int function_types_compatible_p (const_tree, const_tree,
+   struct comptypes_data *);
+static int type_lists_compatible_p (const_tree, const_tree,
+   struct comptypes_data *);
 static tree lookup_field (tree, tree);
 static int convert_arguments (location_t, vec, tree,
  vec *, vec *, tree,
@@ -125,7 +127,8 @@ static tree find_init_member (tree, struct obstack *);
 static void readonly_warning (tree, enum lvalue_use);
 static int lvalue_or_else (location_t, const_tree, enum lvalue_use);
 static void record_maybe_used_decl (tree);
-static int comptypes_internal (const_tree, const_tree, bool *, bool *);
+static int comptypes_internal (const_tree, const_tree,
+  struct comptypes_data *data);
 
 /* Return true if EXP is a null pointer constant, false otherwise.  */
 
@@ -1039,6 +1042,12 @@ common_type (tree t1, tree t2)
   return c_common_type (t1, t2);
 }
 
+struct comptypes_data {
+
+  bool enum_and_int_p;
+  bool different_types_p;
+};
+
 /* Return 1 if TYPE1 and TYPE2 are compatible types for assignment
or various other operations.  Return 2 if they are compatible
but a warning may be needed if you use them together.  */
@@ -1049,7 +1058,9 @@ comptypes (tree type1, tree type2)
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, NULL, NULL);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, );
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1064,7 +1075,10 @@ comptypes_check_enum_int (tree type1, tree type2, bool 
*enum_and_int_p)
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, enum_and_int_p, NULL);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, );
+  *enum_and_int_p = data.enum_and_int_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1080,7 +1094,10 @@ comptypes_check_different_types (tree type1, tree type2,
   const struct tagged_tu_seen_cache * tagged_tu_seen_base1 = 
tagged_tu_seen_base;
   int val;
 
-  val = comptypes_internal (type1, type2, NULL, different_types_p);
+  struct comptypes_data data = { };
+  val = comptypes_internal (type1, type2, );
+  *different_types_p = data.different_types_p;
+
   free_all_tagged_tu_seen_up_to (tagged_tu_seen_base1);
 
   return val;
@@ -1089,19 +1106,18 @@ comptypes_check_different_types (tree type1, tree type2,
 /* Return 1 if TYPE1 and TYPE2 are 

Re: [V1][PATCH 2/3] Use the element_count atribute info in builtin object size [PR108896].

2023-05-27 Thread Martin Uecker via Gcc-patches


Thank you for working on this!


Here are a couple of comments:

How is the size for an object with FAM defined? 

There are at least three possible choices:

offset(..) + N * sizeof
sizeof(..) + N * sizeof
or the size of a struct with the replacement array.

Or is this not relevant here?


I would personally prefer an attribute which does
not use a string, but uses C expressions, so that
one could write something like this (although I would
limit it initially to the most simple case) 

struct {
  struct bar { int n; }* ptr;
  int buf[] [[element_count(.ptr->n + 3)]];
};

Of course, we could still support this later even
if we use a string now.

Martin




Am Donnerstag, dem 25.05.2023 um 16:14 + schrieb Qing Zhao:
> 2023-05-17 Qing Zhao 
> 
> gcc/ChangeLog:
> 
>   PR C/108896
>   * tree-object-size.cc (addr_object_size): Use the element_count
>   attribute info.
>   * tree.cc (component_ref_has_element_count_p): New function.
>   (component_ref_get_element_count): New function.
>   * tree.h (component_ref_has_element_count_p): New prototype.
>   (component_ref_get_element_count): New prototype.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR C/108896
>   * gcc.dg/flex-array-element-count-2.c: New test.
> ---
>  .../gcc.dg/flex-array-element-count-2.c   | 56 +++
>  gcc/tree-object-size.cc   | 37 ++--
>  gcc/tree.cc   | 93 +++
>  gcc/tree.h| 10 ++
>  4 files changed, 189 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> 
> diff --git a/gcc/testsuite/gcc.dg/flex-array-element-count-2.c 
> b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> new file mode 100644
> index 000..5a280e8c731
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/flex-array-element-count-2.c
> @@ -0,0 +1,56 @@
> +/* test the attribute element_count and its usage in
> + * __builtin_dynamic_object_size.  */ 
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include "builtin-object-size-common.h"
> +
> +#define expect(p, _v) do { \
> +size_t v = _v; \
> +if (p == v) \
> + __builtin_printf ("ok:  %s == %zd\n", #p, p); \
> +else \
> + {  \
> +   __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +   FAIL (); \
> + } \
> +} while (0);
> +
> +struct flex {
> +  int b;
> +  int c[];
> +} *array_flex;
> +
> +struct annotated {
> +  int b;
> +  int c[] __attribute__ ((element_count ("b")));
> +} *array_annotated;
> +
> +void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
> +{
> +  array_flex
> += (struct flex *)malloc (sizeof (struct flex)
> ++ normal_count *  sizeof (int));
> +  array_flex->b = normal_count;
> +
> +  array_annotated
> += (struct annotated *)malloc (sizeof (struct annotated)
> + + attr_count *  sizeof (int));
> +  array_annotated->b = attr_count;
> +
> +  return;
> +}
> +
> +void __attribute__((__noinline__)) test ()
> +{
> +expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
> +expect(__builtin_dynamic_object_size(array_annotated->c, 1),
> +array_annotated->b * sizeof (int));
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +  setup (10,10);   
> +  test ();
> +  DONE ();
> +}
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 9a936a91983..f9aadd59054 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -585,6 +585,7 @@ addr_object_size (struct object_size_info *osi, 
> const_tree ptr,
>    if (pt_var != TREE_OPERAND (ptr, 0))
>  {
>    tree var;
> +  tree element_count_ref = NULL_TREE;
>  
> 
>    if (object_size_type & OST_SUBOBJECT)
>   {
> @@ -600,11 +601,12 @@ addr_object_size (struct object_size_info *osi, 
> const_tree ptr,
>   var = TREE_OPERAND (var, 0);
>     if (var != pt_var && TREE_CODE (var) == ARRAY_REF)
>   var = TREE_OPERAND (var, 0);
> -   if (! TYPE_SIZE_UNIT (TREE_TYPE (var))
> +   if (! component_ref_has_element_count_p (var)
> +  && ((! TYPE_SIZE_UNIT (TREE_TYPE (var))
>     || ! tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (var)))
>     || (pt_var_size && TREE_CODE (pt_var_size) == INTEGER_CST
>     && tree_int_cst_lt (pt_var_size,
> -   TYPE_SIZE_UNIT (TREE_TYPE (var)
> +   TYPE_SIZE_UNIT (TREE_TYPE (var)))
>   var = pt_var;
>     else if (var != pt_var && TREE_CODE (pt_var) == MEM_REF)
>   {
> @@ -612,6 +614,7 @@ addr_object_size (struct object_size_info *osi, 
> const_tree ptr,
>     /* For >fld, compute object size if fld isn't a flexible array
>    member.  */
>     bool is_flexible_array_mem_ref = false;
> +
>     while (v && v != pt_var)
>     

[C PATCH] -Wstringop-overflow for parameters with forward-declared sizes

2023-05-26 Thread Martin Uecker via Gcc-patches


This is a minor change so that parameter that have
forward declarations for with -Wstringop-overflow.


Bootstrapped and regression tested on x86_64. 



c: -Wstringop-overflow for parameters with forward-declared sizes

Warnings from -Wstringop-overflow do not appear for parameters declared
as VLAs when the bound refers to a parameter forward declaration. This
is fixed by splitting the loop that passes through parameters into two,
first only recording the positions of all possible size expressions
and then processing the parameters.

PR c/109970

gcc/c-family:

* c-attribs.cc (build_attr_access_from_parms): Split loop to first
record all parameters.

gcc/testsuite:

* gcc.dg/pr109970.c: New test.

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 072cfb69147..e2792ca6898 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -5278,6 +5278,15 @@ build_attr_access_from_parms (tree parms, bool 
skip_voidptr)
   tree argtype = TREE_TYPE (arg);
   if (DECL_NAME (arg) && INTEGRAL_TYPE_P (argtype))
arg2pos.put (arg, argpos);
+}
+
+  argpos = 0;
+  for (tree arg = parms; arg; arg = TREE_CHAIN (arg), ++argpos)
+{
+  if (!DECL_P (arg))
+   continue;
+
+  tree argtype = TREE_TYPE (arg);
 
   tree argspec = DECL_ATTRIBUTES (arg);
   if (!argspec)
diff --git a/gcc/testsuite/gcc.dg/pr109970.c b/gcc/testsuite/gcc.dg/pr109970.c
new file mode 100644
index 000..d234e10455f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr109970.c
@@ -0,0 +1,15 @@
+/* PR109970
+ * { dg-do compile }
+ * { dg-options "-Wstringop-overflow" }
+ * */
+
+void bar(int x, char buf[x]);
+void foo(int x; char buf[x], int x);
+
+int main()
+{
+   char buf[10];
+   bar(11, buf);   /* { dg-warning "accessing 11 bytes in a region of size 
10" } */
+   foo(buf, 11);   /* { dg-warning "accessing 11 bytes in a region of size 
10" } */
+}
+




Re: [committed] testsuite: Require trampolines for nestev-vla tests

2023-05-25 Thread Martin Uecker via Gcc-patches


Thanks! I will try to not forget this next time.

Am Donnerstag, dem 25.05.2023 um 21:20 +0300 schrieb Dimitar Dimitrov:
> Three recent test cases declare nested C functions, so they fail on
> targets lacking support for trampolines. Fix by adding the necessary
> filter.
> 
> Committed as obvious.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/nested-vla-1.c: Require effective target trampolines.
>   * gcc.dg/nested-vla-2.c: Ditto.
>   * gcc.dg/nested-vla-3.c: Ditto.
> 
> CC: Martin Uecker 
> Signed-off-by: Dimitar Dimitrov 
> ---
>  gcc/testsuite/gcc.dg/nested-vla-1.c | 1 +
>  gcc/testsuite/gcc.dg/nested-vla-2.c | 1 +
>  gcc/testsuite/gcc.dg/nested-vla-3.c | 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.dg/nested-vla-1.c 
> b/gcc/testsuite/gcc.dg/nested-vla-1.c
> index 5b62c2c213a..d1b3dc3c5f8 100644
> --- a/gcc/testsuite/gcc.dg/nested-vla-1.c
> +++ b/gcc/testsuite/gcc.dg/nested-vla-1.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-std=gnu99" } */
> +/* { dg-require-effective-target trampolines } */
>  
> 
>  
> 
>  int main()
> diff --git a/gcc/testsuite/gcc.dg/nested-vla-2.c 
> b/gcc/testsuite/gcc.dg/nested-vla-2.c
> index d83c90a0b16..294b01d370e 100644
> --- a/gcc/testsuite/gcc.dg/nested-vla-2.c
> +++ b/gcc/testsuite/gcc.dg/nested-vla-2.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-std=gnu99" } */
> +/* { dg-require-effective-target trampolines } */
>  
> 
>  
> 
>  int main()
> diff --git a/gcc/testsuite/gcc.dg/nested-vla-3.c 
> b/gcc/testsuite/gcc.dg/nested-vla-3.c
> index 1ffb482da3b..d2ba04adab8 100644
> --- a/gcc/testsuite/gcc.dg/nested-vla-3.c
> +++ b/gcc/testsuite/gcc.dg/nested-vla-3.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-std=gnu99" } */
> +/* { dg-require-effective-target trampolines } */
>  
> 
>  
> 
>  int main()




Re: [C PATCH v3] Fix ICEs related to VM types in C 2/2

2023-05-23 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 23.05.2023 um 10:18 +0200 schrieb Richard Biener:
> On Tue, May 23, 2023 at 8:24 AM Martin Uecker 
> wrote:
> > 
> > Am Dienstag, dem 23.05.2023 um 08:13 +0200 schrieb Richard Biener:
> > > On Mon, May 22, 2023 at 7:24 PM Martin Uecker via Gcc-patches
> > >  wrote:
> > > > 
> > > > 
> > > > 
> > > > This version contains the middle-end changes for PR109450
> > > > and test cases as before.  The main middle-end change is that
> > > > we use gimplify_type_sizes also for parameters and remove
> > > > the special code that also walked into pointers (which is
> > > > incorrect).
> > > > 
> > > > In addition, in the C FE this patch now also adds DECL_EXPR
> > > > for vm-types which are pointed-to by parameters declared
> > > > as arrays.  The new function created contains the exact
> > > > code previously used only for regular pointers, and is
> > > > now also called for parameters declared as arrays.
> > > > 
> > > > 
> > > > Martin
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > >     Fix ICEs related to VM types in C 2/2 [PR109450]
> > > > 
> > > >     Size expressions were sometimes lost and not gimplified
> > > > correctly,
> > > >     leading to ICEs and incorrect evaluation order.  Fix this
> > > > by 1) not
> > > >     recursing pointers when gimplifying parameters in the
> > > > middle-end
> > > >     (the code is merged with gimplify_type_sizes), which is
> > > > incorrect
> > > >     because it might access variables declared later for
> > > > incomplete
> > > >     structs, and 2) adding a decl expr for variably-modified
> > > > arrays
> > > >     that are pointed to by parameters declared as arrays.
> > > > 
> > > >     PR c/109450
> > > > 
> > > >     gcc/
> > > >     * c/c-decl.cc (add_decl_expr): New function.
> > > >     (grokdeclarator): Add decl expr for size expression
> > > > in
> > > >     types pointed to by parameters declared as arrays.
> > > >     * function.cc (gimplify_parm_type): Remove
> > > > function.
> > > >     (gimplify_parameters): Call gimplify_parm_sizes.
> > > >     * gimplify.cc (gimplify_type_sizes): Make function
> > > > static.
> > > >     (gimplify_parm_sizes): New function.
> > > > 
> > > >     gcc/testsuite/
> > > >     * gcc.dg/pr109450-1.c: New test.
> > > >     * gcc.dg/pr109450-2.c: New test.
> > > >     * gcc.dg/vla-26.c: New test.
> > > > 
> > > > diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> > > > index 494d3cf1747..c35347734b2 100644
> > > > --- a/gcc/c/c-decl.cc
> > > > +++ b/gcc/c/c-decl.cc
> > > > @@ -6490,6 +6490,55 @@ smallest_type_quals_location (const
> > > > location_t *locations,
> > > >    return loc;
> > > >  }
> > > > 
> > > > +
> > > > +/* We attach an artificial TYPE_DECL to pointed-to type
> > > > +   and arrange for it to be included in a DECL_EXPR.  This
> > > > +   forces the sizes evaluation at a safe point and ensures it
> > > > +   is not deferred until e.g. within a deeper conditional
> > > > context.
> > > > +
> > > > +   PARM contexts have no enclosing statement list that
> > > > +   can hold the DECL_EXPR, so we need to use a BIND_EXPR
> > > > +   instead, and add it to the list of expressions that
> > > > +   need to be evaluated.
> > > > +
> > > > +   TYPENAME contexts do have an enclosing statement list,
> > > > +   but it would be incorrect to use it, as the size should
> > > > +   only be evaluated if the containing expression is
> > > > +   evaluated.  We might also be in the middle of an
> > > > +   expression with side effects on the pointed-to type size
> > > > +   "arguments" prior to the pointer declaration point and
> > > > +   the fake TYPE_DECL in the enclosing context would force
> > > > +   the size evaluation prior to the side effects.  We
> > > > therefore
> > &

Re: [C PATCH v3] Fix ICEs related to VM types in C 2/2

2023-05-23 Thread Martin Uecker via Gcc-patches
Am Dienstag, dem 23.05.2023 um 08:13 +0200 schrieb Richard Biener:
> On Mon, May 22, 2023 at 7:24 PM Martin Uecker via Gcc-patches
>  wrote:
> > 
> > 
> > 
> > This version contains the middle-end changes for PR109450
> > and test cases as before.  The main middle-end change is that
> > we use gimplify_type_sizes also for parameters and remove
> > the special code that also walked into pointers (which is
> > incorrect).
> > 
> > In addition, in the C FE this patch now also adds DECL_EXPR
> > for vm-types which are pointed-to by parameters declared
> > as arrays.  The new function created contains the exact
> > code previously used only for regular pointers, and is
> > now also called for parameters declared as arrays.
> > 
> > 
> > Martin
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Fix ICEs related to VM types in C 2/2 [PR109450]
> > 
> > Size expressions were sometimes lost and not gimplified correctly,
> > leading to ICEs and incorrect evaluation order.  Fix this by 1) not
> > recursing pointers when gimplifying parameters in the middle-end
> > (the code is merged with gimplify_type_sizes), which is incorrect
> > because it might access variables declared later for incomplete
> > structs, and 2) adding a decl expr for variably-modified arrays
> > that are pointed to by parameters declared as arrays.
> > 
> > PR c/109450
> > 
> > gcc/
> > * c/c-decl.cc (add_decl_expr): New function.
> > (grokdeclarator): Add decl expr for size expression in
> > types pointed to by parameters declared as arrays.
> > * function.cc (gimplify_parm_type): Remove function.
> > (gimplify_parameters): Call gimplify_parm_sizes.
> > * gimplify.cc (gimplify_type_sizes): Make function static.
> > (gimplify_parm_sizes): New function.
> > 
> > gcc/testsuite/
> > * gcc.dg/pr109450-1.c: New test.
> > * gcc.dg/pr109450-2.c: New test.
> > * gcc.dg/vla-26.c: New test.
> > 
> > diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
> > index 494d3cf1747..c35347734b2 100644
> > --- a/gcc/c/c-decl.cc
> > +++ b/gcc/c/c-decl.cc
> > @@ -6490,6 +6490,55 @@ smallest_type_quals_location (const location_t 
> > *locations,
> >    return loc;
> >  }
> > 
> > +
> > +/* We attach an artificial TYPE_DECL to pointed-to type
> > +   and arrange for it to be included in a DECL_EXPR.  This
> > +   forces the sizes evaluation at a safe point and ensures it
> > +   is not deferred until e.g. within a deeper conditional context.
> > +
> > +   PARM contexts have no enclosing statement list that
> > +   can hold the DECL_EXPR, so we need to use a BIND_EXPR
> > +   instead, and add it to the list of expressions that
> > +   need to be evaluated.
> > +
> > +   TYPENAME contexts do have an enclosing statement list,
> > +   but it would be incorrect to use it, as the size should
> > +   only be evaluated if the containing expression is
> > +   evaluated.  We might also be in the middle of an
> > +   expression with side effects on the pointed-to type size
> > +   "arguments" prior to the pointer declaration point and
> > +   the fake TYPE_DECL in the enclosing context would force
> > +   the size evaluation prior to the side effects.  We therefore
> > +   use BIND_EXPRs in TYPENAME contexts too.  */
> > +static void
> > +add_decl_expr(location_t loc, enum decl_context decl_context, tree type, 
> > tree *expr)
> > +{
> > +  tree bind = NULL_TREE;
> > +  if (decl_context == TYPENAME || decl_context == PARM || decl_context == 
> > FIELD)
> > +{
> > +  bind = build3 (BIND_EXPR, void_type_node, NULL_TREE, NULL_TREE, 
> > NULL_TREE);
> > +  TREE_SIDE_EFFECTS (bind) = 1;
> > +  BIND_EXPR_BODY (bind) = push_stmt_list ();
> > +  push_scope ();
> > +}
> > +
> > +  tree decl = build_decl (loc, TYPE_DECL, NULL_TREE, type);
> > +  pushdecl (decl);
> > +  DECL_ARTIFICIAL (decl) = 1;
> > +  add_stmt (build_stmt (DECL_SOURCE_LOCATION (decl), DECL_EXPR, decl));
> > +  TYPE_NAME (type) = decl;
> > +
> > +  if (bind)
> > +{
> > +  pop_scope ();
> > +  BIND_EXPR_BODY (bind) = pop_stmt_list (BIND_EXPR_BODY (bind));
> > +  if (*expr)
> > +   *expr = build2 (COMPOUND_EXPR, void_type_node, *expr, bind);
> > +  else
> 

  1   2   >