Re: [PATCH] libstdc++: Test 17_intro/names.cc with -D_FORTIFY_SOURCE=2 [PR116210]

2024-10-04 Thread Jakub Jelinek
On Fri, Oct 04, 2024 at 10:03:36AM -0400, Siddhesh Poyarekar wrote:
> > Add a new testcase that repeats 17_intro/names.cc but with
> > _FORTIFY_SOURCE defined, to find problems in Glibc fortify wrappers like
> > https://sourceware.org/bugzilla/show_bug.cgi?id=32052 (which is fixed
> > now).
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > PR libstdc++/116210
> > * testsuite/17_intro/names.cc (sz): Undef for versions of Glibc
> > that use it in the fortify wrappers.
> > * testsuite/17_intro/names_fortify.cc: New test.
> > ---
> >   libstdc++-v3/testsuite/17_intro/names.cc | 7 +++
> >   libstdc++-v3/testsuite/17_intro/names_fortify.cc | 6 ++
> >   2 files changed, 13 insertions(+)
> >   create mode 100644 libstdc++-v3/testsuite/17_intro/names_fortify.cc
> > 
> > diff --git a/libstdc++-v3/testsuite/17_intro/names.cc 
> > b/libstdc++-v3/testsuite/17_intro/names.cc
> > index 6b9a3639aad..bbf45b93dee 100644
> > --- a/libstdc++-v3/testsuite/17_intro/names.cc
> > +++ b/libstdc++-v3/testsuite/17_intro/names.cc
> > @@ -377,4 +377,11 @@
> >   #undef y
> >   #endif
> > +#if defined __GLIBC_PREREQ && defined _FORTIFY_SOURCE
> > +# if __GLIBC_PREREQ(2,35) && ! __GLIBC_PREREQ(2,41)
> > +// https://sourceware.org/bugzilla/show_bug.cgi?id=32052
> > +#  undef sz
> > +# endif
> > +#endif
> 
> We've backported the fix to stable branches, so the version check isn't
> really that reliable.

That doesn't matter that much.  The worst that happens is that with those
older fixed glibc versions the testing will not test that symbol.
What is more important is that it is checked on the latest glibc,
so when people test gcc with that version, they'll notice if it regresses.

Jakub



Re: [patch,avr] Implement TARGET_FLOATN_MODE

2024-10-04 Thread Jakub Jelinek
On Fri, Oct 04, 2024 at 08:09:48AM -0600, Jeff Law wrote:
> 
> 
> On 10/4/24 7:46 AM, Georg-Johann Lay wrote:
> > This patch implements TARGET_FLOATN_MODE which maps
> > _Float32[x] to SFmode and _Float64[x] to DFmode.
> > 
> > There is currently no library support for extended float types,
> > but these settings are more reasonable for avr (and they make
> > more tests pass).
> > 
> > Ok for trunk?
> > 
> > Johann
> > 
> > -- 
> > 
> > AVR: Implement TARGET_FLOATN_MODE.
> > 
> > gcc/
> >  * config/avr/avr.cc (avr_floatn_mode): New static function.
> >  (TARGET_FLOATN_MODE): New define.
> OK

This is certainly incorrect.

As specified by e.g. ISO C23 H.2.3 Extended floating types, the requirement
on the extended floating types is:
"For each of its basic formats, IEC 60559 specifies an extended format whose 
maximum exponent and
precision exceed those of the basic format it is associated with. Extended 
formats are intended for
arithmetic with more precision and exponent range than is available in the 
basic formats used for
the input data."
So, while SFmode is a good mode to use for _Float32 and DFmode is a good
mode to use for _Float64, SFmode isn't a good mode to use for _Float32x and
neither is DFmode a good mode to use for _Float64x.
I'd expect you want DFmode for _Float32x and opt_scalar_float_mode () for
_Float64x.

Jakub



Re: [PATCH] libstdc++: Test 17_intro/names.cc with -D_FORTIFY_SOURCE=2 [PR116210]

2024-10-04 Thread Jakub Jelinek
On Fri, Oct 04, 2024 at 12:52:11PM +0100, Jonathan Wakely wrote:
> This doesn't really belong in our testsuite, because the sole purpose of
> the new test is to find bugs in the Glibc wrappers (like the one linked
> below). But maybe it's a kindness to do it in our testsuite, because we
> already have this test in place, and one Glibc bug was already found
> thanks to Sam running the existing test with _FORTIFY_SOURCE defined.
> 
> Should we do this?

I think so.  While those bugs are glibc bugs, libstdc++ uses libc headers
and so if they have namespace cleanness issues, so does libstdc++.

> Add a new testcase that repeats 17_intro/names.cc but with
> _FORTIFY_SOURCE defined, to find problems in Glibc fortify wrappers like
> https://sourceware.org/bugzilla/show_bug.cgi?id=32052 (which is fixed
> now).
> 
> libstdc++-v3/ChangeLog:
> 
>   PR libstdc++/116210
>   * testsuite/17_intro/names.cc (sz): Undef for versions of Glibc
>   that use it in the fortify wrappers.
>   * testsuite/17_intro/names_fortify.cc: New test.

Jakub



Re: [PATCH 1/2] c++: add -Wdeprecated-literal-operator [CWG2521]

2024-10-04 Thread Jakub Jelinek
On Fri, Oct 04, 2024 at 12:19:03PM +0200, Jakub Jelinek wrote:
> Though, maybe the tests should have both the deprecated syntax and the
> non-deprecated one...

Here is a variant of the patch which does that.

Tested on x86_64-linux and i686-linux, ok for trunk?

2024-10-04  Jakub Jelinek  

* g++.dg/cpp26/unevalstr1.C: Revert the 2024-10-03 changes, instead
expect extra warnings.  Add another set of tests without space
between " and _.
* g++.dg/cpp26/unevalstr2.C: Expect extra warnings for C++23.  Add
another set of tests without space between " and _.

--- gcc/testsuite/g++.dg/cpp26/unevalstr1.C.jj  2024-10-04 12:28:08.820899177 
+0200
+++ gcc/testsuite/g++.dg/cpp26/unevalstr1.C 2024-10-04 14:15:35.563531334 
+0200
@@ -83,21 +83,57 @@ extern "\o{0103}" { int f14 (); }   // { d
 [[nodiscard ("\x{20}")]] int h19 ();   // { dg-error "numeric escape sequence 
in unevaluated string" }
 [[nodiscard ("\h")]] int h20 ();   // { dg-error "unknown escape sequence" 
}
 
-float operator ""_my0 (const char *);
-float operator "" ""_my1 (const char *);
-float operator L""_my2 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator u""_my3 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator U""_my4 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator u8""_my5 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
-float operator L"" ""_my6 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator u"" ""_my7 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator U"" ""_my8 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator u8"" ""_my9 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" L""_my10 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" u""_my11 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" U""_my12 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" u8""_my13 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "\0"_my14 (const char *);   // { dg-error "expected empty 
string after 'operator' keyword" }
-float operator "\x00"_my15 (const char *); // { dg-error "expected empty 
string after 'operator' keyword" }
-float operator "\h"_my16 (const char *);   // { dg-error "expected empty 
string after 'operator' keyword" }
+float operator "" _my0 (const char *);
+float operator "" "" _my1 (const char *);
+float operator L"" _my2 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator u"" _my3 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator U"" _my4 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator u8"" _my5 (const char *);   // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator L"" "" _my6 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator u"" "" _my7 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator U"" "" _my8 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator u8"" "" _my9 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
+float operator "" L"" _my10 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
+float operator "" u"" _my11 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
+float operator "" U"" _my12 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
+float operator "" u8"" _my13 (const char *);   // { dg-error &q

Re: [PATCH 1/2] c++: add -Wdeprecated-literal-operator [CWG2521]

2024-10-04 Thread Jakub Jelinek
On Thu, Oct 03, 2024 at 12:39:59PM -0400, Jason Merrill wrote:
>   * g++.dg/cpp26/unevalstr1.C

This patch didn't touch unevalstr2.C which now FAILs because of the new
warnings in C++23 mode (the testcase is target { c++11 && c++23_down }, so
isn't run in C++26).

The intent in both of those tests was to test the separate (now deprecated)
syntax, so instead of removing the space between closing " and _ I've
adjusted the testcase to expect those 17 extra warnings.  And I've also
adjusted the unevalstr1.C testcase to do the same, when it is removed from
C++29 or whatever, that can be just guarded by #if.
Though, maybe the tests should have both the deprecated syntax and the
non-deprecated one...

Tested on x86_64-linux and i686-linux, ok for trunk?

2024-10-04  Jakub Jelinek  

* g++.dg/cpp26/unevalstr1.C: Revert the 2024-10-03 changes, instead 
expect
extra warnings.
* g++.dg/cpp26/unevalstr2.C: Expect extra warnings for C++23.

--- gcc/testsuite/g++.dg/cpp26/unevalstr1.C.jj  2024-10-03 20:10:07.392515277 
+0200
+++ gcc/testsuite/g++.dg/cpp26/unevalstr1.C 2024-10-04 11:04:18.309736501 
+0200
@@ -83,21 +83,38 @@ extern "\o{0103}" { int f14 (); }   // { d
 [[nodiscard ("\x{20}")]] int h19 ();   // { dg-error "numeric escape sequence 
in unevaluated string" }
 [[nodiscard ("\h")]] int h20 ();   // { dg-error "unknown escape sequence" 
}
 
-float operator ""_my0 (const char *);
-float operator "" ""_my1 (const char *);
-float operator L""_my2 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator u""_my3 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator U""_my4 (const char *); // { dg-error "invalid encoding prefix 
in literal operator" }
-float operator u8""_my5 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
-float operator L"" ""_my6 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator u"" ""_my7 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator U"" ""_my8 (const char *);  // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator u8"" ""_my9 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" L""_my10 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" u""_my11 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" U""_my12 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "" u8""_my13 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
-float operator "\0"_my14 (const char *);   // { dg-error "expected empty 
string after 'operator' keyword" }
-float operator "\x00"_my15 (const char *); // { dg-error "expected empty 
string after 'operator' keyword" }
-float operator "\h"_my16 (const char *);   // { dg-error "expected empty 
string after 'operator' keyword" }
+float operator "" _my0 (const char *);
+float operator "" "" _my1 (const char *);
+float operator L"" _my2 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator u"" _my3 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator U"" _my4 (const char *);// { dg-error "invalid 
encoding prefix in literal operator" }
+float operator u8"" _my5 (const char *);   // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator L"" "" _my6 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator u"" "" _my7 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator U"" "" _my8 (const char *); // { dg-error "invalid encoding 
prefix in literal operator" }
+float operator u8"" "" _my9 (const char *);// { dg-error "invalid encoding 
prefix in literal operator" }
+float operator "" L"" _my10 (const char *);// { dg-error "invalid encoding 
prefix in l

[PATCH] i386: Fix up ix86_expand_int_compare with TImode comparisons of SUBREGs from V8{H,B}Fmode against zero [PR116921]

2024-10-04 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because the ix86_expand_int_compare
optimization to use {,v}ptest assumes there are instructions for all
16-byte vector modes.  That isn't the case, we only have one for
V16QI, V8HI, V4SI, V2DI, V1TI, V4SF and V2DF, not for
V8HF nor V8BF.

The following patch fixes that by using the V8HI instruction instead
for those 2 modes.  tmp can't be a SUBREG, because it is SUBREG_REG
of another SUBREG, so we don't need to worry about gen_lowpart
failing.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-10-04  Jakub Jelinek  

PR target/116921
* config/i386/i386-expand.cc (ix86_expand_int_compare): Add a SUBREG
to V8HImode from V8HFmode or V8BFmode before generating a ptest.

* gcc.target/i386/pr116921.c: New test.

--- gcc/config/i386/i386-expand.cc.jj   2024-10-03 17:27:28.328227793 +0200
+++ gcc/config/i386/i386-expand.cc  2024-10-03 18:11:18.514076904 +0200
@@ -3095,6 +3095,9 @@ ix86_expand_int_compare (enum rtx_code c
   && GET_MODE_SIZE (GET_MODE (SUBREG_REG (op0))) == 16)
 {
   tmp = SUBREG_REG (op0);
+  if (GET_MODE_INNER (GET_MODE (tmp)) == HFmode
+ || GET_MODE_INNER (GET_MODE (tmp)) == BFmode)
+   tmp = gen_lowpart (V8HImode, tmp);
   tmp = gen_rtx_UNSPEC (CCZmode, gen_rtvec (2, tmp, tmp), UNSPEC_PTEST);
 }
   else
--- gcc/testsuite/gcc.target/i386/pr116921.c.jj 2024-10-03 18:16:36.368711747 
+0200
+++ gcc/testsuite/gcc.target/i386/pr116921.c2024-10-03 18:17:25.702034243 
+0200
@@ -0,0 +1,12 @@
+/* PR target/116921 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -msse4" } */
+
+long x;
+_Float16 __attribute__((__vector_size__ (16))) f;
+
+void
+foo (void)
+{
+  x -= !(__int128) (f / 2);
+}

Jakub



[PATCH] i386: Fix up *minmax3_2 splitter [PR116925]

2024-10-04 Thread Jakub Jelinek
Hi!

While *minmax3_1 correctly uses
   if (MEM_P (operands[1]))
 operands[1] = force_reg (mode, operands[1]);
to ensure operands[1] is not a MEM, *minmax3_2 does it wrongly
by calling force_reg but ignoring its return value.

The following borderingly obvious patch fixes that, bootstrapped/regtested
on x86_64-linux and i686-linux, ok for trunk?

Didn't find similar other errors in the backend with force_reg calls.

2024-10-04  Jakub Jelinek  

PR target/116925
* config/i386/sse.md (*minmax3_2): Assign force_reg result
back to operands[2] instead of throwing it away.

* g++.target/i386/avx-pr116925.C: New test.

--- gcc/config/i386/sse.md.jj   2024-10-01 09:38:57.0 +0200
+++ gcc/config/i386/sse.md  2024-10-03 17:33:12.071507421 +0200
@@ -3269,7 +3269,7 @@ (define_insn_and_split "*minmax3_2
  u = UNSPEC_IEEE_MAX;
 
if (MEM_P (operands[2]))
- force_reg (mode, operands[2]);
+ operands[2] = force_reg (mode, operands[2]);
rtvec v = gen_rtvec (2, operands[2], operands[1]);
rtx tmp = gen_rtx_UNSPEC (mode, v, u);
emit_move_insn (operands[0], tmp);
--- gcc/testsuite/g++.target/i386/avx-pr116925.C.jj 2024-10-03 
17:36:10.12406 +0200
+++ gcc/testsuite/g++.target/i386/avx-pr116925.C2024-10-03 
17:35:26.805656671 +0200
@@ -0,0 +1,12 @@
+// PR target/116925
+// { dg-do compile }
+// { dg-options "-O2 -mavx -ffloat-store" }
+
+typedef float V __attribute__((vector_size (16)));
+V a, b, c;
+
+void
+foo ()
+{
+  c = a > b ? a : b;
+}

Jakub



[PATCH] diagnostic, pch: Fix up the new diagnostic PCH methods for ubsan checking [PR116936]

2024-10-04 Thread Jakub Jelinek
Hi!

The PR notes that the new pch_save/pch_restore methods I've added
recently invoke UB if either m_classification_history.address ()
or m_push_list.address () is NULL (which can happen if those vectors
are empty (and in the pch_save case nothing has been pushed into them
before either).  While the corresponding length is necessarily 0,
fwrite (NULL, something, 0, f) or
fread (NULL, something, 0, f) still invoke UB.

The following patch fixes that by not calling fwrite/fread if the
corresponding length is 0.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-10-04  Jakub Jelinek  

PR pch/116936
* diagnostic.cc (diagnostic_option_classifier::pch_save): Only call
fwrite if corresponding length is non-zero.
(diagnostic_option_classifier::pch_restore): Only call fread if
corresponding length is non-zero.

--- gcc/diagnostic.cc.jj2024-10-01 09:38:58.014961851 +0200
+++ gcc/diagnostic.cc   2024-10-02 20:33:37.922953272 +0200
@@ -165,11 +165,13 @@ diagnostic_option_classifier::pch_save (
   unsigned int lengths[2] = { m_classification_history.length (),
  m_push_list.length () };
   if (fwrite (lengths, sizeof (lengths), 1, f) != 1
-  || fwrite (m_classification_history.address (),
-sizeof (diagnostic_classification_change_t),
-lengths[0], f) != lengths[0]
-  || fwrite (m_push_list.address (), sizeof (int),
-lengths[1], f) != lengths[1])
+  || (lengths[0]
+ && fwrite (m_classification_history.address (),
+sizeof (diagnostic_classification_change_t),
+lengths[0], f) != lengths[0])
+  || (lengths[1]
+ && fwrite (m_push_list.address (), sizeof (int),
+lengths[1], f) != lengths[1]))
 return -1;
   return 0;
 }
@@ -187,11 +189,13 @@ diagnostic_option_classifier::pch_restor
   gcc_checking_assert (m_push_list.is_empty ());
   m_classification_history.safe_grow (lengths[0]);
   m_push_list.safe_grow (lengths[1]);
-  if (fread (m_classification_history.address (),
-sizeof (diagnostic_classification_change_t),
-lengths[0], f) != lengths[0]
-  || fread (m_push_list.address (), sizeof (int),
-   lengths[1], f) != lengths[1])
+  if ((lengths[0]
+   && fread (m_classification_history.address (),
+sizeof (diagnostic_classification_change_t),
+lengths[0], f) != lengths[0])
+  || (lengths[1]
+ && fread (m_push_list.address (), sizeof (int),
+   lengths[1], f) != lengths[1]))
 return -1;
   return 0;
 }

Jakub



[PATCH] ssa-math-opts, i386: Improve spaceship expansion [PR116896]

2024-10-04 Thread Jakub Jelinek
Hi!

The PR notes that we don't emit optimal code for C++ spaceship
operator if the result is returned as an integer rather than the
result just being compared against different values and different
code executed based on that.
So e.g. for
template 
auto foo (T x, T y) { return x <=> y; }
for both floating point types, signed integer types and unsigned integer
types.  auto in that case is std::strong_ordering or std::partial_ordering,
which are fancy C++ abstractions around struct with signed char member
which is -1, 0, 1 for the strong ordering and -1, 0, 1, 2 for the partial
ordering (but for -ffast-math 2 is never the case).
I'm afraid functions like that are fairly common and unless they are
inlined, we really need to map the comparison to those -1, 0, 1 or
-1, 0, 1, 2 values.

Now, for floating point spaceship I've in the past already added an
optimization (with tree-ssa-math-opts.cc discovery and named optab, the
optab only defined on x86 though right now), which ensures there is just
a single comparison instruction and then just tests based on flags.
Now, if we have code like:
  auto a = x <=> y;
  if (a == std::partial_ordering::less)
bar ();
  else if (a == std::partial_ordering::greater)
baz ();
  else if (a == std::partial_ordering::equivalent)
qux ();
  else if (a == std::partial_ordering::unordered)
corge ();
etc., that results in decent code generation, the spaceship named pattern
on x86 optimizes for the jumps, so emits comparisons on the flags, followed
by setting the result to -1, 0, 1, 2 and subsequent jump pass optimizes that
well.  But if the result needs to be stored into an integer and just
returned that way or there are no immediate jumps based on it (or turned
into some non-standard integer values like -42, 0, 36, 75 etc.), then CE
doesn't do a good job for that, we end up with say
comiss  %xmm1, %xmm0
jp  .L4
seta%al
movl$0, %edx
leal-1(%rax,%rax), %eax
cmove   %edx, %eax
ret
.L4:
movl$2, %eax
ret
The jp is good, that is the unlikely case and can't be easily handled in
straight line code due to the layout of the flags, but the rest uses cmov
which often isn't a win and a weird math.
With the patch below we can get instead
xorl%eax, %eax
comiss  %xmm1, %xmm0
jp  .L2
seta%al
sbbl$0, %eax
ret
.L2:
movl$2, %eax
ret

The patch changes the discovery in the generic code, by detecting if
the future .SPACESHIP result is just used in a PHI with -1, 0, 1 or
-1, 0, 1, 2 values (the latter for HONOR_NANS) and passes that as a flag in
a new argument to .SPACESHIP ifn, so that the named pattern is told whether
it should optimize for branches or for loading the result into a -1, 0, 1
(, 2) integer.  Additionally, it doesn't detect just floating point <=>
anymore, but also integer and unsigned integer, but in those cases only
if an integer -1, 0, 1 is wanted (otherwise == and > or similar comparisons
result in good code).
The backend then can for those integer or unsigned integer <=>s return
effectively (x > y) - (x < y) in a way that is efficient on the target
(so for x86 with ensuring zero initialization first when needed before
setcc; one for floating point and unsigned, where there is just one setcc
and the second one optimized into sbb instruction, two for the signed int
case).  So e.g. for signed int we now emit
xorl%edx, %edx
xorl%eax, %eax
cmpl%esi, %edi
setl%dl
setg%al
subl%edx, %eax
ret
and for unsigned
xorl%eax, %eax
cmpl%esi, %edi
seta%al
sbbb$0, %al
ret

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, I wonder if other targets wouldn't benefit from defining the
named optab too...

2024-10-04  Jakub Jelinek  

PR middle-end/116896
* optabs.def (spaceship_optab): Use spaceship$a4 rather than
spaceship$a3.
* internal-fn.cc (expand_SPACESHIP): Expect 3 call arguments
rather than 2, expand the last one, expect 4 operands of
spaceship_optab.
* tree-ssa-math-opts.cc: Include cfghooks.h.
(optimize_spaceship): Check if a single PHI is initialized to
-1, 0, 1, 2 or -1, 0, 1 values, in that case pass 1 as last (new)
argument to .SPACESHIP and optimize away the comparisons,
otherwise pass 0.  Also check for integer comparisons rather than
floating point, in that case do it only if there is a single PHI
with -1, 0, 1 values and pass 1 to last argument of .SPACESHIP
if the <=> is signed, 2 if unsigned.
* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Add
another rtx argument.
(ix86_expand_int_spaceship): Declare.
* config/i

Re: [PATCH] C/116735 - ICE in build_counted_by_ref

2024-10-02 Thread Jakub Jelinek
On Wed, Oct 02, 2024 at 11:48:16AM -0400, Marek Polacek wrote:
> > +  error_at (DECL_SOURCE_LOCATION (field_decl),
> > +   "argument %qE to the %qE attribute is not a field declaration"
> > +   " in the same structure as %qD", fieldname,
> > +   (get_attribute_name (attr_counted_by)),
> 
> Why use get_attribute_name when we know it must be "counted_by"?  And below
> too.

There might be a reason if the message would be used by multiple
spots with different attributes and the other uses would need that %qE,
rather than say %qs or % (to make it easier for translators).
If the message is only for this attribute, just use %, or
if it would be for several attributes but in each case you'd know the name
as constant literal, %qs with "counted_by" operand would be best.

That said, the ()s around the call are also superfluous, so if it isn't
changed, it should be just
get_attribute_name (attr_counted_by),

Jakub



Re: [RFC PATCH] Allow limited extended asm at toplevel

2024-10-02 Thread Jakub Jelinek
On Wed, Oct 02, 2024 at 01:59:03PM +0200, Richard Biener wrote:
> As you are using input constraints to mark symbol uses maybe we can
> use output constraints with a magic identifier (and a constraint letter
> specifying 'identifier'):
> 
> asm (".globl %0; %0: ret" : "_D" (extern int foo()) : ...);
> 
> In the BOF it was noted that LTO wants to be able to rename / localize
> symbols so both use and definition should be used in a way to support
> this (though changing visibility is difficult - the assembler might
> tie to GOT uses, and .globl is hard to replace).

Seems we have quite a few free letters on the constraint side, I think
'-.:;[]@(){}|_`
are all currently rejected in both input and output constraints
(we only allow ISALPHA constraint chars for the target specific ones).
So, using one of those for "the inline asm defines this global symbol",
another for "the inline asm defines this local symbol", and another
"the inline asm uses this function" might be possible;
Of course it can be also just one of those special characters with
some qualifier after it, we'd just need to tweak the generated
insn_constraint_len for that.  I think the asm uses some variable can
be expressed with "m" (var) just fine.
Anyway, for function and var definitions, should the compiler be told
more information (e.g. whether it is weak or non-weak), or should one
assume that from the actually used FUNCTION_DECL/VAR_DECL?  Should the
global vs. local be also implied from it?  Though for variables, how does
one declare a static variable declaration but not definition?

And another thing is what the argument should be, what you wrote about
would be hard to parse (parsing dependent on the constraints, we usually
don't parse the constraint until the expression is parsed).
extern int foo ();
asm ("... : : "_" (foo));
probably would (but maybe just extern "C" for C++).

And we'd need to decide whether for LTO toplevel inline asm with
extended asm (as a new extension) is actually required not to rely on
the exact function/variable name at least in the non-global case and
the compiler is allowed to rename them (i.e. use %c0 etc. rather than
the actual name).

Jakub



[RFC PATCH] Allow limited extended asm at toplevel

2024-10-02 Thread Jakub Jelinek
Hi!

In the Cauldron IPA/LTO BoF we've discussed toplevel asms and it was
discussed it would be nice to tell the compiler something about what
the toplevel asm does.  Sure, I'm aware the kernel people said they
aren't willing to use something like that, but perhaps other projects
do.  And for kernel perhaps we should add some new option which allows
some dumb parsing of the toplevel asms and gather something from that
parsing.

The following patch is just a small step towards that, namely, allow
some subset of extended inline asm outside of functions.
The patch is unfinished, LTO streaming (out/in) of the ASM_EXPRs isn't
implemented, nor any cgraph/varpool changes to find out references etc.

The patch allows something like:

int a[2], b;
enum { E1, E2, E3, E4, E5 };
struct S { int a; char b; long long c; };
asm (".section blah; .quad %P0, %P1, %P2, %P3, %P4; .previous"
 : : "m" (a), "m" (b), "i" (42), "i" (E4), "i" (sizeof (struct S)));

Even for non-LTO, that could be useful e.g. for getting enumerators from
C/C++ as integers into the toplevel asm, or sizeof/offsetof etc.

The restrictions I've implemented are:
1) asm qualifiers aren't still allowed, so asm goto or asm inline can't be
   specified at toplevel, asm volatile has the volatile ignored for C++ with
   a warning and is an error in C like before
2) I see good use for mainly input operands, output maybe to make it clear
   that the inline asm may write some memory, I don't see a good use for
   clobbers, so the patch doesn't allow those (and of course labels because
   asm goto can't be specified)
3) the patch allows only constraints which don't allow registers, so
   typically "m" or "i" or other memory or immediate constraints; for
   memory, it requires that the operand is addressable and its address
   could be used in static var initializer (so that no code actually
   needs to be emitted for it), for others that they are constants usable
   in the static var initializers
4) the patch disallows + (there is no reload of the operands, so I don't
   see benefits of tying some operands together), nor % (who cares if
   something is commutative in this case), or & (again, no code is emitted
   around the asm), nor the 0-9 constraints

Right now there is no way to tell the compiler that the inline asm defines
some symbol, I wonder if we can find some unused constraint letter or
sequence or other syntax to denote that.  Note, I think we want to be
able to specify that an inline asm defines a function or variable and
be able to provide the type etc. thereof.  So
extern void foo (void);
extern int var;
asm ("%P0: ret" : : "defines" (foo));
asm ("%P0: .quad 0" : : "defines" (var));
where the exact "defines" part is TBD.

Another question is whether all targets have something like x86 P print
modifier which doesn't add any stuff around the printed expressions
(perhaps there are targets which don't do that just for %0 etc.), or
whether we want to add something that will be usable on all targets.

Thoughts on this?

2024-10-02  Jakub Jelinek  

gcc/
* output.h (insn_noperands): Declare.
* final.cc (insn_noperands): No longer static.
* varasm.cc (assemble_asm): Handle ASM_EXPR.
* doc/extend.texi (Basic @code{asm}, Extended @code{asm}): Document
that extended asm is now allowed outside of functions with certain
restrictions.
gcc/c/
* c-parser.cc (c_parser_asm_string_literal): Add forward declaration.
(c_parser_asm_definition): Parse also extended asm without
clobbers/labels.
* c-typeck.cc (build_asm_expr): Allow extended asm outside of
functions and check extra restrictions.
gcc/cp/
* cp-tree.h (finish_asm_stmt): Add TOPLEV_P argument.
* parser.cc (cp_parser_asm_definition): Parse also extended asm
without clobbers/labels outside of functions.
* semantics.cc (finish_asm_stmt): Add TOPLEV_P argument, if set,
check extra restrictions for extended asm outside of functions.
* pt.cc (tsubst_stmt): Adjust finish_asm_stmt caller.

--- gcc/output.h.jj 2024-10-02 10:02:08.031896380 +0200
+++ gcc/output.h2024-10-02 11:27:13.383943702 +0200
@@ -338,6 +338,9 @@ extern rtx_insn *current_output_insn;
The precise value is the insn being output, to pass to error_for_asm.  */
 extern const rtx_insn *this_is_asm_operands;
 
+/* Number of operands of this insn, for an `asm' with operands.  */
+extern unsigned int insn_noperands;
+
 /* Carry information from ASM_DECLARE_OBJECT_NAME
to ASM_FINISH_DECLARE_OBJECT.  */
 extern int size_directive_output;
--- gcc/final.cc.jj 2024-10-02 10:02:08.031896380 +0200
+++ gcc/final.cc2024-10-02 11:27:13.382943715 +0200
@@ -149,7 +149,7 @@ extern const int lengt

[committed] range-cache: Fix ranger ICE if number of bbs increases [PR116899]

2024-10-01 Thread Jakub Jelinek
Hi!

Ranger cache during initialization reserves number of basic block slots
in the m_workback vector (in an inefficient way by doing create (0)
and safe_grow_cleared (n) and truncate (0) rather than say create (n)
or reserve (n) after create).  The problem is that some passes split bbs and/or
create new basic blocks and so when one is unlucky, the quick_push into that
vector can ICE.

The following patch replaces those 4 quick_push calls with safe_push.

I've also gathered some statistics from compiling 63 gcc sources (picked those
that dependent on gimple-range-cache.h just because I had to rebuild them once
for the instrumentation), and that showed that in 81% of cases nothing has
been pushed into the vector at all (and note, not everything was small, there
were even cases with 1+ basic blocks), another 18.5% of cases had just 1-4
elements in the vector at most, 0.08% had 5-8 and 19 out of 305386 cases
had at most 9-11 elements, nothing more.  So, IMHO reserving number of basic
block in the vector is a waste of compile time memory and with the safe_push
calls, the patch just initializes the vector to vNULL.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved in the PR
by Andrew, committed to trunk.

2024-10-01  Jakub Jelinek  

PR middle-end/116899
* gimple-range-cache.cc (ranger_cache::ranger_cache): Set m_workback
to vNULL instead of creating it, growing and then truncating.
(ranger_cache::fill_block_cache): Use safe_push rather than quick_push
on m_workback.
(ranger_cache::range_from_dom): Likewise.

* gcc.dg/bitint-111.c: New test.

--- gcc/gimple-range-cache.cc.jj2024-09-30 18:50:52.314056272 +0200
+++ gcc/gimple-range-cache.cc   2024-09-30 19:33:22.160685645 +0200
@@ -997,9 +997,7 @@ update_list::pop ()
 
 ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 {
-  m_workback.create (0);
-  m_workback.safe_grow_cleared (last_basic_block_for_fn (cfun));
-  m_workback.truncate (0);
+  m_workback = vNULL;
   m_temporal = new temporal_cache;
 
   // If DOM info is available, spawn an oracle as well.
@@ -1560,7 +1558,7 @@ ranger_cache::fill_block_cache (tree nam
   // Visit each block back to the DEF.  Initialize each one to UNDEFINED.
   // m_visited at the end will contain all the blocks that we needed to set
   // the range_on_entry cache for.
-  m_workback.quick_push (bb);
+  m_workback.safe_push (bb);
   undefined.set_undefined ();
   m_on_entry.set_bb_range (name, bb, undefined);
   gcc_checking_assert (m_update->empty_p ());
@@ -1634,7 +1632,7 @@ ranger_cache::fill_block_cache (tree nam
  // the list.
  gcc_checking_assert (!m_on_entry.bb_range_p (name, pred));
  m_on_entry.set_bb_range (name, pred, undefined);
- m_workback.quick_push (pred);
+ m_workback.safe_push (pred);
}
 }
 
@@ -1729,7 +1727,7 @@ ranger_cache::range_from_dom (vrange &r,
 
   // This block has an outgoing range.
   if (gori ().has_edge_range_p (name, bb))
-   m_workback.quick_push (prev_bb);
+   m_workback.safe_push (prev_bb);
   else
{
  // Normally join blocks don't carry any new range information on
@@ -1753,7 +1751,7 @@ ranger_cache::range_from_dom (vrange &r,
break;
  }
  if (all_dom)
-   m_workback.quick_push (prev_bb);
+   m_workback.safe_push (prev_bb);
}
}
 
--- gcc/testsuite/gcc.dg/bitint-111.c.jj2024-09-30 19:36:11.997330014 
+0200
+++ gcc/testsuite/gcc.dg/bitint-111.c   2024-09-30 19:37:02.085635292 +0200
@@ -0,0 +1,16 @@
+/* PR middle-end/116899 */
+/* { dg-do compile { target bitint575 } } */
+/* { dg-options "-O2" } */
+
+float f;
+_BitInt(255) b;
+
+void
+foo (signed char c)
+{
+  for (;;)
+{
+  c %= (unsigned _BitInt(512)) 0;  /* { dg-warning "division by zero" } */
+  f /= b >= c;
+}
+}

Jakub



Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-10-01 Thread Jakub Jelinek
On Tue, Oct 01, 2024 at 11:10:03AM +0100, Jonathan Wakely wrote:
> Let's use an inline variable. A function-local static requires
> __cxa_guard_acquire, which (for some targets, including the ones
> affected by this change) uses __gthread_active_p which will
> recursively re-enter the variable's initialization.
> 
> So something like:
> 
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wc++17-extensions"
> inline volatile int __gthread_active = -1;
> #pragma GCC diagnostic pop

Note, only for #ifdef __cplusplus, for C there is no such thing as inline
variables and in that case it should use
static volatile int __ghtread_active = -1;
instead.

Jakub



[committed] range-cache: Fix ICE on SSA_NAME with def_stmt not yet in the IL [PR116898]

2024-10-01 Thread Jakub Jelinek
Hi!

Some passes like the bitint lowering queue some statements on edges and only
commit them at the end of the pass.  If they use ranger at the same time,
the ranger might see such SSA_NAMEs and ICE on those.  The following patch
instead just punts on them.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by Andrew
in the PR, committed to trunk.

2024-10-01  Jakub Jelinek  

PR middle-end/116898
* gimple-range-cache.cc (ranger_cache::block_range): If a SSA_NAME
with NULL def_bb isn't SSA_NAME_IS_DEFAULT_DEF, return false instead
of failing assertion.  Formatting fix.

* gcc.dg/bitint-110.c: New test.

--- gcc/gimple-range-cache.cc.jj2024-08-12 10:49:12.687608080 +0200
+++ gcc/gimple-range-cache.cc   2024-09-30 18:50:52.314056272 +0200
@@ -1284,13 +1284,16 @@ ranger_cache::block_range (vrange &r, ba
   gimple *def_stmt = SSA_NAME_DEF_STMT (name);
   basic_block def_bb = NULL;
   if (def_stmt)
-   def_bb = gimple_bb (def_stmt);;
+   def_bb = gimple_bb (def_stmt);
   if (!def_bb)
{
  // If we get to the entry block, this better be a default def
  // or range_on_entry was called for a block not dominated by
- // the def.  
- gcc_checking_assert (SSA_NAME_IS_DEFAULT_DEF (name));
+ // the def.  But it could be also SSA_NAME defined by a statement
+ // not yet in the IL (such as queued edge insertion), in that case
+ // just punt.
+ if (!SSA_NAME_IS_DEFAULT_DEF (name))
+   return false;
  def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
}
 
--- gcc/testsuite/gcc.dg/bitint-110.c.jj2024-09-30 18:52:50.098424063 
+0200
+++ gcc/testsuite/gcc.dg/bitint-110.c   2024-09-30 18:54:38.183925375 +0200
@@ -0,0 +1,20 @@
+/* PR middle-end/116898 */
+/* { dg-do compile { target bitint575 } } */
+/* { dg-options "-O -finstrument-functions -fnon-call-exceptions" } */
+
+_BitInt(127) a;
+_BitInt(511) b;
+
+void
+foo (_BitInt(31) c)
+{
+  do
+{
+  c %= b;
+again:
+}
+  while (c);
+  a /= 0;  /* { dg-warning "division by zero" } */
+  c -= a;
+  goto again;
+}

Jakub



Patch ping Re: [PATCH] opts: Fix up regenerate-opt-urls dependencies

2024-09-30 Thread Jakub Jelinek
Hi!

On Sat, Sep 21, 2024 at 07:43:25PM +0200, Jakub Jelinek wrote:
> It seems that we currently require
> 1) enabling at least c,c++,fortran,d in --enable-languages
> 2) first doing make html
> before one can successfully regenerate-opt-urls, otherwise without 2)
> one gets
> make regenerate-opt-urls
> make: *** No rule to make target 
> '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gcc/Option-Index.html', 
> needed by 'regenerate-opt-urls'.  Stop.
> or say if not configuring d after make html one still gets
> make regenerate-opt-urls
> make: *** No rule to make target 
> '/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gdc/Option-Index.html', 
> needed by 'regenerate-opt-urls'.  Stop.
> 
> Now, I believe neither 1) nor 2) is really necessary.
> The regenerate-opt-urls goal has dependency on 3 Option-Index.html files,
> but those files don't have dependencies how to generate them.
> make html has dependency on $(HTMLS_BUILD) which adds
> $(build_htmldir)/gcc/index.html and lang.html among other things, where
> the former actually builds not just index.html but also Option-Index.html
> and tons of other files, and lang.html is filled in by configure depending
> on configured languages, so sometimes will include gfortran.html and
> sometimes d.html.
> 
> The following patch adds dependencies of the Option-Index.html on their
> corresponding index.html files and that is all that seems to be needed,
> make regenerate-opt-urls then works even without prior make html and
> even if just a subset of c/c++, fortran and d is enabled.

I'd like to ping this patch.  Bootstrapped/regtested on x86_64-linux and
i686-linux several times (and tested that make regenerate-opt-urls just
works in that case even without make html or configuring in d,fortran).

> 2024-09-21  Jakub Jelinek  
> 
>   * Makefile.in ($(OPT_URLS_HTML_DEPS)): Add dependencies of the
>   Option-Index.html files on the corresponding index.html files.
>   Don't mention the requirement that all languages that have their own
>   HTML manuals to be enabled.
> 
> --- gcc/Makefile.in.jj2024-09-18 15:03:25.979207519 +0200
> +++ gcc/Makefile.in   2024-09-21 19:26:31.160949856 +0200
> @@ -3640,12 +3640,12 @@ $(build_htmldir)/gccinstall/index.html:
>   $(SHELL) $(srcdir)/doc/install.texi2html
>  
>  # Regenerate the .opt.urls files from the generated html, and from the .opt
> -# files.  Doing so requires all languages that have their own HTML manuals
> -# to be enabled.
> +# files.
>  .PHONY: regenerate-opt-urls
>  OPT_URLS_HTML_DEPS = $(build_htmldir)/gcc/Option-Index.html \
>   $(build_htmldir)/gdc/Option-Index.html \
>   $(build_htmldir)/gfortran/Option-Index.html
> +$(OPT_URLS_HTML_DEPS): %/Option-Index.html: %/index.html
>  
>  regenerate-opt-urls: $(srcdir)/regenerate-opt-urls.py $(OPT_URLS_HTML_DEPS)
>   $(srcdir)/regenerate-opt-urls.py $(build_htmldir) $(shell dirname 
> $(srcdir))

Jakub



Re: [PATCH] doc: Document struct-layout-1.exp for ABI checks

2024-09-29 Thread Jakub Jelinek
On Sun, Sep 08, 2024 at 08:48:57AM +0300, Dimitar Dimitrov wrote:
> This test helped discover PR116621, so it is worth being documented.
> 
> gcc/ChangeLog:
> 
>   * doc/sourcebuild.texi: Document struct-layout-1.exp.
> 
> Signed-off-by: Dimitar Dimitrov 

LGTM.

Jakub



RFC: C++ and C23 zero initialization of padding bits

2024-09-28 Thread Jakub Jelinek
Hi!

C++ has
https://eel.is/c++draft/dcl.init#general-6.2
https://eel.is/c++draft/dcl.init#general-6.3
which says that during zero-initialization padding bits of structures
and unions are zero initialized, and in
https://eel.is/c++draft/dcl.init#general-9.3
says that in certain cases value-initialization is zero-initialization.
E.g. () initialization is value-initialization.

Now, looking at C23, it has something similar, called default
initialization, which also requires zeroing padding bits, unlike C++
in different cases.  Initialization with {} initializer new in C23 is
default-initialization, and if an initializer initializes some members
but not others, the rest are also default-initialized.  Reading C17,
that was apparently the case already there, eventhough it wasn't called
default initialization - "the remainder of the aggregate shall be initialized
implicitly the same as objects that have static storage duration" and
the static/thread_local storage duration initialization talking about the
clearing of padding bits.

Of course, if this is initialization of file/namespace scope or thread_local
variable, we do zero initialize padding bits forever.  A different thing are
automatic variable initializers.  For structures I think whether we clear
the padding bits or not is right now purely an optimization decision during
gimplification, whether we decide to optimize by clearing the whole
structure or not from performance POV.  And for unions we did clear the
padding (the whole union; the patch I've just posted changes that though).

And I think D has something similar.

My main question is when the standards say that the padding bits need to be
cleared, whether a valid program can actually observe that.

I'd hope that structure assignment is element-wise copying and so doesn't
need to preserve those bits.  What about memcpy, or *(unsigned char *),
or for C++ std::bit_cast inspection of the bits (constexpr for C++ or not)?

Or is the wording about clearing padding bits in both standards just useless
and it is still UB to depend on the value of the padding bits?

struct A { unsigned char a; unsigned int b; };
struct B { unsigned char c; struct A d; struct A e; unsigned long long f; };

void
foo (void)
{
#ifdef __cplusplus
  struct B a (); // value-initialization -> zero-initialization
  struct B b = { 0 }; // I think this is zero-initialization of d and e and f.
#else
  struct B a = {}; // New in C23, default-initialization of whole.
  struct B b = { 0 }; // I think this is default-initialization of d and e and 
f,
  // even in C17 and possibly earlier.
#endif
  /* I think both standards say that the padding in between c and d here
 is cleared, can one inspect it like this?  Similarly padding between e and 
f.  */
  if (*((unsigned char *)&a + 1) != 0)
abort ();
  /* My reading of either of the standards is that the padding in between
 b.c and b.d and b.e and b.f is uninitialized.  */
  /* My reading of either of the standards is that the padding in between
 b.d.a and b.d.b and padding between b.e.a and b.e.b is zero initialized
 (and similarly also for a.d.a and a.d.b etc.).  */
  if (*((unsigned char *)&b + offsetof (struct B, e) + 1) != 0)
abort ();
}

If the padding bits need to be cleared and it needs to be observable for
C++ zero-initialization and for C default-initialization, I wonder if we
shouldn't introduce CONSTRUCTOR_CLEAR_PADDING flag, let the FEs set it on
CONSTRUCTORs where the standards guarantee clearing of padding bits and
during gimplification don't take it just as an optimization, but as a
conformance issue (say in the code in expr.cc I was touching inside of
CONSTRUCTOR_CLEAR_PADDING ctors never set *p_complete to -1, but set it to
0 in those cases instead to force the zeroing.  And a question is if we
should have some flag too for whether missing constructor elts result in
those elements to be effectively CONSTRUCTOR_CLEAR_PADDING or not (or
whether it is enough that we set *p_complete = 0 in that case anyway).

And another question is whether we want to keep such flag on the CONSTRUCTOR
preserved in GIMPLE, or whether we declare a mere struct whatever a = {};
in GIMPLE clears the padding too, or whether we use
MEM  [&a, 0] = {};
for that.  And whether the DSE memory trimming shouldn't use some other
representation if the padding bits in it aren't needed to be cleared.
And whether we perform some optimizations which might break the padding
bits, SRA, or the DSE memory trimming (in case there is a store but some
padding bits left somewhere), etc.

Thoughts on this?

Jakub



[PATCH] expr: Don't clear whole unions [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 04:01:33PM +0200, Jakub Jelinek wrote:
> So, I think we should go with (but so far completely untested except
> for pr78687.C which is optimized with Marek's patch and the above testcase
> which doesn't have the clearing anymore) the following patch.

That patch had a bug in type_has_padding_at_level_p and so it didn't
bootstrap.

Here is a full patch which does.  It regressed the infoleak-1.c test
which I've adjusted, but I think the test had undefined behavior.
In particular the question is whether
  union un_b { unsigned char j; unsigned int i; } u = {0};
leaves (or can leave) some bits uninitialized or not.

I believe it can, it is an explicit initialization of the j member
which is just 8-bit (but see my upcoming mail on padding bits in C23/C++)
and nothing in the C standard from what I can see seems to imply the padding
bits in the union beyond the actually initialized field in this case would
be initialized.
Though, looking at godbolt, clang and icc 19 and older gcc all do zero
initialize the whole union before storing the single member in there (if
non-zero, otherwise just clear).

So whether we want to do this or do it by default is another question.

Anyway, bootstrapped/regtested on x86_64-linux and i686-linux successfully.

2024-09-28  Jakub Jelinek  

PR c++/116416
* expr.cc (categorize_ctor_elements_1): Fix up union handling of
*p_complete.  Clear it only if num_fields is 0 and the union has
at least one FIELD_DECL, set to -1 if either union has no fields
and non-zero size, or num_fields is 1 and complete_ctor_at_level_p
returned false.
* gimple-fold.cc (type_has_padding_at_level_p): Fix up UNION_TYPE
handling, return also true for UNION_TYPE with no FIELD_DECLs
and non-zero size, handle QUAL_UNION_TYPE like UNION_TYPE.

* gcc.dg/plugin/infoleak-1.c (test_union_2b, test_union_4b): Expect
diagnostics.

--- gcc/expr.cc.jj  2024-09-04 12:09:22.598808244 +0200
+++ gcc/expr.cc 2024-09-27 15:34:20.929282525 +0200
@@ -7218,7 +7218,36 @@ categorize_ctor_elements_1 (const_tree c
 
   if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor),
num_fields, elt_type))
-*p_complete = 0;
+{
+  if (TREE_CODE (TREE_TYPE (ctor)) == UNION_TYPE
+ || TREE_CODE (TREE_TYPE (ctor)) == QUAL_UNION_TYPE)
+   {
+ /* The union case is more complicated.  */
+ if (num_fields == 0)
+   {
+ /* If the CONSTRUCTOR doesn't have any elts, it is
+incomplete if the union has at least one field.  */
+ for (tree f = TYPE_FIELDS (TREE_TYPE (ctor));
+  f; f = DECL_CHAIN (f))
+   if (TREE_CODE (f) == FIELD_DECL)
+ {
+   *p_complete = 0;
+   break;
+ }
+ /* Otherwise it has padding if the union has non-zero size.  */
+ if (*p_complete > 0
+ && !integer_zerop (TYPE_SIZE (TREE_TYPE (ctor
+   *p_complete = -1;
+   }
+ /* Otherwise the CONSTRUCTOR should have exactly one element.
+It is then never incomplete, but if complete_ctor_at_level_p
+returned false, it has padding.  */
+ else if (*p_complete > 0)
+   *p_complete = -1;
+   }
+  else
+   *p_complete = 0;
+}
   else if (*p_complete > 0
   && type_has_padding_at_level_p (TREE_TYPE (ctor)))
 *p_complete = -1;
--- gcc/gimple-fold.cc.jj   2024-09-09 11:25:43.197048840 +0200
+++ gcc/gimple-fold.cc  2024-09-27 18:51:30.036002116 +0200
@@ -4814,12 +4814,22 @@ type_has_padding_at_level_p (tree type)
return false;
   }
 case UNION_TYPE:
+case QUAL_UNION_TYPE:
+  bool any_fields;
+  any_fields = false;
   /* If any of the fields is smaller than the whole, there is padding.  */
   for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
-   if (TREE_CODE (f) == FIELD_DECL)
- if (simple_cst_equal (TYPE_SIZE (TREE_TYPE (f)),
-   TREE_TYPE (type)) != 1)
-   return true;
+   if (TREE_CODE (f) != FIELD_DECL)
+ continue;
+   else if (simple_cst_equal (TYPE_SIZE (TREE_TYPE (f)),
+  TYPE_SIZE (type)) != 1)
+ return true;
+   else
+ any_fields = true;
+  /* If the union doesn't have any fields and still has non-zero size,
+all of it is padding.  */
+  if (!any_fields && !integer_zerop (TYPE_SIZE (type)))
+   return true;
   return false;
 case ARRAY_TYPE:
 case COMPLEX_TYPE:
--- gcc/testsuite/gcc.dg/plugin/infoleak-1.c.jj 2022-09-11 22:28:56.356164436 
+0200
+++ gcc/testsuite/gcc.dg/plugin/infoleak-1.c2024-09-28 08:20:25.806973480 
+0200
@@ -123,9 +123,1

Re: [PATCH v2 4/4] tree-object-size: Fall back to wholesize for non-const offset

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:29PM -0400, Siddhesh Poyarekar wrote:
> Don't bail out early if the offset to a pointer in __builtin_object_size
> is a variable, return the wholesize instead since that is a better
> fallback for maximum estimate.  This should keep checks in place for
> fortified functions to constrain overflows to at lesat some extent.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/77608
>   * tree-object-size.cc (plus_stmt_object_size): Drop check for
>   constant offset.
>   * testsuite/gcc.dg/builtin-object-size-1.c (test14): New test.
> 
> Signed-off-by: Siddhesh Poyarekar 
> ---
>  gcc/testsuite/gcc.dg/builtin-object-size-1.c | 19 +++
>  gcc/tree-object-size.cc  |  7 ---
>  2 files changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-1.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> index 6ffe12be683..5a24087ae1e 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-1.c
> @@ -791,6 +791,25 @@ test13 (unsigned cond)
>  #endif
>  }
>  
> +void
> +__attribute__ ((noinline))
> +test14 (unsigned off)
> +{
> +  char *buf2 = malloc (10);
> +  char *p;
> +  size_t t;
> +
> +  p = &buf2[off];
> +
> +#ifdef __builtin_object_size
> +  if (__builtin_object_size (p, 0) != 10 - off)
> +FAIL ();
> +#else
> +  if (__builtin_object_size (p, 0) != 10)
> +FAIL ();
> +#endif
> +}
> +
>  int
>  main (void)
>  {
> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 1b569c3d12b..ebd2a4650aa 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1595,8 +1595,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>  op1 = try_collapsing_offset (op1, NULL_TREE, NOP_EXPR, object_size_type);
>  
>/* Handle PTR + OFFSET here.  */
> -  if (size_valid_p (op1, object_size_type)
> -  && (TREE_CODE (op0) == SSA_NAME || TREE_CODE (op0) == ADDR_EXPR))
> +  if ((TREE_CODE (op0) == SSA_NAME || TREE_CODE (op0) == ADDR_EXPR))
>  {
>if (TREE_CODE (op0) == SSA_NAME)
>   {
> @@ -1621,7 +1620,9 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>if (size_unknown_p (bytes, 0))
>   ;
>else if ((object_size_type & OST_DYNAMIC)
> -|| bytes != wholesize || compare_tree_int (op1, offset_limit) <= 
> 0)
> +|| bytes != wholesize
> +|| (size_valid_p (op1, object_size_type)
> +&& compare_tree_int (op1, offset_limit) <= 0))
>   bytes = size_for_offset (bytes, op1, wholesize);
>/* In the static case, with a negative offset, the best estimate for
>minimum size is size_unknown but for maximum size, the wholesize is a

LGTM.

Jakub



Re: [PATCH v2 1/4] tree-object-size: use size_for_offset in more cases

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:26PM -0400, Siddhesh Poyarekar wrote:
> When wholesize != size, there is a reasonable opportunity for static
> object sizes also to be computed using size_for_offset, so use that.
> 
> gcc/ChangeLog:
> 
>   * tree-object-size.cc (plus_stmt_object_size): Call
>   SIZE_FOR_OFFSET for some negative offset cases.
>   * testsuite/gcc.dg/builtin-object-size-3.c (test9): Adjust test.
>   * testsuite/gcc.dg/builtin-object-size-4.c (test8): Likewise.
> ---
>  gcc/testsuite/gcc.dg/builtin-object-size-3.c | 6 +++---
>  gcc/testsuite/gcc.dg/builtin-object-size-4.c | 6 +++---
>  gcc/tree-object-size.cc  | 2 +-
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-3.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> index 3f58da3d500..ec2c62c9640 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-3.c
> @@ -574,7 +574,7 @@ test9 (unsigned cond)
>if (__builtin_object_size (&p[-4], 2) != (cond ? 6 : 10))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (&p[-4], 2) != 0)
> +  if (__builtin_object_size (&p[-4], 2) != 6)
>  FAIL ();
>  #endif
>  
> @@ -585,7 +585,7 @@ test9 (unsigned cond)
>if (__builtin_object_size (p, 2) != ((cond ? 2 : 6) + cond))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 2) != 0)
> +  if (__builtin_object_size (p, 2) != 2)
>  FAIL ();
>  #endif
>  
> @@ -598,7 +598,7 @@ test9 (unsigned cond)
>!= sizeof (y) - __builtin_offsetof (struct A, c) - 8 + cond)
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 2) != 0)
> +  if (__builtin_object_size (p, 2) != sizeof (y) - __builtin_offsetof 
> (struct A, c) - 8)
>  FAIL ();
>  #endif
>  }
> diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-4.c 
> b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> index b3eb36efb74..7bcd24c4150 100644
> --- a/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> +++ b/gcc/testsuite/gcc.dg/builtin-object-size-4.c
> @@ -482,7 +482,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (&p[-4], 3) != (cond ? 6 : 10))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (&p[-4], 3) != 0)
> +  if (__builtin_object_size (&p[-4], 3) != 6)
>  FAIL ();
>  #endif
>  
> @@ -493,7 +493,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (p, 3) != ((cond ? 2 : 6) + cond))
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 3) != 0)
> +  if (__builtin_object_size (p, 3) != 2)
>  FAIL ();
>  #endif
>  
> @@ -505,7 +505,7 @@ test8 (unsigned cond)
>if (__builtin_object_size (p, 3) != sizeof (y.c) - 8 + cond)
>  FAIL ();
>  #else
> -  if (__builtin_object_size (p, 3) != 0)
> +  if (__builtin_object_size (p, 3) != sizeof (y.c) - 8)
>  FAIL ();
>  #endif
>  }

The testcase changes look reasonable to me.

> diff --git a/gcc/tree-object-size.cc b/gcc/tree-object-size.cc
> index 6544730e153..f8fae0cbc82 100644
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1527,7 +1527,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>if (size_unknown_p (bytes, 0))
>   ;
>else if ((object_size_type & OST_DYNAMIC)
> -|| compare_tree_int (op1, offset_limit) <= 0)
> +|| bytes != wholesize || compare_tree_int (op1, offset_limit) <= 
> 0)
>   bytes = size_for_offset (bytes, op1, wholesize);
>/* In the static case, with a negative offset, the best estimate for
>minimum size is size_unknown but for maximum size, the wholesize is a

The coding conventions say that in cases like this where the whole condition
doesn't fit on a single line, each ||/&& operand should be on a separate
line.
So, the patch should be adding || bytes != wholesize on a separate line.

That said, there is a pre-existing problem, the tree direct comparisons
(bytes != wholesize here, && wholesize != sz in size_for_offset (note,
again, it should be on a separate line), maybe others).

We do INTEGER_CST caching, either using small array for small values or
hash table for larger ones, so INTEGER_CSTs with the same value of the
same type should be pointer equal unless they are TREE_OVERFLOW or similar,
but for anything else, unless you guarantee that in the "same" case
you assign the same tree to size/wholesize rather than say
perform size_binop twice, I'd expect instead comparisons with
operand_equal_p or something similar.

Though, because this patch is solely for the __builtin_object_size case
and the sizes in that case should be solely INTEGER_CSTs, I guess this patch
is ok with the formatting nit fix (and ideally the one in size_for_offset
too).

Jakub



Re: [PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 09:54:13AM -0400, Lewis Hyatt wrote:
> A couple comments that may be helpful...
> 
> -This is also PR 64117 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64117)
> 
> -I submitted a patch last year for that but did not get any response
> (https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635648.html).

Oops, sorry, wasn't aware of that.
Note, I've already committed it.

> I guess I never pinged it because I am still trying to ping two other
> ones :). My patch did not switch to vec so it was not as nice as this
> one. I wonder though, if some of the testcases I added could be
> incorporated? In particular the testcase from my patch

Maybe.

> pragma-diagnostic-3.C I believe will still be failing after this one.
> There is an issue with C++ because it processes the pragmas twice,
> once in early mode and once in normal mode, that makes it do the wrong
> thing for this case:
> 
> t.h:
> 
>  #pragma GCC diagnostic push
>  #pragma GCC diagnostic ignored...
>  //no pop at end of the file
> 
> t.c
> 
>  #include "t.h"
>  #pragma GCC diagnostic pop
>  //expect to be at the initial state here, but won't be if t.h is a PCH
> 
> In my patch I had separated the PCH restore from a more general "state
> restore" logic so that the C++ frontend can restore the state after
> the first pass through the data.

People shouldn't be doing this, without PCH or with it, and especially not
with PCH, that is simply too ugly.
That said, this was the reason why I have saved also the m_push_list
vector and not just the history.  If that isn't enough and there is some
other state on the libcpp side, I'd think we should go with a sorry and tell
the user not to do this with PCH.  It becomes a nightmare already if e.g.
the command line -Werror=something -Wno-error=somethingelse overrides
differ between the PCH creation and PCH reading (my first version of
the patch saved m_classify_diagnostic array but that broke tests relying
on those differences).

Jakub



Re: [PATCH v6] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 02:01:19PM +, Qing Zhao wrote:
> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
> + .ACCESS_WITH_SIZE, return true.
> + More cases can be included later when the counted_by attribute is
> + extended to other situations.  */
> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)

The ()s around the == are useless.

> +  && is_access_with_size_p (TREE_OPERAND (array_ref, 0)))
> +return true;
> +  return false;
> +}
> +
> +/* Get the reference to the counted-by object associated with the ARRAY_REF. 
>  */
> +static tree
> +get_counted_by_ref (tree array_ref)
> +{
> +  /* Currently, only when the array_ref is an indirect_ref to a call to the
> + .ACCESS_WITH_SIZE, get the corresponding counted_by ref.
> + More cases can be included later when the counted_by attribute is
> + extended to other situations.  */
> +  if ((TREE_CODE (array_ref) == INDIRECT_REF)

Again.

> + if (TREE_CODE (TREE_TYPE (ref)) != ARRAY_TYPE)
> +   {
> + error_at (loc, "the argument must be an array"
> +"%<__builtin_counted_by_ref%>");
> + expr.set_error ();
> + break;
> +   }
> +
> + /* if the array ref is inside TYPEOF or ALIGNOF, the call to

Comments should start with capital letter, i.e. If

> +.ACCESS_WITH_SIZE was not genereated by the routine

s/genereated/generated/

> +build_component_ref by default, we should generate it here.  */
> + if ((in_typeof || in_alignof)
&& TREE_CODE (ref) == COMPONENT_REF)

The above && ... fits on the same line as the rest of the condition.

> +   ref = handle_counted_by_for_component_ref (loc, ref);
> +
> + if (has_counted_by_object (ref))
> +   expr.value
> + = get_counted_by_ref (ref);

This too.

> + else
> +   expr.value
> + = build_int_cst (build_pointer_type (void_type_node), 0);

else
  expr.value = null_pointer_node;
instead.

> +/*
> + * For the COMPONENT_REF ref, check whether it has a counted_by attribute,
> + * if so, wrap this COMPONENT_REF with the corresponding CALL to the
> + * function .ACCESS_WITH_SIZE.
> + * Otherwise, return the ref itself.
> + */

We don't use this style of comments.  No *s at the start of each line, /*
should be immediately followed after space with the first line and */
should be right after . and two spaces.

Jakub



Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 12:14:47PM +0200, Richard Biener wrote:
> I can investigate a bit when there's a testcase showing the issue.

The testcase is pr78687.C with Marek's cp-gimplify.cc patch.

Or the
struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; } 
u; unsigned long w; };
void bar (struct S *);

void
foo ()
{
  struct S s = { .u = { .v = {} }, .w = 2 };
  bar (&s);
}
I've mentioned shows the same behavior except for SRA (which
is there of course prevented through the object escaping).
Though, not really sure right now if this reduced testcase
in C or C++ actually isn't supposed to clear the whole object rather than
just the initialized fields and what exactly is CONSTRUCTOR_NO_CLEARING
vs. !CONSTRUCTOR_NO_CLEARING supposed to mean.

On pr78687.C with Marek's patch, CONSTRUCTOR_NO_CLEARING is cleared with
  /* The result of a constexpr function must be completely initialized.

 However, in C++20, a constexpr constructor doesn't necessarily have
 to initialize all the fields, so we don't clear CONSTRUCTOR_NO_CLEARING
 in order to detect reading an unitialized object in constexpr instead
 of value-initializing it.  (reduced_constant_expression_p is expected to
 take care of clearing the flag.)  */
  if (TREE_CODE (result) == CONSTRUCTOR
  && (cxx_dialect < cxx20
  || !DECL_CONSTRUCTOR_P (fun)))
clear_no_implicit_zero (result);

generic.texi says:
"Unrepresented fields will be cleared (zeroed), unless the
CONSTRUCTOR_NO_CLEARING flag is set, in which case their value becomes
undefined."
Now, for RECORD_TYPE, I think the !CONSTRUCTOR_NO_CLEARING meaning is clear,
if the flag isn't set, then if there is no constructor_elt for certain
FIELD_DECL, that FIELD_DECL is implicitly zeroed.  The state of padding bits
is fuzzy, we don't really have a flag whether the CONSTRUCTOR clears also
padding bits (aka C++ zero initialization) or not.
The problem above is with UNION_TYPE.  If the CONSTRUCTOR for it is empty,
that should IMHO still imply zero initialization of the whole thing, we
don't really know which union member is active.  But if the CONSTRUCTOR
has one elt (it should never have more than one), shouldn't that mean
(unless there is a new flag which says that padding bits are cleared too)
that CONSTRUCTOR_NO_CLEARING and !CONSTRUCTOR_NO_CLEARING behave the same,
in particular that the selected FIELD_DECL is initialized to whatever
the initializer is but nothing else is?

The reason why the gimplifier clears the whole struct is because
on (with Marek's patch on the pr78687.C testcase)
D.10137 = 
{._storage={.D.9542={.D.9123={._tail={.D.9181={._tail={.D.9240={._head={}}},
 ._which=2}};
or that
s = {.u={.v={}}, .w=2};
in the above testcase, categorize_ctor_elements yields
valid_const_initializer = true
num_nonzero_elements = 1
num_unique_nonzero_elements = 1
num_ctor_elements = 1
complete_p = false
- there is a single non-empty initializer in both CONSTRUCTORs,
they aren't CONSTRUCTOR_NO_CLEARING and the reason complete_p is false
is that categorize_ctor_elements_1 does:
  if (*p_complete && !complete_ctor_at_level_p (TREE_TYPE (ctor),
num_fields, elt_type))
*p_complete = 0;
  else if (*p_complete > 0
   && type_has_padding_at_level_p (TREE_TYPE (ctor)))
*p_complete = -1;
and for UNION_TYPE/QUAL_UNION_TYPE complete_ctor_at_level_p does:
  if (num_elts == 0)
return false;
...
  /* ??? We could look at each element of the union, and find the
 largest element.  Which would avoid comparing the size of the
 initialized element against any tail padding in the union.
 Doesn't seem worth the effort...  */
  return simple_cst_equal (TYPE_SIZE (type), TYPE_SIZE (last_type)) == 1;
Now, given the documentation of categorize_ctor_elements:
   * whether the constructor is complete -- in the sense that every
 meaningful byte is explicitly given a value --
 and place it in *P_COMPLETE:
 -  0 if any field is missing
 -  1 if all fields are initialized, and there's no padding
 - -1 if all fields are initialized, but there's padding
I'd argue this handling of UNION_TYPE/QUAL_UNION_TYPE is wrong
(though note that type_has_padding_at_level_p returns true if any of the
union members is smaller than the whole, rather than checking whether
the actually initialized one has the same size as whole), it will
set *p_complete to 0 as if any field is missing, even when no field
is missing, just the union has padding.

So, I think we should go with (but so far completely untested except
for pr78687.C which is optimized with Marek's patch and the above testcase
which doesn't have the clearing anymore) the following patch.

2024-09-27  Jakub Jelinek  

PR c++/116416
* expr.cc (categorize_ctor_

Re: [PATCH v2 3/4] tree-object-size: Handle PHI + CST type offsets

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:28PM -0400, Siddhesh Poyarekar wrote:
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1473,7 +1473,7 @@ merge_object_sizes (struct object_size_info *osi, tree 
> dest, tree orig)
> with all of its targets are constants.  */
>  
>  static tree
> -try_collapsing_offset (tree op, int object_size_type)
> +try_collapsing_offset (tree op, tree cst, tree_code code, int 
> object_size_type)
>  {
>gcc_assert (!(object_size_type & OST_DYNAMIC));

I believe you were just returning op here if it isn't SSA_NAME.
Now, if it is INTEGER_CST, it will misbehave whenever cst != NULL,
as it will just return op rather than op + cst or op - cst.

> @@ -1485,13 +1485,41 @@ try_collapsing_offset (tree op, int object_size_type)
>switch (gimple_code (stmt))
>  {
>  case GIMPLE_ASSIGN:
> -  /* Peek through casts.  */
> -  if (gimple_assign_rhs_code (stmt) == NOP_EXPR)
> +  switch (gimple_assign_rhs_code (stmt))
>   {
> -   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> - object_size_type);
> -   if (TREE_CODE (ret) == INTEGER_CST)
> - return ret;
> +   /* Peek through casts.  */
> + case NOP_EXPR:

That would be CASE_CONVERT:

> + {

Wrong indentation, { should be 2 columns to the left of case, and tree ret
4 columns.

> +   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> + NULL_TREE, NOP_EXPR,
> + object_size_type);

This loses cst and code from the caller if it isn't NULL_TREE, NOP_EXPR,
again causing miscompilations.
I'd think it should just pass through cst, code from the caller here
(but then one needs to be extra careful, because cst can have different
type from op because of the recusion on casts).

> +   if (TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> +   break;
> + }
> +   /* Propagate constant offsets to PHI expressions.  */
> + case PLUS_EXPR:
> + case MINUS_EXPR:
> + {
> +   tree rhs1 = gimple_assign_rhs1 (stmt);
> +   tree rhs2 = gimple_assign_rhs2 (stmt);
> +   tree ret = NULL_TREE;
> +

And I think this should just punt on code != NOP_EXPR (or cst != NULL_TREE,
one is enough).  Because if you have
  # _2 = PHI...
  _3 = _2 + 3;
  _4 = 7 - _3;
or something similar, it will handle just the inner operation and not the
outer one.

> +   if (TREE_CODE (rhs1) == INTEGER_CST)
> + ret = try_collapsing_offset (rhs2, rhs1,
> +  gimple_assign_rhs_code (stmt),
> +  object_size_type);
> +   else if (TREE_CODE (rhs2) == INTEGER_CST)
> + ret = try_collapsing_offset (rhs1, rhs2,
> +  gimple_assign_rhs_code (stmt),
> +  object_size_type);
> +
> +   if (ret && TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> +   break;
> + }
> + default:
> +   break;
>   }
>break;
>  case GIMPLE_PHI:
> @@ -1507,14 +1535,20 @@ try_collapsing_offset (tree op, int object_size_type)
> if (TREE_CODE (rhs) != INTEGER_CST)
>   return op;
>  
> +   if (cst)
> + rhs = fold_build2 (code, sizetype,
> +fold_convert (ptrdiff_type_node, rhs),
> +fold_convert (ptrdiff_type_node, cst));

This can be done on wide_int too.  But one needs to think through the cast
cases, what happened if the cast was widening (from signed or unsigned),
what happened if the cast was narrowing.

> +   else
> + rhs = fold_convert (ptrdiff_type_node, rhs);
> +
> /* Note that this is the *opposite* of what we usually do with
>sizes, because the maximum offset estimate here will give us a
>minimum size estimate and vice versa.  */
> -   enum tree_code code = (object_size_type & OST_MINIMUM
> -  ? MAX_EXPR : MIN_EXPR);
> +   enum tree_code selection_code = (object_size_type & OST_MINIMUM
> +? MAX_EXPR : MIN_EXPR);
>  
> -   off = fold_build2 (code, ptrdiff_type_node, off,
> -  fold_convert (ptrdiff_type_node, rhs));
> +   off = fold_build2 (selection_code, ptrdiff_type_node, off, rhs);
>   }
> return fold_convert (sizetype, off);
>   }
> @@ -1558,7 +1592,7 @@ plus_stmt_object_size (struct object_size_info *osi, 
> tree var, gimple *stmt)
>  return false;
>  
>if (!(object_size_type & OST_DYNAMIC) && TREE_CODE (op1) != INTEGER_CST)
> -op1 = try_collapsing_offset (op1, object_size_type);
> +op1 = try_collapsing_offset (op1, NULL_TREE, NO

Re: [PATCH v2 2/4] tree-object-size: Fold PHI node offsets with constants [PR116556]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 12:40:27PM -0400, Siddhesh Poyarekar wrote:
> --- a/gcc/tree-object-size.cc
> +++ b/gcc/tree-object-size.cc
> @@ -1468,6 +1468,63 @@ merge_object_sizes (struct object_size_info *osi, tree 
> dest, tree orig)
>return bitmap_bit_p (osi->reexamine, SSA_NAME_VERSION (orig));
>  }
>  
> +/* For constant sizes, try collapsing a non-constant offset to a constant if
> +   possible.  The case handled at the moment is when the offset is a PHI node
> +   with all of its targets are constants.  */
> +
> +static tree
> +try_collapsing_offset (tree op, int object_size_type)
> +{
> +  gcc_assert (!(object_size_type & OST_DYNAMIC));
> +
> +  if (TREE_CODE (op) != SSA_NAME)
> +return op;
> +
> +  gimple *stmt = SSA_NAME_DEF_STMT (op);
> +
> +  switch (gimple_code (stmt))
> +{
> +case GIMPLE_ASSIGN:
> +  /* Peek through casts.  */
> +  if (gimple_assign_rhs_code (stmt) == NOP_EXPR)

Do you really want to handle all casts?  That could be widening, narrowing,
from non-integral types, ...
If only same mode, then gimple_assign_unary_nop_p probably should be used
instead, if any from integral types (same precision, widening, narrowing),
then perhaps CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
but verify that INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt)))
before actually recursing.
Note, I think narrowing or widening casts to sizetype are fine, but when
you recurse through, other casts might not be anymore.
Consider
  short int _1;
  unsigned int _2;
  size_t _3;
...
  # _1 = PHI <-10(7), 12(8), 4(9), -42(10)>
  _2 = (unsigned int) _1;
  _3 = (size_t) _2;
If the recursion finds minimum or maximum from the signed short int _1
values (cast to ptrdiff_type_node), pretending that is the minimum or
maximum for _3 is just wrong, as the cast from signed to unsigned will
turn negative values to something larger than the smaller positive values.

Similarly, consider casts from unsigned __int128 -> unsigned short -> size_t
(or signed short in between), what is minimum/maximum in 128-bits (the code
for PHIs actually ignores the upper bits and looks only for signed 64-bits,
but if there is unfolded cast from INTEGER_CST you actually could have even
large value) isn't necessarily minimum after cast to 16-bit (unsigned or
signed).

So, unless you want to get all the possibilities into account, perhaps only
recurse through casts from integer types to integer types with precision
of sizetype?

And perhaps also look for INTEGER_CST type returned from the recursive
call and if it doesn't have sizetype precision, either convert it to
sizetype or ignore.

> + {
> +   tree ret = try_collapsing_offset (gimple_assign_rhs1 (stmt),
> + object_size_type);
> +   if (TREE_CODE (ret) == INTEGER_CST)
> + return ret;
> + }
> +  break;
> +case GIMPLE_PHI:
> + {
> +   tree off = ((object_size_type & OST_MINIMUM)
> +   ? TYPE_MIN_VALUE (ptrdiff_type_node)
> +   : TYPE_MAX_VALUE (ptrdiff_type_node));

Because you only process constants, I wonder whether using
wide_int off and performing the min/max on wide_int wouldn't be
better, no need to create INTEGER_CSTs for all the intermediate
results.
That would be
  unsigned int prec = TYPE_PRECISION (ptrdiff_type_node);
  wide_int off = ((object_size_type & OST_MINIMUM)
  ? wi::to_wide (TYPE_MIN_VALUE (ptrdiff_type_node))
  : wi::to_wide (TYPE_MAX_VALUE (ptrdiff_type_node)));

> +
> +   for (unsigned i = 0; i < gimple_phi_num_args (stmt); i++)
> + {
> +   tree rhs = gimple_phi_arg_def (stmt, i);
> +

I wonder if it wouldn't be useful to recurse here,
  rhs = try_collapsing_offset (rhs, object_size_type);
but guess with some extra counter argument and only allow some small
constant levels of nesting (but also do that for the cast cases).

> +   if (TREE_CODE (rhs) != INTEGER_CST)
> + return op;
> +
> +   /* Note that this is the *opposite* of what we usually do with
> +  sizes, because the maximum offset estimate here will give us a
> +  minimum size estimate and vice versa.  */
> +   enum tree_code code = (object_size_type & OST_MINIMUM
> +  ? MAX_EXPR : MIN_EXPR);
> +
> +   off = fold_build2 (code, ptrdiff_type_node, off,
> +  fold_convert (ptrdiff_type_node, rhs));

And this could be

  wide_int w = wi::to_wide (rhs, prec);
  if (object_size_type & OST_MINIMUM)
off = wi::smax (off, w);
  else
off = wi::smin (off, w);

> + }
> +   return fold_convert (sizetype, off);

  return wide_int_to_tree (sizetype, off);

> + }
> +default:
> +  break;
> +}
> +
> +  /* Nothing worked, so return OP untouched.  */
> +  return op;
> +}
>  
>  /* Compute 

Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2024 at 08:16:43AM +0200, Richard Biener wrote:
> > __attribute__((noinline))
> > struct ref_proxy f ()
> > {
> >struct ref_proxy ptr;
> >struct ref_proxy D.10036;
> >struct ref_proxy type;
> >struct ref_proxy type;
> >struct qual_option D.10031;
> >struct ref_proxy D.10030;
> >struct qual_option inner;
> >struct variant t;
> >struct variant D.10026;
> >struct variant D.10024;
> >struct inplace_ref D.10023;
> >struct inplace_ref ptr;
> >struct ref_proxy D.9898;
> > 
> > [local count: 1073741824]:
> >MEM  [(struct variant *)&D.10024] = {};
> 
> Without actually checking it might be that SRA chokes on the above.
> The IL is basically a huge chain of aggregate copies interspersed
> with clobbers and occasional scalar inits and I fear that we really
> only have SRA dealing with this.

True.

> Is there any reason to use the char[40] init instead of a aggregate
> {} init of type variant?

It is dse1 which introduces that:
-  D.10137 = {};
+  MEM  [(struct variant *)&D.10137] = {};
in particular maybe_trim_constructor_store.

So, if SRA chokes on it, it better be fixed to deal with that,
DSE can create that any time.
Though, not sure how to differentiate that from the actual C++ zero
initialization where it is supposed to clear also padding bits if any.
I think a CONSTRUCTOR flag for that would be best, though e.g. in GIMPLE
representing such clears including padding bits with
MEM  [(struct whatever *)&D.whatever] = {};
might be an option too.  But then how to represent the DSE constructor
trimming such that it is clear that it still doesn't initialize the padding
bits?
Anyway, even if padding bits are zero initialized, if SRA could see that
nothing really inspects those padding bits, it would be nice to still optimize
it.

That said, it is
a union of a struct with 5 pointers (i.e. 40 bytes) and an empty struct
(1 byte, with padding) followed by size_t which_ field (the = 2 store).

And, I believe when not constexpr evaluating this, there actually is no
clearing before the = 2; store,
void 
eggs::variants::detail::_storage, true, true>::_storage<2, option_2> (struct _storage * 
const thi
s, struct index which, struct option_2 & args#0)
{
  struct index D.10676;

  *this = {CLOBBER(bob)};
  {
_1 = &this->D.9542;

eggs::variants::detail::_union, true>::_union<2, option_2> (_1, D.10676, args#0);
this->_which = 2;
  }
}
and the call in there is just 3 nested inline calls which do some clobbers
too, take address of something and call another inline and in the end
it is just a clobber and nothing else.

So, another thing to look at is whether the CONSTRUCTOR is
CONSTRUCTOR_NO_CLEARING or not and if that is ok, and/or whether
the gimplification is correct in that case (say if it would be
struct S { union U { struct T { void *a, *b, *c, *d, *e; } t; struct V {} v; } 
u; unsigned long w; };
void bar (S *);

void
foo ()
{
  S s = { .u = { .v = {} }, .w = 2 };
  bar (&s);
}
why do we expand it as
  s = {};
  s.w = 2;
when just
  s.w = 2;
or maybe
  s.u.v = {};
  s.w = 2;
would be enough.  Because when the large union has just a small member
(in this case empty struct) active, clearing the whole union is really a
waste of time.

> I would suggest to open a bugreport.

Yes.

Jakub



[PATCH] diagnostic: Save/restore diagnostic context history and push/pop state for PCH [PR116847]

2024-09-26 Thread Jakub Jelinek
Hi!

The following patch on top of the just posted cleanup patch
saves/restores the m_classification_history and m_push_list
vectors for PCH.  Without that as the testcase shows during parsing
of the templates we don't report ignored diagnostics, but after loading
PCH header when instantiating those templates those warnings can be
emitted.  This doesn't show up on x86_64-linux build because configure
injects there -fcf-protection -mshstk flags during library build (and so
also during PCH header creation), but make check doesn't use those flags
and so the PCH header is ignored.

Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-linux and
i686-linux still pending, ok for trunk if it passes it?

2024-09-26  Jakub Jelinek  

PR libstdc++/116847
gcc/
* diagnostic.h (diagnostic_option_classifier): Add pch_save and
pch_restore method declarations.
(diagnostic_context): Add pch_save and pch_restore inline method
definitions.
* diagnostic.cc (diagnostic_option_classifier::pch_save): New method.
(diagnostic_option_classifier::pch_restore): Likewise.
gcc/c-family/
* c-pch.cc: Include diagnostic.h.
(c_common_write_pch): Call global_dc->pch_save.
(c_common_read_pch): Call global_dc->pch_restore.
gcc/testsuite/
* g++.dg/pch/pr116847.C: New test.
* g++.dg/pch/pr116847.Hs: New test.

--- gcc/diagnostic.h.jj 2024-09-26 15:42:05.696731787 +0200
+++ gcc/diagnostic.h2024-09-26 17:32:16.419161700 +0200
@@ -256,6 +256,9 @@ public:
   diagnostic_t
   update_effective_level_from_pragmas (diagnostic_info *diagnostic) const;
 
+  int pch_save (FILE *);
+  int pch_restore (FILE *);
+
 private:
   /* Each time a diagnostic's classification is changed with a pragma,
  we record the change and the location of the change in an array of
@@ -551,6 +554,18 @@ public:
  const char *, const char *, va_list *,
  diagnostic_t) ATTRIBUTE_GCC_DIAG(7,0);
 
+  int
+  pch_save (FILE *f)
+  {
+return m_option_classifier.pch_save (f);
+  }
+
+  int
+  pch_restore (FILE *f)
+  {
+return m_option_classifier.pch_restore (f);
+  }
+
 private:
   void error_recursion () ATTRIBUTE_NORETURN;
 
--- gcc/diagnostic.cc.jj2024-09-26 16:11:51.664502977 +0200
+++ gcc/diagnostic.cc   2024-09-26 22:59:05.204135316 +0200
@@ -156,6 +156,46 @@ diagnostic_option_classifier::fini ()
   m_push_list.release ();
 }
 
+/* Save the diagnostic_option_classifier state to F for PCH
+   output.  Returns 0 on success, -1 on error.  */
+
+int
+diagnostic_option_classifier::pch_save (FILE *f)
+{
+  unsigned int lengths[2] = { m_classification_history.length (),
+ m_push_list.length () };
+  if (fwrite (lengths, sizeof (lengths), 1, f) != 1
+  || fwrite (m_classification_history.address (),
+sizeof (diagnostic_classification_change_t),
+lengths[0], f) != lengths[0]
+  || fwrite (m_push_list.address (), sizeof (int),
+lengths[1], f) != lengths[1])
+return -1;
+  return 0;
+}
+
+/* Read the diagnostic_option_classifier state from F for PCH
+   read.  Returns 0 on success, -1 on error.  */
+
+int
+diagnostic_option_classifier::pch_restore (FILE *f)
+{
+  unsigned int lengths[2];
+  if (fread (lengths, sizeof (lengths), 1, f) != 1)
+return -1;
+  gcc_checking_assert (m_classification_history.is_empty ());
+  gcc_checking_assert (m_push_list.is_empty ());
+  m_classification_history.safe_grow (lengths[0]);
+  m_push_list.safe_grow (lengths[1]);
+  if (fread (m_classification_history.address (),
+sizeof (diagnostic_classification_change_t),
+lengths[0], f) != lengths[0]
+  || fread (m_push_list.address (), sizeof (int),
+   lengths[1], f) != lengths[1])
+return -1;
+  return 0;
+}
+
 /* Save all diagnostic classifications in a stack.  */
 
 void
--- gcc/c-family/c-pch.cc.jj2024-02-01 16:00:37.149739756 +0100
+++ gcc/c-family/c-pch.cc   2024-09-26 18:12:31.097700113 +0200
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.
 #include "c-pragma.h"
 #include "langhooks.h"
 #include "hosthooks.h"
+#include "diagnostic.h"
 
 /* This is a list of flag variables that must match exactly, and their
names for the error message.  The possible values for *flag_var must
@@ -178,7 +179,8 @@ c_common_write_pch (void)
   cpp_write_pch_state (parse_in, pch_outfile);
   timevar_pop (TV_PCH_CPP_SAVE);
 
-  if (fseek (pch_outfile, 0, SEEK_SET) != 0
+  if (global_dc->pch_save (pch_outfile) < 0
+  || fseek (pch_outfile, 0, SEEK_SET) != 0
   || fwrite (get_ident (), IDENT_LENGTH, 1, pch_outfile) != 1)
 fatal_error (input_location, "cannot write %s: %m", pch_file);
 
@@ -359,6 +361,10 @@ c_common_read_pch (cpp_reader *pfile, co
   linemap_line_start (line_table, saved_loc.line, 

[PATCH] diagnostic: Use vec instead of custom array reallocations for m_classification_history/m_push_list [PR116847]

2024-09-26 Thread Jakub Jelinek
Hi!

diagnostic.h already relies on vec.h, it uses auto_vec in one spot.

The following patch converts m_classification_history and m_push_list
hand-managed arrays to vec templates.
The main advantage is exponential rather than linear reallocation,
e.g. with current libstdc++ headers if one includes all the standard
headers there could be ~ 300 reallocations of the m_classification_history
array (sure, not all of them will result in actually copying the data, but
still).
In addition to that it fixes some formatting issues in the code.

Bootstrapped on i686-linux so far, bootstrap/regtest on x86_64-linux and
i686-linux still pending, ok for trunk if it passes it?

2024-09-26  Jakub Jelinek  

PR libstdc++/116847
* diagnostic.h (diagnostic_option_classifier): Change type
of m_classification_history from diagnostic_classification_change_t *
to vec.  Change type of
m_push_list from int * to vec.  Remove m_n_classification_history
and m_n_push members.
* diagnostic.cc (diagnostic_option_classifier::init): Set m_push_list
to vNULL rather than nullptr.  Don't initialize m_n_push.  Initialize
m_classification_history to vNULL.
(diagnostic_option_classifier::fini): Call release () method on
m_push_list instead of free on it.  Call release () on
m_classification_history.  Don't clear m_n_push.
(diagnostic_option_classifier::push): Adjust for m_push_list and
m_classification_history being vectors rather than custom allocated
arrays with counter.
(diagnostic_option_classifier::pop): Likewise.
(classify_diagnostic): Adjust for m_classification_history being
vector rather than custom allocated array with counter.
(update_effective_level_from_pragmas): Likewise.

--- gcc/diagnostic.h.jj 2024-09-21 12:28:13.452940750 +0200
+++ gcc/diagnostic.h2024-09-26 15:42:05.696731787 +0200
@@ -287,14 +287,10 @@ private:
  binary-wise or end-to-front, to find the most recent
  classification for a given diagnostic, given the location of the
  diagnostic.  */
-  diagnostic_classification_change_t *m_classification_history;
-
-  /* The size of the above array.  */
-  int m_n_classification_history;
+  vec m_classification_history;
 
   /* For pragma push/pop.  */
-  int *m_push_list;
-  int m_n_push;
+  vec m_push_list;
 };
 
 /* A bundle of options relating to printing the user's source code
--- gcc/diagnostic.cc.jj2024-09-21 12:28:13.438940942 +0200
+++ gcc/diagnostic.cc   2024-09-26 16:11:51.664502977 +0200
@@ -143,8 +143,8 @@ diagnostic_option_classifier::init (int
   m_classify_diagnostic = XNEWVEC (diagnostic_t, n_opts);
   for (int i = 0; i < n_opts; i++)
 m_classify_diagnostic[i] = DK_UNSPECIFIED;
-  m_push_list = nullptr;
-  m_n_push = 0;
+  m_push_list = vNULL;
+  m_classification_history = vNULL;
 }
 
 void
@@ -152,8 +152,8 @@ diagnostic_option_classifier::fini ()
 {
   XDELETEVEC (m_classify_diagnostic);
   m_classify_diagnostic = nullptr;
-  free (m_push_list);
-  m_n_push = 0;
+  m_classification_history.release ();
+  m_push_list.release ();
 }
 
 /* Save all diagnostic classifications in a stack.  */
@@ -161,8 +161,7 @@ diagnostic_option_classifier::fini ()
 void
 diagnostic_option_classifier::push ()
 {
-  m_push_list = (int *) xrealloc (m_push_list, (m_n_push + 1) * sizeof (int));
-  m_push_list[m_n_push ++] = m_n_classification_history;
+  m_push_list.safe_push (m_classification_history.length ());
 }
 
 /* Restore the topmost classification set off the stack.  If the stack
@@ -173,19 +172,13 @@ diagnostic_option_classifier::pop (locat
 {
   int jump_to;
 
-  if (m_n_push)
-jump_to = m_push_list [-- m_n_push];
+  if (!m_push_list.is_empty ())
+jump_to = m_push_list.pop ();
   else
 jump_to = 0;
 
-  const int i = m_n_classification_history;
-  m_classification_history =
-(diagnostic_classification_change_t *) xrealloc (m_classification_history, 
(i + 1)
-* sizeof 
(diagnostic_classification_change_t));
-  m_classification_history[i].location = where;
-  m_classification_history[i].option = jump_to;
-  m_classification_history[i].kind = DK_POP;
-  m_n_classification_history ++;
+  diagnostic_classification_change_t v = { where, jump_to, DK_POP };
+  m_classification_history.safe_push (v);
 }
 
 /* Initialize the diagnostic message outputting machinery.  */
@@ -880,31 +873,27 @@ classify_diagnostic (const diagnostic_co
  the pragmas were.  */
   if (where != UNKNOWN_LOCATION)
 {
-  int i;
+  unsigned i;
 
   /* Record the command-line status, so we can reset it back on DK_POP. */
   if (old_kind == DK_UNSPECIFIED)
{
- old_kind = !context->option_enabled_p (option_id)
-   ? DK_IGNORED : DK_ANY;
+ old_kind = (!context->option_enabled_p (option_id)
+  

Re: [PATCH v2] libgcc, libstdc++: Make TU-local declarations in headers external linkage [PR115126]

2024-09-26 Thread Jakub Jelinek
On Thu, Sep 26, 2024 at 08:34:45PM +1000, Nathaniel Shead wrote:
> --- a/libgcc/gthr-posix.h
> +++ b/libgcc/gthr-posix.h
> @@ -44,6 +44,21 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  # endif
>  #endif
>  
> +#ifdef __has_attribute
> +# if __has_attribute(__always_inline__)
> +#  define __GTHREAD_ALWAYS_INLINE __attribute__((__always_inline__))
> +# endif
> +#endif
> +#ifndef __GTHREAD_ALWAYS_INLINE
> +# define __GTHREAD_ALWAYS_INLINE
> +#endif
> +
> +#ifdef __cplusplus
> +# define __GTHREAD_INLINE inline __GTHREAD_ALWAYS_INLINE
> +#else
> +# define __GTHREAD_INLINE static inline
> +#endif

Thanks.

> @@ -182,22 +197,29 @@ __gthrw(pthread_setschedparam)
>  
>  #if defined(__FreeBSD__) || (defined(__sun) && defined(__svr4__))
>  
> -static volatile int __gthread_active = -1;
> +#pragma GCC visibility push(hidden)
> +__GTHREAD_INLINE volatile int *
> +__gthread_active (void)
> +{
> +  static volatile int __gthread_active_var = -1;
> +  return &__gthread_active_var;
> +}
> +#pragma GCC visibility pop

I think something like the above

> -static void
> +__GTHREAD_INLINE void
>  __gthread_trigger (void)
>  {
> -  __gthread_active = 1;
> +  *__gthread_active () = 1;
>  }
>  
> -static inline int
> +__GTHREAD_INLINE int
>  __gthread_active_p (void)
>  {
>static pthread_mutex_t __gthread_active_mutex = PTHREAD_MUTEX_INITIALIZER;
>static pthread_once_t __gthread_active_once = PTHREAD_ONCE_INIT;

is needed also around this one.

> @@ -257,13 +279,15 @@ __gthrw2(__gthrw_(__pthread_key_create),
>  # define GTHR_ACTIVE_PROXY   __gthrw_(pthread_cancel)
>  #endif
>  
> -static inline int
> +#pragma GCC visibility push(hidden)
> +__GTHREAD_INLINE int
>  __gthread_active_p (void)
>  {
>static void *const __gthread_active_ptr
>  = __extension__ (void *) >HR_ACTIVE_PROXY;
>return __gthread_active_ptr != 0;
>  }
> +#pragma GCC visibility pop

And this one but you've added it to that one already.

Otherwise LGTM for the libgcc side, will defer to Jon for libstdc++ side.

Jakub



[PATCH] pretty-print: Fix up allocate_object

2024-09-26 Thread Jakub Jelinek
On Thu, Aug 29, 2024 at 06:58:12PM -0400, David Malcolm wrote:
> The following patch rewrites the internals of pp_format.

> The tokens and token lists are allocated on the chunk_obstack, and so
> there's no additional heap activity required, with the memory reclaimed
> when the chunk_obstack is freed after phase 3 of formatting.

> +static void *
> +allocate_object (size_t sz, obstack &s)
> +{
> +  /* We must not be half-way through an object.  */
> +  gcc_assert (obstack_base (&s) == obstack_next_free (&s));
> +
> +  obstack_grow (&s, obstack_base (&s), sz);
> +  void *buf = obstack_finish (&s);
> +  return buf;
>  }

I think this is wrong.  I hoped it would be the reason of the
unexpected libstdc++ warnings on certain architectures after
seeing
==4027220== Source and destination overlap in memcpy(0x4627154, 0x4627154, 12)
==4027220==at 0x404B93E: memcpy (vg_replace_strmem.c:1123)
==4027220==by 0xAAD5618: allocate_object(unsigned int, obstack&) 
(pretty-print.cc:1183)
==4027220==by 0xAAD8C0E: operator new (pretty-print.cc:1210)
==4027220==by 0xAAD8C0E: make (pretty-print-format-impl.h:305)
==4027220==by 0xAAD8C0E: format_phase_1 (pretty-print.cc:1659)
==4027220==by 0xAAD8C0E: pretty_printer::format(text_info&) 
(pretty-print.cc:1618)
==4027220==by 0xAAA840E: pp_format (pretty-print.h:583)
==4027220==by 0xAAA840E: 
diagnostic_context::report_diagnostic(diagnostic_info*) (diagnostic.cc:1260)
==4027220==by 0xAAA8703: 
diagnostic_context::diagnostic_impl(rich_location*, diagnostic_metadata const*, 
diagnostic_option_id, char const*, char**, diagnostic_t) (diagnostic.cc:1404)
==4027220==by 0xAAB8682: warning(diagnostic_option_id, char const*, ...) 
(diagnostic-global-context.cc:166)
==4027220==by 0x97725F5: warn_deprecated_use(tree_node*, tree_node*) 
(tree.cc:12485)
==4027220==by 0x8B6694B: mark_used(tree_node*, int) (decl2.cc:6121)
==4027220==by 0x8C9E25E: tsubst_expr(tree_node*, tree_node*, int, 
tree_node*) [clone .part.0] (pt.cc:21626)
==4027220==by 0x8C9E5E6: tsubst_expr(tree_node*, tree_node*, int, 
tree_node*) [clone .part.0] (pt.cc:20935)
==4027220==by 0x8C9E1D7: tsubst_expr(tree_node*, tree_node*, int, 
tree_node*) [clone .part.0] (pt.cc:20424)
==4027220==by 0x8C9DF2E: tsubst_expr(tree_node*, tree_node*, int, 
tree_node*) [clone .part.0] (pt.cc:20496)
==4027220== 
etc. valgrind warnings, unfortunately it is not, but still
I think this is a bug.
If the obstack has enough space in it, i.e. if obstack_room (&s) >= sz,
then obstack_grow from obstack_base will copy uninitialized bytes
through memcpy (obstack_base (&s), obstack_base (&s), sz);
(which pedantically isn't valid due to the overlap, and so
the reason why valgrind complains, but in reality I think most
implementations can handle it fine, after all, we also use it for
structure assignments which could have full or no overlap but never
partial).
If obstack_room (&s) < sz, then obstack_grow will first
_obstack_newchunk (&s, sz); which will allocate new memory and
copy the existing data of the object (but the above assertion
guartantees it will copy 0 bytes) and then the memcpy copies
sz bytes from the old base to the new (if unlucky, that could crash
as there could be end of page and unmapped next page in between).

I think we should use obstack_blank instead of obstack_grow, which
does everything obstack_grow does, except for the memcpy of the
uninitialized data.

Bootstrapped/regtested on i686-linux, ok for trunk?

2024-09-25  Jakub Jelinek  

* pretty-print.cc (allocate_object): Use obstack_blank rather than
obstack_grow.

--- gcc/pretty-print.cc.jj  2024-09-24 15:14:54.116162039 +0200
+++ gcc/pretty-print.cc 2024-09-25 22:12:57.776029725 +0200
@@ -1180,7 +1180,7 @@ allocate_object (size_t sz, obstack &s)
   /* We must not be half-way through an object.  */
   gcc_assert (obstack_base (&s) == obstack_next_free (&s));
 
-  obstack_grow (&s, obstack_base (&s), sz);
+  obstack_blank (&s, sz);
   void *buf = obstack_finish (&s);
   return buf;
 }


Jakub



[PATCH] c++: Add testcase for DR 2874

2024-09-25 Thread Jakub Jelinek
Hi!

Seems we already allow the partial specializations the way the DR clarifies,
so this patch just adds a testcase which verifies that.

Tested on x86_64-linux, ok for trunk?

2024-09-25  Jakub Jelinek  

* g++.dg/DRs/dr2874.C: New test.

--- gcc/testsuite/g++.dg/DRs/dr2874.C.jj2024-09-25 15:13:52.858410172 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr2874.C   2024-09-25 15:15:24.042168486 +0200
@@ -0,0 +1,13 @@
+// DR 2874 - Qualified declarations of partial specializations
+// { dg-do compile { target c++11 } }
+
+namespace N {
+  template 
+  struct A;
+}
+
+template <>
+struct N::A ;
+
+template 
+struct N::A ; 


Jakub



[PATCH] c++: Add testcase for DR 2836

2024-09-25 Thread Jakub Jelinek
Hi!

Seems we already handle it the way the DR clarifies, if double/long double
and std::float64_t have the same mode, foo has long double type (while
x + y would be _Float64 in C23), so this patch just adds a testcase which
verifies that.

Tested on x86_64-linux, powerpc64-linux and powerpc64le-linux, ok for trunk?

2024-09-25  Jakub Jelinek  

* g++.dg/DRs/dr2836.C: New test.

--- gcc/testsuite/g++.dg/DRs/dr2836.C.jj2024-09-25 14:12:37.430315269 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr2836.C   2024-09-25 14:16:44.868944354 +0200
@@ -0,0 +1,30 @@
+// DR 2836 - Conversion rank of long double and extended floating-point types
+// { dg-do compile { target c++23 } }
+// { dg-additional-options "-mlong-double-64" { target powerpc*-*-* s390*-*-* 
} }
+
+#include 
+
+#if defined (__STDCPP_FLOAT64_T__) \
+&& __LDBL_MAX_EXP__ == __FLT64_MAX_EXP__ \
+&& __LDBL_MANT_DIG__ == __FLT64_MANT_DIG__ \
+&& __DBL_MAX_EXP__ == __FLT64_MAX_EXP__ \
+&& __DBL_MANT_DIG__ == __FLT64_MANT_DIG__
+auto
+foo (long double x, std::float64_t y)
+{
+  return x + y;
+}
+
+template struct integral_constant {
+  static constexpr T value = v;
+};
+typedef integral_constant false_type;
+typedef integral_constant true_type;
+template
+struct is_same : false_type {};
+template 
+struct is_same : true_type {};
+
+static_assert (is_same ::value);
+
+#endif


Jakub



[PATCH] c++: Add testcase for DR 2728

2024-09-25 Thread Jakub Jelinek
Hi!

Seems we already handle delete expressions the way the DR clarifies,
so this patch just adds a testcase which verifies that.

Tested on x86_64-linux, ok for trunk?

2024-09-25  Jakub Jelinek  

* g++.dg/DRs/dr2728.C: New test.

--- gcc/testsuite/g++.dg/DRs/dr2728.C.jj2024-09-25 13:44:15.102534622 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr2728.C   2024-09-25 13:44:09.876605940 +0200
@@ -0,0 +1,20 @@
+// DR 2728 - Evaluation of conversions in a delete-expression
+// { dg-do run }
+
+struct S {
+  S (int *x) : p (x) {}
+  operator int * () const { ++s; return p; }
+  int *p;
+  static int s;
+};
+int S::s;
+
+int
+main ()
+{
+  int *a = new int;
+  S s (a);
+  delete s;
+  if (S::s != 1)
+__builtin_abort ();
+}

Jakub



Re: [PATCH] Speed up get_bitmask_from_range

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 01:42:04PM +0200, Richard Biener wrote:
> When min != max we know min ^ max != 0.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
>   * value-range.cc (get_bitmask_from_range): Remove redundant
>   compare of xorv with zero.

LGTM.

Jakub



gcc-patches@gcc.gnu.org

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 01:41:51PM +0200, Richard Biener wrote:
> wide_int_storage shows up high in the profile for the testcase in
> PR114855 where the apparent issue is that the conditional jump
> on 'precision' after the (inlined) memcpy stalls the pipeline due
> to the data dependence and required store-to-load forwarding.  We
> can add scheduling freedom by instead testing precision as from the
> source which speeds up the function by 30%.  I've applied the
> same logic to the copy CTOR.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
>   * wide-int.h (wide_int_storage::wide_int_storage): Branch
>   on source precision to avoid data dependence on memcpy
>   destination.
>   (wide_int_storage::operator=): Likewise.

LGTM.

Jakub



Re: [PATCH] libgcc, libstdc++: Make more entities no longer TU-local [PR115126]

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 12:18:07PM +0100, Jonathan Wakely wrote:
> > >  And whether similarly we couldn't use
> > > __attribute__((__visibility__ ("hidden"))) on the static block scope
> > > vars for C++ (again, if compiler supports that), so that the changes
> > > don't affect ABI of C++ libraries.
> >
> > That sounds good too.
> 
> Can you use visibility attributes on a local static? I get a warning
> that it's ignored.

Indeed :(

And #pragma GCC visibility push(hidden)/#pragma GCC visibility pop around
just the static block scope var definition does nothing.
If it is around the whole inline function though, then it seems to work.
Though, unsure if we want that around the whole header; wonder what it would
do with the weakrefs.

Jakub



Re: [PATCH] libgcc, libstdc++: Make more entities no longer TU-local [PR115126]

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 10:43:50AM +0100, Jonathan Wakely wrote:
> > libgcc/ChangeLog:
> >
> > * gthr-posix.h (__GTHREAD_INLINE): New macro.
> > (__gthread_active): Convert from variable to function.
> > (__gthread_trigger): Mark as __GTHREAD_INLINE instead of static.
> > (__gthread_active_p): Likewise.
> > (__gthread_create): Likewise.
> > (__gthread_join): Likewise.
> > (__gthread_detach): Likewise.
> > (__gthread_equal): Likewise.
> > (__gthread_self): Likewise.
> > (__gthread_yield): Likewise.
> > (__gthread_once): Likewise.
> > (__gthread_key_create): Likewise.
> > (__gthread_key_delete): Likewise.
> > (__gthread_getspecific): Likewise.
> > (__gthread_setspecific): Likewise.
> > (__gthread_mutex_init_function): Likewise.
> > (__gthread_mutex_destroy): Likewise.
> > (__gthread_mutex_lock): Likewise.
> > (__gthread_mutex_trylock): Likewise.
> > (__gthread_mutex_timedlock): Likewise.
> > (__gthread_mutex_unlock): Likewise.
> > (__gthread_recursive_mutex_init_function): Likewise.
> > (__gthread_recursive_mutex_lock): Likewise.
> > (__gthread_recursive_mutex_trylock): Likewise.
> > (__gthread_recursive_mutex_timedlock): Likewise.
> > (__gthread_recursive_mutex_unlock): Likewise.
> > (__gthread_recursive_mutex_destroy): Likewise.
> > (__gthread_cond_init_function): Likewise.
> > (__gthread_cond_broadcast): Likewise.
> > (__gthread_cond_signal): Likewise.
> > (__gthread_cond_wait): Likewise.
> > (__gthread_cond_timedwait): Likewise.
> > (__gthread_cond_wait_recursive): Likewise.
> > (__gthread_cond_destroy): Likewise.
> > (__gthread_rwlock_rdlock): Likewise.
> > (__gthread_rwlock_tryrdlock): Likewise.
> > (__gthread_rwlock_wrlock): Likewise.
> > (__gthread_rwlock_trywrlock): Likewise.
> > (__gthread_rwlock_unlock): Likewise.
> > * gthr-single.h (__GTHREAD_INLINE): New macro.
> > (__gthread_active_p): Mark as __GTHREAD_INLINE instead of static.
> > (__gthread_once): Likewise.
> > (__gthread_key_create): Likewise.
> > (__gthread_key_delete): Likewise.
> > (__gthread_getspecific): Likewise.
> > (__gthread_setspecific): Likewise.
> > (__gthread_mutex_destroy): Likewise.
> > (__gthread_mutex_lock): Likewise.
> > (__gthread_mutex_trylock): Likewise.
> > (__gthread_mutex_unlock): Likewise.
> > (__gthread_recursive_mutex_lock): Likewise.
> > (__gthread_recursive_mutex_trylock): Likewise.
> > (__gthread_recursive_mutex_unlock): Likewise.
> > (__gthread_recursive_mutex_destroy): Likewise.

I'm worried about ABI consequences of these changes.
>From what I can see, this doesn't change anything important for C,
the inlines are still static inline and the conversion of global static
vars to function-local in static inline still keeps those static.

But for C++ this means significant changes.
Say
_ZZ16__gthread_activevE20__gthread_active_var
etc. will now be STB_GNU_UNIQUE symbol, exported from whatever shared
libraries and binaries are compiled with C++ and include those headers
and actually use them somewhere.

On one side, it has benefits (e.g. every TU won't separately test and
store whether the threads are active or not), on the other side it can
prevent dlclose of some shared libraries and create interdependencies
between libraries.

So, I wonder if we couldn't define the C++ __GTHREAD_INLINE to
static inline __attribute__((__always_inline__)) (or at least when
using a compiler which supports the attribute, we can now test it
using preprocessor...).  And whether similarly we couldn't use
__attribute__((__visibility__ ("hidden"))) on the static block scope
vars for C++ (again, if compiler supports that), so that the changes
don't affect ABI of C++ libraries.

Jakub



Re: [PATCH] i386, v2: Add GENERIC and GIMPLE folders of __builtin_ia32_{min,max}* [PR116738]

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 11:36:33AM +0200, Richard Biener wrote:
> I'll note it would be much simpler if we could write x > y ? x : y in
> the intrinsic header.

Unfortunately, not so much.

It can do that only for simple cases like _mm_min_p{s,d} or similar,
which aren't masked, aren't the "scalar" ones and aren't the "rounding"
cases (rounding for these intrinsics is solely about disabling exceptions).

The "scalar" cases are actually vector ops, but the x < y ? x : y is
only in the first lane, the rest of lanes come from x and one really
can't represent it as say __builtin_shuffle with x < y ? x : y and x
operands because that throws different exceptions.

The masked cases are similarly a permutation, though this time with
yet another operand rather than one of the provided one.  Again, for
exceptions it can't be simple shuffle.  And one can actually mix the masked
and scalar case together.

And the round case is verification that the argument is constant 4 or 8,
if 4 it acts as the non-rounded one (but possibly masked etc.), if 8
with exceptions disabled.  For disabled exceptions, I'm afraid using
UNLE/UNGE rather than GT/LT doesn't help because they still exception on
sNaNs.

Jakub



[PATCH] i386, v2: Add GENERIC and GIMPLE folders of __builtin_ia32_{min,max}* [PR116738]

2024-09-25 Thread Jakub Jelinek
On Wed, Sep 25, 2024 at 10:17:50AM +0800, Hongtao Liu wrote:
> > +   for (int i = 0; i < 2; ++i)
> > + {
> > +   unsigned count = vector_cst_encoded_nelts (args[i]), j;
> > +   for (j = 0; j < count; ++j)
> > + if (!tree_expr_nan_p (VECTOR_CST_ENCODED_ELT (args[i], 
> > j)))
> Is this a typo? I assume you want to check if the component is NAN, so
> tree_expr_nan_p, not !tree_expr_nan_p?
> > +   break;
> > +   if (j < count)
> > + break;
> Also this break just break the outer loop(for (int i = 0; i < 2;
> i++)), but according to comments, it wants to break the outer switch?

You're right, thanks for catching that.  Fortunately both meant just that
it got NaNs optimized too and optimized the rest as it should.

I just wanted to avoid return NULL_TREE; or goto and screwed it up.

Here is a fixed version, tested additionally on looking at gimple dump on
typedef float __v4sf __attribute__((vector_size (16)));
__v4sf foo (void) { return __builtin_ia32_minss ((__v4sf) { __builtin_nanf 
(""), 0.f, 0.f, 0.f }, (__v4sf) { __builtin_inff (), 1.0f, 2.0f, 3.0f }); }
__v4sf bar (void) { return __builtin_ia32_minss ((__v4sf) { -__builtin_inff (), 
0.f, 0.f, 0.f }, (__v4sf) { __builtin_inff (), 1.0f, 2.0f, 3.0f }); }

Ok for trunk if it passes bootstrap/regtest?

2024-09-25  Jakub Jelinek  

PR target/116738
* config/i386/i386.cc (ix86_fold_builtin): Handle
IX86_BUILTIN_M{IN,AX}{S,P}{S,H,D}*.
(ix86_gimple_fold_builtin): Handle IX86_BUILTIN_M{IN,AX}P{S,H,D}*.

* gcc.target/i386/avx512f-pr116738-1.c: New test.
* gcc.target/i386/avx512f-pr116738-2.c: New test.

--- gcc/config/i386/i386.cc.jj  2024-09-24 18:54:24.120313544 +0200
+++ gcc/config/i386/i386.cc 2024-09-25 10:21:00.922417024 +0200
@@ -18507,6 +18507,8 @@ ix86_fold_builtin (tree fndecl, int n_ar
= (enum ix86_builtins) DECL_MD_FUNCTION_CODE (fndecl);
   enum rtx_code rcode;
   bool is_vshift;
+  enum tree_code tcode;
+  bool is_scalar;
   unsigned HOST_WIDE_INT mask;
 
   switch (fn_code)
@@ -18956,6 +18958,131 @@ ix86_fold_builtin (tree fndecl, int n_ar
}
  break;
 
+   case IX86_BUILTIN_MINSS:
+   case IX86_BUILTIN_MINSH_MASK:
+ tcode = LT_EXPR;
+ is_scalar = true;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MAXSS:
+   case IX86_BUILTIN_MAXSH_MASK:
+ tcode = GT_EXPR;
+ is_scalar = true;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MINPS:
+   case IX86_BUILTIN_MINPD:
+   case IX86_BUILTIN_MINPS256:
+   case IX86_BUILTIN_MINPD256:
+   case IX86_BUILTIN_MINPS512:
+   case IX86_BUILTIN_MINPD512:
+   case IX86_BUILTIN_MINPS128_MASK:
+   case IX86_BUILTIN_MINPD128_MASK:
+   case IX86_BUILTIN_MINPS256_MASK:
+   case IX86_BUILTIN_MINPD256_MASK:
+   case IX86_BUILTIN_MINPH128_MASK:
+   case IX86_BUILTIN_MINPH256_MASK:
+   case IX86_BUILTIN_MINPH512_MASK:
+ tcode = LT_EXPR;
+ is_scalar = false;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MAXPS:
+   case IX86_BUILTIN_MAXPD:
+   case IX86_BUILTIN_MAXPS256:
+   case IX86_BUILTIN_MAXPD256:
+   case IX86_BUILTIN_MAXPS512:
+   case IX86_BUILTIN_MAXPD512:
+   case IX86_BUILTIN_MAXPS128_MASK:
+   case IX86_BUILTIN_MAXPD128_MASK:
+   case IX86_BUILTIN_MAXPS256_MASK:
+   case IX86_BUILTIN_MAXPD256_MASK:
+   case IX86_BUILTIN_MAXPH128_MASK:
+   case IX86_BUILTIN_MAXPH256_MASK:
+   case IX86_BUILTIN_MAXPH512_MASK:
+ tcode = GT_EXPR;
+ is_scalar = false;
+   do_minmax:
+ gcc_assert (n_args >= 2);
+ if (TREE_CODE (args[0]) != VECTOR_CST
+ || TREE_CODE (args[1]) != VECTOR_CST)
+   break;
+ mask = HOST_WIDE_INT_M1U;
+ if (n_args > 2)
+   {
+ gcc_assert (n_args >= 4);
+ /* This is masked minmax.  */
+ if (TREE_CODE (args[3]) != INTEGER_CST
+ || TREE_SIDE_EFFECTS (args[2]))
+   break;
+ mask = TREE_INT_CST_LOW (args[3]);
+ unsigned elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (args[0]));
+ mask |= HOST_WIDE_INT_M1U << elems;
+ if (mask != HOST_WIDE_INT_M1U
+ && TREE_CODE (args[2]) != VECTOR_CST)
+   break;
+ if (n_args >= 5)
+   {
+ if (!tree_fits_uhwi_p (args[4]))
+   break;
+ if (tree_to_uhwi (args[4]) != 4
+ && tree_to_uhwi (args[4]) != 8)
+   break;
+   }
+ if (mask == (HOST_WIDE_INT_M1U << elems))
+   return args[2];
+   }
+ /* Punt on NaN

Re: [PATCH] c++, v2: Implement C++23 P2718R0 - Wording for P2644R1 Fix for Range-based for Loop [PR107637]

2024-09-24 Thread Jakub Jelinek
On Tue, Sep 24, 2024 at 01:34:44PM -0400, Jason Merrill wrote:
> Let's also give an error for trying to disable it in C++23+.
> Missing function comment, maybe just use the one below?
> Please add a comment to this and range-for4 explaining that this is to get
> the fix enabled in GNU modes.
> 
> OK with those changes.

Done, committed now, thanks for the review.

I've also committed the following tweak for the status page:

diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index d986fc79..76f6ef6d 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -576,7 +576,7 @@
 
Wording for P2644R1 Fix for Range-based for Loop 
https://wg21.link/p2718";>P2718R0
-   https://gcc.gnu.org/PR107637";>No
+   15
__cpp_range_based_for >= 202211L 
 
 

[PATCH] libcpp: Add -Wleading-whitespace= warning

2024-09-24 Thread Jakub Jelinek
Hi!

The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663388.html
patch adds -Wleading-whitespace= warning option.
This warning doesn't care how much one actually indents which line
in the source (that is something that can't be easily done in the
preprocessor without doing syntactic analysis), but just simple checks
on what kind of whitespace is used in the indentation.
I think it is still useful to get warnings about such issues early,
while git diagnoses some of it in patches (e.g. the tab after space
case), getting the warnings earlier might help avoiding such issues
sooner.

There are projects which ban use of tabs and require just spaces,
others which require indentation just with horizontal tabs, and finally
projects which want indentation with tabs for multiples of tabstop size
followed by spaces (fewer than tabstop size), like GCC.
For all 3 kinds the warning diagnoses indentation with '\v' or '\f'
characters (unless line contains just whitespace), and for the last one
also cases where a space in the indentation is followed by horizontal
tab or where there are N or more consecutive spaces in the indentation
(for -ftabstop=N).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

BTW, for additional testing I've enabled the warnings (without -Werror
for them) in stage3.  There are many warnings (both trailing and leading
whitespace), some of them something that can be easily fixed in the headers
or source files, but others with whitespace issues in generated sources,
so if we enable the warnings, either we'd need to adjust the generators
or disable the warnings in (some of the) generated files.

2024-09-24  Jakub Jelinek  

libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_leading_whitespace and cpp_tabstop members.
(enum cpp_warning_reason): Add CPP_W_LEADING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document new
line note kinds.
* init.cc (cpp_create_reader): Set cpp_tabstop to 8.
* lex.cc (find_leading_whitespace_issues): New function.
(_cpp_clean_line): Use it.
(_cpp_process_line_notes): Handle 'L', 'S' and 'T' line notes.
(lex_raw_string): Clear type on 'L', 'S' and 'T' line notes
inside of raw string literals.
gcc/
* doc/invoke.texi (Wleading-whitespace=): Document.
gcc/c-family/
* c.opt (Wleading-whitespace=): New option.
* c-opts.cc (c_common_post_options): Set cpp_opts->cpp_tabstop
to global_dc->m_tabstop.
gcc/testsuite/
* c-c++-common/cpp/Wleading-whitespace-1.c: New test.
* c-c++-common/cpp/Wleading-whitespace-2.c: New test.
* c-c++-common/cpp/Wleading-whitespace-3.c: New test.
* c-c++-common/cpp/Wleading-whitespace-4.c: New test.

--- libcpp/include/cpplib.h.jj  2024-09-23 16:08:40.846050280 +0200
+++ libcpp/include/cpplib.h 2024-09-23 17:09:32.250056701 +0200
@@ -594,9 +594,15 @@ struct cpp_options
   /* True if -finput-charset= option has been used explicitly.  */
   bool cpp_input_charset_explicit;
 
+  /* -Wleading-whitespace= value.  */
+  unsigned char cpp_warn_leading_whitespace;
+
   /* -Wtrailing-whitespace= value.  */
   unsigned char cpp_warn_trailing_whitespace;
 
+  /* -ftabstop= value.  */
+  unsigned int cpp_tabstop;
+
   /* Dependency generation.  */
   struct
   {
@@ -713,6 +719,7 @@ enum cpp_warning_reason {
   CPP_W_BIDIRECTIONAL,
   CPP_W_INVALID_UTF8,
   CPP_W_UNICODE,
+  CPP_W_LEADING_WHITESPACE,
   CPP_W_TRAILING_WHITESPACE
 };
 
--- libcpp/internal.h.jj2024-09-23 16:08:40.846050280 +0200
+++ libcpp/internal.h   2024-09-23 18:19:46.642467051 +0200
@@ -318,7 +318,8 @@ struct _cpp_line_note
 
   /* Type of note.  The 9 'from' trigraph characters represent those
  trigraphs, '\\' an escaped newline, ' ' an escaped newline with
- intervening space, 'W' trailing whitespace, 0 represents a note that
+ intervening space, 'W' trailing whitespace, 'L', 'S' and 'T' for
+ leading whitespace issues, 0 represents a note that
  has already been handled, and anything else is invalid.  */
   unsigned int type;
 };
--- libcpp/init.cc.jj   2024-09-20 08:57:03.041075703 +0200
+++ libcpp/init.cc  2024-09-23 17:24:53.564421636 +0200
@@ -246,6 +246,7 @@ cpp_create_reader (enum c_lang lang, cpp
   CPP_OPTION (pfile, cpp_warn_invalid_utf8) = 0;
   CPP_OPTION (pfile, cpp_warn_unicode) = 1;
   CPP_OPTION (pfile, cpp_input_charset_explicit) = 0;
+  CPP_OPTION (pfile, cpp_tabstop) = 8;
 
   /* Default CPP arithmetic to something sensible for the host for the
  benefit of dumb users like fix-header.  */
--- libcpp/lex.cc.jj2024-09-23 16:08:40.847050267 +0200
+++ libcpp/lex.cc   2024-09-24 09:32:57.293210930 +0200
@@ -818,6 +818,59 @@ _

[PATCH] i386: Add GENERIC and GIMPLE folders of __builtin_ia32_{min,max}* [PR116738]

2024-09-24 Thread Jakub Jelinek
Hi!

The following patch adds GENERIC and GIMPLE folders for various
x86 min/max builtins.
As discussed, these builtins have effectively x < y ? x : y
(or x > y ? x : y) behavior.
The GENERIC folding is done if all the (relevant) arguments are
constants (such as VECTOR_CST for vectors) and is done because
the GIMPLE folding can't easily handle masking, rounding and the
ss/sd cases (in a way that it would be pattern recognized back to the
corresponding instructions).  The GIMPLE folding is also done just
for TARGET_SSE4 or later when optimizing, otherwise it is apparently
not matched back.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-24  Jakub Jelinek  

PR target/116738
* config/i386/i386.cc (ix86_fold_builtin): Handle
IX86_BUILTIN_M{IN,AX}{S,P}{S,H,D}*.
(ix86_gimple_fold_builtin): Handle IX86_BUILTIN_M{IN,AX}P{S,H,D}*.

* gcc.target/i386/avx512f-pr116738-1.c: New test.
* gcc.target/i386/avx512f-pr116738-2.c: New test.

--- gcc/config/i386/i386.cc.jj  2024-09-12 10:56:57.344683959 +0200
+++ gcc/config/i386/i386.cc 2024-09-23 15:15:40.154783766 +0200
@@ -18507,6 +18507,8 @@ ix86_fold_builtin (tree fndecl, int n_ar
= (enum ix86_builtins) DECL_MD_FUNCTION_CODE (fndecl);
   enum rtx_code rcode;
   bool is_vshift;
+  enum tree_code tcode;
+  bool is_scalar;
   unsigned HOST_WIDE_INT mask;
 
   switch (fn_code)
@@ -18956,6 +18958,133 @@ ix86_fold_builtin (tree fndecl, int n_ar
}
  break;
 
+   case IX86_BUILTIN_MINSS:
+   case IX86_BUILTIN_MINSH_MASK:
+ tcode = LT_EXPR;
+ is_scalar = true;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MAXSS:
+   case IX86_BUILTIN_MAXSH_MASK:
+ tcode = GT_EXPR;
+ is_scalar = true;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MINPS:
+   case IX86_BUILTIN_MINPD:
+   case IX86_BUILTIN_MINPS256:
+   case IX86_BUILTIN_MINPD256:
+   case IX86_BUILTIN_MINPS512:
+   case IX86_BUILTIN_MINPD512:
+   case IX86_BUILTIN_MINPS128_MASK:
+   case IX86_BUILTIN_MINPD128_MASK:
+   case IX86_BUILTIN_MINPS256_MASK:
+   case IX86_BUILTIN_MINPD256_MASK:
+   case IX86_BUILTIN_MINPH128_MASK:
+   case IX86_BUILTIN_MINPH256_MASK:
+   case IX86_BUILTIN_MINPH512_MASK:
+ tcode = LT_EXPR;
+ is_scalar = false;
+ goto do_minmax;
+
+   case IX86_BUILTIN_MAXPS:
+   case IX86_BUILTIN_MAXPD:
+   case IX86_BUILTIN_MAXPS256:
+   case IX86_BUILTIN_MAXPD256:
+   case IX86_BUILTIN_MAXPS512:
+   case IX86_BUILTIN_MAXPD512:
+   case IX86_BUILTIN_MAXPS128_MASK:
+   case IX86_BUILTIN_MAXPD128_MASK:
+   case IX86_BUILTIN_MAXPS256_MASK:
+   case IX86_BUILTIN_MAXPD256_MASK:
+   case IX86_BUILTIN_MAXPH128_MASK:
+   case IX86_BUILTIN_MAXPH256_MASK:
+   case IX86_BUILTIN_MAXPH512_MASK:
+ tcode = GT_EXPR;
+ is_scalar = false;
+   do_minmax:
+ gcc_assert (n_args >= 2);
+ if (TREE_CODE (args[0]) != VECTOR_CST
+ || TREE_CODE (args[1]) != VECTOR_CST)
+   break;
+ mask = HOST_WIDE_INT_M1U;
+ if (n_args > 2)
+   {
+ gcc_assert (n_args >= 4);
+ /* This is masked minmax.  */
+ if (TREE_CODE (args[3]) != INTEGER_CST
+ || TREE_SIDE_EFFECTS (args[2]))
+   break;
+ mask = TREE_INT_CST_LOW (args[3]);
+ unsigned elems = TYPE_VECTOR_SUBPARTS (TREE_TYPE (args[0]));
+ mask |= HOST_WIDE_INT_M1U << elems;
+ if (mask != HOST_WIDE_INT_M1U
+ && TREE_CODE (args[2]) != VECTOR_CST)
+   break;
+ if (n_args >= 5)
+   {
+ if (!tree_fits_uhwi_p (args[4]))
+   break;
+ if (tree_to_uhwi (args[4]) != 4
+ && tree_to_uhwi (args[4]) != 8)
+   break;
+   }
+ if (mask == (HOST_WIDE_INT_M1U << elems))
+   return args[2];
+   }
+ /* Punt on NaNs, unless exceptions are disabled.  */
+ if (HONOR_NANS (args[0])
+ && (n_args < 5 || tree_to_uhwi (args[4]) != 8))
+   for (int i = 0; i < 2; ++i)
+ {
+   unsigned count = vector_cst_encoded_nelts (args[i]), j;
+   for (j = 0; j < count; ++j)
+ if (!tree_expr_nan_p (VECTOR_CST_ENCODED_ELT (args[i], j)))
+   break;
+   if (j < count)
+ break;
+ }
+ {
+   tree res = const_binop (tcode,
+   truth_type_for (TREE_TYPE (args[0])),
+   args[0], args[1]);
+   if (res == NULL_TREE || TREE_CODE (res) != VECTOR_CST)
+ break;

[committed] i386: Fix comment typo

2024-09-24 Thread Jakub Jelinek
Hi!

Found a comment typo, fixed as obvious.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2024-09-24  Jakub Jelinek  

* config/i386/i386-expand.cc (ix86_expand_round_builtin): Fix comment
typo, insead -> instead.

--- gcc/config/i386/i386-expand.cc.jj   2024-09-20 08:57:02.496083163 +0200
+++ gcc/config/i386/i386-expand.cc  2024-09-23 11:01:14.128079764 +0200
@@ -12748,7 +12748,7 @@ ix86_expand_round_builtin (const struct
  /* Skip erasing embedded rounding for below expanders who
 generates multiple insns.  In ix86_erase_embedded_rounding
 the pattern will be transformed to a single set, and emit_insn
-appends the set insead of insert it to chain.  So the insns
+appends the set instead of insert it to chain.  So the insns
 emitted inside define_expander would be ignored.  */
  switch (icode)
{

Jakub



[PATCH] c++, v2: Implement C++23 P2718R0 - Wording for P2644R1 Fix for Range-based for Loop [PR107637]

2024-09-24 Thread Jakub Jelinek
On Mon, Sep 23, 2024 at 03:46:36PM -0400, Jason Merrill wrote:
> > -frange-based-for-ext-temps
> > or do you have better suggestion?
> 
> I'd probably drop "based", "range-for" seems enough.
> 
> > Shall we allow also disabling it in C++23 or later modes, or override
> > user choice unconditionally for C++23+ and only allow users to
> > enable/disable it in C++11-C++20?
> 
> Hmm, I think the latter.
> 
> > What about the __cpp_range_based_for predefined macro?
> > Shall it be defined to the C++23 202211L value if the switch is on?
> > While that could be done in theory for C++17 and later code, for C++11/14
> > __cpp_range_based_for is 200907L and doesn't include the C++17
> > 201603L step.  Or keep the macro only for C++23 and later?
> 
> I think update the macro for 17 and later.

Ok.

Here is a new patch.

> > > > @@ -44600,11 +44609,14 @@ cp_convert_omp_range_for (tree &this_pre
> > > >  else
> > > > {
> > > >   range_temp = build_range_temp (init);
> > > > + tree name = DECL_NAME (range_temp);
> > > >   DECL_NAME (range_temp) = NULL_TREE;
> > > >   pushdecl (range_temp);
> > > > + DECL_NAME (range_temp) = name;
> > > >   cp_finish_decl (range_temp, init,
> > > >   /*is_constant_init*/false, NULL_TREE,
> > > >   LOOKUP_ONLYCONVERTING);
> > > > + DECL_NAME (range_temp) = NULL_TREE;
> > > 
> > > This messing with the name needs a rationale.  What wants it to be null?
> > 
> > I'll add comments.  The first = NULL_TREE; is needed so that pushdecl
> > doesn't register the temporary for name lookup, the = name now is so that
> > cp_finish_decl recognizes the temporary as range based for temporary
> > for the lifetime extension, and the last one is just to preserve previous
> > behavior, not have it visible in debug info etc.
> 
> But cp_convert_range_for doesn't ever set the name to NULL_TREE, why should
> the OMP variant be different?
> 
> Having it visible to name lookup in the debugger seems beneficial. Having it
> visible to the code seems less useful, but not important to prevent.

So, in the end it works fine even for the OpenMP case when not inside of a
template, all I had to add is the renaming of the symbol at the end after
pop_scope from "__for_range " to "__for_range" etc.
It doesn't work unfortunately during instantiation, we only create a single
scope in that case for the whole loop nest rather than one for each loop in
it and changing that isn't easy.  With the "__for_range " name in, if there
are 2+ range based for loops in the OpenMP loop nest (collapsed or ordered),
one gets then errors about defining it multiple times.
I'll try to fix that up at incrementally later, for now I just went with
a new flag to the function, so that it does the DECL_NAME dances only when
called from the instantiation (and confirmed actually all 3 spots are
needed, clearing before pushdecl, resetting back before cp_finish_decl and
clearing after cp_finish_decl, the last one so that pop_scope doesn't ICE
on seeing the name change).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-24  Jakub Jelinek  

PR c++/107637
gcc/
* omp-general.cc (find_combined_omp_for, find_nested_loop_xform):
Handle CLEANUP_POINT_EXPR like TRY_FINALLY_EXPR.
* doc/invoke.texi (frange-for-ext-temps): Document.  Add
-fconcepts to the C++ option list.
gcc/c-family/
* c.opt (frange-for-ext-temps): New option.
* c-opts.cc (c_common_post_options): Set flag_range_for_ext_temps
for C++23 or later or for C++11 or later in !flag_iso mode if
the option wasn't set by user.
* c-cppbuiltin.cc (c_cpp_builtins): Change __cpp_range_based_for
value for flag_range_for_ext_temps from 201603L to 202212L in C++17
or later.
* c-omp.cc (c_find_nested_loop_xform_r): Handle CLEANUP_POINT_EXPR
like TRY_FINALLY_EXPR.
gcc/cp/
* cp-tree.h: Implement C++23 P2718R0 - Wording for P2644R1 Fix for
Range-based for Loop.
(cp_convert_omp_range_for): Add bool tmpl_p argument.
(find_range_for_decls): Declare.
* parser.cc (cp_convert_range_for): For flag_range_for_ext_temps call
push_stmt_list () before cp_finish_decl for range_temp and save it
temporarily to FOR_INIT_STMT.
(cp_convert_omp_range_for): Add tmpl_p argument.  If set, remember
DECL_NAME of range_temp and for cp_finish_decl call restore it before
clearing it again, if u

Re: [PATCH] c++: Implement C++23 P2718R0 - Wording for P2644R1 Fix for Range-based for Loop [PR107637]

2024-09-23 Thread Jakub Jelinek
On Mon, Sep 23, 2024 at 11:32:59AM -0400, Jason Merrill wrote:
> On 8/9/24 9:06 PM, Jakub Jelinek wrote:
> > Hi!
> > 
> > The following patch implements the C++23 P2718R0 paper
> > - Wording for P2644R1 Fix for Range-based for Loop.
> > As all the temporaries from __for_range initialization should have life
> > extended until the end of __for_range scope, this patch disables (for C++23
> > and later only and if !processing_template_decl) CLEANUP_POINT_EXPR wrapping
> > of the __for_range declaration, also disables -Wdangling-reference warning
> > as well as the rest of extend_ref_init_temps (we know the __for_range 
> > temporary
> > is not TREE_STATIC and as all the temporaries from the initializer will be 
> > life
> > extended, we shouldn't try to handle temporaries referenced by references 
> > any
> > differently) and adds an extra push_stmt_list/pop_stmt_list before
> > cp_finish_decl of __for_range and after end of the for body and wraps all
> > that into CLEANUP_POINT_EXPR.
> > I had to repeat that also for OpenMP range loops because those are handled
> > differently.
> 
> Let's add a flag for this, not just control it with cxx_dialect.  We might
> want to consider enabling it by default in earlier modes when not being
> strictly conforming?

-frange-based-for-ext-temps
or do you have better suggestion?

Shall we allow also disabling it in C++23 or later modes, or override
user choice unconditionally for C++23+ and only allow users to
enable/disable it in C++11-C++20?

What about the __cpp_range_based_for predefined macro?
Shall it be defined to the C++23 202211L value if the switch is on?
While that could be done in theory for C++17 and later code, for C++11/14
__cpp_range_based_for is 200907L and doesn't include the C++17
201603L step.  Or keep the macro only for C++23 and later?
And if one can override -std=c++23 -fno-range-based-for-ext-temps,
shall it use the C++17 version?

> > @@ -44600,11 +44609,14 @@ cp_convert_omp_range_for (tree &this_pre
> > else
> > {
> >   range_temp = build_range_temp (init);
> > + tree name = DECL_NAME (range_temp);
> >   DECL_NAME (range_temp) = NULL_TREE;
> >   pushdecl (range_temp);
> > + DECL_NAME (range_temp) = name;
> >   cp_finish_decl (range_temp, init,
> >   /*is_constant_init*/false, NULL_TREE,
> >   LOOKUP_ONLYCONVERTING);
> > + DECL_NAME (range_temp) = NULL_TREE;
> 
> This messing with the name needs a rationale.  What wants it to be null?

I'll add comments.  The first = NULL_TREE; is needed so that pushdecl
doesn't register the temporary for name lookup, the = name now is so that
cp_finish_decl recognizes the temporary as range based for temporary
for the lifetime extension, and the last one is just to preserve previous
behavior, not have it visible in debug info etc.

Jakub



Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-23 Thread Jakub Jelinek
On Sun, Sep 22, 2024 at 08:45:40AM -0600, Sandra Loosemore wrote:
> On 9/21/24 22:52, Jakub Jelinek wrote:
> > On Sat, Sep 21, 2024 at 08:08:29PM -0600, Sandra Loosemore wrote:
> > > On 9/20/24 01:41, Jakub Jelinek wrote:
> > > > > +
> > > > > +   /* Check for unknown properties.  */
> > > > > if (omp_ts_map[ts_code].valid_properties == NULL)
> > > > >   continue;
> > > > > -
> > > > 
> > > > Why?
> > > 
> > > Why what?  I made this change because when I added another check in this
> > > loop I was temporarily confused about the control flow and I thought this
> > > would help.  If you're asking why it's doing the null check here, it's
> > > because it doesn't make sense to check properties on all selectors, like
> > > properties that can have implementation-defined values for other 
> > > compilers.
> > 
> > I meant why the empty line was removed?
> 
> To make it more clear that the null check is part of the code chunk to
> diagnose unknown properties -- each of which is now formatted the same way,
> with a blank line, comment to explain what it's doing, and no blank lines
> within the chunk.

Ok.

Jakub



Re: [PATCH] doc: Remove @code wrapping of fortran option names [PR116801]

2024-09-22 Thread Jakub Jelinek
On Sun, Sep 22, 2024 at 11:47:09PM +0200, Andreas Schwab wrote:
> On Sep 22 2024, Jakub Jelinek wrote:
> 
> > On Sun, Sep 22, 2024 at 10:52:37PM +0200, Andreas Schwab wrote:
> >> On Sep 22 2024, Mikael Morin wrote:
> >> 
> >> > @@ -370,7 +370,7 @@ Set the default accessibility of module entities to 
> >> > @code{PRIVATE}.
> >> >  Use-associated entities will not be accessible unless they are 
> >> > explicitly
> >> >  declared as @code{PUBLIC}.
> >> >  
> >> > -@opindex @code{ffixed-line-length-}@var{n}
> >> > +@opindex ffixed-line-length-@var{n}
> >> 
> >> Shouldn't all the @var{...} parts be dropped as well, throughout?
> >
> > We have it all over the other manuals:
> 
> But it causes them to not show up in the urls files.

It is more important what should appear in the manual.
The urls generating script can be always adjusted to account for the var
part after =/-

Jakub



Re: [PATCH] doc: Remove @code wrapping of fortran option names [PR116801]

2024-09-22 Thread Jakub Jelinek
On Sun, Sep 22, 2024 at 10:52:37PM +0200, Andreas Schwab wrote:
> On Sep 22 2024, Mikael Morin wrote:
> 
> > @@ -370,7 +370,7 @@ Set the default accessibility of module entities to 
> > @code{PRIVATE}.
> >  Use-associated entities will not be accessible unless they are explicitly
> >  declared as @code{PUBLIC}.
> >  
> > -@opindex @code{ffixed-line-length-}@var{n}
> > +@opindex ffixed-line-length-@var{n}
> 
> Shouldn't all the @var{...} parts be dropped as well, throughout?

We have it all over the other manuals:
grep @opindex.*@var doc/*.texi d/*.texi
doc/invoke.texi:@opindex fstrict-flex-arrays=@var{level}
doc/invoke.texi:@opindex Wlarger-than-@var{byte-size}
doc/invoke.texi:@opindex fmin-function-alignment=@var{n}
doc/invoke.texi:@opindex fdump-rtl-@var{pass}
doc/invoke.texi:@opindex masm=@var{dialect}
doc/invoke.texi:@opindex missue-rate=@var{number}
doc/invoke.texi:@opindex mbranch-cost=@var{number}
doc/invoke.texi:@opindex mflush-trap=@var{number}
doc/invoke.texi:@opindex mflush-func=@var{name}
doc/invoke.texi:@opindex mcustom-@var{insn}
doc/invoke.texi:@opindex mno-custom-@var{insn}
doc/invoke.texi:@opindex matomic-model=@var{model}
doc/invoke.texi:@opindex multcost=@var{number}
doc/invoke.texi:@opindex mdiv=@var{strategy}
doc/invoke.texi:@opindex mdivsi3_libfunc=@var{name}
doc/invoke.texi:@opindex mbranch-cost=@var{num}
doc/invoke.texi:@opindex mdebug-main=@var{prefix}
doc/invoke.texi:@opindex mpointer-size=@var{size}
doc/invoke.texi:@opindex masm=@var{dialect}
doc/invoke.texi:@opindex mtune-ctrl=@var{feature-list}
doc/invoke.texi:@opindex mstringop-strategy=@var{alg}
doc/invoke.texi:@opindex mmemcpy-strategy=@var{strategy}
doc/invoke.texi:@opindex mmemset-strategy=@var{strategy}

Jakub



Re: [PATCH] doc: Remove @code wrapping of fortran option names [PR116801]

2024-09-22 Thread Jakub Jelinek
On Sun, Sep 22, 2024 at 04:17:11PM +0200, Mikael Morin wrote:
> -@opindex @code{std=}@var{std} option
> +@opindex std=@var{std} option

IMHO this one should be just
@opindex std=@var{std}

The option part is superfluous.

> -@opindex @code{idirafter @var{dir}}
> +@opindex idirafter @var{dir}
> -@opindex @code{imultilib @var{dir}}
> +@opindex imultilib @var{dir}
> -@opindex @code{iprefix @var{prefix}}
> +@opindex iprefix @var{prefix}
> -@opindex @code{isysroot @var{dir}}
> +@opindex isysroot @var{dir}
> -@opindex @code{iquote @var{dir}}
> +@opindex iquote @var{dir}
> -@opindex @code{isystem @var{dir}}
> +@opindex isystem @var{dir}
> -@opindex @code{fintrinsic-modules-path} @var{dir}
> +@opindex fintrinsic-modules-path @var{dir}

For consistency with the other parts of the documentation, I'd
leave the " @var{dir}" part from all of these (sure, keep it as
is in @item/@itemx), seems nothing in the non-Fortran manual
uses arguments after space in the index:
grep @opindex..*' ' */*.texi
doc/invoke.texi:@opindex  mnopm
fortran/invoke.texi:@opindex @code{std=}@var{std} option
fortran/invoke.texi:@opindex @code{idirafter @var{dir}}
fortran/invoke.texi:@opindex @code{imultilib @var{dir}}
fortran/invoke.texi:@opindex @code{iprefix @var{prefix}}
fortran/invoke.texi:@opindex @code{isysroot @var{dir}}
fortran/invoke.texi:@opindex @code{iquote @var{dir}}
fortran/invoke.texi:@opindex @code{isystem @var{dir}}
fortran/invoke.texi:@opindex @code{fintrinsic-modules-path} @var{dir}
(grep before your patch; the mnopm case is just superfluous space
before the option name, should be fixed one day if it isn't ignored).

Otherwise LGTM.

Jakub



Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-21 Thread Jakub Jelinek
On Sat, Sep 21, 2024 at 08:08:29PM -0600, Sandra Loosemore wrote:
> On 9/20/24 01:41, Jakub Jelinek wrote:
> > > +
> > > +   /* Check for unknown properties.  */
> > > if (omp_ts_map[ts_code].valid_properties == NULL)
> > >   continue;
> > > -
> > 
> > Why?
> 
> Why what?  I made this change because when I added another check in this
> loop I was temporarily confused about the control flow and I thought this
> would help.  If you're asking why it's doing the null check here, it's
> because it doesn't make sense to check properties on all selectors, like
> properties that can have implementation-defined values for other compilers.

I meant why the empty line was removed?

Jakub



[PATCH] opts: Fix up regenerate-opt-urls dependencies

2024-09-21 Thread Jakub Jelinek
Hi!

It seems that we currently require
1) enabling at least c,c++,fortran,d in --enable-languages
2) first doing make html
before one can successfully regenerate-opt-urls, otherwise without 2)
one gets
make regenerate-opt-urls
make: *** No rule to make target 
'/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gcc/Option-Index.html', needed 
by 'regenerate-opt-urls'.  Stop.
or say if not configuring d after make html one still gets
make regenerate-opt-urls
make: *** No rule to make target 
'/home/jakub/src/gcc/obj12x/gcc/HTML/gcc-15.0.0/gdc/Option-Index.html', needed 
by 'regenerate-opt-urls'.  Stop.

Now, I believe neither 1) nor 2) is really necessary.
The regenerate-opt-urls goal has dependency on 3 Option-Index.html files,
but those files don't have dependencies how to generate them.
make html has dependency on $(HTMLS_BUILD) which adds
$(build_htmldir)/gcc/index.html and lang.html among other things, where
the former actually builds not just index.html but also Option-Index.html
and tons of other files, and lang.html is filled in by configure depending
on configured languages, so sometimes will include gfortran.html and
sometimes d.html.

The following patch adds dependencies of the Option-Index.html on their
corresponding index.html files and that is all that seems to be needed,
make regenerate-opt-urls then works even without prior make html and
even if just a subset of c/c++, fortran and d is enabled.

Ok for trunk?

2024-09-21  Jakub Jelinek  

* Makefile.in ($(OPT_URLS_HTML_DEPS)): Add dependencies of the
Option-Index.html files on the corresponding index.html files.
Don't mention the requirement that all languages that have their own
HTML manuals to be enabled.

--- gcc/Makefile.in.jj  2024-09-18 15:03:25.979207519 +0200
+++ gcc/Makefile.in 2024-09-21 19:26:31.160949856 +0200
@@ -3640,12 +3640,12 @@ $(build_htmldir)/gccinstall/index.html:
$(SHELL) $(srcdir)/doc/install.texi2html
 
 # Regenerate the .opt.urls files from the generated html, and from the .opt
-# files.  Doing so requires all languages that have their own HTML manuals
-# to be enabled.
+# files.
 .PHONY: regenerate-opt-urls
 OPT_URLS_HTML_DEPS = $(build_htmldir)/gcc/Option-Index.html \
$(build_htmldir)/gdc/Option-Index.html \
$(build_htmldir)/gfortran/Option-Index.html
+$(OPT_URLS_HTML_DEPS): %/Option-Index.html: %/index.html
 
 regenerate-opt-urls: $(srcdir)/regenerate-opt-urls.py $(OPT_URLS_HTML_DEPS)
$(srcdir)/regenerate-opt-urls.py $(build_htmldir) $(shell dirname 
$(srcdir))

Jakub



Re: *PING* [PATCH v3 10/10] fortran: Add -finline-intrinsics flag for MINLOC/MAXLOC [PR90608]

2024-09-21 Thread Jakub Jelinek
On Sat, Sep 21, 2024 at 04:32:38PM +0200, Mikael Morin wrote:
> Thanks for the tip.
> The Makefile dependencies seem to be incomplete.
> Here is what I get:

AFAIK right now one needs to configure at least c,c++,fortran,d
in --enable-languages (or some superset thereof) and
make -C gcc html
make -C gcc regenerate-opt-urls
or alternatively, if one doesn't actually configure all those languages,
make -C gcc html fortran.html d.html
make -C gcc regenerate-opt-urls
should handle it too.

And it should certainly update the existing source *.opt.urls files, works
for me...

I think we should just fix the regenerate-opt-urls dependencies to do the
right thing automatically.

Jakub



Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-21 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 07:03:45PM -0400, Jason Merrill wrote:
> > The CALL_EXPR case in cp_fold uses !flag_no_inline instead, that makes more
> > sense to me.
> > Because checking "noinline" attribute (which means don't inline this
> > function) on current_function_decl rather than on functions being "inlined"
> > (the constexpr functions being non-manifestly constant evaluated) is just
> > weird.
> > If we really wanted, we could honor "noinline" during constant evaluation
> > on the CALL_EXPR/AGGR_INIT_EXPR fndecls, but dunno if whenever doing the
> > non-manifestly constant evaluated cases or just in special cases like these
> > two (CALL_EXPR in cp_fold, this in cp_fold_r).
> 
> Checking noinline in non-manifestly constant-evaluated cases might make
> sense.

Though, if somebody marks some function explicitly constexpr they should be
prepared to get some constexpr evaluation of it, doesn't have to be strictly
standard required one.
And for -fimplicit-constexpr we already have "noinline" attribute check, so
maybe it is ok as is.

Jakub



Re: [PATCH] c++: compile time evaluation of prvalues [PR116416]

2024-09-20 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 06:18:15PM -0400, Marek Polacek wrote:
> --- a/gcc/cp/cp-gimplify.cc
> +++ b/gcc/cp/cp-gimplify.cc
> @@ -1473,6 +1473,20 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void 
> *data_)
>that case, strip it in favor of this one.  */
>if (tree &init = TARGET_EXPR_INITIAL (stmt))
>   {
> +   tree fn;
> +   if ((data->flags & ff_genericize)
> +   /* Give the user an option to opt out.  */
> +   && !((fn = current_function_decl)
> +&& lookup_attribute ("noinline",
> + DECL_ATTRIBUTES (fn
> + {
> +   tree folded = maybe_constant_init (init, TARGET_EXPR_SLOT (stmt));
> +   if (folded != init && TREE_CONSTANT (folded))
> + {
> +   init = folded;
> +   break;
> + }
> + }

The CALL_EXPR case in cp_fold uses !flag_no_inline instead, that makes more
sense to me.
Because checking "noinline" attribute (which means don't inline this
function) on current_function_decl rather than on functions being "inlined"
(the constexpr functions being non-manifestly constant evaluated) is just
weird.
If we really wanted, we could honor "noinline" during constant evaluation
on the CALL_EXPR/AGGR_INIT_EXPR fndecls, but dunno if whenever doing the
non-manifestly constant evaluated cases or just in special cases like these
two (CALL_EXPR in cp_fold, this in cp_fold_r).

Jakub



C++ patch ping

2024-09-20 Thread Jakub Jelinek
Hi!

I'd like to ping 16 C++ patches:

https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660046.html
  c++: Implement C++23 P2718R0 - Wording for P2644R1 Fix for Range-based for 
Loop [PR107637]

https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662507.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662750.html
  CWG 2867 - Order of initialization for structured bindings - rest of 
implementation [PR115769]

https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661904.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661905.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661906.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662330.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662331.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662333.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662334.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662336.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662379.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662380.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662381.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662545.html
  P2552R3 - On the ignorability of standard attributes - series [PR110345]

https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659836.html
  c++: Attempt to implement C++26 P3034R1 - Module Declarations Shouldn't be 
Macros [PR114461]

Thanks

Jakub



Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-20 Thread Jakub Jelinek
On Wed, Sep 18, 2024 at 02:50:39PM -0600, Sandra Loosemore wrote:
> Here's the revised patch, covering both restrictions.  OK to commit?

Thanks.

> +   /* Each trait-property can only be specified once in a trait-selector
> +  other than the construct selector set.  FIXME: only handles
> +  name-list properties, not clause-list properties, since the
> +  "requires" selector is not implemented yet (PR 113067).  */
> +   if (tss_code != OMP_TRAIT_SET_CONSTRUCT)
> + for (tree p1 = OMP_TS_PROPERTIES (ts); p1; p1 = TREE_CHAIN (p1))
> +   {
> + if (OMP_TP_NAME (p1) != OMP_TP_NAMELIST_NODE)
> +   break;
> + const char *n1 = omp_context_name_list_prop (p1);
> + if (!n1)
> +   continue;
> + for (tree p2 = TREE_CHAIN (p1); p2; p2 = TREE_CHAIN (p2))
> +   {
> + const char *n2 = omp_context_name_list_prop (p2);
> + if (!n2)
> +   continue;
> + if (!strcmp (n1, n2))
> +   {
> + error_at (loc,
> +   "trait-property %qs specified more "
> +   "than once in %qs selector",
> +   n1, OMP_TS_NAME (ts));
> + return error_mark_node;
> +   }
> +   }
> +   }

This is O(n^2) in number of properties, but let's hope people are sane and
don't try:
#define A(x) n#x,
#define B(x) A(x##0) A(x##1) A(x##2) A(x##3) A(x##4) A(x##5) A(x##6) A(x##7) 
A(x##8) A(x##9)
#define C(x) B(x##0) B(x##1) B(x##2) B(x##3) B(x##4) B(x##5) B(x##6) B(x##7) 
B(x##8) B(x##9)
#define D(x) C(x##0) C(x##1) C(x##2) C(x##3) C(x##4) C(x##5) C(x##6) C(x##7) 
C(x##8) C(x##9)
#define E(x) D(x##0) D(x##1) D(x##2) D(x##3) D(x##4) D(x##5) D(x##6) D(x##7) 
D(x##8) D(x##9)
isa(E(1), avx512f)
or something similar.
Otherwise, we'd need a hash table if the list is longer than say 20 entries
(of course, that could be detected in the first inner loop and switch for
that to the different implementation).

> +
> +   /* Check for unknown properties.  */
> if (omp_ts_map[ts_code].valid_properties == NULL)
>   continue;
> -

Why?

Otherwise LGTM.

Jakub



Re: [PATCH] i386: Fix up _mm_min_ss etc. handling of zeros and NaNs [PR116738]

2024-09-19 Thread Jakub Jelinek
On Fri, Sep 20, 2024 at 08:01:58AM +0200, Richard Biener wrote:
> > P.S. I have a patch to replace UNSPEC_IEEE_M{AX,IN} with IF_THEN_ELSE
> > (except for the 3dNOW! PFMIN/MAX, those actually are documented to behave
> > differently), but it actually doesn't improve anything much, as
> > simplify_const_relational_operation nor simplify_ternary_operation aren't
> > able to fold comparisons with two CONST_VECTOR operands or IF_THEN_ELSE
> > with 3 CONST_VECTOR operands.
> > So, maybe better approach will be to generic fold the builtins with constant
> > arguments (maybe leaving NaNs to runtime).
> 
> It would be possible to fold them in the gimple folding hook to VEC_COND_EXPRs
> with the chance the min/max operation being lost when expanding to RTL.

Sure, but we don't actually pattern recognize 
typedef float v4sf __attribute__((vector_size (sizeof (4 * sizeof (float);

v4sf
foo (v4sf x, v4sf y)
{
  return x < y ? y : x;
}
back to maxpd etc.  So it wouldn't be an optimization in most cases, at
least until we do that, user was looking for such insn or better with 
_mm_max_ps...
Maybe we should.

For scalar ('-Dvector_size(x)=') this is currently matched in ce2.

Exception-wise, seems the insn raise Invalid on NaN input (either) and if y
is SNaN, actually propagate it rather than turn it into QNaN, so I think it
is actually an exact match for x < y ? y : x (or x > y ? y : x).

Jakub



Re: Re: *PING* [PATCH v3 10/10] fortran: Add -finline-intrinsics flag for MINLOC/MAXLOC [PR90608]

2024-09-19 Thread Jakub Jelinek
On Mon, Sep 16, 2024 at 10:52:43AM +0200, Mikael Morin wrote:
> > While I understand the intent of 'positive form' vs 'negative form', the
> > above might be clearer as
> > 
> > Usage of intrinsics can be implemented either by generating a call
> > to the libgfortran library function or by directly generating inline
> > code.  For most intrinsics, only a single variant is available, and
> > there is no choice of implementation.  However, some intrinsics can
> > use a library function or inline code, wher inline code typically offers
> > opportunities for additional optimization over a library function.
> > With @code{-finline-intrinsics=...} or 
> > @code{-fno-inline-intrinsics=...}, the
> > choice applies only to the intrinsics present in the comma-separated 
> > list
> > provided as argument.
> > 
> > > > +For each intrinsic, if no choice of implementation was made through 
> > > > either of
> > > > +the flag variants, a default behaviour is chosen depending on 
> > > > optimization:
> > > > +library calls are generated when not optimizing or when optimizing for 
> > > > size;
> > > > +otherwise inline code is preferred.
> > > > +
> > 
> > 
> > OK with consideration the above comments.
> > 
> 
> Harald actually gave a partial green light on this already, but obviously
> there was still room for improvement.
> Thanks for the review, I'm incorporating the changes you suggested.
> 
> I was (and still am) waiting for a review from someone knowledgeable in the
> options system.  I'm considering proceeding without, as I prefer seeing this
> pushed sooner than later.

Just note lang.opt.urls will need to be updated, either you do it right away
with make regenerate-opt-urls or commit, wait for a nag-mail from CI and
commit incrementally the patch it creates.

Jakub



[PATCH] c++: Use type_id_in_expr_sentinel in 6 further spots in the parser

2024-09-19 Thread Jakub Jelinek
Hi!

The following patch uses type_id_in_expr_sentinel in a few spots which
did it all manually.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-19  Jakub Jelinek  

* parser.cc (cp_parser_postfix_expression): Use
type_id_in_expr_sentinel instead of manually saving+setting/restoring
parser->in_type_id_in_expr_p around cp_parser_type_id calls.
(cp_parser_has_attribute_expression): Likewise.
(cp_parser_cast_expression): Likewise.
(cp_parser_sizeof_operand): Likewise.

--- gcc/cp/parser.cc.jj 2024-09-07 09:31:20.708482757 +0200
+++ gcc/cp/parser.cc2024-09-19 10:46:21.916155154 +0200
@@ -7554,7 +7554,6 @@ cp_parser_postfix_expression (cp_parser
tree type;
cp_expr expression;
const char *saved_message;
-   bool saved_in_type_id_in_expr_p;
 
/* All of these can be handled in the same way from the point
   of view of parsing.  Begin by consuming the token
@@ -7569,11 +7568,11 @@ cp_parser_postfix_expression (cp_parser
/* Look for the opening `<'.  */
cp_parser_require (parser, CPP_LESS, RT_LESS);
/* Parse the type to which we are casting.  */
-   saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
-   parser->in_type_id_in_expr_p = true;
-   type = cp_parser_type_id (parser, CP_PARSER_FLAGS_TYPENAME_OPTIONAL,
- NULL);
-   parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
+   {
+ type_id_in_expr_sentinel s (parser);
+ type = cp_parser_type_id (parser, CP_PARSER_FLAGS_TYPENAME_OPTIONAL,
+   NULL);
+   }
/* Look for the closing `>'.  */
cp_parser_require_end_of_template_parameter_list (parser);
/* Restore the old message.  */
@@ -7643,7 +7642,6 @@ cp_parser_postfix_expression (cp_parser
   {
tree type;
const char *saved_message;
-   bool saved_in_type_id_in_expr_p;
 
/* Consume the `typeid' token.  */
cp_lexer_consume_token (parser->lexer);
@@ -7658,10 +7656,10 @@ cp_parser_postfix_expression (cp_parser
   expression.  */
cp_parser_parse_tentatively (parser);
/* Try a type-id first.  */
-   saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
-   parser->in_type_id_in_expr_p = true;
-   type = cp_parser_type_id (parser);
-   parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
+   {
+ type_id_in_expr_sentinel s (parser);
+ type = cp_parser_type_id (parser);
+   }
/* Look for the `)' token.  Otherwise, we can't be sure that
   we're not looking at an expression: consider `typeid (int
   (3))', for example.  */
@@ -7916,10 +7914,8 @@ cp_parser_postfix_expression (cp_parser
else
  {
/* Parse the type.  */
-   bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
-   parser->in_type_id_in_expr_p = true;
+   type_id_in_expr_sentinel s (parser);
type = cp_parser_type_id (parser);
-   parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
parens.require_close (parser);
  }
 
@@ -9502,11 +9498,11 @@ cp_parser_has_attribute_expression (cp_p
  expression.  */
   cp_parser_parse_tentatively (parser);
 
-  bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
-  parser->in_type_id_in_expr_p = true;
-  /* Look for the type-id.  */
-  oper = cp_parser_type_id (parser);
-  parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
+  {
+type_id_in_expr_sentinel s (parser);
+/* Look for the type-id.  */
+oper = cp_parser_type_id (parser);
+  }
 
   cp_parser_parse_definitely (parser);
 
@@ -10268,15 +10264,13 @@ cp_parser_cast_expression (cp_parser *pa
cp_parser_simulate_error (parser);
   else
{
- bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
- parser->in_type_id_in_expr_p = true;
+ type_id_in_expr_sentinel s (parser);
  /* Look for the type-id.  */
  type = cp_parser_type_id (parser);
  /* Look for the closing `)'.  */
  cp_token *close_paren = parens.require_close (parser);
  if (close_paren)
close_paren_loc = close_paren->location;
- parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
}
 
   /* Restore the saved message.  */
@@ -34299,13 +34293,11 @@ cp_parser_sizeof_operand (cp_parser* par
cp_parser_simulate_error (parser);
   else
{
- bool saved_in_type_id_in_expr_p = parser->in_type_id_in_expr_p;
- parser->in_type_id_in_expr_p = true;
+ type_id_in_expr_sentinel s (parser);
  /* Look for the type-id.  */
  type = cp_

[PATCH] i386: Fix up _mm_min_ss etc. handling of zeros and NaNs [PR116738]

2024-09-19 Thread Jakub Jelinek
Hi!

min/max patterns for intrinsics which on x86 result in the second
input operand if the two operands are both zeros or one or both of them
are a NaN shouldn't use SMIN/SMAX RTL, because that is similarly to
MIN_EXPR/MAX_EXPR undefined what will be the result in those cases.

The following patch adds an expander which uses either a new pattern with
UNSPEC_IEEE_M{AX,IN} or use the S{MIN,MAX} representation of the same.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

P.S. I have a patch to replace UNSPEC_IEEE_M{AX,IN} with IF_THEN_ELSE
(except for the 3dNOW! PFMIN/MAX, those actually are documented to behave
differently), but it actually doesn't improve anything much, as
simplify_const_relational_operation nor simplify_ternary_operation aren't
able to fold comparisons with two CONST_VECTOR operands or IF_THEN_ELSE
with 3 CONST_VECTOR operands.
So, maybe better approach will be to generic fold the builtins with constant
arguments (maybe leaving NaNs to runtime).

2024-09-19  Uros Bizjak  
        Jakub Jelinek  

PR target/116738
* config/i386/subst.md (mask_scalar_operand_arg34,
mask_scalar_expand_op3, round_saeonly_scalar_mask_arg3): New
subst attributes.
* config/i386/sse.md
(_vm3):
Change from define_insn to define_expand, rename the old define_insn
to ...
(*_vm3):
... this.

(_ieee_vm3):
New define_insn.

* gcc.target/i386/sse-pr116738.c: New test.

--- gcc/config/i386/subst.md.jj 2024-09-18 15:49:42.200791315 +0200
+++ gcc/config/i386/subst.md2024-09-19 12:32:51.048626421 +0200
@@ -366,6 +366,8 @@ (define_subst_attr "mask_scalar_operand4
 (define_subst_attr "mask_scalarcz_operand4" "mask_scalarcz" "" "%{%5%}%N4")
 (define_subst_attr "mask_scalar4_dest_false_dep_for_glc_cond" "mask_scalar" 
"1" "operands[4] == CONST0_RTX(mode)")
 (define_subst_attr "mask_scalarc_dest_false_dep_for_glc_cond" "mask_scalarc" 
"1" "operands[3] == CONST0_RTX(V8HFmode)")
+(define_subst_attr "mask_scalar_operand_arg34" "mask_scalar" "" ", 
operands[3], operands[4]")
+(define_subst_attr "mask_scalar_expand_op3" "mask_scalar" "3" "5")
 
 (define_subst "mask_scalar"
   [(set (match_operand:SUBST_V 0)
@@ -473,6 +475,7 @@ (define_subst_attr "round_saeonly_scalar
 (define_subst_attr "round_saeonly_scalar_constraint" "round_saeonly_scalar" 
"vm" "v")
 (define_subst_attr "round_saeonly_scalar_prefix" "round_saeonly_scalar" "vex" 
"evex")
 (define_subst_attr "round_saeonly_scalar_nimm_predicate" 
"round_saeonly_scalar" "nonimmediate_operand" "register_operand")
+(define_subst_attr "round_saeonly_scalar_mask_arg3" "round_saeonly_scalar" "" 
", operands[]")
 
 (define_subst "round_saeonly_scalar"
   [(set (match_operand:SUBST_V 0)
--- gcc/config/i386/sse.md.jj   2024-09-10 16:26:02.875151133 +0200
+++ gcc/config/i386/sse.md  2024-09-19 12:43:31.693030695 +0200
@@ -,7 +,27 @@ (define_insn "*ieee_3
   (const_string "*")))
(set_attr "mode" "")])
 
-(define_insn 
"_vm3"
+(define_expand 
"_vm3"
+  [(set (match_operand:VFH_128 0 "register_operand")
+   (vec_merge:VFH_128
+ (smaxmin:VFH_128
+   (match_operand:VFH_128 1 "register_operand")
+   (match_operand:VFH_128 2 "nonimmediate_operand"))
+(match_dup 1)
+(const_int 1)))]
+  "TARGET_SSE"
+{
+  if (!flag_finite_math_only || flag_signed_zeros)
+{
+  emit_insn 
(gen__ieee_vm3
+(operands[0], operands[1], operands[2]
+ 
+ ));
+  DONE;
+}
+})
+
+(define_insn 
"*_vm3"
   [(set (match_operand:VFH_128 0 "register_operand" "=x,v")
(vec_merge:VFH_128
  (smaxmin:VFH_128
@@ -3348,6 +3368,25 @@ (define_insn "_vm3")
+   (set_attr "mode" "")])
+
+(define_insn 
"_ieee_vm3"
+  [(set (match_operand:VFH_128 0 "register_operand" "=x,v")
+   (vec_merge:VFH_128
+ (unspec:VFH_128
+   [(match_operand:VFH_128 1 "register_operand" "0,v")
+(match_operand:VFH_128 2 "nonimmediate_operand" 
"xm,")]
+   IEEE_MAXMIN)
+(match_dup 1)
+(const_int 1)))]
+  "TARGET_SSE"
+  "@
+   \t{%2, %0|%0, %2}
+   v\t{%2, 
%1, %0|%0, %1, 
%2}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sse")
+   (set_attr "btv

[PATCH] libcpp, v2: Add -Wtrailing-whitespace= warning

2024-09-19 Thread Jakub Jelinek
On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> On Wed, Sep 18, 2024 at 7:33 PM Jakub Jelinek  wrote:
> >
> > On Wed, Sep 18, 2024 at 06:17:58PM +0100, Richard Sandiford wrote:
> > > +1  I'd much rather learn about this kind of error before the code reaches
> > > a review tool :)
> > >
> > > >From a quick check, it doesn't look like Clang has this, so there is no
> > > existing name to follow.
> >
> > I was considering also -Wtrailing-whitespace, but
> > 1) git diff really warns just about trailing spaces/tabs, not form feeds or
> > vertical tabs
> > 2) gcc source contains tons of spots with form feed in it (though,
> > I think pretty much always as the sole character on a line).
> > And not really sure how people use vertical tabs in the source if at all.
> > Perhaps form feed could be not warned if at end of line if it isn't the sole
> > character on a line...
> 
> Generally I like diagnosing this early.  For the above I'd say
> -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and
> tabs - for
> -Wtrailiing-whitespace) would be nice.  As for naming possibly follow the
> is{space,blank,cntrl} character classifications?  If those are a good
> fit, that is.

Here is a patch which currently allows blank (' ' '\t') and space (' ' '\t'
'\f' '\v'), cntrl not yet added, not anything non-ASCII, but in theory could
be added later (though, non-ASCII would be just for inside of comments,
say non-breaking space etc. in the source is otherwise an error).

Bootstrapped/regtested on x86_64-linux and i686-linux.

2024-09-19  Jakub Jelinek  

libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_trailing_whitespace member.
(enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document 'W' line note.
* lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
except for trailing whitespace after backslash.  Formatting fix.
(_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
Formatting fixes.
(lex_raw_string): Clear type on 'W' notes.
gcc/
* doc/invoke.texi (Wtrailing-whitespace): Document.
gcc/c-family/
* c.opt (Wtrailing-whitespace=): New option.
(Wtrailing-whitespace): New alias.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.

--- libcpp/include/cpplib.h.jj  2024-09-13 16:09:32.690455174 +0200
+++ libcpp/include/cpplib.h 2024-09-19 16:59:09.674903649 +0200
@@ -594,6 +594,9 @@ struct cpp_options
   /* True if -finput-charset= option has been used explicitly.  */
   bool cpp_input_charset_explicit;
 
+  /* -Wtrailing-whitespace= value.  */
+  unsigned char cpp_warn_trailing_whitespace;
+
   /* Dependency generation.  */
   struct
   {
@@ -709,7 +712,8 @@ enum cpp_warning_reason {
   CPP_W_EXPANSION_TO_DEFINED,
   CPP_W_BIDIRECTIONAL,
   CPP_W_INVALID_UTF8,
-  CPP_W_UNICODE
+  CPP_W_UNICODE,
+  CPP_W_TRAILING_WHITESPACE
 };
 
 /* Callback for header lookup for HEADER, which is the name of a
--- libcpp/internal.h.jj2024-09-18 09:45:36.832570227 +0200
+++ libcpp/internal.h   2024-09-19 16:54:56.610321817 +0200
@@ -318,8 +318,8 @@ struct _cpp_line_note
 
   /* Type of note.  The 9 'from' trigraph characters represent those
  trigraphs, '\\' an escaped newline, ' ' an escaped newline with
- intervening space, 0 represents a note that has already been handled,
- and anything else is invalid.  */
+ intervening space, 'W' trailing whitespace, 0 represents a note that
+ has already been handled, and anything else is invalid.  */
   unsigned int type;
 };
 
--- libcpp/lex.cc.jj2024-09-13 16:09:32.720454758 +0200
+++ libcpp/lex.cc   2024-09-19 16:58:37.434339128 +0200
@@ -928,7 +928,7 @@ _cpp_clean_line (cpp_reader *pfile)
  if (p == buffer->next_line || p[-1] != '\\')
break;
 
- add_line_note (buffer, p - 1, p != d ? ' ': '\\');
+ add_line_note (buffer, p - 1, p != d ? ' ' : '\\');
  d = p - 2;
  buffer->next_line = p - 1;
}
@@ -943,6 +943,20 @@ _cpp_clean_line (cpp_reader *pfile)
}
}
}
+ done:
+  if (d > buffer->next_line
+ && CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
+   switch (CPP

Re: [PATCH RFC] build: update bootstrap req to C++14

2024-09-19 Thread Jakub Jelinek
On Thu, Sep 19, 2024 at 10:21:15AM -0400, Jason Merrill wrote:
> On 9/19/24 7:57 AM, Richard Biener wrote:
> > On Wed, Sep 18, 2024 at 6:22 PM Jason Merrill  wrote:
> > > 
> > > Tested x86_64-pc-linux-gnu with 5.5.0 bootstrap compiler.  Thoughts?
> > 
> > I'm fine with this in general - do we have needs of bumping the requirement 
> > for
> > GCC 15 though?  IMO we should bump once we are requiring actual C++14
> > in some place.
> 
> Jakub's dwarf2asm patch yesterday uses C++14 if available, and I remember

And libcpp too.

> seeing a couple of other patches that would have been simpler with C++14
> available.

It was just a few lines and if I removed the now never true
HAVE_DESIGNATED_INITIALIZERS cases, it wouldn't even add any new lines, just
change some to others.  Both of those patches were just minor optimizations,
it is fine if they don't happen during stage1.

We also have some spots with
#if __cpp_inline_variables < 201606L
#else
#endif
conditionals but that doesn't mean we need to bump to C++17.

Sure, bumping the required C++ version means we can remove the corresponding
conditionals, and more importantly stop worrying about working around GCC
4.8.x/4.9 bugs (I think that is actually more important).
The price is stopping to use some of the cfarm machines for testing or
using IBM Advanced Toolchain or hand-built GCC 14 there as the system
compiler there.
At some point we certainly want to do that, the question is if the benefits
right now overweight the pain.

> > As of the version requirement as you say only some minor versions of the 
> > GCC 5
> > series are OK I would suggest to say we recommend using GCC 6 or later
> > but GCC 5.5 should also work?
> 
> Aren't we already specifying a minor revision with 4.8.3 for C++11?
> 
> Another possibility would be to just say GCC 5, and adjust that upward if we
> run into problems.

I think for the oldest supported version we need some CFarm machines around
with that compiler so that all people can actually test issues with it.
Dunno which distros shipped GCC 5 in long term support versions if any and
at which minor those are.

Jakub



Re: [PATCH] libcpp: Add -Wtrailing-blanks warning

2024-09-19 Thread Jakub Jelinek
On Thu, Sep 19, 2024 at 09:07:06AM +0200, Jakub Jelinek wrote:
> space is ' ' '\t' '\n' '\r' '\f' '\v' in the C locale,
> blank is ' ' '\t'
> cntrl is a lot of chars but not ' '
> if we extend by the safe-ctype
> vspace '\r' '\n'
> nvspace ' ' '\t' '\f' '\v' '\0'
> Obviously, we shouldn't look at '\r' and '\n', those aren't trailing
> characters, those are line separators.
> 
> Would we need to consider all UTF-8 (or EBCDIC-UTF) control characters is
> cntrl?
> ..0009; Control # Cc  [10] ..
> 000B..000C; Control # Cc   [2] ..
> 000E..001F; Control # Cc  [18] ..
> 007F..009F; Control # Cc  [33] ..
> 00AD  ; Control # Cf   SOFT HYPHEN
> 061C  ; Control # Cf   ARABIC LETTER MARK
> 180E  ; Control # Cf   MONGOLIAN VOWEL SEPARATOR
> 200B  ; Control # Cf   ZERO WIDTH SPACE
> 200E..200F; Control # Cf   [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK
> 2028  ; Control # Zl   LINE SEPARATOR
> 2029  ; Control # Zp   PARAGRAPH SEPARATOR
> 202A..202E; Control # Cf   [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT 
> OVERRIDE
> 2060..2064; Control # Cf   [5] WORD JOINER..INVISIBLE PLUS
> 2065  ; Control # Cn   
> 2066..206F; Control # Cf  [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES
> FEFF  ; Control # Cf   ZERO WIDTH NO-BREAK SPACE
> FFF0..FFF8; Control # Cn   [9] ..
> FFF9..FFFB; Control # Cf   [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR 
> ANNOTATION TERMINATOR
> 13430..1343F  ; Control # Cf  [16] EGYPTIAN HIEROGLYPH VERTICAL 
> JOINER..EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE
> 1BCA0..1BCA3  ; Control # Cf   [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND 
> FORMAT UP STEP
> 1D173..1D17A  ; Control # Cf   [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL 
> END PHRASE
> E ; Control # Cn   
> E0001 ; Control # Cf   LANGUAGE TAG
> E0002..E001F  ; Control # Cn  [30] ..
> E0080..E00FF  ; Control # Cn [128] ..
> E01F0..E0FFF  ; Control # Cn [3600] ..
> 
> Wonder why anybody would be interested to find just trailing spaces and not
> trailing tabs or vice versa, so if we have categories, blank would be one,
> then perhaps nvspace as something not including '\0', so just ' ' '\t' '\f'
> '\v' and if really needed, control characters with added ' ', but how to
> call that and would it really need to parse UTF-8/EBCDIC and look at
> pregenerated tables?

And there are also:
0009..000D; White_Space # Cc   [5] ..
0020  ; White_Space # Zs   SPACE
0085  ; White_Space # Cc   
00A0  ; White_Space # Zs   NO-BREAK SPACE
1680  ; White_Space # Zs   OGHAM SPACE MARK
2000..200A; White_Space # Zs  [11] EN QUAD..HAIR SPACE
2028  ; White_Space # Zl   LINE SEPARATOR
2029  ; White_Space # Zp   PARAGRAPH SEPARATOR
202F  ; White_Space # Zs   NARROW NO-BREAK SPACE
205F  ; White_Space # Zs   MEDIUM MATHEMATICAL SPACE
3000  ; White_Space # Zs   IDEOGRAPHIC SPACE

Jakub



Re: [PATCH] libcpp: Add -Wtrailing-blanks warning

2024-09-19 Thread Jakub Jelinek
On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> On Wed, Sep 18, 2024 at 7:33 PM Jakub Jelinek  wrote:
> >
> > On Wed, Sep 18, 2024 at 06:17:58PM +0100, Richard Sandiford wrote:
> > > +1  I'd much rather learn about this kind of error before the code reaches
> > > a review tool :)
> > >
> > > >From a quick check, it doesn't look like Clang has this, so there is no
> > > existing name to follow.
> >
> > I was considering also -Wtrailing-whitespace, but
> > 1) git diff really warns just about trailing spaces/tabs, not form feeds or
> > vertical tabs
> > 2) gcc source contains tons of spots with form feed in it (though,
> > I think pretty much always as the sole character on a line).
> > And not really sure how people use vertical tabs in the source if at all.
> > Perhaps form feed could be not warned if at end of line if it isn't the sole
> > character on a line...
> 
> Generally I like diagnosing this early.  For the above I'd say
> -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and
> tabs - for
> -Wtrailiing-whitespace) would be nice.  As for naming possibly follow the
> is{space,blank,cntrl} character classifications?  If those are a good
> fit, that is.

I think the character classifications risk problems.

space is ' ' '\t' '\n' '\r' '\f' '\v' in the C locale,
blank is ' ' '\t'
cntrl is a lot of chars but not ' '
if we extend by the safe-ctype
vspace '\r' '\n'
nvspace ' ' '\t' '\f' '\v' '\0'
Obviously, we shouldn't look at '\r' and '\n', those aren't trailing
characters, those are line separators.

Would we need to consider all UTF-8 (or EBCDIC-UTF) control characters is
cntrl?
..0009; Control # Cc  [10] ..
000B..000C; Control # Cc   [2] ..
000E..001F; Control # Cc  [18] ..
007F..009F; Control # Cc  [33] ..
00AD  ; Control # Cf   SOFT HYPHEN
061C  ; Control # Cf   ARABIC LETTER MARK
180E  ; Control # Cf   MONGOLIAN VOWEL SEPARATOR
200B  ; Control # Cf   ZERO WIDTH SPACE
200E..200F; Control # Cf   [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK
2028  ; Control # Zl   LINE SEPARATOR
2029  ; Control # Zp   PARAGRAPH SEPARATOR
202A..202E; Control # Cf   [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT 
OVERRIDE
2060..2064; Control # Cf   [5] WORD JOINER..INVISIBLE PLUS
2065  ; Control # Cn   
2066..206F; Control # Cf  [10] LEFT-TO-RIGHT ISOLATE..NOMINAL DIGIT SHAPES
FEFF  ; Control # Cf   ZERO WIDTH NO-BREAK SPACE
FFF0..FFF8; Control # Cn   [9] ..
FFF9..FFFB; Control # Cf   [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR 
ANNOTATION TERMINATOR
13430..1343F  ; Control # Cf  [16] EGYPTIAN HIEROGLYPH VERTICAL 
JOINER..EGYPTIAN HIEROGLYPH END WALLED ENCLOSURE
1BCA0..1BCA3  ; Control # Cf   [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND 
FORMAT UP STEP
1D173..1D17A  ; Control # Cf   [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL 
END PHRASE
E ; Control # Cn   
E0001 ; Control # Cf   LANGUAGE TAG
E0002..E001F  ; Control # Cn  [30] ..
E0080..E00FF  ; Control # Cn [128] ..
E01F0..E0FFF  ; Control # Cn [3600] ..

Wonder why anybody would be interested to find just trailing spaces and not
trailing tabs or vice versa, so if we have categories, blank would be one,
then perhaps nvspace as something not including '\0', so just ' ' '\t' '\f'
'\v' and if really needed, control characters with added ' ', but how to
call that and would it really need to parse UTF-8/EBCDIC and look at
pregenerated tables?

Jakub



Re: [PATCH] libcpp: Add -Wtrailing-blanks warning

2024-09-18 Thread Jakub Jelinek
On Wed, Sep 18, 2024 at 06:17:58PM +0100, Richard Sandiford wrote:
> +1  I'd much rather learn about this kind of error before the code reaches
> a review tool :)
> 
> >From a quick check, it doesn't look like Clang has this, so there is no
> existing name to follow.

I was considering also -Wtrailing-whitespace, but 
1) git diff really warns just about trailing spaces/tabs, not form feeds or
vertical tabs
2) gcc source contains tons of spots with form feed in it (though,
I think pretty much always as the sole character on a line).
And not really sure how people use vertical tabs in the source if at all.
Perhaps form feed could be not warned if at end of line if it isn't the sole
character on a line...

Jakub



Re: [PATCH] dwarf2asm: Use constexpr for eh_data_format_name initialization for C++14

2024-09-18 Thread Jakub Jelinek
On Wed, Sep 18, 2024 at 10:51:49AM -0600, Jeff Law wrote:
> On 9/18/24 10:04 AM, Jakub Jelinek wrote:
> > Hi!
> > 
> > Similarly to the previous patch, dwarf2asm.cc had
> > HAVE_DESIGNATED_INITIALIZERS support, and as fallback a huge switch.
> > The switch from what I can see is expanded as a jump table with 256
> > label pointers and code at those labels then loads addresses of
> > string literals.
> > The following patch instead uses a table with 256 const char * pointers,
> > NULL for ICE, non-NULL for returning something, similarly to the
> > HAVE_DESIGNATED_INITIALIZERS case.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > 2024-09-18  Jakub Jelinek  
> > 
> > * dwarf2asm.cc (eh_data_format_name): Use constexpr initialization
> > of format_names table for C++14 instead of a large switch.
> Do we want to defer this while the discussion around moving to c++14 plays
> out?  Or would you prefer to move forward and if we move to c++14 come back
> and simplify?

I'm fine either way.
Note, the #if HAVE_DESIGNATED_INITIALIZERS parts could be removed too now
that we compile the stuff with C++ (for years).  clang++ actually does
support int a[3] = { [2] = 1, [0] = 2 }; but warns about it by default.

Jakub



[PATCH] libcpp: Add -Wtrailing-blanks warning

2024-09-18 Thread Jakub Jelinek
Hi!

Trailing blanks is something even git diff diagnoses; while it is a coding
style issue, if it is so common that git diff diagnoses it, I think it could
be useful to various projects to check that at compile time.

Dunno if it should be included in -Wextra, currently it isn't, and due to
tons of trailing whitespace in our sources, haven't enabled it for when
building gcc itself either.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, git diff also diagnoses indentation with tab following space, wonder
if we couldn't have trivial warning options where one would simply ask for
checking of indentation with no tabs, just spaces vs. indentation with
tabs followed by spaces (but never tab width or more spaces in the
indentation).  I think that would be easy to do also on the libcpp side.
Checking how much something should be exactly indented requires syntax
analysis (at least some limited one) and can consider columns of first token
on line, but what the exact indentation blanks were is something only libcpp
knows.

2024-09-18  Jakub Jelinek  

libcpp/
* include/cpplib.h (struct cpp_options): Add cpp_warn_trailing_blanks
member.
(enum cpp_warning_reason): Add CPP_W_TRAILING_BLANKS.
* internal.h (struct _cpp_line_note): Document 'B' line note.
* lex.cc (_cpp_clean_line): Add 'B' line note for trailing blanks
except for trailing whitespace after backslash.  Formatting fix.
(_cpp_process_line_notes): Emit -Wtrailing-blanks diagnostics.
Formatting fixes.
(lex_raw_string): Clear type on 'B' notes.
gcc/
* doc/invoke.texi (Wtrailing-blanks): Document.
gcc/c-family/
* c.opt (Wtrailing-blanks): New option.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-blanks.c: New test.

--- libcpp/include/cpplib.h.jj  2024-09-13 16:09:32.690455174 +0200
+++ libcpp/include/cpplib.h 2024-09-18 13:01:26.289338279 +0200
@@ -594,6 +594,9 @@ struct cpp_options
   /* True if -finput-charset= option has been used explicitly.  */
   bool cpp_input_charset_explicit;
 
+  /* True if -Wtrailing-blanks.  */
+  bool cpp_warn_trailing_blanks;
+
   /* Dependency generation.  */
   struct
   {
@@ -709,7 +712,8 @@ enum cpp_warning_reason {
   CPP_W_EXPANSION_TO_DEFINED,
   CPP_W_BIDIRECTIONAL,
   CPP_W_INVALID_UTF8,
-  CPP_W_UNICODE
+  CPP_W_UNICODE,
+  CPP_W_TRAILING_BLANKS
 };
 
 /* Callback for header lookup for HEADER, which is the name of a
--- libcpp/internal.h.jj2024-09-18 09:45:36.832570227 +0200
+++ libcpp/internal.h   2024-09-18 12:36:04.386099371 +0200
@@ -318,8 +318,8 @@ struct _cpp_line_note
 
   /* Type of note.  The 9 'from' trigraph characters represent those
  trigraphs, '\\' an escaped newline, ' ' an escaped newline with
- intervening space, 0 represents a note that has already been handled,
- and anything else is invalid.  */
+ intervening space, 'B' trailing blanks, 0 represents a note that
+ has already been handled, and anything else is invalid.  */
   unsigned int type;
 };
 
--- libcpp/lex.cc.jj2024-09-13 16:09:32.720454758 +0200
+++ libcpp/lex.cc   2024-09-18 14:00:46.344062046 +0200
@@ -928,7 +928,7 @@ _cpp_clean_line (cpp_reader *pfile)
  if (p == buffer->next_line || p[-1] != '\\')
break;
 
- add_line_note (buffer, p - 1, p != d ? ' ': '\\');
+ add_line_note (buffer, p - 1, p != d ? ' ' : '\\');
  d = p - 2;
  buffer->next_line = p - 1;
}
@@ -943,6 +943,11 @@ _cpp_clean_line (cpp_reader *pfile)
}
}
}
+ done:
+  if (d > buffer->next_line
+ && ISBLANK (d[-1])
+ && CPP_OPTION (pfile, cpp_warn_trailing_blanks))
+   add_line_note (buffer, d - 1, 'B');
 }
   else
 {
@@ -955,7 +960,6 @@ _cpp_clean_line (cpp_reader *pfile)
s++;
 }
 
- done:
   *d = '\n';
   /* A sentinel note that should never be processed.  */
   add_line_note (buffer, d + 1, '\n');
@@ -1013,13 +1017,23 @@ _cpp_process_line_notes (cpp_reader *pfi
 
   if (note->type == '\\' || note->type == ' ')
{
- if (note->type == ' ' && !in_comment)
-   cpp_error_with_line (pfile, CPP_DL_WARNING, 
pfile->line_table->highest_line, col,
-"backslash and newline separated by space");
+ if (note->type == ' ')
+   {
+ if (!in_comment)
+   cpp_error_with_line (pfile, CPP_DL_WARNING,
+pfile->line_table->highest_line, col,
+"backslash and newline separated by "
+   

[PATCH] dwarf2asm: Use constexpr for eh_data_format_name initialization for C++14

2024-09-18 Thread Jakub Jelinek
Hi!

Similarly to the previous patch, dwarf2asm.cc had
HAVE_DESIGNATED_INITIALIZERS support, and as fallback a huge switch.
The switch from what I can see is expanded as a jump table with 256
label pointers and code at those labels then loads addresses of
string literals.
The following patch instead uses a table with 256 const char * pointers,
NULL for ICE, non-NULL for returning something, similarly to the
HAVE_DESIGNATED_INITIALIZERS case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-18  Jakub Jelinek  

* dwarf2asm.cc (eh_data_format_name): Use constexpr initialization
of format_names table for C++14 instead of a large switch.

--- gcc/dwarf2asm.cc.jj 2024-01-03 11:51:28.611771716 +0100
+++ gcc/dwarf2asm.cc2024-09-18 10:12:30.916464439 +0200
@@ -488,14 +488,22 @@ eh_data_format_name (int format)
 {
 #if HAVE_DESIGNATED_INITIALIZERS
 #define S(p, v)[p] = v,
+#elif __cpp_constexpr >= 201304L
+#define S(p, v)names[p] = v;
 #else
 #define S(p, v)case p: return v;
 #endif
 
 #if HAVE_DESIGNATED_INITIALIZERS
   __extension__ static const char * const format_names[256] = {
+#elif __cpp_constexpr >= 201304L
+  static constexpr struct format_names_s {
+const char *names[256];
+constexpr format_names_s () : names {}
+{
 #else
-  switch (format) {
+  switch (format)
+{
 #endif
 
   S(DW_EH_PE_absptr, "absolute")
@@ -635,8 +643,15 @@ eh_data_format_name (int format)
   gcc_assert (format >= 0 && format < 0x100 && format_names[format]);
 
   return format_names[format];
+#elif __cpp_constexpr >= 201304L
+}
+  } format_names;
+
+  gcc_assert (format >= 0 && format < 0x100 && format_names.names[format]);
+
+  return format_names.names[format];
 #else
-  }
+}
   gcc_unreachable ();
 #endif
 }

Jakub



Re: [PATCH v5] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-17 Thread Jakub Jelinek
On Tue, Sep 17, 2024 at 11:13:09AM +0200, Jakub Jelinek wrote:
> So maybe better
>   tree arg = e_p->value;
>   tree f;
>   if ((in_typeof || in_alignof)
>   && TREE_CODE (arg) == COMPONENT_REF
>   && (f = TREE_OPERAND (arg, 1))
>   && TREE_CODE (f) == FIELD_DECL
>   && c_flexible_array_member_type_p (TREE_TYPE (f))
>   && lookup_attribute ("counted_by", DECL_ATTRIBUTES (f))
>   && DECL_NAME (f))

Of course with
  {
auto save_in_typeof = in_typeof;
auto save_in_alignof = in_alignof;
in_typeof = 0;
in_alignof = 0;
> arg = build_component_ref (EXPR_LOCATION (arg),
>TREE_OPERAND (arg, 0),
>DECL_NAME (f), UNKNOWN_LOCATION,
>UNKNOWN_LOCATION, true);
in_typeof = save_in_typeof;
in_alignof = save_in_alignof;
  }
Why don't we have the sentinel classes C++ FE has in the C FE?

Or outline the counted_by specific part of build_component_ref into
a separate function and call just that.

>   if (has_counted_by_object (arg))
> expr.value = get_counted_by_ref (arg);
>   else
> expr.value = null_pointer_node;

BTW, concerning just counted_by attribute, how does it play together
with _Atomic on the size member?  Perhaps it should be disallowed.
The thing is that _Atomic access lowering is done in the FE, so too
early for tree-object-size, which would either need to reimplement it by
hand (sure, not that difficult, it needs to just atomically load).

Jakub



Re: [PATCH v5] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-17 Thread Jakub Jelinek
On Sat, Sep 14, 2024 at 08:58:28PM +0200, Jakub Jelinek wrote:
>   if (has_counted_by_object (e_p->value))
> expr.value = get_counted_by_ref (e_p->value);
>   else if (in_typeof && TREE_CODE (e_p->value) == COMPONENT_REF)
> {
>   tree counted_by_type = NULL_TREE;
>   tree arg = (*cexpr_
>   if (build_counted_by_ref (TREE_OPERAND (e_p->value, 0),
> TREE_OPERAND (e_p->value, 1),
> &counted_by_type))
> expr.value
>   = build_zero_cst (build_pointer_type (counted_by_type));
>   else
> expr.value = null_pointer_node;
> }
>   else
> expr.value = null_pointer_node;
> 
> (plus make build_counted_by_ref non-static and add prototype).
> 
> Completely untested.

Note, the above I think ought to work with say
__typeof (__builtin_counted_by_ref (obj->fam))
or
__alignof (*__builtin_counted_by_ref (obj->fam))
but will not work with say
struct with_fam obj;
void foo () {
...
__typeof (char [__builtin_counted_by_ref (obj.fam) == &obj.size ? 1 : -1])
...
}
etc. (and similar for alignof; something like that should go into the
testsuite).
So maybe better
tree arg = e_p->value;
tree f;
if ((in_typeof || in_alignof)
&& TREE_CODE (arg) == COMPONENT_REF
&& (f = TREE_OPERAND (arg, 1))
&& TREE_CODE (f) == FIELD_DECL
&& c_flexible_array_member_type_p (TREE_TYPE (f))
&& lookup_attribute ("counted_by", DECL_ATTRIBUTES (f))
&& DECL_NAME (f))
  arg = build_component_ref (EXPR_LOCATION (arg),
 TREE_OPERAND (arg, 0),
 DECL_NAME (f), UNKNOWN_LOCATION,
 UNKNOWN_LOCATION, true);
if (has_counted_by_object (arg))
  expr.value = get_counted_by_ref (arg);
else
  expr.value = null_pointer_node;

Jakub



Re: [PATCH v5] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-14 Thread Jakub Jelinek
On Wed, Sep 11, 2024 at 09:13:40PM +, Qing Zhao wrote:
> @@ -11741,6 +11770,55 @@ c_parser_postfix_expression (c_parser *parser)
>   set_c_expr_source_range (&expr, loc, close_paren_loc);
>   break;
> }
> + case RID_BUILTIN_COUNTED_BY_REF:
> +   {
> + vec *cexpr_list;
> + c_expr_t *e_p;
> + location_t close_paren_loc;
> +
> + in_builtin_counted_by_ref++;

I think this in_builtin_counted_by_ref stuff here plus the whole c-typeck.cc
hunk should be replaced with:

> + c_parser_consume_token (parser);
> + if (!c_parser_get_builtin_args (parser,
> + "__builtin_counted_by_ref",
> + &cexpr_list, false,
> + &close_paren_loc))
> +   {
> + expr.set_error ();
> + in_builtin_counted_by_ref--;
> + break;
> +   }
> + if (vec_safe_length (cexpr_list) != 1)
> +   {
> + error_at (loc, "wrong number of arguments to "
> +"%<__builtin_counted_by_ref%>");
> + expr.set_error ();
> + in_builtin_counted_by_ref--;
> + break;
> +   }
> +
> + e_p = &(*cexpr_list)[0];
> +
> + if (TREE_CODE (TREE_TYPE (e_p->value)) != ARRAY_TYPE)
> +   {
> + error_at (loc, "the argument must be an array"
> +"%<__builtin_counted_by_ref%>");
> + expr.set_error ();
> + in_builtin_counted_by_ref--;
> + break;
> +   }
> +

if (has_counted_by_object (e_p->value))
  expr.value = get_counted_by_ref (e_p->value);
else if (in_typeof && TREE_CODE (e_p->value) == COMPONENT_REF)
  {
tree counted_by_type = NULL_TREE;
tree arg = (*cexpr_
if (build_counted_by_ref (TREE_OPERAND (e_p->value, 0),
  TREE_OPERAND (e_p->value, 1),
  &counted_by_type))
  expr.value
= build_zero_cst (build_pointer_type (counted_by_type));
else
  expr.value = null_pointer_node;
  }
else
  expr.value = null_pointer_node;

(plus make build_counted_by_ref non-static and add prototype).

Completely untested.

> + if (has_counted_by_object ((*cexpr_list)[0].value))
> +   expr.value
> + = get_counted_by_ref ((*cexpr_list)[0].value);
> + else
> +   expr.value
> + = build_int_cst (build_pointer_type (void_type_node), 0);

Jakub



Re: [Patch, v3] gcn/mkoffload.cc: Use #embed for including the generated ELF file

2024-09-13 Thread Jakub Jelinek
On Fri, Sep 13, 2024 at 04:24:47PM +0200, Tobias Burnus wrote:
> As #embed is now supported by GCC (thanks!), I could commit this patch :-)

Thanks to Joseph for the reviews.

> Committed as r15-3629-g508ef585243d46 →
> https://gcc.gnu.org/r15-3629-g508ef585243d46
> 
> Unless I missed something, we need to wait for a few pending patches before
> there is a real speed up. However, first, that will come then automatically
> to GCN compilations and, secondly, the generated code is already much nicer
> thanks to #embed + seems to be a tiny tiny bit faster already.

Indeed, only the "dumb" support is in now including the 2 extensions,
so it will work, just not will speed things up (at least not significantly).
The rest whenever it is reviewed (and adjusted according to patch review).

Jakub



Re: [PATCH] libcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnostics

2024-09-13 Thread Jakub Jelinek
On Fri, Sep 13, 2024 at 01:19:02PM +0200, Richard Biener wrote:
> The genmatch.cc parts are OK - I wonder how much the new dependences
> push gimple-match* and generic-match* compilation to the end?

Only time will tell.
Looking at my past 3 x86_64-linux bootstraps (-j32) but I'm doing i686-linux
bootstrap concurrently (also -j32) - 2 without the patch, one with it,
the g*-match*.cc files were generated always in the same minute as
config.state and in the 2 older bootstraps first g*-match*.o was finished
~4 minutes after config.state and last ~6 minutes after it, while
in the latest bootstrap first g*-match*.o was ready ~1 minute after
config.state and last ~4 minutes after it.
Toplevel config.status to make -j32 bootstrap time was 86minutes
in the unpatched build, 76minutes in patched, but this really varies
quite a lot.
Basically, the old dependencies forced all non-generated files to wait
until all generated files were generated, the new dependencies force
all non-generated *.o files to wait until all but g*match* generated files
were generated and all non-generated *.o files but libcommon.a to wait
until all g*match* generated files were generated.  So, one pretty much
needs to generate the non-g*match* first, then compile libcommon.a (36
small objects), then generate g*match* and then compile the rest.
So, perhaps on > 36 slow CPUs in theory it might be slightly slower,
because there will be another time frame where those further CPUs are idle.

Jakub



[PATCH] libcpp: Fix up UB in finish_embed

2024-09-13 Thread Jakub Jelinek
Hi!

Jonathan reported on IRC that certain unnamed proprietary static analyzer
is unhappy about the new finish_embed function and it is actually right.
On a testcase like:
#embed __FILE__ limit (0) if_empty (0)
params->if_empty.count is 1, limit is 0, so count is 0 (we need just
a single token and one fits into pfile->directive_result).  Because
count is 0, we don't allocate toks, so it stays NULL, and then in
1301  if (prefix->count)
1302{
1303  *tok = *prefix->base_run.base;
1304  tok = toks;
1305  tokenrun *cur_run = &prefix->base_run;
1306  while (cur_run)
1307{
1308  size_t cnt = (cur_run->next ? cur_run->limit
1309: prefix->cur_token) - cur_run->base;
1310  cpp_token *t = cur_run->base;
1311  if (cur_run == &prefix->base_run)
1312{
1313  t++;
1314  cnt--;
1315}
1316  memcpy (tok, t, cnt * sizeof (cpp_token));
1317  tok += cnt;
1318  cur_run = cur_run->next;
1319}
1320}
the *tok = *prefix->base_run.base; assignment will copy the only
token.  cur_run is still non-NULL, cnt will be initially 1 and
then decremented to 0, but we invoke UB because we do
memcpy (NULL, cur_run->base + 1, 0 * sizeof (cpp_token));
and then the loop stops because cur_run->next must be NULL.

As we don't really copy anything, toks can be anything non-NULL,
so the following patch fixes that by initializing toks also to
&pfile->directive_result (just something known to be non-NULL).
This should be harmless even for the
#embed __FILE__ limit (1)
case (no non-empty prefix/suffix) where toks isn't allocated
either, but in that case prefix->count will be 0 and in the
1321  for (size_t i = 0; i < limit; ++i)
1322{
1323  tok->src_loc = params->loc;
1324  tok->type = CPP_NUMBER;
1325  tok->flags = NO_EXPAND;
1326  if (i == 0)
1327tok->flags |= PREV_WHITE;
1328  tok->val.str.text = s;
1329  tok->val.str.len = sprintf ((char *) s, "%d", buffer[i]);
1330  s += tok->val.str.len + 1;
1331  if (tok == &pfile->directive_result)
1332tok = toks;
1333  else
1334tok++;
1335  if (i < limit - 1)
1336{
1337  tok->src_loc = params->loc;
1338  tok->type = CPP_COMMA;
1339  tok->flags = NO_EXPAND;
1340  tok++;
1341}
1342}
loop limit will be 1, so tok is initially &pfile->directive_result,
that is stilled in, then tok = toks; (previously setting tok to NULL,
now to &pfile->directive_result again) and because 0 < 1 - 1 is
false, nothing further will happen and the loop will finish (and as
params->suffix.count will be 0, nothing further will use tok).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-12  Jakub Jelinek  

* files.cc (finish_embed): Initialize toks to tok rather
than NULL.

--- libcpp/files.cc.jj  2024-09-12 18:16:49.995409074 +0200
+++ libcpp/files.cc 2024-09-12 22:55:01.619765364 +0200
@@ -1284,7 +1284,7 @@ finish_embed (cpp_reader *pfile, _cpp_fi
 }
   uchar *s = len ? _cpp_unaligned_alloc (pfile, len) : NULL;
   _cpp_buff *tok_buff = NULL;
-  cpp_token *toks = NULL, *tok = &pfile->directive_result;
+  cpp_token *tok = &pfile->directive_result, *toks = tok;
   size_t count = 0;
   if (limit)
 count = (params->prefix.count + limit * 2 - 1

Jakub



[PATCH] c++: Don't emit deprecated/unavailable attribute diagnostics when creating cdtor thunks [PR116678]

2024-09-13 Thread Jakub Jelinek
Hi!

Another spot where we mark_used a function (in this case ctor or dtor)
even when it is just artificially used inside of thunks (emitted on mingw
with -Os for the testcase).

Bootstrapped/regtested on x86_64-linux and i686-linux and tested with
a cross compiler to x86_64-mingw on the testcase, ok for trunk?

2024-09-13  Jakub Jelinek  

PR c++/116678
* optimize.cc: Include decl.h.
(maybe_thunk_body): Temporarily change deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.

* g++.dg/warn/deprecated-20.C: New test.

--- gcc/cp/optimize.cc.jj   2024-05-21 10:19:34.484528180 +0200
+++ gcc/cp/optimize.cc  2024-09-12 13:56:30.710186354 +0200
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.
 #include "coretypes.h"
 #include "target.h"
 #include "cp-tree.h"
+#include "decl.h"
 #include "stringpool.h"
 #include "cgraph.h"
 #include "debug.h"
@@ -287,6 +288,11 @@ maybe_thunk_body (tree fn, bool force)
   if (ctor_omit_inherited_parms (fns[0]))
 return 0;
 
+  /* Don't diagnose deprecated or unavailable cdtors just because they
+ have thunks emitted for them.  */
+  auto du = make_temp_override (deprecated_state,
+   UNAVAILABLE_DEPRECATED_SUPPRESS);
+
   DECL_ABSTRACT_P (fn) = false;
   if (!DECL_WEAK (fn))
 {
--- gcc/testsuite/g++.dg/warn/deprecated-20.C.jj2024-09-12 
13:54:58.850422325 +0200
+++ gcc/testsuite/g++.dg/warn/deprecated-20.C   2024-09-12 13:55:27.510036714 
+0200
@@ -0,0 +1,16 @@
+// PR c++/116678
+// { dg-do compile }
+// { dg-options "-Os -pedantic" }
+
+struct S
+{
+  [[deprecated]] S () { s = 1; }   // { dg-bogus "'S::S\\\(\\\)' is 
deprecated" }
+  S (int x) { s = x; } // { dg-warning "C\\\+\\\+11 attributes 
only available with" "" { target c++98_only } .-1 }
+  ~S () {}
+  int s;
+};
+
+int
+main ()
+{
+}

Jakub



Re: [PATCH] libcpp: Implement clang -Wheader-guard warning [PR96842]

2024-09-12 Thread Jakub Jelinek
On Thu, Sep 12, 2024 at 11:12:26AM -0400, David Malcolm wrote:
> We were chatting on IRC about how it would be nice to be able to use
> %qs in libcppp diagnostics; here is an example (rather than using
> \"%s\").

Yeah, I'm working on a patch for that.

> Not a blocker, but it occurs to me that ideally we'd group the warning
> and note into a diagnostic group, but unfortunately there's no way to
> express that currently via the interface libcpp has.  We would need to
> add {begin,end}_group hooks, which in turn suggests that maybe that
> libcpp's interface into diagnostics should be an abstract base class
> with various vfuncs, rather than a callback.

I haven't added auto_diagnostic_group because nothing in libcpp does that,
yes, we need some solution for that.

> Also not a blocker, but it would nice to have a fix-it hint here, by
> using the rich_location overload of cpp_error_at and adding a fix-it
> hint to the rich_location.

And yes, I was thinking about fix-it hint, but I think that depends on
better locations there first, currently the patch uses just the lines
with the directives.
I was hoping that can be done incrementally.

Jakub



[PATCH] libcpp, v5: Add support for gnu::base64 #embed parameter

2024-09-12 Thread Jakub Jelinek
On Wed, Sep 11, 2024 at 10:23:20PM +, Joseph Myers wrote:
> On Fri, 30 Aug 2024, Jakub Jelinek wrote:
> 
> > +should be no newlines in the string literal and because this parameter
> > +is meant namely for use by the preprocessor itself, there is no support
> > +for any escape sequences in the string literal argument.  If 
> > @code{gnu::base64}
> 
> Given the "no escape sequences" rule, I think there should be a test for 
> that - testing rejection of a string that would be valid if escape 
> sequences were processed (for example, valid base64 but with the 
> individual characters encoded using \x), but is not valid because they are 
> not processed.  As far as I can see, the existing tests with escape 
> sequences are invalid for other reasons (they use \n as the escape 
> sequence).

Thanks.

Here is an updated patch and before that just the incremental diff from
the previous patch.

--- gcc/testsuite/c-c++-common/cpp/embed-18.c   2024-08-30 18:43:18.132097274 
+0200
+++ gcc/testsuite/c-c++-common/cpp/embed-18.c   2024-09-12 12:29:25.919231207 
+0200
@@ -17,6 +17,12 @@
 #embed "." gnu::base64("") /* { dg-error "'gnu::base64' argument not valid 
base64 encoded string" } */
 #embed "." gnu::base64("a===") /* { dg-error "'gnu::base64' argument not valid 
base64 encoded string" } */
 #embed "." 
gnu::base64("TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2NpbmcgZWxpdCwg\nc2VkIGRvIGVpdXNtb2QgdGVtcG9yIGluY2lkaWR1bnQgdXQgbGFib3JlIGV0IGRvbG9yZSBtYWdu\nYSBhbGlxdWEuCg==")
 /* { dg-error "'gnu::base64' argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\x53\x41\x3d\x3d") /* { dg-error "'gnu::base64' 
argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\123\101\075\075") /* { dg-error "'gnu::base64' 
argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\u0053\u0041\u003d\u003d") /* { dg-error 
"'gnu::base64' argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\u{53}\u{41}\u{3d}\u{3d}") /* { dg-error 
"'gnu::base64' argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\U0053\U0041\U003d\U003d") /* { 
dg-error "'gnu::base64' argument not valid base64 encoded string" } */
+#embed "." gnu::base64("\N{LATIN CAPITAL LETTER S}\N{LATIN CAPITAL LETTER 
A}\N{LATIN CAPITAL LETTER A}\N{LATIN CAPITAL LETTER A}") /* { dg-error 
"'gnu::base64' argument not valid base64 encoded string" } */
 #embed "embed-18.c" gnu::base64("SA==") /* { dg-error "'gnu::base64' parameter 
can be only used with \\\".\\\"" } */
 #embed  gnu::base64("SA==") /* { dg-error "'gnu::base64' parameter 
can be only used with \\\".\\\"" } */
 #embed <.> gnu::base64("SA==") /* { dg-error "'gnu::base64' parameter can be 
only used with \\\".\\\"" } */

Tested on x86_64-linux.

2024-09-12  Jakub Jelinek  

libcpp/
* internal.h (struct cpp_embed_params): Add base64 member.
(_cpp_free_embed_params_tokens): Declare.
* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
(EMBED_PARAMS): Add gnu::base64 entry.
(_cpp_parse_embed_params): Parse gnu::base64 parameter.  If
-fpreprocessed without -fdirectives-only, require #embed to have
gnu::base64 parameter.  Diagnose conflict between gnu::base64 and
limit or gnu::offset parameters.
(do_embed): Use _cpp_free_embed_params_tokens.
* files.cc (finish_embed, base64_dec_fn): New functions.
(base64_dec): New array.
(B64D0, B64D1, B64D2, B64D3): Define.
(finish_base64_embed): New function.
(_cpp_stack_embed): Use finish_embed.  Handle params->base64
using finish_base64_embed.
* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
gcc/
* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
parameter.
gcc/testsuite/
* c-c++-common/cpp/embed-17.c: New test.
* c-c++-common/cpp/embed-18.c: New test.
* c-c++-common/cpp/embed-19.c: New test.
* c-c++-common/cpp/embed-27.c: New test.
* gcc.dg/cpp/embed-6.c: New test.
* gcc.dg/cpp/embed-7.c: New test.

--- libcpp/internal.h.jj2024-09-12 11:33:53.173949013 +0200
+++ libcpp/internal.h   2024-09-12 11:45:39.455488076 +0200
@@ -638,7 +638,7 @@ struct cpp_embed_params
   locati

Re: [PATCH] cselib: Discard useless locs of preserved VALUEs [PR116627]

2024-09-11 Thread Jakub Jelinek
On Wed, Sep 11, 2024 at 11:26:27PM +0200, Jakub Jelinek wrote:
> I think we need to discuard useless locs even from the preserved VALUEs.
> That IMHO shouldn't create any further useless VALUEs, the preserved
> VALUEs are never useless, so we don't need to iterate with it, can do it
> just once, but IMHO it needs to be done because actually
> discard_useless_values.
> 
> The following patch does that.

Note, I've verified the patch on x86_64-linux cc1plus didn't change
anything at all on the resulting cc1plus binary (compared it to one
bootstrapped without this patch with the patch later applied and
make cc1plus done in the stage3, the only change in the binary was
16 bytes of executable_checksum).

Jakub



[PATCH] libcpp: Implement clang -Wheader-guard warning [PR96842]

2024-09-11 Thread Jakub Jelinek
Hi!

The following patch implements the clang -Wheader-guard warning, which warns
if a valid multiple inclusion header guard's #ifndef/#if !defined directive
is immediately (no other non-line directives nor other (non-comment)
tokens in between) followed by #define directive for some different macro,
which in get_suggestion rules is close enough to the actual header guard
macro (i.e. likely misspelling), the #define is object-like with empty
definition (I've followed what clang implements) and the macro isn't defined
later on (at least not on the final #endif at the end of a header).

In this case it emits a warning, so that
#ifndef STDIO_H
#define STDOI_H
...
#endif
or similar misspellings can be caught.

clang enables this warning by default, but I've put it into -Wall instead
as it still seems to be a style warning, nothing more severe; if a header
doesn't survive multiple inclusion because of the misspelling, users will
get different diagnostics.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-11  Jakub Jelinek  

PR preprocessor/96842
libcpp/
* include/cpplib.h (struct cpp_options): Add warn_header_guard member.
(enum cpp_warning_reason): Add CPP_W_HEADER_GUARD enumerator.
* internal.h (struct cpp_reader): Add mi_def_cmacro, mi_loc and
mi_def_loc members.
(_cpp_defined_macro_p): Constify type pointed by argument type.
Formatting fix.
* init.cc (cpp_create_reader): Clear
CPP_OPTION (pfile, warn_header_guard).
* directives.cc (struct if_stack): Add def_loc and mi_def_cmacro
members.
(DIRECTIVE_TABLE): Add IF_COND flag to define.
(do_define): Set ifs->mi_def_cmacro on a define immediately following
#ifndef directive for the guard.  Clear pfile->mi_valid.  Formatting
fix.
(do_endif): Copy over pfile->mi_def_cmacro and pfile->mi_def_loc
if ifs->mi_def_cmacro is set and pfile->mi_cmacro isn't a defined
macro.
(push_conditional): Clear mi_def_cmacro and mi_def_loc members.
* files.cc (_cpp_pop_file_buffer): Emit -Wheader-guard diagnostics.
gcc/
* doc/invoke.texi (Wheader-guard): Document.
gcc/c-family/
* c.opt (Wheader-guard): New option.
* c.opt.urls: Regenerated.
* c-ppoutput.cc (init_pp_output): Initialize also cb->get_suggestion.
gcc/testsuite/
* c-c++-common/cpp/Wheader-guard-1.c: New test.
* c-c++-common/cpp/Wheader-guard-1-1.h: New test.
* c-c++-common/cpp/Wheader-guard-1-2.h: New test.
* c-c++-common/cpp/Wheader-guard-1-3.h: New test.
* c-c++-common/cpp/Wheader-guard-1-4.h: New test.
* c-c++-common/cpp/Wheader-guard-1-5.h: New test.
* c-c++-common/cpp/Wheader-guard-1-6.h: New test.
* c-c++-common/cpp/Wheader-guard-1-7.h: New test.
* c-c++-common/cpp/Wheader-guard-1-8.h: New test.
* c-c++-common/cpp/Wheader-guard-1-9.h: New test.
* c-c++-common/cpp/Wheader-guard-1-10.h: New test.
* c-c++-common/cpp/Wheader-guard-1-11.h: New test.
* c-c++-common/cpp/Wheader-guard-1-12.h: New test.
* c-c++-common/cpp/Wheader-guard-2.c: New test.
* c-c++-common/cpp/Wheader-guard-2.h: New test.
* c-c++-common/cpp/Wheader-guard-3.c: New test.
* c-c++-common/cpp/Wheader-guard-3.h: New test.

--- libcpp/include/cpplib.h.jj  2024-09-03 16:47:47.323031836 +0200
+++ libcpp/include/cpplib.h 2024-09-11 16:39:36.373680969 +0200
@@ -435,6 +435,10 @@ struct cpp_options
   /* Different -Wimplicit-fallthrough= levels.  */
   unsigned char cpp_warn_implicit_fallthrough;
 
+  /* Nonzero means warn about a define of a different macro right after
+ #ifndef/#if !defined header guard directive.  */
+  unsigned char warn_header_guard;
+
   /* Nonzero means we should look for header.gcc files that remap file
  names.  */
   unsigned char remap;
@@ -702,7 +706,8 @@ enum cpp_warning_reason {
   CPP_W_EXPANSION_TO_DEFINED,
   CPP_W_BIDIRECTIONAL,
   CPP_W_INVALID_UTF8,
-  CPP_W_UNICODE
+  CPP_W_UNICODE,
+  CPP_W_HEADER_GUARD
 };
 
 /* Callback for header lookup for HEADER, which is the name of a
--- libcpp/internal.h.jj2024-09-03 16:47:47.324031823 +0200
+++ libcpp/internal.h   2024-09-11 17:09:26.481097532 +0200
@@ -493,9 +493,11 @@ struct cpp_reader
  been used.  */
   bool seen_once_only;
 
-  /* Multiple include optimization.  */
+  /* Multiple include optimization and -Wheader-guard warning.  */
   const cpp_hashnode *mi_cmacro;
   const cpp_hashnode *mi_ind_cmacro;
+  const cpp_hashnode *mi_def_cmacro;
+  location_t mi_loc, mi_def_loc;
   bool mi_valid;
 
   /* Lexing.  */
@@ -676,7 +678,8 @@ _cpp_in_main_source_file (cpp_reader *pf
 }
 
 /* True if NODE is a macro for the purposes of ifdef, defined etc.  */
-inline bool _cpp_defined_macro_p (cpp_hashnode *node)
+inline bool
+_cpp_defined_macro_p (const cpp_hashn

[PATCH] c++: Disable deprecated/unavailable diagnostics when creating thunks for methods with such attributes [PR116636]

2024-09-11 Thread Jakub Jelinek
Hi!

On the following testcase, we emit false positive warnings/errors about using
the deprecated or unavailable methods when creating thunks for them, even
when nothing (in the testcase so far) actually used those.

The following patch temporarily disables that diagnostics when creating
the thunks.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-11  Jakub Jelinek  

PR c++/116636
* method.cc: Include decl.h.
(use_thunk): Temporarily change deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.

* g++.dg/warn/deprecated-19.C: New test.

--- gcc/cp/method.cc.jj 2024-09-06 13:43:37.823301244 +0200
+++ gcc/cp/method.cc2024-09-11 12:19:57.420486173 +0200
@@ -26,6 +26,7 @@ along with GCC; see the file COPYING3.
 #include "coretypes.h"
 #include "target.h"
 #include "cp-tree.h"
+#include "decl.h"
 #include "stringpool.h"
 #include "cgraph.h"
 #include "varasm.h"
@@ -283,6 +284,11 @@ use_thunk (tree thunk_fndecl, bool emit_
   /* Thunks are always addressable; they only appear in vtables.  */
   TREE_ADDRESSABLE (thunk_fndecl) = 1;
 
+  /* Don't diagnose deprecated or unavailable functions just because they
+ have thunks emitted for them.  */
+  auto du = make_temp_override (deprecated_state,
+UNAVAILABLE_DEPRECATED_SUPPRESS);
+
   /* Figure out what function is being thunked to.  It's referenced in
  this translation unit.  */
   TREE_ADDRESSABLE (function) = 1;
--- gcc/testsuite/g++.dg/warn/deprecated-19.C.jj2024-09-11 
12:50:25.34263 +0200
+++ gcc/testsuite/g++.dg/warn/deprecated-19.C   2024-09-11 13:05:29.210222060 
+0200
@@ -0,0 +1,22 @@
+// PR c++/116636
+// { dg-do compile }
+// { dg-options "-pedantic -Wdeprecated" }
+
+struct A {
+  virtual int foo () = 0;
+};
+struct B : virtual A {
+  [[deprecated]] int foo () { return 0; }  // { dg-message "declared here" 
}
+}; // { dg-warning "C\\\+\\\+11 
attributes only available with" "" { target c++98_only } .-1 }
+struct C : virtual A {
+  [[gnu::unavailable]] int foo () { return 0; }// { dg-message 
"declared here" }
+}; // { dg-warning "C\\\+\\\+11 
attributes only available with" "" { target c++98_only } .-1 }
+
+void
+bar ()
+{
+  B b;
+  b.foo ();// { dg-warning "'virtual int 
B::foo\\\(\\\)' is deprecated" }
+  C c;
+  c.foo ();// { dg-error "'virtual int 
C::foo\\\(\\\)' is unavailable" }
+}

Jakub



[PATCH] cselib: Discard useless locs of preserved VALUEs [PR116627]

2024-09-11 Thread Jakub Jelinek
Hi!

remove_useless_values iteratively discards useless locs (locs of
cselib_val which refer to non-preserved VALUEs with no locations),
which in turn can make further values useless until no further VALUEs
are made useless and then discards the useless VALUEs.

Preserved VALUEs (something done during var-tracking only I think)
live in a different hash table, cselib_preserved_hash_table rather
than cselib_hash_table.  cselib_find_slot first looks up slot in
cselib_preserved_hash_table and only if not found looks it up in
cselib_hash_table (and INSERTs only into the latter), whereas preservation
of a VALUE results in move of a cselib_val from the latter to the former
hash table.

The testcase in the PR (apparently too fragile, it only reproduces on 14
branch with various flags on a single arch, not on trunk) ICEs, because
we have a preserved VALUE (QImode with (const_int 0) as one of the locs).
In a different BB SImode r2 is looked up, a non-preserved VALUE is created
for it, and the r13-2916 added code attempts to lookup also SUBREGs of that
in narrower modes, among those QImode, so adds to that SImode r2
non-preserve VALUE a new loc of (subreg:QI (value:SI) 0).  That SImode
value is considered useless, so remove_useless_value discards it, but
nothing discarded it from the preserved VALUE's loc_list, so when looking
something up in the hash table we ICE trying to derevence CSELIB_VAL
of the discarded VALUE.

I think we need to discuard useless locs even from the preserved VALUEs.
That IMHO shouldn't create any further useless VALUEs, the preserved
VALUEs are never useless, so we don't need to iterate with it, can do it
just once, but IMHO it needs to be done because actually
discard_useless_values.

The following patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-11  Jakub Jelinek  

PR target/116627
* cselib.cc (remove_useless_values): Discard useless locs
even from preserved cselib_vals in cselib_preserved_hash_table
hash table.

--- gcc/cselib.cc.jj2024-04-26 11:46:54.960269768 +0200
+++ gcc/cselib.cc   2024-09-11 10:54:05.018242593 +0200
@@ -751,6 +751,11 @@ remove_useless_values (void)
   }
   *p = &dummy_val;
 
+  if (cselib_preserve_constants)
+cselib_preserved_hash_table->traverse  (NULL);
+  gcc_assert (!values_became_useless);
+
   n_useless_values += n_useless_debug_values;
   n_debug_values -= n_useless_debug_values;
   n_useless_debug_values = 0;

Jakub



Re: [PATCH] c++: Implement for namespace statics CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-11 Thread Jakub Jelinek
On Wed, Sep 11, 2024 at 10:16:18PM +1000, Nathaniel Shead wrote:
> In the header_module_p case, it is valid to have internal linkage
> definitions (e.g. in an anonymous namespace), but in that case the
> {static,tls}_aggregates lists should still be in place to be streamed
> and everything should work as "normal".

As the patch doesn't touch the streaming of {static,tls}_aggregates
in that case, I guess that means CWG 2867 will not be fixed for those
cases (i.e. temporaries from the structured binding base initialization
will be destructed at the end of that initialization, rather than at the
end of subsequent get initializers); perhaps we should stream the
STATIC_INIT_DECOMP_*BASE_P flags say by streaming there integer_zero_node
or integer_one_node right before the decls and on streaming it back set
the flags again.  For the !header_module_p case, we'll need a testcase too
to make sure it works properly.

Jakub



Re: [PATCH v4] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-10 Thread Jakub Jelinek
On Tue, Sep 10, 2024 at 09:28:04PM +, Qing Zhao wrote:
> @@ -11741,6 +11770,54 @@ c_parser_postfix_expression (c_parser *parser)
>   set_c_expr_source_range (&expr, loc, close_paren_loc);
>   break;
> }
> + case RID_BUILTIN_COUNTED_BY_REF:
> +   {
> + vec *cexpr_list;
> + c_expr_t *e_p;
> + location_t close_paren_loc;
> +
> + in_builtin_counted_by_ref = true;
> +
> + c_parser_consume_token (parser);
> + if (!c_parser_get_builtin_args (parser,
> + "__builtin_counted_by_ref",
> + &cexpr_list, false,
> + &close_paren_loc))
> +   {
> + expr.set_error ();
> + goto error_exit;

Up to Joseph or Marek as C maintainers/reviewers, but I think it is a bad
idea to use such a generic name for a label inside of handling of one
specific keyword.

Either use RAII and just break; instead of goto error_exit;, like
struct in_builtin_counted_by_ref_sentinel {
  ~in_builtin_counted_by_ref_sentinel ()
  { in_builtin_counted_by_ref = false; }
} ibcbr_sentinel;
or add those in_builtin_counted_by_ref = false; lines before each break;

Or set it just when parsing the args?

Anyway, I'm not even convinced a global variable like that is a good idea.
The argument can contain arbitrary expressions in there (e.g. comma
expression, statement expression, ...), I strongly doubt you want to
have that special handling in all the places in the grammar rather than just
for the last COMPONENT_REF in there.  And, there is no reason why
you couldn't have e.g. nested call inside of the argument:
__builtin_counted_by_ref (ptr[*__builtin_counted_by_ref 
(something->fam1)]->fam2)
and that on the other side clears in_builtin_counted_by_ref after parsing
the inner one.

Jakub



Re: [PATCH v2] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-10 Thread Jakub Jelinek
On Tue, Sep 10, 2024 at 06:31:23PM +, Qing Zhao wrote:
> 
> 
> > On Sep 10, 2024, at 14:09, Jakub Jelinek  wrote:
> > 
> > On Tue, Sep 10, 2024 at 06:02:45PM +, Qing Zhao wrote:
> >>> #define alloc(P, FAM, COUNT) ({ \
> >>> __auto_type __p = &(P); \
> >>> __auto_type __c = (COUNT); \
> >>> size_t __size = sizeof(*(*__p)) + sizeof(*(*__p)->FAM) * __c; \
> > 
> > Shouldn't that be
> >  size_t __size = offsetof(__typeof(*__p), FAM) + sizeof(*(*__p)->FAM) * 
> > __c; \
> > ?
> 
> Yeah, I think that the correct size computation should be:
> 
> #define MAX(A, B) (A > B) ? (A) : (B)
> size_t __size = MAX (sizeof (*(*__p)), offsetof(__typeof(*__p), FAM) + 
> sizeof(*(*__p)->FAM) * __c); \

No, why?  sizeof (*(*__p)) should be always >= offsetof(__typeof(*__p), FAM),
you can't have an offset outside of a structure (ok, except doing something
like use fld[100] as FAM).  offsetof + sizeof (elt) * count is the actually
needed size, say if it is
struct S { size_t a; char b; __attribute__((counted_by (a))) char c[]; };
then you don't really need 2 * sizeof (size_t) + N size of N elements
in the flexible array, just sizeof (size_t) + 1 + N is enough.

Or is counted_by attribute handling it in some weird way?

Jakub



[PATCH] c++: Implement for namespace statics CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-10 Thread Jakub Jelinek
Hi!

The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662507.html
patch adds CWG 2867 support for namespace locals.

Those vars are just pushed into {static,tls}_aggregates chain, then
pruned from those lists, separated by priority and finally emitted into
the corresponding dynamic initialization functions.
The patch adds two flags used on the TREE_LIST nodes in those lists,
one marks the structured binding base variable and/or associated ref
extended temps, another marks the vars initialized using get methods.
The flags are preserved across the pruning, for splitting into by priority
all associated decls of a structured binding using tuple* are forced
into the same priority as the first one, and finally when actually emitting
code, CLEANUP_POINT_EXPRs are disabled in the base initializer(s) and
code from the bases and non-bases together is wrapped into a single
CLEANUP_POINT_EXPR.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, I haven't touched the module handling; from what I can see,
prune_vars_needing_no_initialization is destructive to the
{static,tls}_aggregates lists (keeps the list NULL at the end or if there
are errors or it contains some DECL_EXTERNAL decls, keeps in there just
those, not the actual vars that need dynamic initialization) and
the module writing is done only afterwards, so I think it could work
reasonably only if header_module_p ().  Can namespace scope structured
bindings appear in header_module_p () or !header_module_p () modules?
How would a testcase using them look like?  Especially when structured
bindings can't be extern nor templates nor inline there can be just one
definition, so the module would need to be included in a single file, no?
In any case, the patch shouldn't make the modules case any worse, it
just adds TREE_LIST flags which will not be streamed for modules and so
if one can use structured bindings in modules, possibly CWG 2867 would be
not fixed for those but nothing worse than that.

2024-09-10  Jakub Jelinek  

PR c++/115769
gcc/cp/
* cp-tree.h (STATIC_INIT_DECOMP_BASE_P): Define.
(STATIC_INIT_DECOMP_NONBASE_P): Define.
* decl.cc (cp_finish_decl): Mark nodes in {static,tls}_aggregates
with 
* decl2.cc (decomp_handle_one_var, decomp_finalize_var_list): New
functions.
(emit_partial_init_fini_fn): Use them.
(prune_vars_needing_no_initialization): Clear
STATIC_INIT_DECOMP_*BASE_P flags if needed.
(partition_vars_for_init_fini): Use same priority for
consecutive STATIC_INIT_DECOMP_*BASE_P vars and propagate
those flags to new TREE_LISTs when possible.  Formatting fix.
(handle_tls_init): Use decomp_handle_one_var and
decomp_finalize_var_list functions.
gcc/testsuite/
* g++.dg/DRs/dr2867-5.C: New test.
* g++.dg/DRs/dr2867-6.C: New test.
* g++.dg/DRs/dr2867-7.C: New test.
* g++.dg/DRs/dr2867-8.C: New test.

--- gcc/cp/cp-tree.h.jj 2024-09-07 09:31:20.601484156 +0200
+++ gcc/cp/cp-tree.h2024-09-09 15:53:44.924112247 +0200
@@ -470,6 +470,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   BASELINK_FUNCTIONS_MAYBE_INCOMPLETE_P (in BASELINK)
   BIND_EXPR_VEC_DTOR (in BIND_EXPR)
   ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (in ATOMIC_CONSTR)
+  STATIC_INIT_DECOMP_BASE_P (in the TREE_LIST for {static,tls}_aggregates)
2: IDENTIFIER_KIND_BIT_2 (in IDENTIFIER_NODE)
   ICS_THIS_FLAG (in _CONV)
   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (in VAR_DECL)
@@ -489,6 +490,8 @@ extern GTY(()) tree cp_global_trees[CPTI
   IMPLICIT_CONV_EXPR_BRACED_INIT (in IMPLICIT_CONV_EXPR)
   PACK_EXPANSION_AUTO_P (in *_PACK_EXPANSION)
   contract_semantic (in ASSERTION_, PRECONDITION_, POSTCONDITION_STMT)
+  STATIC_INIT_DECOMP_NONBASE_P (in the TREE_LIST
+   for {static,tls}_aggregates)
3: IMPLICIT_RVALUE_P (in NON_LVALUE_EXPR or STATIC_CAST_EXPR)
   ICS_BAD_FLAG (in _CONV)
   FN_TRY_BLOCK_P (in TRY_BLOCK)
@@ -5947,6 +5950,21 @@ extern bool defer_mangling_aliases;
 
 extern bool flag_noexcept_type;
 
+/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic
+   initialization of namespace scope structured binding base or related
+   extended ref init temps.  Temporaries from the initialization of
+   STATIC_INIT_DECOMP_BASE_P dynamic initializers should be destroyed only
+   after the last STATIC_INIT_DECOMP_NONBASE_P dynamic initializer following
+   it.  */
+#define STATIC_INIT_DECOMP_BASE_P(NODE) \
+  TREE_LANG_FLAG_1 (TREE_LIST_CHECK (NODE))
+
+/* True if this TREE_LIST in {static,tls}_aggregates is a for dynamic
+   initialization of namespace scope structured binding non-base
+   variable using get.  */
+#define STATIC_INIT_DECOMP_NONBASE_P(NODE) \
+  TREE_LANG_FLAG_2 (TREE_LIST_CHECK (NODE))
+
 /* A list of namespace-scope objects which have constructors or

Re: [PATCH v2] Provide new GCC builtin __builtin_counted_by_ref [PR116016]

2024-09-10 Thread Jakub Jelinek
On Tue, Sep 10, 2024 at 06:02:45PM +, Qing Zhao wrote:
> > #define alloc(P, FAM, COUNT) ({ \
> >  __auto_type __p = &(P); \
> >  __auto_type __c = (COUNT); \
> >  size_t __size = sizeof(*(*__p)) + sizeof(*(*__p)->FAM) * __c; \

Shouldn't that be
  size_t __size = offsetof(__typeof(*__p), FAM) + sizeof(*(*__p)->FAM) * __c; \
?

> >  if ((*__p = malloc(__size))) { \ 
> >__auto_type ret = __builtin_counted_by_ref((*__p)->FAM); \
> > *_Generic(ret, void *: &(size_t){0}, default: ret) = __c; \
> >  } \
> > })
> > 
> > to have brackets around the macro arguments to avoid accidents,
> > to reduce compile time due to multiple evaluation of the macro
> > arguments, and to avoid warnings for the null pointer dereference
> > on clang.

Jakub



Re: [Patch][RFC] Fortran/OpenMP: Middle-end support for mapping of DT with allocatable components

2024-09-10 Thread Jakub Jelinek
On Tue, Sep 10, 2024 at 12:19:33PM +0200, Tobias Burnus wrote:
> Background: OpenMP states that for 'map(var)', all allocatable components
> of 'var' will automatically also be mapped ('deep mapping').

Not a review, just a comment.  This kind of recursive mapping is also
what needs to happen for declare mapper, so wonder if that shouldn't be
solved together; and some way to merge mappings of one field after another
with the same way if consecutive fields (with possibly some padding bits
in between) are mapped the same way.

Jakub



Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-09 Thread Jakub Jelinek
On Mon, Sep 09, 2024 at 02:55:25PM -0600, Sandra Loosemore wrote:
> On 9/9/24 05:01, Jakub Jelinek wrote:
> > 
> > I think also testing the device={kind(any,any)} and device={kind("any",any)}
> > and device={kind(any,"any"))} would be useful.
> 
> Hmmm, it looks like GCC does not presently check for the restriction
> 
> "Each trait-property may only be specified once in a trait selector other
> than those in the construct selector set."

Seems that restriction is new in 5.1 (which is why it hasn't been checked in
the original code).

Jakub



Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Jakub Jelinek
On Mon, Sep 09, 2024 at 02:10:05PM +, Qing Zhao wrote:
> Okay, now after finishing reading all the discussion so far, I realized that 
> we are back to the previous pointer approach:
> 
> __builtin_get_counted_by (p->FAM)
> 
> Works as:
> 
> If (has_counted_by (p->FAM))
>   return &p->COUNT;
> else
>   return (void *)0;
> 
> Then the user will use it as:
> 
> auto p = __builtin_get_counted_by(__p->FAM)
> *_Generic(p, void*: &(int){}, default: p) = COUNT;
> 
> The major reason for back to the previous pointer approach is (from Martin):
> 
> "The initial proposal already has this as "Alternative Design"
> and then there is this response to my comment:
> 
> https://discourse.llvm.org/t/rfc-introducing-new-clang-builtin-builtin-get-counted-by/80836/27
> 
> which seems to imply that she is open to this idea.

I'd strongly prefer the pointer variant over some ugly hack like what is
proposed for the returning lvalue case.  Yes, returning various different
pointer types is still not a normal builtin, but it is in line with various
overloaded builtins where the return type depends on the argument types
or other details, like __atomic_{load,exchange,compare_exchage} etc.

clang with -fbounds-safety can impose some restrictions like that the
pointer shouldn't escape the current function and be just dereferenced
and not say cast to integer etc.
If they "return lvalue", what prevents &__builtin_whatever (p->FAM)?
And I still don't understand how they can avoid taking the address of the
counter, can't one bypass that by say
(int *) ((char *)&obj + offsetof (struct something, obj))
?

Jakub



Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-09 Thread Jakub Jelinek
On Sun, Sep 08, 2024 at 09:15:23AM -0600, Sandra Loosemore wrote:
> On 8/16/24 06:58, Jakub Jelinek wrote:
> > 
> > If this can apply (perhaps with small fuzz) to vanilla trunk, guess it can
> > be committed right now, doesn't need to wait for the rest of the
> > metadirective patch set.
> 
> OK.  I've tested the slightly cleaned-up version of the patch which is
> attached; I'll push it in the next day or two if there is no further
> objection.
> 
> -Sandra

> From 23a82bea26805f2a430354d56b444d5bb7eaed3f Mon Sep 17 00:00:00 2001
> From: Sandra Loosemore 
> Date: Fri, 6 Sep 2024 20:58:13 +
> Subject: [PATCH] OpenMP: Reject other properties with kind(any)
> 
> TR13 (pre-6.0) of the OpenMP spec says:
> 
> "If trait-property any is specified in the kind trait-selector of the
> device selector set or the target_device selector sets, no other
> trait-property may be specified in the same selector set."
> 
> This restriction dates back to OpenMP 5.1 (where it had slightly
> different wording).  GCC's implementation was based on the 5.0 spec, and
> several testcases include selectors that are now considered invalid.
> This patch adds a diagnostic and fixes the testcases.
> 
> gcc/ChangeLog
>   * omp-general.cc (omp_check_context_selector): Reject other
>   properties in the same selector set with kind(any).
> 
> gcc/testsuite/ChangeLog
>   * c-c++-common/gomp/declare-variant-10.c: Fix broken tests.
>   * c-c++-common/gomp/declare-variant-3.c: Likewise.
>   * c-c++-common/gomp/declare-variant-9.c: Likewise.
>   * c-c++-common/gomp/declare-variant-any.c: New.
>   * gfortran.dg/gomp/declare-variant-10.f90: Fix broken tests.
>   * gfortran.dg/gomp/declare-variant-3.f90: Likewise.
>   * gfortran.dg/gomp/declare-variant-9.f90: Likewise.
>   * gfortran.dg/gomp/declare-variant-any.f90: New.
> ---
>  gcc/omp-general.cc| 35 +++
>  .../c-c++-common/gomp/declare-variant-10.c|  4 +--
>  .../c-c++-common/gomp/declare-variant-3.c | 10 ++
>  .../c-c++-common/gomp/declare-variant-9.c |  4 +--
>  .../c-c++-common/gomp/declare-variant-any.c   | 10 ++
>  .../gfortran.dg/gomp/declare-variant-10.f90   |  4 +--
>  .../gfortran.dg/gomp/declare-variant-3.f90| 12 ++-
>  .../gfortran.dg/gomp/declare-variant-9.f90|  2 +-
>  .../gfortran.dg/gomp/declare-variant-any.f90  | 28 +++
>  9 files changed, 86 insertions(+), 23 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/gomp/declare-variant-any.c
>  create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-variant-any.f90
> 
> diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
> index 0b61335dba4..aa53c00bec5 100644
> --- a/gcc/omp-general.cc
> +++ b/gcc/omp-general.cc
> @@ -1288,6 +1288,8 @@ omp_check_context_selector (location_t loc, tree ctx)
>for (tree tss = ctx; tss; tss = TREE_CHAIN (tss))
>  {
>enum omp_tss_code tss_code = OMP_TSS_CODE (tss);
> +  bool saw_any_prop = false;
> +  bool saw_other_prop = false;
>  
>/* We can parse this, but not handle it yet.  */
>if (tss_code == OMP_TRAIT_SET_TARGET_DEVICE)
> @@ -1324,6 +1326,31 @@ omp_check_context_selector (location_t loc, tree ctx)
> else
>   ts_seen[ts_code] = true;
>  
> +

I think just one empty line is enough, no need for two.

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-any.c
> @@ -0,0 +1,10 @@
> +extern int f1 (int);
> +extern int f2 (int);
> +extern int f3 (int);
> +extern int f4 (int);
> +
> +#pragma omp declare variant (f1) match (device={kind(any,gpu)})  /* { 
> dg-error "no other trait-property may be specified" } */
> +#pragma omp declare variant (f2) match (device={kind(cpu,"any")})  /* { 
> dg-error "no other trait-property may be specified" } */
> +#pragma omp declare variant (f3) match (device={kind("any"),arch(x86_64)})  
> /* { dg-error "no other trait-property may be specified" } */
> +#pragma omp declare variant (f4) match (device={arch(x86_64),kind(any)})  /* 
> { dg-error "no other trait-property may be specified" } */

I think also testing the device={kind(any,any)} and device={kind("any",any)}
and device={kind(any,"any"))} would be useful.
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/gomp/declare-variant-any.f90
> @@ -0,0 +1,28 @@
> +integer function f1 (x)
> +  integer, intent(in) :: x
> +  f1 = x + 1
> +end function
> +integer function f2 (x)
> +  integer, intent(in) :: x
> +  f2 = x + 2
> +end function
> +integer function f3 (x)
> +  integer, intent(in) :: 

[committed] testsuite: Fix up pr116588.c test [PR116588]

2024-09-09 Thread Jakub Jelinek
On Sat, Sep 07, 2024 at 01:58:46PM -0400, Andrew MacLeod wrote:
The test as committed without the tree-vrp.cc change only FAILs with
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
The DEBUG code in there was just to make it easier to debug, but doesn't
actually fail when the test is miscompiled.
We don't need such debugging code in simple tests like that, but it is
useful if they abort when miscompiled.

With this patch without the tree-vrp.cc change I see
FAIL: gcc.dg/pr116588.c execution test
FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0"
and with it it passes.

Tested on x86_64-linux with
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} dg.exp=pr116588.c'
and committed to trunk as obvious.

2024-09-09  Jakub Jelinek  

PR tree-optimization/116588
* gcc.dg/pr116588.c: Remove -DDEBUG from dg-options.
(main): Remove debugging code and simplify.

--- gcc/testsuite/gcc.dg/pr116588.c.jj  2024-09-09 09:27:39.155082488 +0200
+++ gcc/testsuite/gcc.dg/pr116588.c 2024-09-09 09:31:37.951970837 +0200
@@ -1,7 +1,7 @@
 /* PR tree-optimization/116588 */
 /* { dg-do run { target bitint575 } } */
 /* { dg-require-effective-target int128 } */
-/* { dg-options "-O2 -fno-vect-cost-model -fno-tree-dominator-opts 
-fno-tree-fre --param=vrp-block-limit=0  -DDEBUG -fdump-tree-vrp2-details" } */
+/* { dg-options "-O2 -fno-vect-cost-model -fno-tree-dominator-opts 
-fno-tree-fre --param=vrp-block-limit=0 -fdump-tree-vrp2-details" } */
 
 int a;
 __int128 b, c;
@@ -18,15 +18,8 @@ foo (_BitInt (129) e)
 int
 main ()
 {
-  __int128 x = foo (0);
-#ifdef DEBUG
-  for (unsigned i = 0; i < sizeof (x); i++)
-__builtin_printf ("%02x", i[(volatile unsigned char *) &x]);
-  __builtin_printf("\n");
-#else
-  if (x)
-__builtin_abort();
-#endif
+  if (foo (0))
+__builtin_abort ();
 }
 
 /* { dg-final { scan-tree-dump-not "0 != 0" "vrp2" } } */


Jakub



Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-07 Thread Jakub Jelinek
On Sat, Sep 07, 2024 at 07:50:29AM +0200, Martin Uecker wrote:
> >   2. Apple's implementation supports expressions in the '__counted_by'
> > attribute, thus the 'count' may be an R-value that can't have its
> > address taken.
> > 
> > [1] 
> > https://discourse.llvm.org/t/rfc-introducing-new-clang-builtin-builtin-get-counted-by/80836/
> 
> Yes, this would be a limitation.

But then you really can't return an lvalue either, all you can return is an
rvalue in that case.  So just return a (void*)0 in that case.

Jakub



[PATCH] libiberty: Fix up > 64K section handling in simple_object_elf_copy_lto_debug_section [PR116614]

2024-09-06 Thread Jakub Jelinek
Hi!

cat abc.C
#define A(n) struct T##n {} t##n;
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) 
A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) 
B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) 
C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) 
D(n##8) D(n##9)
E(1) E(2) E(3)
int main () { return 0; }
./xg++ -B ./ -o abc{.o,.C} -flto -flto-partition=1to1 -O2 -g 
-fdebug-types-section -c
./xgcc -B ./ -o abc{,.o} -flto -flto-partition=1to1 -O2
(not included in testsuite as it takes a while to compile) FAILs with
lto-wrapper: fatal error: Too many copied sections: Operation not supported
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The following patch fixes that.  Most of the 64K+ section support for
reading and writing was already there years ago (and especially reading used
quite often already) and a further bug fixed in it in the PR104617 fix.

Yet, the fix isn't solely about removing the
  if (new_i - 1 >= SHN_LORESERVE)
{
  *err = ENOTSUP;
  return "Too many copied sections";
}
5 lines, the missing part was that the function only handled reading of
the .symtab_shndx section but not copying/updating of it.
If the result has less than 64K-epsilon sections, that actually wasn't
needed, but e.g. with -fdebug-types-section one can exceed that pretty
easily (reported to us on WebKitGtk build on ppc64le).
Updating the section is slightly more complicated, because it basically
needs to be done in lock step with updating the .symtab section, if one
doesn't need to use SHN_XINDEX in there, the section should (or should be
updated to) contain SHN_UNDEF entry, otherwise needs to have whatever would
be overwise stored but couldn't fix.  But repeating due to that all the
symtab decisions what to discard and how to rewrite it would be ugly.

So, the patch instead emits the .symtab_shndx section (or sections) last
and prepares the content during the .symtab processing and in a second
pass when going just through .symtab_shndx sections just uses the saved
content.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-06  Jakub Jelinek  

PR lto/116614
* simple-object-elf.c (SHN_COMMON): Align comment with neighbouring
comments.
(SHN_HIRESERVE): Use uppercase hex digits instead of lowercase for
consistency.
(simple_object_elf_find_sections): Formatting fixes.
(simple_object_elf_fetch_attributes): Likewise.
(simple_object_elf_attributes_merge): Likewise.
(simple_object_elf_start_write): Likewise.
(simple_object_elf_write_ehdr): Likewise.
(simple_object_elf_write_shdr): Likewise.
(simple_object_elf_write_to_file): Likewise.
(simple_object_elf_copy_lto_debug_section): Likewise.  Don't fail for
new_i - 1 >= SHN_LORESERVE, instead arrange in that case to copy
over .symtab_shndx sections, though emit those last and compute their
section content when processing associated .symtab sections.  Handle
simple_object_internal_read failure even in the .symtab_shndx reading
case.

--- libiberty/simple-object-elf.c.jj2024-01-03 12:07:48.461085637 +0100
+++ libiberty/simple-object-elf.c   2024-09-06 13:34:12.796669098 +0200
@@ -128,9 +128,9 @@ typedef struct {
 
 #define SHN_UNDEF  0   /* Undefined section */
 #define SHN_LORESERVE  0xFF00  /* Begin range of reserved indices */
-#define SHN_COMMON 0xFFF2  /* Associated symbol is in common */
+#define SHN_COMMON 0xFFF2  /* Associated symbol is in common */
 #define SHN_XINDEX 0x  /* Section index is held elsewhere */
-#define SHN_HIRESERVE  0x  /* End of reserved indices */
+#define SHN_HIRESERVE  0x  /* End of reserved indices */
 
 
 /* 32-bit ELF program header.  */
@@ -569,8 +569,8 @@ simple_object_elf_find_sections (simple_
 void *data,
 int *err)
 {
-  struct simple_object_elf_read *eor =
-(struct simple_object_elf_read *) sobj->data;
+  struct simple_object_elf_read *eor
+= (struct simple_object_elf_read *) sobj->data;
   const struct elf_type_functions *type_functions = eor->type_functions;
   unsigned char ei_class = eor->ei_class;
   size_t shdr_size;
@@ -662,8 +662,8 @@ simple_object_elf_fetch_attributes (simp
const char **errmsg ATTRIBUTE_UNUSED,
int *err ATTRIBUTE_UNUSED)
 {
-  struct simple_object_elf_read *eor =
-(struct simple_object_elf_read *) sobj->data;
+  struct simple_object_elf_read *eor
+= (struct simple_object_elf_read *) sobj->data;
   struct

[PATCH] c++, v3: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-06 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 10:31:48PM +0200, Franz Sirl wrote:
> Hmm, it just occured to me, how about adding !NONVIRTUAL here? When
> NONVIRTUAL is true, there is no conditional stmt at all, or?

Yeah, that makes sense, the problem doesn't happen in that case.

Here is an adjusted patch, bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2024-09-06  Jakub Jelinek  

PR c++/116449
* typeck.cc (get_member_function_from_ptrfunc): Use save_expr
on instance_ptr and function even if it doesn't have side-effects,
as long as it isn't a decl.

* g++.dg/ubsan/pr116449.C: New test.

--- gcc/cp/typeck.cc.jj 2024-09-02 17:07:30.115098114 +0200
+++ gcc/cp/typeck.cc2024-09-04 19:08:24.127490242 +0200
@@ -4188,10 +4188,23 @@ get_member_function_from_ptrfunc (tree *
   if (!nonvirtual && is_dummy_object (instance_ptr))
nonvirtual = true;
 
-  if (TREE_SIDE_EFFECTS (instance_ptr))
-   instance_ptr = instance_save_expr = save_expr (instance_ptr);
+  /* Use save_expr even when instance_ptr doesn't have side-effects,
+unless it is a simple decl (save_expr won't do anything on
+constants), so that we don't ubsan instrument the expression
+multiple times.  See PR116449.  */
+  if (TREE_SIDE_EFFECTS (instance_ptr)
+ || (!nonvirtual && !DECL_P (instance_ptr)))
+   {
+ instance_save_expr = save_expr (instance_ptr);
+ if (instance_save_expr == instance_ptr)
+   instance_save_expr = NULL_TREE;
+ else
+   instance_ptr = instance_save_expr;
+   }
 
-  if (TREE_SIDE_EFFECTS (function))
+  /* See above comment.  */
+  if (TREE_SIDE_EFFECTS (function)
+ || (!nonvirtual && !DECL_P (function)))
function = save_expr (function);
 
   /* Start by extracting all the information from the PMF itself.  */
--- gcc/testsuite/g++.dg/ubsan/pr116449.C.jj2024-09-04 18:58:46.106764285 
+0200
+++ gcc/testsuite/g++.dg/ubsan/pr116449.C   2024-09-04 18:58:46.106764285 
+0200
@@ -0,0 +1,14 @@
+// PR c++/116449
+// { dg-do compile }
+// { dg-options "-O2 -Wall -fsanitize=undefined" }
+
+struct C { void foo (int); void bar (); int c[16]; };
+typedef void (C::*P) ();
+struct D { P d; };
+static D e[1] = { { &C::bar } };
+
+void
+C::foo (int x)
+{
+  (this->*e[c[x]].d) ();
+}


Jakub



[PATCH] c++: Fix up pedantic handling of alignas [PR110345]

2024-09-06 Thread Jakub Jelinek
Hi!

The following patch on top of the PR110345 P2552R3 series:
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661904.html   

  
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661905.html   

  
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661906.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662330.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662331.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662333.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662334.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662336.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662379.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662380.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662381.html
emits pedantic pedwarns for alignas appertaining to incorrect entities.
As the middle-end and attribute exclusions look for "aligned" attribute,
the patch transforms alignas into "internal "::aligned attribute (didn't
use [[aligned (x)]] so that people can't type it that way).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-09-06  Jakub Jelinek  

PR c++/110345
gcc/c-family/
* c-common.h (attr_aligned_exclusions): Declare.
(handle_aligned_attribute): Likewise.
* c-attribs.cc (handle_aligned_attribute): No longer
static.
(attr_aligned_exclusions): Use extern instead of static.
gcc/cp/
* cp-tree.h (enum cp_tree_index): Add CPTI_INTERNAL_IDENTIFIER.
(internal_identifier): Define.
(internal_attribute_table): Declare.
* parser.cc (cp_parser_exception_declaration): Error on alignas
on exception declaration.
(cp_parser_std_attribute_spec): Turn alignas into internal
ns aligned attribute rather than gnu.
* decl.cc (initialize_predefined_identifiers): Initialize
internal_identifier.
* tree.cc (handle_alignas_attribute): New function.
(internal_attributes): New variable.
(internal_attribute_table): Likewise.
* cp-objcp-common.h (cp_objcp_attribute_table): Add
internal_attribute_table entry.
gcc/testsuite/
* g++.dg/cpp0x/alignas1.C: Add dg-options "".
* g++.dg/cpp0x/alignas2.C: Likewise.
* g++.dg/cpp0x/alignas7.C: Likewise.
* g++.dg/cpp0x/alignas21.C: New test.
* g++.dg/ext/bitfield9.C: Expect a warning.
* g++.dg/cpp2a/is-layout-compatible3.C: Add dg-options -pedantic.
Expect a warning.

--- gcc/c-family/c-common.h.jj  2024-09-06 13:43:37.311307920 +0200
+++ gcc/c-family/c-common.h 2024-09-06 15:33:47.497616169 +0200
@@ -1645,8 +1645,10 @@ extern int parse_tm_stmt_attr (tree, int
 extern int tm_attr_to_mask (tree);
 extern tree tm_mask_to_attr (int);
 extern tree find_tm_attribute (tree);
+extern const struct attribute_spec::exclusions attr_aligned_exclusions[];
 extern const struct attribute_spec::exclusions attr_cold_hot_exclusions[];
 extern const struct attribute_spec::exclusions attr_noreturn_exclusions[];
+extern tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
 extern tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
--- gcc/c-family/c-attribs.cc.jj2024-09-06 13:43:37.300308064 +0200
+++ gcc/c-family/c-attribs.cc   2024-09-06 16:00:55.864465359 +0200
@@ -100,7 +100,6 @@ static tree handle_destructor_attribute
 static tree handle_mode_attribute (tree *, tree, tree, int, bool *);
 static tree handle_section_attribute (tree *, tree, tree, int, bool *);
 static tree handle_special_var_sec_attribute (tree *, tree, tree, int, bool *);
-static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
 static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,
  int, bool *);
 static tree handle_strict_flex_array_attribute (tree *, tree, tree,
@@ -192,7 +191,7 @@ static tree handle_null_terminated_strin
   { name, function, type, variable }
 
 /* Define attributes that are mutually exclusive with one another.  */
-static const struct attribute_spec::exclusions attr_aligned_exclusions[] =
+extern const struct attribute_spec::exclusions attr_aligned_exclusions[] =
 {
   /* Attribute name exclusion applies to:
function, type, variable */
@@ -2806,7 +2805,7 @@ common_handle_aligned_attribute (tree *n
 /* Handle a "aligned" attribute; arguments as in
struct attribute_spec.handler.  */
 
-static tree
+tree
 handle_a

Re: [PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 02:10:19PM +, Kyrylo Tkachov wrote:
> > This is certainly wrong.
> > PROP_gimple_any is set already at the end of gimplification, so certainly
> > doesn't include any other early gimple passes.
> > And, not all statements are folded during gimplification, e.g. in OpenMP
> > regions folding is postponed until the omp lowering pass and folded only
> > there (at which point PROP_gimple_any is already set).
> > 
> > What exactly are you trying to ensure this optimization goes before?
> > For non-VL vectors I guess vector lowering, but that is done far later
> > and we already have a different predicate for that.
> > For VL vectors, what transforms that if user write % ?
> 
> There’s currently no way to write this in a generic VLA way. The SVE 
> intrinsics for this would be opaque to GIMPLE and the generic vector 
> extension doesn’t support VLA for now.
> The problem is the fold-minus-1.c test case that wants to see the fold happen 
> early on, and I think that makes sense from a canonicalization POV but when 
> the vectorizer has expanded a vector mod later on we don’t want to put it 
> back together.
> I agree gimple_any doesn’t look like the right thing. Is there a better check 
> to use?

If it never works with SVE or RISC-V VL vectors, then match.pd shouldn't do
it unless it works.  Testcase can be always adjusted or limited to targets
which do support that.
Or, do it for VECTOR_INTEGER_TYPE_P only
if ((optimize_vectors_before_lowering_p ()
 && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST)
|| target_supports_op_p (type, TRUNC_MOD_EXPR, optab_vector))
i.e. do it before vector lowering for vectors which can be lowered,
and when the optabs works.
Anyway, I think it would be useful to check whether it actually results
in better generated code on targets which support TRUNC_DIV_EXPR
and MULT_EXPR on vectors but not TRUNC_MOD_EXPR (if there are any).

Jakub



Re: [PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 01:46:01PM +, Jennifer Schmitz wrote:
> In the pattern X - (X / Y) * Y to X % Y, this patch guards the
> simplification for vector types by a check for:
> 1) Support of the mod optab for vectors OR
> 2) Application during early gimple passes (using PROP_gimple_any).
> This is to prevent reverting vectorization of modulo to div/mult/sub
> if the target does not support vector mod optab, while still allowing
> the simplification during early gimple passes (as tested, for example,
> in gcc.dg/fold-minus-1.c).
> 
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
> 
> Signed-off-by: Jennifer Schmitz 
> 
> gcc/
>   PR tree-optimization/116569
>   * generic-match-head.cc (optimize_early_gimple_p): Add inline
>   function with mask for early gimple passes.
>   * gimple-match-head.cc (optimize_early_gimple_p): Likewise.
>   * match.pd: Guard simplification to trunc_mod with check for
>   mod optab support.
> 
> gcc/testsuite/
>   PR tree-optimization/116569
>   * gcc.dg/torture/pr116569.c: New test.

This is certainly wrong.
PROP_gimple_any is set already at the end of gimplification, so certainly
doesn't include any other early gimple passes.
And, not all statements are folded during gimplification, e.g. in OpenMP
regions folding is postponed until the omp lowering pass and folded only
there (at which point PROP_gimple_any is already set).

What exactly are you trying to ensure this optimization goes before?
For non-VL vectors I guess vector lowering, but that is done far later
and we already have a different predicate for that.
For VL vectors, what transforms that if user write % ?

Jakub



[PATCH] c++, v2: Implement for static locals CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-06 Thread Jakub Jelinek
Hi!

On Wed, Aug 14, 2024 at 06:11:35PM +0200, Jakub Jelinek wrote:
> Here is the I believe ABI compatible version, which uses the separate
> guard variables, so different structured binding variables can be
> initialized in different threads, but the thread that did the artificial
> base initialization will keep temporaries live at least until the last
> guard variable is released (i.e. when even that variable has been
> initialized).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux on top of the
> https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660354.html
> patch, ok for trunk?
> 
> As for namespace scope structured bindings and this DR, all of
> set_up_extended_ref_temp, cp_finish_decl -> expand_static_init and
> cp_finish_decl -> cp_finish_decomp -> cp_finish_decl -> expand_static_init
> in that case just push some decls into the static_aggregates or
> tls_aggregates chains.
> So, we can end up e.g. with the most important decl for a extended ref
> temporary (which initializes some temporaries), then perhaps some more
> of those, then DECL_DECOMPOSITION_P base, then n times optionally some further
> extended refs and DECL_DECOMPOSITION_P non-base and I think we need
> to one_static_initialization_or_destruction all of them together, by
> omitting CLEANUP_POINT_EXPR from the very first one (or all until the
> DECL_DECOMPOSITION_P base?), say through temporarily clearing
> stmts_are_full_exprs_p and then wrapping whatever
> one_static_initialization_or_destruction produces for all of those into
> a single CLEANUP_POINT_EXPR argument.
> Perhaps remember static_aggregates or tls_aggregates early before any
> check_initializer etc. calls and then after cp_finish_decomp cut that
> TREE_LIST nodes and pass that as a separate TREE_VALUE in the list.
> Though, not sure what to do about modules.cc uses of these, it needs
> to save/restore that stuff somehow too.

Now that the CWG 2867 patch for automatic structured bindings is in,
here is an updated version of the block scope static structured bindings
CWG 2867 patch.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

No patch for the namespace scope structured bindings yet, will work on that
soon.

2024-09-05  Jakub Jelinek  

PR c++/115769
* decl.cc: Partially implement CWG 2867 - Order of initialization
for structured bindings.
(cp_finish_decl): If need_decomp_init, for function scope structure
binding bases, temporarily clear stmts_are_full_exprs_p before
calling expand_static_init, after it call cp_finish_decomp and wrap
code emitted by both into maybe_cleanup_point_expr_void and ensure
cp_finish_decomp isn't called again.

* g++.dg/DRs/dr2867-3.C: New test.
* g++.dg/DRs/dr2867-4.C: New test.

--- gcc/cp/decl.cc.jj   2024-09-04 19:55:59.046491602 +0200
+++ gcc/cp/decl.cc  2024-09-04 20:04:35.695952219 +0200
@@ -9140,7 +9140,24 @@ cp_finish_decl (tree decl, tree init, bo
 initializer.  It is not legal to redeclare a static data
 member, so this issue does not arise in that case.  */
   else if (var_definition_p && TREE_STATIC (decl))
-   expand_static_init (decl, init);
+   {
+ if (decomp && DECL_FUNCTION_SCOPE_P (decl))
+   {
+ tree sl = push_stmt_list ();
+ auto saved_stmts_are_full_exprs_p = stmts_are_full_exprs_p ();
+ current_stmt_tree ()->stmts_are_full_exprs_p = 0;
+ expand_static_init (decl, init);
+ current_stmt_tree ()->stmts_are_full_exprs_p
+   = saved_stmts_are_full_exprs_p;
+ cp_finish_decomp (decl, decomp);
+ decomp = NULL;
+ sl = pop_stmt_list (sl);
+ sl = maybe_cleanup_point_expr_void (sl);
+ add_stmt (sl);
+   }
+ else
+   expand_static_init (decl, init);
+   }
 }
 
   /* If a CLEANUP_STMT was created to destroy a temporary bound to a
--- gcc/testsuite/g++.dg/DRs/dr2867-3.C.jj  2024-08-13 21:05:42.876446125 
+0200
+++ gcc/testsuite/g++.dg/DRs/dr2867-3.C 2024-08-13 21:05:42.876446125 +0200
@@ -0,0 +1,159 @@
+// CWG2867 - Order of initialization for structured bindings.
+// { dg-do run { target c++11 } }
+// { dg-options "" }
+
+#define assert(X) do { if (!(X)) __builtin_abort(); } while (0)
+
+namespace std {
+  template struct tuple_size;
+  template struct tuple_element;
+}
+
+int a, c, d, i;
+
+struct A {
+  A () { assert (c == 3); ++c; }
+  ~A () { ++a; }
+  template  int &get () const { assert (c == 5 + I); ++c; return i; }
+};
+
+template <> struct std::tuple_size  { static const int value = 4; };
+template  struct std::tuple_element  { using type = int; };
+template <> struct std::tuple_size  { static const int value = 4; };
+template  struct std:

Re: [PATCH] fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 09:51:38AM +0200, Richard Biener wrote:
> On Fri, Sep 6, 2024 at 9:31 AM Jakub Jelinek  wrote:
> >
> > On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote:
> > > This is an alternative patch to fix PR tree-optimization/116601 by 
> > > factoring
> > > out the main part of pass_fold_builtins::execute into its own function so 
> > > that
> > > we don't need to repeat the code for doing the eh cleanup. It also fixes 
> > > the
> > > problem I saw with the atomics which might skip over a statement; though 
> > > I don't
> > > have a testcase for that.
> >
> > I'm worried about using this elsewhere, various fab foldings are meant to be
> > done only in that pass and not earlier.
> > E.g. the __builtin_constant_p folding, __builtin_assume_aligned, stack
> > restore, unreachable, va_{start,end,copy}.
> 
> Maybe we can document this fact better or name the function differently?

Some of it is documented already in the source.
case BUILT_IN_CONSTANT_P:
  /* Resolve __builtin_constant_p.  If it hasn't been
 folded to integer_one_node by now, it's fairly
 certain that the value simply isn't constant.  */
  result = integer_zero_node;
or
case BUILT_IN_VA_START:
case BUILT_IN_VA_END:
case BUILT_IN_VA_COPY:
  /* These shouldn't be folded before pass_stdarg.  */
  result = optimize_stdarg_builtin (stmt);
Obviously, if either is done much earlier, then the former can fold to 1
(e.g. if it is before IPA or shortly after IPA and not all usual propagation
after inlining etc. is done already), or pass_stdarg hasn't been done, etc.

Jakub



  1   2   3   4   5   6   7   8   9   10   >