[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2023-06-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271

Sam James  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
 CC||sjames at gcc dot gnu.org

--- Comment #15 from Sam James  ---
(In reply to cuilili from comment #14)
> This regression has been fixed with the commit below and we can close this
> ticket.
>  
> https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a

Thanks. Do you have a gcc.gnu.org email address? If so, if you change your
Bugzilla to that, you should have permissions to modify bugs yourself.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-06-07 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 104271, which changed state.

Bug 104271 Summary: [12 Regression] 538.imagick_r run-time at -Ofast 
-march=native regressed by 26% on Intel Cascade Lake server CPU
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2023-06-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|12.4|12.3

[Bug tree-optimization/71414] 2x slower than clang summing small float array, GCC should consider larger vectorization factor for "unrolling" reductions

2023-06-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414

--- Comment #14 from Hongtao.liu  ---
(In reply to Richard Biener from comment #13)
> The target now has the ability to tell the vectorizer to choose a larger VF
> based on the cost info it got for the default VF, so the x86 backend could
> make use of that.  For example with the following patch we'll unroll the
> vectorized loops 4 times (of course the actual check for small reduction
> loops and a register pressure estimate is missing).  That generates
> 
> .L4:
> vaddps  (%rax), %zmm1, %zmm1
> vaddps  64(%rax), %zmm2, %zmm2
> addq$256, %rax
> vaddps  -128(%rax), %zmm0, %zmm0
> vaddps  -64(%rax), %zmm3, %zmm3
> cmpq%rcx, %rax
> jne .L4
> movq%rdx, %rax
> andq$-64, %rax
> vaddps  %zmm3, %zmm0, %zmm0
> vaddps  %zmm2, %zmm1, %zmm1
> vaddps  %zmm1, %zmm0, %zmm1
> ... more epilog ...
> 
> with -march=znver4 on current trunk.
> 
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index d4ff56ee8dd..53c09bb9d9c 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -23615,8 +23615,18 @@ class ix86_vector_costs : public vector_costs
>   stmt_vec_info stmt_info, slp_tree node,
>   tree vectype, int misalign,
>   vect_cost_model_location where) override;
> +  void finish_cost (const vector_costs *uncast_scalar_costs);
>  };
>  
> +void
> +ix86_vector_costs::finish_cost (const vector_costs *uncast_scalar_costs)
> +{
> +  auto *scalar_costs
> += static_cast (uncast_scalar_costs);
> +  m_suggested_unroll_factor = 4;
> +  vector_costs::finish_cost (scalar_costs);

I remember we have posted an patch for that
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604186.html

One regression observed is the VF of epilog loop will increase(from xmm to ymm)
after unroll the vectorized loops, and it regressed performance for
lower-tripcount loop(similar as -mprefer-vector-width=512).

Also for the case in the PR, I'm trying to enable
-fvariable-expansion-in-unroller when -funroll-loops, and the partial sum will
break reduction chain.

[Bug tree-optimization/71414] 2x slower than clang summing small float array, GCC should consider larger vectorization factor for "unrolling" reductions

2023-06-07 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414

--- Comment #15 from rguenther at suse dot de  ---
On Wed, 7 Jun 2023, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414
> 
> --- Comment #14 from Hongtao.liu  ---
> (In reply to Richard Biener from comment #13)
> > The target now has the ability to tell the vectorizer to choose a larger VF
> > based on the cost info it got for the default VF, so the x86 backend could
> > make use of that.  For example with the following patch we'll unroll the
> > vectorized loops 4 times (of course the actual check for small reduction
> > loops and a register pressure estimate is missing).  That generates
> > 
> > .L4:
> > vaddps  (%rax), %zmm1, %zmm1
> > vaddps  64(%rax), %zmm2, %zmm2
> > addq$256, %rax
> > vaddps  -128(%rax), %zmm0, %zmm0
> > vaddps  -64(%rax), %zmm3, %zmm3
> > cmpq%rcx, %rax
> > jne .L4
> > movq%rdx, %rax
> > andq$-64, %rax
> > vaddps  %zmm3, %zmm0, %zmm0
> > vaddps  %zmm2, %zmm1, %zmm1
> > vaddps  %zmm1, %zmm0, %zmm1
> > ... more epilog ...
> > 
> > with -march=znver4 on current trunk.
> > 
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index d4ff56ee8dd..53c09bb9d9c 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -23615,8 +23615,18 @@ class ix86_vector_costs : public vector_costs
> >   stmt_vec_info stmt_info, slp_tree node,
> >   tree vectype, int misalign,
> >   vect_cost_model_location where) override;
> > +  void finish_cost (const vector_costs *uncast_scalar_costs);
> >  };
> >  
> > +void
> > +ix86_vector_costs::finish_cost (const vector_costs *uncast_scalar_costs)
> > +{
> > +  auto *scalar_costs
> > += static_cast (uncast_scalar_costs);
> > +  m_suggested_unroll_factor = 4;
> > +  vector_costs::finish_cost (scalar_costs);
> 
> I remember we have posted an patch for that
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604186.html
> 
> One regression observed is the VF of epilog loop will increase(from xmm to 
> ymm)
> after unroll the vectorized loops, and it regressed performance for
> lower-tripcount loop(similar as -mprefer-vector-width=512).

Ah, yeah.  We could resort to check estimated_number_of_iterations
to guide us with profile feedback.  I'm also (again) working on
fully masked epilogues which should reduce the impact on low-trip
count loops.

> Also for the case in the PR, I'm trying to enable
> -fvariable-expansion-in-unroller when -funroll-loops, and the partial sum will
> break reduction chain.

Probably also a good idea - maybe -fvariable-expansion-in-unroller can
be made smarter and guided by register pressure?

[Bug tree-optimization/110151] warning: 'strncpy' output truncated copying 10 bytes from a string of length 26 [-Wstringop-truncation]

2023-06-07 Thread yinyuefengyi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110151

Xionghu Luo (luoxhu at gcc dot gnu.org)  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Xionghu Luo (luoxhu at gcc dot gnu.org)  ---
duplicate.

*** This bug has been marked as a duplicate of bug 107473 ***

[Bug tree-optimization/107473] Unexpected warning / error with strncpy

2023-06-07 Thread yinyuefengyi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107473

Xionghu Luo (luoxhu at gcc dot gnu.org)  changed:

   What|Removed |Added

 CC||yinyuefengyi at gmail dot com

--- Comment #2 from Xionghu Luo (luoxhu at gcc dot gnu.org)  ---
*** Bug 110151 has been marked as a duplicate of this bug. ***

[Bug middle-end/88781] [meta-bug] bogus/missing -Wstringop-truncation warnings

2023-06-07 Thread yinyuefengyi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88781
Bug 88781 depends on bug 110151, which changed state.

Bug 110151 Summary: warning: 'strncpy' output truncated copying 10 bytes from a 
string of length 26 [-Wstringop-truncation]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110151

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/110122] [13/14 Regression] using an aggregate with a member variable with a user defined copy constructor in a class NTTP causes capture and use of the `this` pointer in a generic lambda to p

2023-06-07 Thread waffl3x at protonmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110122

--- Comment #5 from waffl3x  ---
(In reply to Patrick Palka from comment #4)
> 
> Yes, it seems the original testcase is exhibiting two bugs (one of which a
> GCC 13 regression), whereas the second testcase exhibits one
> (non-regression) bug.
> 
> In your original testcase Bar's copy constructor shouldn't be needed since
> the template parameter V isn't being copied anywhere, but we're somehow end
> up with an illegitimate use of the constructor (bug #1) and then we're also
> failing to synthesize it (bug #2).
> 
> In the second testcase Bar's copy constructor is legitimately needed since
> we're arguably making a copy of V when writing the specialization
> Doppelganger, but we fail to synthesize the constructor (bug #2).
> 

Ah okay got it, bug #1 is pretty harmless then isn't it, because there should
never be a situation where the illegitimate use of the constructor will be an
error, it only popped up because bug #2 happened.
On the other hand, this makes me wonder about what's going on during class
template instantiation, at worst it's probably an inefficiency rather than
anything potentially harmful though.
> 
> Ah, does it work for you to give Bar an explicitly defaulted copy and
> default ctor?
Yes actually, that does seem to do the trick.
https://godbolt.org/z/x7eYzY6dz

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-06-07 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038

Martin Jambor  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Martin Jambor  ---
Let's mark it as fixed then.

[Bug middle-end/110142] [14 Regression] x264 from SPECCPU 2017 miscompares from g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0

2023-06-07 Thread avieira at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110142

--- Comment #1 from avieira at gcc dot gnu.org ---
Found the issue to be with passing a subtype to vect_recog_widen_op_pattern in
vect_recog_widen_{plus,minus}_pattern where we didn't before. Removing those
and letting it default to a NULL pointer seems to fix the codegen issue.  Will
test patches locally and send in patch when done.

[Bug target/110152] New: [14 Regression] ICE on 3dnow-1.c since r14-1166

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110152

Bug ID: 110152
   Summary: [14 Regression] ICE on 3dnow-1.c since r14-1166
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

Starting with r14-1166-gaffee7dcfa1ee272d43ac7cb I'm seeing
+FAIL: gcc.target/i386/3dnow-1.c (internal compiler error: Segmentation fault
signal terminated program cc1)
+FAIL: gcc.target/i386/3dnow-1.c (test for excess errors)
+FAIL: gcc.target/i386/3dnow-2.c (internal compiler error: Segmentation fault
signal terminated program cc1)
+FAIL: gcc.target/i386/3dnow-2.c (test for excess errors)
+FAIL: gcc.target/i386/mmx-1.c (internal compiler error: Segmentation fault
signal terminated program cc1)
+FAIL: gcc.target/i386/mmx-1.c (test for excess errors)
+FAIL: gcc.target/i386/mmx-2.c (internal compiler error: Segmentation fault
signal terminated program cc1)
+FAIL: gcc.target/i386/mmx-2.c (test for excess errors)
on i686-linux.  When the testcase is compiled with -m32 -m3dnow -mno-sse, I see
ix86_expand_vector_init_general (true, E_V4HImode, ..., ...) call, which since
the
above change recurses into
ix86_expand_vector_init_general (false, E_V2SImode, ..., ...) but as mmx_ok is
false
in that case, it will not handle V2SImode using ix86_expand_vector_init_concat,
but recurses endlessly calling itself with (false, E_V2SImode, ..., ...).

I think we should pass mmx_ok instead of false in the nwords == 2 case.

[Bug middle-end/110142] [14 Regression] x264 from SPECCPU 2017 miscompares from g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0

2023-06-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110142

--- Comment #2 from Tamar Christina  ---
Thank you!

[Bug target/110152] [14 Regression] ICE on 3dnow-1.c since r14-1166

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110152

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Priority|P3  |P1

[Bug target/110152] [14 Regression] ICE on 3dnow-1.c since r14-1166

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110152

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2023-06-07

--- Comment #1 from Jakub Jelinek  ---
2023-06-07  Jakub Jelinek  

PR target/110152
* config/i386/i386-expand.cc (ix86_expand_vector_init_general): For
n_words == 2 recurse with mmx_ok as first argument rather than false.

--- gcc/config/i386/i386-expand.cc.jj   2023-06-03 15:32:04.489410367 +0200
+++ gcc/config/i386/i386-expand.cc  2023-06-07 10:31:34.715981752 +0200
@@ -16371,7 +16371,7 @@ quarter:
  machine_mode concat_mode = tmp_mode == DImode ? V2DImode : V2SImode;
  rtx tmp = gen_reg_rtx (concat_mode);
  vals = gen_rtx_PARALLEL (concat_mode, gen_rtvec_v (2, words));
- ix86_expand_vector_init_general (false, concat_mode, tmp, vals);
+ ix86_expand_vector_init_general (mmx_ok, concat_mode, tmp, vals);
  emit_move_insn (target, gen_lowpart (mode, tmp));
}
   else if (n_words == 4)

seems to fix it.

[Bug libstdc++/110145] 20_util/to_chars/double.cc fails for -m32 -fexcess-precision=standard

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110145

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jsm28 at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
Actually, even simple
double a = 1e126;

int
main ()
{
  __builtin_printf ("%.13a\n", a);
}
behaves differently in C between -m64 and -m32 -fexcess-precision=standard.
-m64 gives
0x1.7a2ecc414a03fp+418
while -m32 -fexcess-precision=standard gives
0x1.7a2ecc414a040p+418
Now, 1e126L is 0xb.d176620a501fc0p+415, which is that
0x1.7a2ecc414a03fp+418
exactly.  So wonder why we try to round that up.

[Bug c++/110153] New: [modules] Static module mapper format cannot handle header unit paths with spaces

2023-06-07 Thread boris at kolpackov dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110153

Bug ID: 110153
   Summary: [modules] Static module mapper format cannot handle
header unit paths with spaces
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: boris at kolpackov dot net
  Target Milestone: ---

The static module mapper format (-fmodule-mapper=[?], as described
here[1] and implemented in c++tools/resolver.cc) uses spaces to separate the
module names from BMI file names and as a result cannot handle header unit
names that contain spaces. While the header names that contain spaces are not
very likely, the header unit names include the directory components which could
plausibly contain spaces.

Possible ways to fix this that came to mind:

1. Use the same quoting/escaping mechanism as in the dynamic mapper (see the
libcody documentation[2]).

2. Currently the separator is either a space or a tab. We could change it to be
only tab (assuming that paths with tabs are a lot less likely than with
spaces).


[1] https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Module-Mapper.html
[2] https://github.com/urnathan/libcody#packet-encoding

[Bug libstdc++/110145] 20_util/to_chars/double.cc fails for -m32 -fexcess-precision=standard

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110145

--- Comment #5 from Jakub Jelinek  ---
Now, putting a breakpoint on round_for_format, seems the constant is actually
0xb.d176620a501fbffb650e5a93bc3d89854bea8f289011b2bp+415 or so before rounding
aka
0x1.7a2ecc414a03f7ff6ca1cb527787b130a97d51e51202365p+418, and long double on
ia32 has
63 bits of precision, so it is correctly rounded to
0x1.7a2ecc414a03f800p+418L when rounding to long double.
And that rounded to double is 0x1.7a2ecc414a040p+418.
While rounding the original long number immediately to 52 bits of precision
yields
0x1.7a2ecc414a03f0p+418.

double a = 1e126L;
double b = 1e126;

int
main ()
{
  __builtin_printf ("%.13a\n%.13a\n", a, b);
}

confirms this.
So, the issue is that with excess precision, we do double rounding of
constants, first
to long double and then to double, while without excess precision it is rounded
just once immediately to double.

[Bug libstdc++/110145] 20_util/to_chars/double.cc fails for -m32 -fexcess-precision=standard

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110145

--- Comment #6 from Jakub Jelinek  ---
So perhaps just wrap that single test with #if FLT_EVAL_METHOD == 0 ||
FLT_EVAL_METHOD == 1 (to make sure double constants are evaluated to double
precision)?
Or use 0x1.7a2ecc414a03fp+418 instead of 1e126?  Or both (wrap the 1e126 test
with the
#if and add 0x1.7a2ecc414a03fp+418 after it?

[Bug c/110154] New: When compiling __builtin_frame_address with a relatively large argument, GCC-trunk takes up a significant amount of time.

2023-06-07 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110154

Bug ID: 110154
   Summary: When compiling __builtin_frame_address with a
relatively large argument, GCC-trunk takes up a
significant amount of time.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 141242068 at smail dot nju.edu.cn
  Target Milestone: ---

This behavior can be verified on https://gcc.godbolt.org/z/7967MTMrP

When compile below program with gcc:
```
void *h() {
  return __builtin_frame_address(0xF);
}
```


gcc takes over 40 seconds to finish:
```
$ time gcc-14 -c -O2 a.c
gcc-14 -c -O2 a.c  43.33s user 1.04s system 99% cpu 44.381 total
```

in contrast, clang immediately returns:
```
$ clang-17 -c -O2 a.c
:2:10: error: argument value 1048575 is outside the valid range [0,
65535]
2 |   return __builtin_frame_address(0xF);
  |  ^   ~~~
1 error generated.
Compiler returned: 1
```

[Bug c/110154] When compiling __builtin_frame_address with a relatively large argument, GCC-trunk takes up a significant amount of time.

2023-06-07 Thread 141242068 at smail dot nju.edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110154

wierton <141242068 at smail dot nju.edu.cn> changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from wierton <141242068 at smail dot nju.edu.cn> ---
not a bug

[Bug tree-optimization/110087] Missing if conversion

2023-06-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087

--- Comment #7 from Uroš Bizjak  ---
Similar conversion, not performed by gcc:

--cut here--
#include 

_Bool foo (void);

int bar (int r)
{
  if (foo ())
r++;

  return r;
}
--cut here--

gcc -O2:

movl%edi, %ebx
callfoo
cmpb$1, %al
sbbl$-1, %ebx
movl%ebx, %eax

could be:

movl%edi, %ebx
callq   foo
movzbl  %al, %eax
addl%ebx, %eax

[Bug tree-optimization/110087] Missing if conversion

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087

--- Comment #8 from Andrew Pinski  ---
(In reply to Uroš Bizjak from comment #7)
> Similar conversion, not performed by gcc:
> 
> --cut here--
> #include 
> 
> _Bool foo (void);
> 
> int bar (int r)
> {
>   if (foo ())
> r++;
> 
>   return r;
> }
> --cut here--
> 
> gcc -O2:
> 
> movl%edi, %ebx
> callfoo
> cmpb$1, %al
> sbbl$-1, %ebx
> movl%ebx, %eax
> 
> could be:
> 
> movl%edi, %ebx
> callq   foo
> movzbl  %al, %eax
> addl%ebx, %eax

Please file this separately, since it is a different issue. Though I think
there are dups of that one too.

[Bug libgcc/109712] [13/14 Regression] Segmentation fault in linear_search_fdes

2023-06-07 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #28 from Carlos Galvez  ---
The proposed patch fixes the issue on our side, thank you!

I realize my comment about doesn't make sense - I was mixing unions in C (where
type punning is fine) and C++ (UB). But then I don't understand why valgrind
would point at that variable as uninitialized...

[Bug tree-optimization/110155] New: Missing if conversion

2023-06-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155

Bug ID: 110155
   Summary: Missing if conversion
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

--cut here--
#include 

_Bool foo (void);

int bar (int r)
{
  if (foo ())
r++;

  return r;
}
--cut here--

compiles (gcc -O2) to:

movl%edi, %ebx
callfoo
cmpb$1, %al
sbbl$-1, %ebx
movl%ebx, %eax

could be performed without compare as:

movl%edi, %ebx
callq   foo
movzbl  %al, %eax
addl%ebx, %eax

[Bug tree-optimization/110087] Missing if conversion

2023-06-07 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110087

--- Comment #9 from Uroš Bizjak  ---
(In reply to Andrew Pinski from comment #8)
> Please file this separately, since it is a different issue.
PR110155.

[Bug middle-end/108410] x264 averaging loop not optimized well for avx512

2023-06-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

--- Comment #4 from Richard Biener  ---
Adding fully masked AVX512 and AVX512 with a masked epilog data:

size   scalar 128 256 512512e512f
19.42   11.329.35   11.17   15.13   16.89
25.726.536.666.667.628.56
34.495.105.105.745.085.73
44.104.334.295.213.794.25
63.783.853.864.762.542.85
83.641.893.764.501.922.16
   123.562.213.754.261.261.42
   163.360.831.064.160.951.07
   203.391.421.334.070.750.85
   243.230.661.724.220.620.70
   283.181.092.044.200.540.61
   323.160.470.410.410.470.53
   343.160.670.610.560.440.50
   383.190.950.950.820.400.45
   423.090.581.211.130.360.40

text sizes are not much different:

 138918372125162917211689

the AVX2 size is large because we completely peel the scalar epilogue,
same for the SSE case.  The scalar epilogue of the 512 loop iterates
32 times (too many for peeling), the masked loop/epilogue are quite
large due to the EVEX encoded instructions so the saved scalar/vector
epilogues do not show.

The AVX512 masked epilogue case now looks like:

.p2align 3
.L5:
vmovdqu8(%r8,%rax), %zmm0
vpavgb  (%rsi,%rax), %zmm0, %zmm0
vmovdqu8%zmm0, (%rdi,%rax)
addq$64, %rax
cmpq%rcx, %rax
jne .L5
movl%edx, %ecx
andl$-64, %ecx
testb   $63, %dl
je  .L19
.L4:
movl%ecx, %eax
subl%ecx, %edx
movl$255, %ecx
cmpl%ecx, %edx
cmova   %ecx, %edx
vpbroadcastb%edx, %zmm0
vpcmpub $6, .LC0(%rip), %zmm0, %k1
vmovdqu8(%rsi,%rax), %zmm0{%k1}{z}
vmovdqu8(%r8,%rax), %zmm1{%k1}{z}
vpavgb  %zmm1, %zmm0, %zmm0
vmovdqu8%zmm0, (%rdi,%rax){%k1}
.L19:
vzeroupper
ret

where there's a missed optimization around the saturation to 255.

The fully masked AVX512 loop is

vmovdqa64   .LC0(%rip), %zmm3
movl$255, %eax
cmpl%eax, %ecx 
cmovbe  %ecx, %eax
vpbroadcastb%eax, %zmm0
vpcmpub $6, %zmm3, %zmm0, %k1
.p2align 4
.p2align 3
.L4:
vmovdqu8(%rsi,%rax), %zmm1{%k1}
vmovdqu8(%r8,%rax), %zmm2{%k1}
movl%r10d, %edx
movl$255, %ecx
subl%eax, %edx
cmpl%ecx, %edx
cmova   %ecx, %edx
vpavgb  %zmm2, %zmm1, %zmm0
vmovdqu8%zmm0, (%rdi,%rax){%k1}
vpbroadcastb%edx, %zmm0
addq$64, %rax
movl%r9d, %edx
subl%eax, %edx
vpcmpub $6, %zmm3, %zmm0, %k1
cmpl$64, %edx
ja  .L4
vzeroupper
ret

which is a much larger loop body due to the mask creation.  At least
that interleaves nicely (dependence wise) with the loop control and
vectorized stmts.  What needs to be optimized somehow is what IVOPTs
makes out of the decreasing remaining scalar iters IV with the 
IV required for the memory accesses.  Without IVOPTs the body looks
like

.L4:
vmovdqu8(%rsi), %zmm1{%k1}
vmovdqu8(%rdx), %zmm2{%k1}
movl$255, %eax
movl%ecx, %r8d
subl$64, %ecx
addq$64, %rsi
addq$64, %rdx
vpavgb  %zmm2, %zmm1, %zmm0
vmovdqu8%zmm0, (%rdi){%k1}
addq$64, %rdi
cmpl%eax, %ecx
cmovbe  %ecx, %eax
vpbroadcastb%eax, %zmm0
vpcmpub $6, %zmm3, %zmm0, %k1
cmpl$64, %r8d
ja  .L4

and the key thing to optimize is

  ivtmp_78 = ivtmp_77 + 4294967232; // -64
  _79 = MIN_EXPR ;
  _80 = (unsigned char) _79;
  _81 = {_80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80,
_80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80,
_80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80,
_80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80, _80,
_80, _80};

that is we want to broadcast a saturated (to vector element precision) value.

[Bug libgcc/109712] [13/14 Regression] Segmentation fault in linear_search_fdes

2023-06-07 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #29 from Carlos Galvez  ---
*my comment about uninitialized "ob.s.b.encoding".

[Bug tree-optimization/110155] Missing if conversion

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
   Keywords||missed-optimization
   Last reconfirmed||2023-06-07
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
(simplify
 (cond (ne onezerovaluep@1 integer_zerop)
  (plus @2 integer_onep) @2)
 (plus @2 @1)
)

Plus could be ior or xor. That is there is a pattern already which be extended
to plus.

[Bug modula2/110126] Variables are reported as unused when only referenced by ASM statements

2023-06-07 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110126

Gaius Mulley  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-06-07
 Ever confirmed|0   |1

--- Comment #3 from Gaius Mulley  ---
Ah yes indeed, so there are two issues:

   (1)  gm2 -Wall -c foo.mod generates an incorrect warning.
   (2)  gm2 cannot concatenate strings before an ASM statement.

will fix - thanks for the report(s)

[Bug libgomp/110156] New: libgomp leaking when executed in a thread

2023-06-07 Thread christophe.beauregard at ec dot gc.ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110156

Bug ID: 110156
   Summary: libgomp leaking when executed in a thread
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: christophe.beauregard at ec dot gc.ca
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 55277
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55277&action=edit
Simple test program

We discovered a situation where the bullseye version of GraphicsMagick suddenly
caused an application to explode in VSS memory use (capping at about 20G) and a
much slower RSS leak. The VSS memory consumption is the memory mapped thread
arenas that apparently aren't being reused. The RSS leak will eventually result
in an OOM situation when the process is run for long enough.

It's possible to force the thread arenas to free up by explicitly calling
omp_pause_resource_all() at thread termination, but I'm not certain that's a
100% fix, and IMHO it's far more than an application *should* be aware of about
something buried inside a library.

The issue seems to be reliably reproducible on a system configuration with
multiple sockets and multiple cores, including VMs (even when run on
single-socket hosts).

I get the same results against gcc-9, 10, and a two-day-old git clone. The
following is from a 2-socket-4-core Haswell VM running Debian bullseye:

cpb@bullseye64:~$ gcc --version
gcc (GCC) 14.0.0 20230606 (experimental)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

cpb@bullseye64:~$ gcc -o gomparena gomparena.c -fopenmp
cpb@bullseye64:~$ ./gomparena   
1686141812
   VSZ   RSS USER COMMAND
297820  2128 cpb  ./gomparena
4
1686141813
   VSZ   RSS USER COMMAND
494428  2152 cpb  ./gomparena
7
1686141814
   VSZ   RSS USER COMMAND
691036  2316 cpb  ./gomparena
10
1686141815
   VSZ   RSS USER COMMAND
887644  2328 cpb  ./gomparena
13
1686141816
   VSZ   RSS USER COMMAND
1149788 2340 cpb  ./gomparena
17
1686141817
   VSZ   RSS USER COMMAND
1346396 2356 cpb  ./gomparena
20
1686141819
   VSZ   RSS USER COMMAND
1543004 2372 cpb  ./gomparena
23
1686141820
   VSZ   RSS USER COMMAND
1739612 2388 cpb  ./gomparena
26
1686141821
   VSZ   RSS USER COMMAND
1936220 2404 cpb  ./gomparena
29
1686141822
   VSZ   RSS USER COMMAND
2132828 2420 cpb  ./gomparena
32

On a two-socket-20-core machine, ten iterations hits the 20G/300-ish thread
arena max.

[Bug sanitizer/110157] New: Address sanitizer crashes when accessing variables through procedure callback

2023-06-07 Thread bardeau at iram dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110157

Bug ID: 110157
   Summary: Address sanitizer crashes when accessing variables
through procedure callback
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bardeau at iram dot fr
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 55278
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55278&action=edit
Main program, library, and Makefile

Hi,

in the gfortran 13.* branch, the attached code crashes as follows. This is a
regression compared to gfortran 12.* releases (and below) which run correctly.

The sample code is simple but needs to be compiled in a library separated from
the main executable (no crash if not). I attach the Makefile which compiles and
links all the parts.

The -fsanitize=address option has to be present (hence my report to the
sanitizer).

In short, the 'gfits_setsort' procedure calls 'quicksort' with a local
(contained) procedure passed as argument. The callback of this procedure by
'quicksort' results in the crash. The main point is that the 'key' variable is
accessed in the called back procedure. In this example, 'key' is a dummy
variable received by 'gfits_setsort', but the same issue is also true if 'key'
is a variable local to 'gfits_setsort'.

$ gfortran --version
GNU Fortran (GCC) 13.1.1 20230606
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ make clean && make
rm -f *.o *.so test
gfortran -fimplicit-none -fsanitize=address -fPIC -c header.f90 -o header.o
gfortran -shared header.o -o libgfits.so
gfortran -fimplicit-none -fsanitize=address -fPIC -c test.f90 -o test.o
gfortran -fsanitize=address test.o -L./ -lgfits -o test

$ export LD_LIBRARY_PATH=".:$LD_LIBRARY_PATH"

$ ./test
 >>> Calling ugt

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x7f821433a3ff in ???
#1  0x7f8211700038 in ???
Segmentation fault (core dumped)

[Bug rtl-optimization/68274] __builtin_unreachable pessimizes code

2023-06-07 Thread matt at godbolt dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68274

--- Comment #5 from Matt Godbolt  ---
Amazing: thank you Andrew!

[Bug tree-optimization/110155] Missing if conversion

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155

--- Comment #2 from Andrew Pinski  ---
Created attachment 55279
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55279&action=edit
Simple patch which I am testing

Note I noticed the patterns I am modifying causes a lot of "garbage" statements
to come out of it.
They explictly produce `(-((typeof(y))t) & z)` where t is zero_one_valued_p but
that is already going to be transformed into `((typeof(y))t) * z` by another
pattern (around line 2058 in match.pd). I will improve that up too.

[Bug libstdc++/110145] 20_util/to_chars/double.cc fails for -m32 -fexcess-precision=standard

2023-06-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110145

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-06-07
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
Created attachment 55280
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55280&action=edit
gcc14-pr110145.patch

Untested fix.

[Bug c++/109655] Prior friend declaration causes "confused by earlier errors, bailing out" (with no error message) with missing constraint on out-of-class class template member definition

2023-06-07 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109655

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 CC||ppalka at gcc dot gnu.org

[Bug target/109800] [11/12/13/14 Regression] arm: ICE (segfault) loading double with -mpure-code -mbig-endian

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109800

--- Comment #3 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Alex Coplan
:

https://gcc.gnu.org/g:98682182e394a0ebc96ba74d7958912ab328dee8

commit r13-7425-g98682182e394a0ebc96ba74d7958912ab328dee8
Author: Alex Coplan 
Date:   Thu May 25 13:34:46 2023 +0100

arm: Fix ICE due to infinite splitting [PR109800]

In r11-966-g9a182ef9ee011935d827ab5c6c9a7cd8e22257d8 we introduce a
simplification to emit_move_insn that attempts to simplify moves of the
form:

(set (subreg:M1 (reg:M2 ...)) (constant C))

where M1 and M2 are of equal mode size. That is problematic for the
splitter
vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun
an
lvalue DFmode pseudo into DImode and assign a constant to it with
emit_move_insn, as the new transformation simply undoes this, and we end up
splitting indefinitely.

This patch changes things around in the arm backend so that we use a
DImode temporary (instead of DFmode) and first load the DImode constant
into the pseudo, and then pun the pseudo into DFmode as an rvalue in a
reg -> reg move. I believe this should be semantically equivalent but
avoids the pathalogical behaviour seen in the PR.

gcc/ChangeLog:

PR target/109800
* config/arm/arm.md (movdf): Generate temporary pseudo in DImode
instead of DFmode.
* config/arm/vfp.md (no_literal_pool_df_immediate): Rather than
punning an
lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it
into
DFmode as an rvalue.

gcc/testsuite/ChangeLog:

PR target/109800
* gcc.target/arm/pure-code/pr109800.c: New test.

(cherry picked from commit f5298d9969b4fa34ff3aecd54b9630e22b2984a5)

[Bug tree-optimization/110062] missed vectorization in graphicsmagick

2023-06-07 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110062

--- Comment #5 from Jan Hubicka  ---
In sharpening the number of iterations depends on sharpen radius. Not sure what
it is for the benchmark, but in normal situations the number of iterations is
indeed not very large.

However clang simply slp vectorizes the red&green channels into vector of size
2.

[Bug tree-optimization/110155] Missing if conversion

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155

--- Comment #3 from Andrew Pinski  ---
Note mult is incorrect in the patch. Also note minus will not work either as
there is a :c there.

[Bug tree-optimization/97711] Failure to optimise "x & 1 ? x - 1 : x" to "x & -2"

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97711

Andrew Pinski  changed:

   What|Removed |Added

 Depends on|103216  |110155

--- Comment #7 from Andrew Pinski  ---
I have a better patch for this one, PR 110155.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103216
[Bug 103216] missed optimization, phiopt/vrp?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155
[Bug 110155] Missing if conversion

[Bug tree-optimization/103216] missed optimization, phiopt/vrp?

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103216

Andrew Pinski  changed:

   What|Removed |Added

URL|https://gcc.gnu.org/piperma |
   |il/gcc-patches/2021-Novembe |
   |r/584411.html   |
   Keywords|patch   |

--- Comment #9 from Andrew Pinski  ---
So to fix this in a better way I propose to extend the:
`/* (zero_one == 0) ? y : z  y -> ((typeof(y))zero_one * z)  y */`
patterns to handle not only zero_one but rather popcount(nz) == 1. 
also treat `signed < 0` as `(signed>>signbit)&1`.

I will do that over the weekend.

[Bug middle-end/78115] Missed optimization for "int modulo 2^31"

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78115

--- Comment #6 from Andrew Pinski  ---
What this needs above and beyond PR 103216 is supporting `x ? a - b : a`.

[Bug middle-end/54571] Missed optimization converting between bit sets

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54571

--- Comment #5 from Andrew Pinski  ---
The `popcount(nz) == 1` comment part of PR 103216 will fix this issue.

[Bug libstdc++/110158] New: Cannot use union with std::string inside in constant expression

2023-06-07 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110158

Bug ID: 110158
   Summary: Cannot use union with std::string inside in constant
expression
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fchelnokov at gmail dot com
  Target Milestone: ---

This program

constexpr bool f() {
union U{
std::string s;
constexpr ~U(){ s.~basic_string(); }
} u{};
return true;
}

static_assert( f() );

is accepted in MSVC and Clang with libc++. But in GCC it produces the error:

error: non-constant condition for static assertion
   in 'constexpr' expansion of 'f()'
   in 'constexpr' expansion of '(& u)->f()::U::~U()'
   in 'constexpr' expansion of
'((f()::U*)this)->f()::U::s.std::__cxx11::basic_string::~basic_string()'
/opt/compiler-explorer/gcc-13.1.0/include/c++/13.1.0/bits/basic_string.h:803:19:
  in 'constexpr' expansion of
'((std::__cxx11::basic_string*)this)->std::__cxx11::basic_string::_M_dispose()'
error: accessing 'std::__cxx11::basic_string_M_allocated_capacity' member instead of initialized
'std::__cxx11::basic_string_M_local_buf' member in
constant expression

Online demo: https://gcc.godbolt.org/z/bbf4Yo3v9

[Bug target/109541] [12/13/14 regression] ICE in extract_constrain_insn on when building rhash-1.4.3

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109541

--- Comment #18 from CVS Commits  ---
The master branch has been updated by Vladimir Makarov :

https://gcc.gnu.org/g:8cc8707446b77f9413654b31704f5a639673c916

commit r14-1610-g8cc8707446b77f9413654b31704f5a639673c916
Author: Vladimir N. Makarov 
Date:   Wed Jun 7 09:51:54 2023 -0400

RA: Constrain class of pic offset table pseudo to general regs

On some targets an integer pseudo can be assigned to a FP reg.  For
pic offset table pseudo it means we will reload the pseudo in this
case and, as a consequence, memory containing the pseudo might be
recognized as wrong one.  The patch fix this problem.

PR target/109541

gcc/ChangeLog:

* ira-costs.cc: (find_costs_and_classes): Constrain classes of pic
offset table pseudo to a general reg subset.

gcc/testsuite/ChangeLog:

* gcc.target/sparc/pr109541.c: New.

[Bug target/109541] [12/13/14 regression] ICE in extract_constrain_insn on when building rhash-1.4.3

2023-06-07 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109541

--- Comment #19 from Eric Botcazou  ---
Thanks Vladimir!  Would you be OK with a backport to the 13 branch?

[Bug tree-optimization/97711] Failure to optimise "x & 1 ? x - 1 : x" to "x & -2"

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97711

--- Comment #8 from Andrew Pinski  ---
Well the patch for PR 110155 will fix f but not g. I will add the POINTER_PLUS
pattern this weekend.

[Bug libgcc/109712] [13/14 Regression] Segmentation fault in linear_search_fdes

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #30 from CVS Commits  ---
The master branch has been updated by Florian Weimer :

https://gcc.gnu.org/g:49310a993308492348119f4033e4db0bda4fe46a

commit r14-1614-g49310a993308492348119f4033e4db0bda4fe46a
Author: Florian Weimer 
Date:   Tue Jun 6 11:01:07 2023 +0200

libgcc: Fix eh_frame fast path in find_fde_tail

The eh_frame value is only used by linear_search_fdes, not the binary
search directly in find_fde_tail, so the bug is not immediately
apparent with most programs.

Fixes commit e724b0480bfa5ec04f39be8c7290330b495c59de ("libgcc:
Special-case BFD ld unwind table encodings in find_fde_tail").

libgcc/

PR libgcc/109712
* unwind-dw2-fde-dip.c (find_fde_tail): Correct fast path for
parsing eh_frame.

[Bug libgcc/109712] [13/14 Regression] Segmentation fault in linear_search_fdes

2023-06-07 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712

--- Comment #31 from Florian Weimer  ---
Will propose a backport to 13 in ~2 weeks.

[Bug d/110113] gdc -fpreview=dip1021 crash in d/dmd/root/aav.d:127 dmd_aaGetRvalue from DsymbolTable::lookup(Identifier const*)

2023-06-07 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110113

--- Comment #6 from ibuclaw at gcc dot gnu.org ---
Full reduction without any imports.

---
class LUBench { }
void lup(ulong , ulong , int , int = 1)
{
new LUBench;
}
void lup_3200(ulong iters, ulong flops)
{
lup(iters, flops, 3200);
}
float raytrace()
{
struct V
{
float x, y, z;
auto normalize() { }
struct Tid { }
Tid spawnLinked() { }
string[] namesByTid;
class MessageBox { }
auto cross() { }
}
}

[Bug d/110113] gdc -fpreview=dip1021 crash in d/dmd/root/aav.d:127 dmd_aaGetRvalue from DsymbolTable::lookup(Identifier const*)

2023-06-07 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110113

--- Comment #7 from ibuclaw at gcc dot gnu.org ---
Same, but without any compiler errors.

This is reproducible in upstream dmd too.

dmd -lowmem -preview=dip1021 pr110113.d -o-

---
class LUBench { }
void lup(ulong , ulong , int , int = 1)
{
new LUBench;
}
void lup_3200(ulong iters, ulong flops)
{
lup(iters, flops, 3200);
}
void raytrace()
{
struct V
{
float x, y, z;
auto normalize() { }
struct Tid { }
auto spawnLinked() { }
string[] namesByTid;
class MessageBox { }
auto cross() { }
}
}

[Bug target/106907] gcc/config/rs6000/rs6000.cc:23155: strange expression ?

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106907

--- Comment #6 from CVS Commits  ---
The releases/gcc-13 branch has been updated by jeevitha :

https://gcc.gnu.org/g:dda4745eb1c9b063c6004baef54aa4cec97edf3d

commit r13-7426-gdda4745eb1c9b063c6004baef54aa4cec97edf3d
Author: Jeevitha Palanisamy 
Date:   Tue Jun 6 06:19:02 2023 -0500

rs6000: Remove duplicate expression [PR106907]

PR106907 has few warnings spotted from cppcheck. In that addressing
duplicate
expression issue here. Here the same expression is used twice in logical
AND(&&) operation which result in same result so removing that.

2023-06-06  Jeevitha Palanisamy  

gcc/
PR target/106907
* config/rs6000/rs6000.cc (vec_const_128bit_to_bytes): Remove
duplicate expression.

(cherry picked from commit c4deccd44655c5d748dfed200a37f2b678c32fe8)

[Bug c++/110159] New: ICEs for C++ Contracts test cases with '-fno-exceptions'

2023-06-07 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110159

Bug ID: 110159
   Summary: ICEs for C++ Contracts test cases with
'-fno-exceptions'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tschwinge at gcc dot gnu.org
CC: jason at gcc dot gnu.org
  Target Milestone: ---

Given native x86_64-pc-linux-gnu build of one-week-old commit
2720bbd597f56742a17119dfe80edc2ba86af255, running 'check-gcc-c++' with
'-fno-exceptions':

$ make check-gcc-c++ RUNTESTFLAGS='--target_board=unix/-fno-exceptions
dg.exp=contracts\*'

..., I see ICEs for C++ Contracts test cases, see below.

[...]/g++.dg/contracts/contracts-assume6.C: In function 'void fun(int)':
[...]/g++.dg/contracts/contracts-assume6.C:13:1: internal compiler error:
Segmentation fault
0x16ffb4f crash_signal
[...]/gcc/toplev.cc:314
0xe41fc8 contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
[...]/gcc/tree.h:3656
0xe41fc8 build_addr_func(tree_node*, int)
[...]/gcc/cp/call.cc:278
0xe4220d build_call_a(tree_node*, int, tree_node**)
[...]/gcc/cp/call.cc:366
0xed0dff build_contract_check(tree_node*)
[...]/gcc/cp/contracts.cc:1814
0xebdec7 cp_genericize_r
[...]/gcc/cp/cp-gimplify.cc:1500
0x19ed0e4 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*,
void*), void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
[...]/gcc/tree.cc:11341
0x1140b3f c_genericize_control_stmt(tree_node**, int*, void*, tree_node*
(*)(tree_node**, int*, void*), tree_node* (*)(tree_node**, int*, tree_node*
(*)(tree_node**, int*, void*), void*, hash_set >*))
[...]/gcc/c-family/c-gimplify.cc:534
0xebdbf1 cp_genericize_r
[...]/gcc/cp/cp-gimplify.cc:1861
0x19ed0e4 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*,
void*), void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
[...]/gcc/tree.cc:11341
0xebd1b9 cp_genericize_tree
[...]/gcc/cp/cp-gimplify.cc:1894
0xebd536 cp_genericize(tree_node*)
[...]/gcc/cp/cp-gimplify.cc:2036
0xf0a5ff finish_function(bool)
[...]/gcc/cp/decl.cc:18312
0xed2729 finish_function_contracts(tree_node*)
[...]/gcc/cp/contracts.cc:2050
0xf09895 finish_function(bool)
[...]/gcc/cp/decl.cc:18354
0x1008677 cp_parser_function_definition_after_declarator
[...]/gcc/cp/parser.cc:32057
0x1009b53 cp_parser_function_definition_from_specifiers_and_declarator
[...]/gcc/cp/parser.cc:31971
0x1009b53 cp_parser_init_declarator
[...]/gcc/cp/parser.cc:22822
0xfde638 cp_parser_simple_declaration
[...]/gcc/cp/parser.cc:15435
0x10143db cp_parser_declaration
[...]/gcc/cp/parser.cc:15121

'gcc/cp/call.cc':

275 tree
276 build_addr_func (tree function, tsubst_flags_t complain)
277 {
278   tree type = TREE_TYPE (function);

358 tree
359 build_call_a (tree function, int n, tree *argarray)
360 {
361   tree decl;
362   tree result_type;
363   tree fntype;
364   int i;
365 
366   function = build_addr_func (function, tf_warning_or_error);

'gcc/cp/contracts.cc':

   1775 tree
   1776 build_contract_check (tree contract)
   1777 {
   [...]
   1814 finish_expr_stmt (build_call_a (terminate_fn, 0, nullptr));

So I suppose 'terminate_fn' isn't initialized here, which is normally done by
'init_exception_processing', which isn't called for '-fno-exceptions'.  I'm
happy to have a try at addressing this, but will need guidance at which level
additional 'if (flag_exceptions)' or similar is necessary.

This code exists as of the initial commit
r13-4160-g2efb237ffc68ec9bb17982434f5941bfa14f8b50 "c++: implement P1492
contracts".

[-PASS:-]{+FAIL:+} g++.dg/contracts/contracts-assume6.C(test for
errors, line 29)
[-PASS:-]{+FAIL:+} g++.dg/contracts/contracts-assume6.C(test for
errors, line 42)
[-PASS:-]{+FAIL:+} g++.dg/contracts/contracts-assume6.C(test for
errors, line 49)
[-PASS:-]{+FAIL:+} g++.dg/contracts/contracts-assume6.C(test for
warnings, line 24)
[-PASS:-]{+FAIL: g++.dg/contracts/contracts-assume6.C   (internal compiler
error: Segmentation fault)+}
{+FAIL:+} g++.dg/contracts/contracts-assume6.C   (test for excess errors)

UNSUPPORTED: g++.dg/contracts/contracts-comdat1.C  -std=c++14
UNSUPPORTED: g++.dg/contracts/contracts-comdat1.C  -std=c++17
[-PASS:-]{+UNRESOLVED:+} g++.dg/contracts/contracts-comdat1.C  -std=c++20 
scan-assembler-not (weak|globl)[^\\n]*_Z1fi.pre
[-PASS:-

[Bug target/110100] __builtin_aarch64_st64b stores to the wrong address

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110100

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:713613541254039a34e1dd8fd4a613a299af1fd6

commit r14-1615-g713613541254039a34e1dd8fd4a613a299af1fd6
Author: Alex Coplan 
Date:   Tue Jun 6 11:04:45 2023 +0100

aarch64: Fix whitespace in ls64 builtin implementation [PR110100]

The ls64 builtin code was using incorrect GNU style with eight spaces where
there should be a tab. Fixed thusly.

gcc/ChangeLog:

PR target/110100
* config/aarch64/aarch64-builtins.cc
(aarch64_init_ls64_builtins_types):
Replace eight consecutive spaces with tabs.
(aarch64_init_ls64_builtins): Likewise.
(aarch64_expand_builtin_ls64): Likewise.
* config/aarch64/aarch64.md (ld64b): Likewise.
(st64b): Likewise.
(st64bv): Likewise
(st64bv0): Likewise.

[Bug target/110100] __builtin_aarch64_st64b stores to the wrong address

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110100

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:737a0b749a7bc3e7cb904ea2d4b18dc130514b85

commit r14-1616-g737a0b749a7bc3e7cb904ea2d4b18dc130514b85
Author: Alex Coplan 
Date:   Tue Jun 6 11:52:19 2023 +0100

aarch64: Fix wrong code with st64b builtin [PR110100]

The st64b pattern incorrectly had an output constraint on the register
operand containing the destination address for the store, leading to
wrong code. This patch fixes that.

gcc/ChangeLog:

PR target/110100
* config/aarch64/aarch64-builtins.cc (aarch64_expand_builtin_ls64):
Use input operand for the destination address.
* config/aarch64/aarch64.md (st64b): Fix constraint on address
operand.

gcc/testsuite/ChangeLog:

PR target/110100
* gcc.target/aarch64/acle/pr110100.c: New test.

[Bug target/110132] aarch64: Bogus -Wbuiltin-declaration-mismatch with ls64 builtins

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110132

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:9963029a24f2d2510b82e7106fae3f364da33c5d

commit r14-1617-g9963029a24f2d2510b82e7106fae3f364da33c5d
Author: Alex Coplan 
Date:   Tue Jun 6 15:19:03 2023 +0100

aarch64: Allow compiler to define ls64 builtins [PR110132]

This patch refactors the ls64 builtins to allow the compiler to define them
directly instead of having wrapper functions in arm_acle.h. This should be
not
only easier to maintain, but it makes two important correctness fixes:
 - It fixes PR110132, where the builtins ended up getting declared with
   invisible bindings in the C FE, so the FE ended up synthesizing
   incompatible implicit definitions for these builtins.
 - It allows the builtins to be used with LTO, which didn't work
previously.

We also take the opportunity to add test coverage from C++ for these
builtins.

gcc/ChangeLog:

PR target/110132
* config/aarch64/aarch64-builtins.cc
(aarch64_general_simulate_builtin):
New. Use it ...
(aarch64_init_ls64_builtins): ... here. Switch to declaring public
ACLE
names for builtins.
(aarch64_general_init_builtins): Ensure we invoke the arm_acle.h
setup if in_lto_p, just like we do for SVE.
* config/aarch64/arm_acle.h: (__arm_ld64b): Delete.
(__arm_st64b): Delete.
(__arm_st64bv): Delete.
(__arm_st64bv0): Delete.

gcc/testsuite/ChangeLog:

PR target/110132
* lib/target-supports.exp
(check_effective_target_aarch64_asm_FUNC_ok):
Extend to ls64.
* g++.target/aarch64/acle/acle.exp: New.
* g++.target/aarch64/acle/ls64.C: New test.
* g++.target/aarch64/acle/ls64_lto.C: New test.
* gcc.target/aarch64/acle/ls64_lto.c: New test.
* gcc.target/aarch64/acle/pr110132.c: New test.

[Bug target/110100] __builtin_aarch64_st64b stores to the wrong address

2023-06-07 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110100

--- Comment #5 from Alex Coplan  ---
Fixed on trunk for GCC 14, keeping open for backports (I think we need this
back to GCC 12).

[Bug target/110132] aarch64: Bogus -Wbuiltin-declaration-mismatch with ls64 builtins

2023-06-07 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110132

--- Comment #4 from Alex Coplan  ---
Fixed on trunk, keeping open for backports (I think we need this back to GCC
12).

[Bug c++/110160] New: g++ rejects concept as cyclical with non-matching function signature

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110160

Bug ID: 110160
   Summary: g++ rejects concept as cyclical with non-matching
function signature
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danakj at orodu dot net
  Target Milestone: ---

Godbolt: https://godbolt.org/z/d4PhfMqvq

Code:
```
#include 
#include 

template 
concept StreamCanReceiveString = requires(T& t, std::string s) {
{ t << s };
};

struct NotAStream {};
struct UnrelatedType {};

template 
S& operator<<(S& s, UnrelatedType) {
return s;
}

static_assert(!StreamCanReceiveString);

static_assert(StreamCanReceiveString);
```

What happens here is GCC fails to be able to resolve the expression
`StreamCanReceiveString`.

1. StreamCanReceiveString tries to do NotAStream << std::string.
2. There is a templated operator<< that takes `StreamCanReceiveString` and
`UnrelatedType`
3. Since `UnrelatedType` is not std::string, this is not an overload candidate.
4. Clang and MSVC therefore do not try to recursively solve
`StreamCanReceiveString` and reject the code. But GCC tries to
solve the concept and then fails due to recursion.

[Bug c++/110160] g++ rejects concept as cyclical with non-matching function signature

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110160

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=99599

--- Comment #1 from Andrew Pinski  ---
I think this is a dup of bug 99599, specifically bug 99599 comment #2 .

[Bug modula2/110161] New: Comparing a typed procedure variable to 0 gives ICE or assertions

2023-06-07 Thread admin--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110161

Bug ID: 110161
   Summary: Comparing a typed procedure variable to 0 gives ICE or
assertions
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: ad...@tho-otto.de
  Target Milestone: ---

In the following example:

MODULE foo;

TYPE xProc = PROCEDURE(): BOOLEAN;
VAR x: xProc;

BEGIN
  IF x = 0 THEN END;
END foo.

I get:

cc1gm2: internal compiler error: assert failed

(unfortunately without any line number information).
Same happens when comparing to NIL or PROC(0). Only xProc(0) works.

[Bug d/110113] gdc -fpreview=dip1021 crash in d/dmd/root/aav.d:127 dmd_aaGetRvalue from DsymbolTable::lookup(Identifier const*)

2023-06-07 Thread ibuclaw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110113

ibuclaw at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
URL||https://github.com/dlang/dm
   ||d/pull/14837
   Last reconfirmed||2023-06-07

--- Comment #8 from ibuclaw at gcc dot gnu.org ---
Regression caused by upstream.

https://github.com/dlang/dmd/pull/14837

[Bug libstdc++/110145] 20_util/to_chars/double.cc fails for -m32 -fexcess-precision=standard

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110145

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:88e7f1f7ee67462713a89104ae07e99b191d5e2c

commit r14-1619-g88e7f1f7ee67462713a89104ae07e99b191d5e2c
Author: Jakub Jelinek 
Date:   Wed Jun 7 19:27:35 2023 +0200

libstdc++: Fix up 20_util/to_chars/double.cc test for excess precision
[PR110145]

This test apparently contains 3 problematic floating point constants,
1e126, 4.91e-6 and 5.547e-6.  These constants suffer from double rounding
when -fexcess-precision=standard evaluates double constants in the
precision
of Intel extended 80-bit long double.
As written in the PR, e.g. the first one is
0x1.7a2ecc414a03f7ff6ca1cb527787b130a97d51e51202365p+418
in the precision of GCC's internal format, 80-bit long double has
63-bit precision, so the above constant rounded to long double is
0x1.7a2ecc414a03f800p+418L
(the least significant bit in the 0 before p isn't there already).
0x1.7a2ecc414a03f800p+418L rounded to IEEE double is
0x1.7a2ecc414a040p+418.
Now, if excess precision doesn't happen and we round the GCC's internal
format number directly to double, it is
0x1.7a2ecc414a03fp+418 and that is the number the test expects.
One can see it on x86-64 (where excess precision to long double doesn't
happen) where double(1e126L) != 1e126.
The other two constants suffer from the same problem.

The following patch tweaks the testcase, such that those problematic
constants are used only if FLT_EVAL_METHOD is 0 or 1 (i.e. when we have
guarantee the constants will be evaluated in double precision),
plus adds corresponding tests with hexadecimal constants which don't
suffer from this excess precision problem, they are exact in double
and long double can hold all double values.

2023-06-07  Jakub Jelinek  

PR libstdc++/110145
* testsuite/20_util/to_chars/double.cc: Include .
(double_to_chars_test_cases,
double_scientific_precision_to_chars_test_cases_2,
double_fixed_precision_to_chars_test_cases_2): #if out 1e126,
4.91e-6
and 5.547e-6 tests if FLT_EVAL_METHOD is negative or larger than 1.
Add unconditional tests with corresponding double constants
0x1.7a2ecc414a03fp+418, 0x1.4981285e98e79p-18 and
0x1.7440bbff418b9p-18.

[Bug tree-optimization/94566] conversion between std::strong_ordering and int

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566

Andrew Pinski  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com

--- Comment #12 from Andrew Pinski  ---
Aldy or Andrew, why in conv1 we don't get a range for 
  SR.4_4 = sD.8798._M_valueD.7665;

Even though the range we have is [-1,1] according to the
__builtin_unreachable()?
It seems like we should get that range. Once we do get that the code works.
E.g. If we add:
  signed char *t = (signed char*)&s;
  signed char tt = *t;
  if (tt < -1 || tt > 1) __builtin_unreachable();

In the front before the other ifs, we get the code we are expecting.

conv2 has a similar issue too, though it has also a different issue of ordering
for the comparisons.

[Bug c++/110160] g++ rejects concept as cyclical with non-matching function signature

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110160

--- Comment #2 from danakj at orodu dot net ---
Ugh, yeah, I guess it is. It means you can't redirect through a template
function that uses concepts with G++.

[Bug target/109725] [14 Regression] ICE: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4430

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109725

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Dimitar Dimitrov :

https://gcc.gnu.org/g:7f26e76c9848aeea9ec10ea701a6168464a4a9c2

commit r14-1621-g7f26e76c9848aeea9ec10ea701a6168464a4a9c2
Author: Dimitar Dimitrov 
Date:   Mon Jun 5 21:39:16 2023 +0300

riscv: Fix scope for memory model calculation

During libgcc configure stage for riscv32-none-elf, when
"--enable-checking=yes,rtl" has been activated, the following error
is observed:

  during RTL pass: final
  conftest.c: In function 'main':
  conftest.c:16:1: internal compiler error: RTL check: expected code
'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4462
 16 | }
| ^
  0x843c4d rtl_check_failed_code1(rtx_def const*, rtx_code, char const*,
int, char const*)
  /mnt/nvme/dinux/local-workspace/gcc/gcc/rtl.cc:916
  0x8ea823 riscv_print_operand
 
/mnt/nvme/dinux/local-workspace/gcc/gcc/config/riscv/riscv.cc:4462
  0xde84b5 output_operand(rtx_def*, int)
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3632
  0xde8ef8 output_asm_insn(char const*, rtx_def**)
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3544
  0xded33b output_asm_insn(char const*, rtx_def**)
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3421
  0xded33b final_scan_insn_1
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:2841
  0xded6cb final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:2887
  0xded8b7 final_1
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:1979
  0xdee518 rest_of_handle_final
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:4240
  0xdee518 execute
  /mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:4318

Fix by moving the calculation of memmodel to the cases where it is used.

Regression tested for riscv32-none-elf. No changes in gcc.sum and
g++.sum.

PR target/109725

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Calculate
memmodel only when it is valid.

Signed-off-by: Dimitar Dimitrov 

[Bug c++/99599] [11/12/13/14 Regression] Concepts requirement falsely reporting cyclic dependency, breaks tag_invoke pattern

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99599

danakj at orodu dot net changed:

   What|Removed |Added

 CC||danakj at orodu dot net

--- Comment #14 from danakj at orodu dot net ---
*** Bug 110160 has been marked as a duplicate of this bug. ***

[Bug c++/110160] g++ rejects concept as cyclical with non-matching function signature

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110160

danakj at orodu dot net changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from danakj at orodu dot net ---
Okay I've got a workaround based on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99599#c6. It's probably worse for
compile times, but it is what it is.

Thanks for the link.

*** This bug has been marked as a duplicate of bug 99599 ***

[Bug c++/99599] [11/12/13/14 Regression] Concepts requirement falsely reporting cyclic dependency, breaks tag_invoke pattern

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99599

--- Comment #15 from danakj at orodu dot net ---
The workaround listed in Comment #6 does not work for templated types,
unfortunately, making Clang and MSVC more expressive here than GCC.

https://godbolt.org/z/obhsqhrbx

```
#include 
#include 
#include 

#if defined(__GNUC__) && !defined(__clang__)
#define COMPILER_IS_GCC 1
#else
#define COMPILER_IS_GCC 0
#endif

namespace sus::string::__private {
template 
A& format_to_stream(A&, B);

template 
concept StreamCanReceiveString = requires(T& t, std::basic_string s) {
{ operator<<(t, s) };
};

/// Consumes the string `s` and streams it to the output stream `os`.
template  S>
S& format_to_stream(S& os, const std::basic_string& s) {
os << s;
return os;
}

}  // namespace sus::string::__private

namespace sus::option {
template 
class Option {};

using namespace ::sus::string::__private;
template <
class T, 
#if COMPILER_IS_GCC
std::same_as > Sus_ValueType,  // Does not deduce T.  *
#endif
StreamCanReceiveString Sus_StreamType
>
inline Sus_StreamType& operator<<(
Sus_StreamType& stream,
#if COMPILER_IS_GCC
const Sus_ValueType& value
#else
const ::sus::option::Option& value  // Does deduce T.   
#endif
) {
return format_to_stream(stream, std::string());
}

}  // namespace sus::option

int main() {
std::stringstream s;
s << sus::option::Option();
}
```

[Bug c++/110153] [modules] Static module mapper format cannot handle header unit paths with spaces

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110153

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=89249

--- Comment #1 from Andrew Pinski  ---
Many build systems (make included here) have issues with spaces.

Even GCC's LTO does not handle spaces that well, see PR 89249.

[Bug c++/110162] New: redundant move in initialization

2023-06-07 Thread jincikang at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

Bug ID: 110162
   Summary: redundant move in initialization
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jincikang at gmail dot com
  Target Milestone: ---

$ cat main.cpp
```cpp
// main.cpp
#include 

class HttpMessage {
public:
std::string* body() noexcept {
return &body_;
}

const std::string* body() const noexcept {
return &body_;
}

void set_body(std::string s) {
body_ = std::move(s);
}
private:
std::string body_;
};

class HttpResponse : private HttpMessage {
public:
using HttpMessage::body;
using HttpMessage::set_body;
private:
};

class HttpRequest : private HttpMessage {
public:
using HttpMessage::body;
using HttpMessage::set_body;
};

int main() {
  [[maybe_unused]]auto post = [](const HttpRequest& request, HttpResponse*
response) {
response->set_body(std::move(*request.body()));
  };
}
```
$ g++ -std=c++2a -Werror -Wall -Wextra main.cpp
Error: redundant move in initialization [-Werror=redundant-move]
   35 | response->set_body(std::move(*request.body()));
  |~^

# OK.
$ clang++ -std=c++2a -Werror -Wall -Wextra main.cpp
# Ok
$ g++-12 -std=c++2a -Werror -Wall -Wextra main.cpp

[Bug sanitizer/110157] Address sanitizer crashes when accessing variables through procedure callback

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110157

--- Comment #1 from Andrew Pinski  ---
If anything what is most likely happening is the stack is not being recorded as
executable which is needed for nest functions.

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread jincikang at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

--- Comment #1 from jinci kang  ---
# OK.
$ g++ -std=c++2a -Werror -Wall main.cpp

[Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result

2023-06-07 Thread dimitar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562

--- Comment #4 from Dimitar Dimitrov  ---
The ideal PRU code sequence for the snippet would be:

char test(uint64_t a, uint64_t b)
{
return a && b;
}
or  r14, r14, r15
or  r16, r16, r17
uminr14, r14, 1
uminr14, r14, r16
ret

Thus I'm trying to implementing the following conversion in
emit_store_flag_int():

   "X != 0" -> "UMIN (X, 1)

[Bug sanitizer/110157] [13/14 Regression] Address sanitizer does not like nested function trampolines any more

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110157

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-06-07
   Target Milestone|--- |13.2
Summary|Address sanitizer crashes   |[13/14 Regression] Address
   |when accessing variables|sanitizer does not like
   |through procedure callback  |nested function trampolines
   ||any more

--- Comment #2 from Andrew Pinski  ---
Reduced GNU C testcase (just compile and run with -fsanitize=address):
```
void quicksort(_Bool (*ugt)())
{
  __builtin_printf(">>> Calling ugt\n");
  _Bool t = ugt();
  __builtin_printf(">>> Done ugt\n");
}

void gfits_setsort(int key)
{
  _Bool sort_gt()
  {
return key > 0;
  }
  quicksort(sort_gt);
}

int main()
{
gfits_setsort(1);
}
```


```
AddressSanitizer:DEADLYSIGNAL
=
==1==ERROR: AddressSanitizer: SEGV on unknown address 0x7f346f900034 (pc
0x7f346f900034 bp 0x7ffe64ea8b90 sp 0x7ffe64ea8b68 T0)
==1==The signal is caused by a READ memory access.
==1==Hint: PC is at a non-executable region. Maybe a wild jump?
#0 0x7f346f900034  ()
#1 0x40134f in gfits_setsort /app/example.cpp:14
#2 0x40139f in main /app/example.cpp:19
#3 0x7f3471eb3082 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x24082) (BuildId:
1878e6b475720c7c51969e69ab2d276fae6d1dee)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV () 
==1==ABORTING
```

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #2 from Andrew Pinski  ---
I think the GCC diagnostic is correct, the std::move is redundant here.

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

--- Comment #3 from Andrew Pinski  ---
See
https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/C_002b_002b-Dialect-Options.html#index-Wno-redundant-move

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

--- Comment #4 from Andrew Pinski  ---
See https://gcc.gnu.org/gcc-13/porting_to.html also.

[Bug target/106562] PRU: Inefficient code for zero check of 64-bit (boolean) AND result

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106562

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=104296

--- Comment #5 from Andrew Pinski  ---
(In reply to Dimitar Dimitrov from comment #4)
> Thus I'm trying to implementing the following conversion in
> emit_store_flag_int():
> 
>"X != 0" -> "UMIN (X, 1)

That is basically what I mention in PR 104296.

[Bug c++/110158] Cannot use union with std::string inside in constant expression

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110158

--- Comment #1 from Andrew Pinski  ---
Here is a slightly reduced testcase (for a slightly different issue still
dealing with unions):
```
struct str1
{
//  bool a;
  char *var;
  union {
char t[15];
int allocated;
  };
  constexpr str1() : var(new char[2]) { t[0] = 0; }
  constexpr ~str1() {if (var != t) delete[] var; }
};

typedef str1 str;
constexpr bool f1() {
str t{};
return true;
}
static_assert( f1() );

constexpr bool f() {
union U{
str s;
constexpr ~U(){ s.~str(); }
} u{};
return true;
}

static_assert( f() );

```

[Bug c++/107198] [13/14 Regression] ICE in cp_gimplify_expr, at cp/cp-gimplify.cc:752 since r13-3175-g6ffbf87ca66f4ed9

2023-06-07 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107198

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed|2022-10-10 00:00:00 |2023-6-7
 CC||tschwinge at gcc dot gnu.org

--- Comment #4 from Thomas Schwinge  ---
Reconfirmed.

Given native x86_64-pc-linux-gnu build of one-week-old commit
2720bbd597f56742a17119dfe80edc2ba86af255, running 'g++.dg/eh/aggregate1.C' with
'-fno-exceptions':

$ make check-gcc-c++ RUNTESTFLAGS='--target_board=unix/-fno-exceptions
dg.exp=aggregate1.C'

..., I see ICEs not for '-std=c++98', but for '-std=c++14' and higher:

UNSUPPORTED: g++.dg/eh/aggregate1.C  -std=c++98: exception handling
disabled
FAIL: g++.dg/eh/aggregate1.C  -std=c++14 (internal compiler error: in
cp_gimplify_expr, at cp/cp-gimplify.cc:782)
UNSUPPORTED: g++.dg/eh/aggregate1.C  -std=c++14: exception handling
disabled
FAIL: g++.dg/eh/aggregate1.C  -std=c++17 (internal compiler error: in
cp_gimplify_expr, at cp/cp-gimplify.cc:782)
UNSUPPORTED: g++.dg/eh/aggregate1.C  -std=c++17: exception handling
disabled
FAIL: g++.dg/eh/aggregate1.C  -std=c++20 (internal compiler error: in
cp_gimplify_expr, at cp/cp-gimplify.cc:782)
UNSUPPORTED: g++.dg/eh/aggregate1.C  -std=c++20: exception handling
disabled

[...]/g++.dg/eh/aggregate1.C: In constructor 'A::A()':
[...]/g++.dg/eh/aggregate1.C:18:47: error: exception handling disabled, use
'-fexceptions' to enable
[...]/g++.dg/eh/aggregate1.C: In function 'void try_idx(int)':
[...]/g++.dg/eh/aggregate1.C:40:25: error: 'x' was not declared in this
scope
[...]/g++.dg/eh/aggregate1.C:39:40: internal compiler error: in
cp_gimplify_expr, at cp/cp-gimplify.cc:782
0x6f7024 cp_gimplify_expr(tree_node**, gimple**, gimple**)
[...]/gcc/cp/cp-gimplify.cc:782
0x13d6cfd gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16331
0x13dcf9d gimplify_init_ctor_eval_range
[...]/gcc/gimplify.cc:4929
0x13dcf9d gimplify_init_ctor_eval
[...]/gcc/gimplify.cc:5008
0x13dce55 gimplify_init_ctor_eval
[...]/gcc/gimplify.cc:5033
0x13dd671 gimplify_init_constructor
[...]/gcc/gimplify.cc:5447
0x13ea18d gimplify_modify_expr
[...]/gcc/gimplify.cc:6127
0x13d76ea gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16422
0x13e8567 gimplify_stmt(tree_node**, gimple**)
[...]/gcc/gimplify.cc:7238
0x13e8567 gimplify_compound_expr
[...]/gcc/gimplify.cc:6431
0x13d7aae gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16412
0x13d80f8 gimplify_cleanup_point_expr
[...]/gcc/gimplify.cc:7238
0x13d80f8 gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16815
0x13da0a6 gimplify_stmt(tree_node**, gimple**)
[...]/gcc/gimplify.cc:7238
0x13d89a8 gimplify_statement_list
[...]/gcc/gimplify.cc:2019
0x13d89a8 gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16867
0x13da0a6 gimplify_stmt(tree_node**, gimple**)
[...]/gcc/gimplify.cc:7238
0x13d89a8 gimplify_statement_list
[...]/gcc/gimplify.cc:2019
0x13d89a8 gimplify_expr(tree_node**, gimple**, gimple**, bool
(*)(tree_node*), int)
[...]/gcc/gimplify.cc:16867
0x13d80f8 gimplify_cleanup_point_expr
[...]/gcc/gimplify.cc:7238

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

--- Comment #5 from Jonathan Wakely  ---
(In reply to Andrew Pinski from comment #4)
> See https://gcc.gnu.org/gcc-13/porting_to.html also.

I don't think this is related to the new rules.

The std::move here is redundant because request is const, so request.body()
calls the const overload which returns const std::string* and so
std::move(*request.body()) produces a const std::string&& which cannot be
moved. It can only be copied. So the move is redundant.

[Bug c++/110162] redundant move in initialization

2023-06-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110162

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #6 from Jonathan Wakely  ---
(In reply to jinci kang from comment #0)
> $ g++ -std=c++2a -Werror -Wall -Wextra main.cpp
> Error: redundant move in initialization [-Werror=redundant-move]
>35 | response->set_body(std::move(*request.body()));
>   |~^

You turned this warning on with -Wextra and then you turned it into an error
with -Werror.

Either stop doing that, or fix the code to avoid the warning.

[Bug c++/99599] [11/12/13/14 Regression] Concepts requirement falsely reporting cyclic dependency, breaks tag_invoke pattern

2023-06-07 Thread danakj at orodu dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99599

--- Comment #16 from danakj at orodu dot net ---
Well for anyone who hits the same issue, it appears that GCC _does_ follow
Clang and MSVC in not considering the overload and chasing through the concept
resolution if the non-concept types are templates and do not match the caller's
arguments.

So you need to do:

1) For non-GCC just use:

  template auto invoke_tag(bar_tag, T it);

2) For GCC non-template type bar_tag use:

  template T, fooable U> auto invoke_tag(T, U it);

3) For GCC template type bar_tag, back to 1)

  template auto invoke_tag(bar_tag, T it);


Note also that 2) uses same_as, not convertible_to as in Comment #6, otherwise
you can get ambiguous overload resolution if multiple types convert to one,
which does not occur in Clang/MSVC with the regular type parameter. This _does_
again result in more code that will compile in Clang/MSVC than in GCC, as it
prevents conversions from types that don't have an overload.

The macros to do this get rather exciting, if that's of interest to someone in
the future:
https://github.com/chromium/subspace/pull/253/commits/719500c4d2cbfcfd238d7ee3c5b3d371f40e46c1

[Bug tree-optimization/94566] conversion between std::strong_ordering and int

2023-06-07 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566

--- Comment #13 from Andrew Macleod  ---
(In reply to Andrew Pinski from comment #12)
> Aldy or Andrew, why in conv1 we don't get a range for 
>   SR.4_4 = sD.8798._M_valueD.7665;
> 
> Even though the range we have is [-1,1] according to the
> __builtin_unreachable()?
> It seems like we should get that range. Once we do get that the code works.
> E.g. If we add:
>   signed char *t = (signed char*)&s;
>   signed char tt = *t;
>   if (tt < -1 || tt > 1) __builtin_unreachable();
> 
> In the front before the other ifs, we get the code we are expecting.
> 
> conv2 has a similar issue too, though it has also a different issue of
> ordering for the comparisons.

its because the unreachable is after the branches, and we have multiple uses of
SR.4_4 before the unreachable.

   [local count: 1073741824]:
  SR.4_4 = s._M_value;
  if (SR.4_4 == -1)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 536870913]:
  if (SR.4_4 == 0)
goto ; [50.00%]
  else
goto ; [50.00%]

   [local count: 268435456]:
  if (SR.4_4 == 1)
goto ; [100.00%]
  else
goto ; [0.00%]

   [count: 0]:
  __builtin_unreachable ();

   [local count: 1073741824]:
  # _1 = PHI <-1(2), 0(3), 1(4)>

We know when we get to bb5 that SR.4_4 is [-1, 1] for sure.
But we dont know that before we reach that spot.
if there was a call to 
  foo(SR.4_4)
in bb 3 for instance,   we wouldn't be able to propagate [-1,1] to the call to
foo because it happens before we know for sure.and foo may go and do
something if it has a value of 6 and exit the compilation, thus never
returning.

So we can only provide a range of [-1, 1] AFTER the unreachable, or if there is
only a single use of it.. the multiples uses are what tricks it.

This has come up before.  we need some sort of backwards propagation that can
propagate discovered values earlier into the IL to a point where it is known to
be safe (ie, it wouldnt be able to propagate it past a call to foo() for
instance)
In cases like this, we could discover it is safe to propagate that range back
to the def point, and then we could set the global.

Until we add some smarts, either to the builtin unreachable elimination code,
or elsewhere, which is aware of how to handle such side effects, we can't set
the global because we dont know if it is safe at each use before the
unreachable call.

[Bug rtl-optimization/110163] New: [14 Regression] Comparing against a constant string is inefficient on some targets

2023-06-07 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110163

Bug ID: 110163
   Summary: [14 Regression] Comparing against a constant string is
inefficient on some targets
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: law at gcc dot gnu.org
  Target Milestone: ---

Comparing against a constant string is expanded by inline_string_cmp and on
some targets the generated code can be inefficient.  This can be seen in
spec2017's omnetpp benchmark, particularly when the inline string comparison
limits are increased.

The problem is the expansion code arranges to do all the arithmetic and tests
in SImode.  On RV64 this introduces a sign extension for each test  due to how
RV64 expresses 32bit ops.

It would be better to do all the computations in word_mode, then convert the
final result to SImode, at least for RV64 and likely for other targets.

I experimented with starting to build out cost checks to determine what mode to
use for the internal computations.  That ran afoul of x86 where the cost of a
byte load is different than the cost of an extended byte load, even though they
use the exact same instruction.

There's also a need to cost out the computations, test & branch in the
different modes as well once the x86 hurdle is behind us.

I've set work on this aside for now.  But the discussion can be found in these
two threads:

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620601.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620577.html

#include 
int
foo (char *x)
{
   return strcmp (x, "lowerLayout");
}

Compiled with -O2 --param builtin-string-cmp-inline-length=100 on rv64 should
show the issue.

[Bug c++/51571] No named return value optimization while adding a dummy scope

2023-06-07 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51571

--- Comment #11 from Jason Merrill  ---
(In reply to CVS Commits from comment #9)
> This implements the guaranteed copy elision specified by P2025

Or not; I just noticed that P2025 also requires a fix for PR53637.

[Bug ipa/109886] UBSAN error: shift exponent 64 is too large for 64-bit type when compiling gcc.c-torture/compile/pr96796.c

2023-06-07 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109886

Andrew Macleod  changed:

   What|Removed |Added

 CC||amacleod at redhat dot com

--- Comment #6 from Andrew Macleod  ---
(In reply to Martin Jambor from comment #5)
> (In reply to Aldy Hernandez from comment #4)
> > (In reply to Andrew Pinski from comment #3)

> > > That is correct. The generated code has a VIEW_CONVERT_EXR from an integer
> > > type to a RECORD_TYPE.
> > 
> > Eeeech.  In that case, then what you suggest is reasonable.  Bail if
> > param_type is not supported by the underlying range.  Maybe the IPA experts
> > could opine?
> 
> With LTOed type mismateches or with K&R style code, IPA has to be prepared
> to deal with such cases, unfortunately.  So a check like that indeed looks
> reasonable.

The new range-op dispatch code is coming shortly.. when an unsupported type is
passed in to any ranger routine, we'll simply return false instead of trapping
like we do now.

[Bug tree-optimization/97711] Failure to optimise "x & 1 ? x - 1 : x" to "x & -2"

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97711

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-June/62
   ||0985.html

--- Comment #9 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620985.html

[Bug tree-optimization/110155] Missing if conversion

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110155

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-June/62
   ||0985.html

--- Comment #4 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620985.html

[Bug c++/110164] New: Improve diagnostic for incomplete standard library types due to missing include

2023-06-07 Thread rs2740 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110164

Bug ID: 110164
   Summary: Improve diagnostic for incomplete standard library
types due to missing include
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rs2740 at gmail dot com
  Target Milestone: ---

If I forget to include a header before using a sufficiently well-known standard
library type, GCC helpfully reminds me of the header:

$ echo 'std::array x;' | g++ -x c++ -

:1:6: error: ‘array’ in namespace ‘std’ does not name a template type
:1:1: note: ‘std::array’ is defined in header ‘’; did you forget
to ‘#include ’?

But if I happen to have a different standard library header included that
happens to bring in a forward declaration of the type, the error message is
less helpful:

$ echo -e '#include \nstd::array x;' | g++ -x c++ -

:2:21: error: aggregate ‘std::array x’ has incomplete type and
cannot be defined

It would be nice if the latter case also has a hint about the potential missing
include.

[Bug c++/110164] Improve diagnostic for incomplete standard library types due to missing include

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110164

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed||2023-06-07
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=39730

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug c++/53637] NRVO not applied where there are two different variables involved

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53637

--- Comment #10 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:28db36e2cfca1b7106adc8d371600fa3a325c4e2

commit r14-1624-g28db36e2cfca1b7106adc8d371600fa3a325c4e2
Author: Jason Merrill 
Date:   Wed Jun 7 05:15:02 2023 -0400

c++: allow NRV and non-NRV returns [PR58487]

Now that we support NRV from an inner block, we can also support non-NRV
returns from other blocks, since once the NRV is out of scope a later
return
expression can't possibly alias it.

This fixes 58487 and half-fixes 53637: now one of the returns is elided,
but
not the other.

Fixing the remaining xfails in these testcases will require a very
different
approach, probably involving a full tree/block walk from finalize_nrv, and
check_return_expr only adding to a list of potential return variables.

PR c++/58487
PR c++/53637

gcc/cp/ChangeLog:

* cp-tree.h (INIT_EXPR_NRV_P): New.
* semantics.cc (finalize_nrv_r): Check it.
* name-lookup.h (decl_in_scope_p): Declare.
* name-lookup.cc (decl_in_scope_p): New.
* typeck.cc (check_return_expr): Allow non-NRV
returns if the NRV is no longer in scope.

gcc/testsuite/ChangeLog:

* g++.dg/opt/nrv26.C: New test.
* g++.dg/opt/nrv26a.C: New test.
* g++.dg/opt/nrv27.C: New test.

[Bug c++/58487] Missed return value optimization

2023-06-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58487

--- Comment #7 from CVS Commits  ---
The trunk branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:28db36e2cfca1b7106adc8d371600fa3a325c4e2

commit r14-1624-g28db36e2cfca1b7106adc8d371600fa3a325c4e2
Author: Jason Merrill 
Date:   Wed Jun 7 05:15:02 2023 -0400

c++: allow NRV and non-NRV returns [PR58487]

Now that we support NRV from an inner block, we can also support non-NRV
returns from other blocks, since once the NRV is out of scope a later
return
expression can't possibly alias it.

This fixes 58487 and half-fixes 53637: now one of the returns is elided,
but
not the other.

Fixing the remaining xfails in these testcases will require a very
different
approach, probably involving a full tree/block walk from finalize_nrv, and
check_return_expr only adding to a list of potential return variables.

PR c++/58487
PR c++/53637

gcc/cp/ChangeLog:

* cp-tree.h (INIT_EXPR_NRV_P): New.
* semantics.cc (finalize_nrv_r): Check it.
* name-lookup.h (decl_in_scope_p): Declare.
* name-lookup.cc (decl_in_scope_p): New.
* typeck.cc (check_return_expr): Allow non-NRV
returns if the NRV is no longer in scope.

gcc/testsuite/ChangeLog:

* g++.dg/opt/nrv26.C: New test.
* g++.dg/opt/nrv26a.C: New test.
* g++.dg/opt/nrv27.C: New test.

[Bug tree-optimization/109791] -Wstringop-overflow warning with -O3 and _GLIBCXX_USE_CXX11_ABI=0

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109791

--- Comment #17 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #16)
> (In reply to Richard Biener from comment #15)
> > Created attachment 55155 [details]
> > patch unfolding such PHIs
> > 
> > Updated PHI unfolding patch.  Tests fine besides mentioned diagnostic
> > regressions.
> 
> I was looking into doing the opposite in forwprop but maybe I can skip
> addresses.

Oh yes I see it was mentioned before in PR 102138.

[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=102138

--- Comment #5 from Andrew Pinski  ---
This is basically PR 102138 .

[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959

--- Comment #6 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> This is basically PR 102138 .

Except it works at -O1 because the cast is pushed out of the phi by phiopt but
the cast is the same as a & 1 here :(.

For comment #0 we could just match this for unsigned type
a_2(D) > 1 ? 0 : a_2(D) == a_2(D) <= 1 ? a_2(D) : 0 -> (unsigned)(a == 1)

For comment #3 we need to pattern match this now:
  _7 = (_Bool) a_6(D);
  _9 = a_6(D) <= 1;
  _10 = _7 & _9;

[Bug target/105617] [12/13/14 Regression] Slp is maybe too aggressive in some/many cases

2023-06-07 Thread already5chosen at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617

--- Comment #19 from Michael_S  ---
(In reply to Mason from comment #18)
> Hello Michael_S,
> 
> As far as I can see, massaging the source helps GCC generate optimal code
> (in terms of instruction count, not convinced about scheduling).
> 
> #include 
> typedef unsigned long long u64;
> void add4i(u64 dst[4], const u64 A[4], const u64 B[4])
> {
>   unsigned char c = 0;
>   c = _addcarry_u64(c, A[0], B[0], dst+0);
>   c = _addcarry_u64(c, A[1], B[1], dst+1);
>   c = _addcarry_u64(c, A[2], B[2], dst+2);
>   c = _addcarry_u64(c, A[3], B[3], dst+3);
> }
> 
> 
> On godbolt, gcc-{11.4, 12.3, 13.1, trunk} -O3 -march=znver1 all generate
> the expected:
> 
> add4i:
> movq(%rdx), %rax
> addq(%rsi), %rax
> movq%rax, (%rdi)
> movq8(%rsi), %rax
> adcq8(%rdx), %rax
> movq%rax, 8(%rdi)
> movq16(%rsi), %rax
> adcq16(%rdx), %rax
> movq%rax, 16(%rdi)
> movq24(%rdx), %rax
> adcq24(%rsi), %rax
> movq%rax, 24(%rdi)
> ret
> 
> I'll run a few benchmarks to test optimal scheduling.

That's not merely "massaging the source". That's changing semantics.
Think about what happens when dst points to the middle of A or of B.
The change of semantics effectively prevented vectorizer from doing harm.

And yes, for common non-aliasing case the scheduling is problematic, too. 
It would probably not cause slowdown on the latest and greatest cores, but
could be slow on less great cores, including your default Zen1.

[Bug tree-optimization/110165] New: [13/14 Regression] wrong code with signed 1 bit integers sometimes since r13-4459-g6508d5e5a1a8

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110165

Bug ID: 110165
   Summary: [13/14 Regression] wrong code with signed 1 bit
integers sometimes since r13-4459-g6508d5e5a1a8
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: pinskia at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Full testcase:

struct s
{
  int t : 1;
};

[[gnu::noipa]]
int f(struct s t, int a, int b)
{
int bd = t.t;
if (bd) a|=b;
return a;
}

int main(void)
{
struct s t;
for(int i = 0;i <= 1; i++)
{
int a = 0x10;
int b = 0x0f;
int c = a | b;
int r = f((struct s){i}, a, b);
int exp = i == 1 ? a | b : a;
if (exp != r)
 __builtin_abort();
}
}
```

Found while improving these match patterns.

[Bug tree-optimization/110165] [13/14 Regression] wrong code with signed 1 bit integers sometimes since r13-4459-g6508d5e5a1a8

2023-06-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110165

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-06-07
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Target Milestone|--- |13.2

  1   2   >