[Bug c/110047] RFE: Add a warning for use of bare "unsigned" (possibly under -Wimplicit-int?)

2023-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110047

--- Comment #1 from Richard Biener  ---
Maybe just diagnose at the point of conversions that are not just sign
conversions but truncations/extensions?

Note even then this will have a high rate of false positives (I'm myself
always short-cutting 'unsigned int' to 'unsigned' ...) so it's more of
a coding-style diagnostic where then warning for all plain 'unsigned'
might be appropriate as well.

So, maybe split it even.  -Wconversion-bare-unsigned and -Wbare-unsigned?

[Bug target/110039] [14 Regression] FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2

2023-05-30 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #1 from rsandifo at gcc dot gnu.org  
---
I guess adding an extra pattern means that we'll have three forms
for this (on top of the existing alt1 and alt2 patterns).  But that
probably can't be helped given that the DI form has presumably not
changed.

[Bug c/110048] undefined reference when build with O0

2023-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110048

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Richard Biener  ---
This is C99 inline semantics - the 'inline' function is only a declaration, not
a definition so you need an additional

void foo (void);

somewhere to create an out-of-line instance.

Or use -fgnu89-inline

[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works

2023-05-30 Thread vital.had at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044

--- Comment #3 from Sergey Fedorov  ---
(In reply to Eric Gallager from comment #2)
> possible dup of either bug 60972 and/or bug 68160?

>From those topics it looks that the bug, if identical, has never been addressed
since GCC 4.9. Would it be helpful to compare against Apple gcc code, which
seems to handle the issue correctly?

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #6 from rguenther at suse dot de  ---
On Tue, 30 May 2023, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035
> 
> Andrew Pinski  changed:
> 
>What|Removed |Added
> 
>Keywords||missed-optimization
>  Ever confirmed|0   |1
>Severity|normal  |enhancement
>Last reconfirmed||2023-05-30
>  Status|UNCONFIRMED |NEW
> 
> --- Comment #2 from Andrew Pinski  ---
> More obvious Reduced testcase:
> ```
> struct MyClass
> {
> unsigned long long arr[128];
> };
> 
> [[gnu::noipa]]
> void sink(void *m){}
> void gg(MyClass &a)
> {
>   MyClass c = a;
>   MyClass *b = new MyClass;
>   *b = c;
>   sink(b);
> }
> ```
> 
> There might be a dup of this issue too.

But we cannot move the load of 'a' across the call to operator new
since that can possibly clobber 'a' (you can overwrite 'new' with
something having observable side-effects)

[Bug c/110048] New: undefined reference when build with O0

2023-05-30 Thread yinyuefengyi at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110048

Bug ID: 110048
   Summary: undefined reference when build with O0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yinyuefengyi at gmail dot com
  Target Milestone: ---

The below case failed to link with O0 since gcc 5.1, is it a regression? 
Though clang always failed to link... 
The case links success with O1+ or 'inline' removed.

https://godbolt.org/z/9PEhWrov8

inline void foo(void)
{
}

int main(void)
{
  foo();
}

[Bug target/59666] IBM long double arithmetic results invalid in non-default rounding modes

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59666

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org,
   ||iains at gcc dot gnu.org

--- Comment #9 from Eric Gallager  ---
(In reply to Sergey Fedorov from comment #8)
> (In reply to Vincent Lefèvre from comment #1)
> > (In reply to Joseph S. Myers from comment #0)
> > It seems to be like that "by design" (though this is not satisfactory) and
> > part of the ppc64 ABI for instance:
> > 
> > http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#PREC
> > 
> > "The software support is restricted to round-to-nearest mode. Programs that
> > use extended precision must ensure that this rounding mode is in effect when
> > extended-precision calculations are performed."
> 
> Also true for AIX:
> https://www.ibm.com/docs/sr/xcafbg/9.0.0?topic=SS3KZ4_9.0.0/com.ibm.xlf111.
> bg.doc/xlfopg/fp-overview.html
> 
> Does anyone know whether this is also true for Darwin on PowerPC though?

I don't, but Iain might...

[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #2 from Eric Gallager  ---
possible dup of either bug 60972 and/or bug 68160?

[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org

--- Comment #4 from Eric Gallager  ---
GCC also has its own -Wmisleading-indentation flag; I wonder why that didn't
catch this?

[Bug c/110047] New: RFE: Add a warning for use of bare "unsigned" (possibly under -Wimplicit-int?)

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110047

Bug ID: 110047
   Summary: RFE: Add a warning for use of bare "unsigned"
(possibly under -Wimplicit-int?)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: enhancement
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: egallager at gcc dot gnu.org
Blocks: 87403
  Target Milestone: ---

When I was first learning C, one thing that confused me was how you can just
use plain "unsigned" as a type, without specifying the length (long, short,
int, etc.). Thus, I thought that casting to unsigned would just change the sign
like a call to abs(), without realizing that there was an implicit "int"
involved. I made a testcase:

$ cat bare_unsigned.c
#include 

unsigned var; /* debatable */

unsigned long foo(void)
{
long variable = LONG_MAX;
unsigned long uvariable = (unsigned)variable; /* warn here */
return uvariable;
}
$

The one where I added the "debatable" comment is debatable because I actually
see a lot of declarations in that form pretty often, and it's probably not very
harmful in that case, but the case with the cast, where it says "warn here", is
probably more deserving of a warning, as there's a change of size involved. It
might make sense to include this under -Wimplicit-int, or maybe create a new
warning -Wbare-unsigned for it?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403
[Bug 87403] [Meta-bug] Issues that suggest a new warning

[Bug c/29970] mixing ({...}) with VLA leads to massive breakage

2023-05-30 Thread muecker at gwdg dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970

--- Comment #18 from Martin Uecker  ---
What is not fixed is returning structs with VLA members as in the first three
test cases, e.g. the second one still ICEs.

[Bug middle-end/70802] IRA memory cost calculation incorrect for immediates

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70802

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=71768
 Resolution|--- |FIXED
   Target Milestone|--- |8.0
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
This is also fixed for GCC 8 by r8-6056-g5cce817119cd31d18fbfc1c8245519d86b5e9
.

[Bug rtl-optimization/71768] Missed trivial rematerialiation oppurtunity

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71768

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||7.5.0
  Known to work||14.0, 8.1.0, 9.1.0
   Target Milestone|--- |8.0
 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #4 from Andrew Pinski  ---
Fixed for GCC 8 by r8-6056-g5cce817119cd31d18fbfc1c8245519d86b5e9 .

[Bug target/27663] missed-optimization transforming a byte array to unsigned long

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27663

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed|2007-07-25 17:29:16 |2023-5-30
  Component|middle-end  |target
  Known to fail||5.1.0

--- Comment #9 from Andrew Pinski  ---
Starting around GCC 5 or so, a call to __bswapsi2 is done here.

my bet is if avr target adds a bswapsi2 pattern (which either expands or splits
into the best moves), this will be optimized correctly.

[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905

2023-05-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971

--- Comment #10 from Kewen Lin  ---
(In reply to JuzheZhong from comment #9)
> (In reply to Kewen Lin from comment #8)
> > I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit
> > --param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline
> > r14-1241 vs. new r14-1242, the results showed that it can offer some
> > speedups for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%,
> > 549.fotonik3d_r 1.25%, but it degraded 510.parest_r by 5.01%.
> > 
> > I just tested Juzhe's new proposed fix which makes the loop closing iv
> > SCEV-ed, it can fix the degradation of 510.parest_r, also the miss
> > optimization on cunroll (in #c5), the test failures are gone as well. One
> > SPEC2017 re-evaluation with that fix is ongoing, I'd expect it won't degrade
> > anything.
> 
> Thanks so much. You mean you are trying this patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html ?

Yes, it means that Richi's concern (niter analysis but all analyses relying on
SCEV are pessimized) does affect the exposed degradation and failures. Thanks
for looking into it.

> 
> I believe it can improve even more for IBM's target.

Hope so, I'll post the new SPEC2017 results once the run finishes.

btw, the SPEC2017 run with --param=vect-partial-vector-usage=2 here is mainly
to verify the expectation on the decrement IV change, the normal SPEC2017 runs
still use --param=vect-partial-vector-usage=1 which isn't affected by this
change and it beats the former in general as the cost for length setting up.

[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905

2023-05-30 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971

--- Comment #9 from JuzheZhong  ---
(In reply to Kewen Lin from comment #8)
> I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit
> --param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline
> r14-1241 vs. new r14-1242, the results showed that it can offer some
> speedups for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%,
> 549.fotonik3d_r 1.25%, but it degraded 510.parest_r by 5.01%.
> 
> I just tested Juzhe's new proposed fix which makes the loop closing iv
> SCEV-ed, it can fix the degradation of 510.parest_r, also the miss
> optimization on cunroll (in #c5), the test failures are gone as well. One
> SPEC2017 re-evaluation with that fix is ongoing, I'd expect it won't degrade
> anything.

Thanks so much. You mean you are trying this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html ?

I believe it can improve even more for IBM's target.

[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905

2023-05-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971

Kewen Lin  changed:

   What|Removed |Added

   Keywords|testsuite-fail  |missed-optimization
   Assignee|linkw at gcc dot gnu.org   |juzhe.zhong at rivai 
dot ai

--- Comment #8 from Kewen Lin  ---
I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit
--param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline
r14-1241 vs. new r14-1242, the results showed that it can offer some speedups
for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%, 549.fotonik3d_r
1.25%, but it degraded 510.parest_r by 5.01%.

I just tested Juzhe's new proposed fix which makes the loop closing iv SCEV-ed,
it can fix the degradation of 510.parest_r, also the miss optimization on
cunroll (in #c5), the test failures are gone as well. One SPEC2017
re-evaluation with that fix is ongoing, I'd expect it won't degrade anything.

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-05-30 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038

--- Comment #2 from cuilili  ---
(In reply to Richard Biener from comment #1)
> Probably best to limit the values to reassoc-width by adding the
> appropriate IntegerRange attribute in params.opt
> 
> IntegerRange(0, 256)
> 
> maybe?

"rewrite_expr_tree_parallel" got a wrong width from "get_reassociation_width" 

The number of ops is 4, width is 2147483647.

get_reassociation_width:
...
  width_min = 1;
  while (width > width_min)
{
  int width_mid = (width + width_min) / 2;   --> (width + 1) out of bounds
...

So Richard suggested that limiting tree-reassoc-width to IntegerRange(0, 256)
would solve the ICE, I also added a width constraint in
rewrite_expr_tree_parallel, here is the patch.


https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620154.html

1. Limit the value of tree-reassoc-width to IntegerRange(0, 256).
2. Add width limit in rewrite_expr_tree_parallel.

[Bug fortran/105847] namelist-object-name can be a renamed host associated entity

2023-05-30 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105847

--- Comment #5 from Jerry DeLisle  ---
Hi Steve,I will see if I can get all this tested and committed this coming
weekend.

[Bug c/50486] Missed -Wsign-conversion with signed -> unsigned casting and enums

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50486

Eric Gallager  changed:

   What|Removed |Added

Summary|No warning at signed -> |Missed -Wsign-conversion
   |unsigned casting|with signed -> unsigned
   ||casting and enums

--- Comment #4 from Eric Gallager  ---
updating the title a bit

[Bug c/29970] mixing ({...}) with VLA leads to massive breakage

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970

--- Comment #17 from Eric Gallager  ---
(In reply to CVS Commits from comment #16)
> The master branch has been updated by Martin Uecker :
> 
> https://gcc.gnu.org/g:4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524
> 
> commit r12-5338-g4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524
> Author: Martin Uecker 
> Date:   Wed Nov 17 14:20:59 2021 +0100
> 
> Fix ICE when mixing VLAs and statement expressions [PR91038]
> 
> When returning VM-types from statement expressions, this can
> lead to an ICE when declarations from the statement expression
> are referred to later. Most of these issues can be addressed by
> gimplifying the base expression earlier in gimplify_compound_lval.
> Another issue is fixed by wrapping the pointer expression in
> pointer_int_sum. This fixes PR91038 and some of the test cases
> from PR29970 (structs with VLA members need further work).
> 
> gcc/
> PR c/91038
> PR c/29970
> * gimplify.c (gimplify_var_or_parm_decl): Update comment.
> (gimplify_compound_lval): Gimplify base expression first.
> (gimplify_target_expr): Add comment.
> 
> gcc/c-family/
> PR c/91038
> PR c/29970
> * c-common.c (pointer_int_sum): Make sure pointer expressions
> are evaluated first when the size expression depends on for
> variably-modified types.
> 
> gcc/testsuite/
> PR c/91038
> PR c/29970
> * gcc.dg/vla-stexp-3.c: New test.
> * gcc.dg/vla-stexp-4.c: New test.
> * gcc.dg/vla-stexp-5.c: New test.
> * gcc.dg/vla-stexp-6.c: New test.
> * gcc.dg/vla-stexp-7.c: New test.
> * gcc.dg/vla-stexp-8.c: New test.
> * gcc.dg/vla-stexp-9.c: New test.

Is this fixed now, or is it staying open for backports?

[Bug c++/55077] implement and enable by default -Wliteral-conversion

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55077

--- Comment #10 from Eric Gallager  ---
(In reply to David Binderman from comment #9)
> -Wfloat-conversion does the deed: any chance of getting it someplace useful
> like -Wall or -Wextra anytime soon ?
> 
> I will put it into my local compiler.

I think the point here is that the proposed -Wliteral-conversion warns for a
smaller number of cases than -Wfloat-conversion does, and thus would be safer
to enable more widely than -Wfloat-conversion is.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #7 from Frank J. T. Wojcik  ---
After playing with this a bit more, I was able to find a Generator which
produces a slightly wider bound, but still nowhere near 0x1.fep+127.
This also includes the requested change to SimpleGen.

#include 
#include 
#include 

#define MAXOFFSET 3

// Return 25-bit values centered around 0x100 (the halfway-point of all
// possible 25-bit outputs). Widening the Generator does not alter the
// program's output.
class SimpleGen {
 public:
using result_type = uint32_t;
result_type val, offset, ctr = 0;
static constexpr result_type min() { return 0; }
static constexpr result_type max() { return 0x1ff; }
result_type operator()()   { offset = (ctr & 1) ?
((ctr / 2) / (2 * MAXOFFSET)) :
((ctr / 2) % (2 * MAXOFFSET));
 val = 0x100 + offset - MAXOFFSET;
 printf("\tG 0x%07x\n", val);
 ++ctr; return val; }
};

int main(void) {
SimpleGen gen;
std::normal_distribution norm {0, 1};

printf("min: %+f %a\nmax: %+f %a\n\n", norm.min(), norm.min(),
norm.max(), norm.max());

for (int i = 0; i < ((2 * MAXOFFSET) * (2 * MAXOFFSET - 1) * 2) ; i++) {
float r = norm(gen);
printf("%d %f %a\n", i, r, r);
}
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.cpp -fno-strict-aliasing
-fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Actual outputs (excerpts only!):
min: -340282346638528859811704183484516925440.00 -0x1.fep+127
max: +340282346638528859811704183484516925440.00 0x1.fep+127

G 0x100
G 0x0ff
30 -8.157336 -0x1.0508e6p+3
31 0.00 0x0p+0

G 0x101
G 0x0ff
32 -8.157336 -0x1.0508e6p+3
33 0.00 0x0p+0

G 0x0ff
G 0x100
40 0.00 0x0p+0
41 -8.157336 -0x1.0508e6p+3

G 0x102
G 0x100
42 0.00 0x0p+0
43 7.985583 0x1.ff13ccp+2

G 0x0ff
G 0x101
48 0.00 0x0p+0
49 -8.157336 -0x1.0508e6p+3

G 0x102
G 0x101
50 0.00 0x0p+0
51 7.985583 0x1.ff13ccp+2

G 0x100
G 0x102
58 7.985583 0x1.ff13ccp+2
59 0.00 0x0p+0

Expected outputs (excerpt only!):
min: -8.157336 -0x1.0508e6p+3
max: +7.985583 0x1.ff13ccp+2

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #6 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #5)
> 2 things, I think your result_type in SimpleGen needs to be public.

Sure, I'll change that.

> Second is LLVM's libc++ outputs:
> -inf -inf
> inf inf

I only have clang 6.0 on my system and its outputs are identical to g++'s. And
I do not have access to MSVC, but I believe it also produces similar values.
These data points are why I bring up the possibility that there is some detail
in the specification that I'm missing.

Still, the behavior I'm looking for is useful, and it does seem to be what the
spec is requiring, so I would like a fix if appropriate.

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread ptk.prasertsuk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #5 from Pontakorn Prasertsuk  ---
(In reply to Andrew Pinski from comment #3)
> We don't even optimize:
> ```
> struct MyClass
> {
> unsigned long long arr[128];
> };
> 
> [[gnu::noipa]]
> void sink(void *m);
> void gg(MyClass &a, MyClass *b)
> {
>   MyClass c = a;
>   *b = c;
>   sink(b);
> }
> ```
> 
> As I mentioned there are dups of the above testcase.

Would you mind pointing me to the original issue?

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #5 from Andrew Pinski  ---
2 things, I think your result_type in SimpleGen needs to be public.


Second is LLVM's libc++ outputs:
-inf -inf
inf inf

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread ptk.prasertsuk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #4 from Pontakorn Prasertsuk  ---
(In reply to Richard Biener from comment #1)
> Ick - convoluted C++.  We end up with
> 
> void ff (struct MyClass & obj)
> {
>   vector(2) long unsigned int vect_SR.16;
>   vector(2) long unsigned int vect_SR.15;
>   vector(2) long unsigned int vect_SR.14;
>   void * _6;
> 
>[local count: 1073741824]:
>   vect_SR.14_5 = MEM  [(struct MyClass
> &)obj_2(D)];
>   vect_SR.15_28 = MEM  [(struct MyClass
> &)obj_2(D) + 16];
>   vect_SR.16_30 = MEM  [(struct MyClass
> &)obj_2(D) + 32];
>   _6 = operator new (48);
>   MEM  [(struct MyClass2 *)_6] = vect_SR.14_5;
>   MEM  [(struct MyClass2 *)_6 + 16B] =
> vect_SR.15_28;
>   MEM  [(struct MyClass2 *)_6 + 32B] =
> vect_SR.16_30;
>   HandleMyClass2 (_6); [tail call]
> 
> and the issue is that 'operator new (48)' can alter what 'obj' points to,
> so we cannot move the loads across the call and we get spilling.
> 
> There is no inter-procedural analysis in GCC that would tell us that
> 'obj_2(D)' (the MyClass & obj argument of ff) does not point to an
> object that did not escape.  In fact 'ff' has global visibility
> and it might have other callers.
> 
> If you add -fwhole-program then you get the function inlined to main and
> 
> main:
> .LFB652:
> .cfi_startproc
> subq$8, %rsp
> .cfi_def_cfa_offset 16
> movl$48, %edi
> call_Znwm
> movq$0, (%rax)
> movq%rax, %rdi
> movq$0, 8(%rax)
> movq$0, 16(%rax)
> movq$0, 24(%rax)
> movq$0, 32(%rax)
> movq$0, 40(%rax)
> call_Z14HandleMyClass2Pv
> xorl%eax, %eax
> addq$8, %rsp
> .cfi_def_cfa_offset 8
> ret
> 
> (not using vectors because 'main' is considered cold).  Do you cite an
> inline copy of ff() for clang?

Hi Richard,

The clang snippet I provided is not inlined into 'main' function.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #4 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #3)
> With 24bit precision, maybe it is ~8 standard deviations away from the mean.
> But the generator argument can change for each call though so that does not
> mean the next call to operator() could produce one with more bits ...
> 
> Also the standard says: "as determined by the current values of d's
> parameters"
> 
> The parameters is only mean and standard deviations and not the generator.

I would agree with all of this also, I think. :)

But can you or someone demonstrate *any* generator which produces (e.g.) the
current value of max() for std::normal_distribution {0, 1}? I can find
no generator implementation which does that, and by my reading of the
implementation there cannot be one.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #3 from Andrew Pinski  ---
With 24bit precision, maybe it is ~8 standard deviations away from the mean.
But the generator argument can change for each call though so that does not
mean the next call to operator() could produce one with more bits ...

Also the standard says: "as determined by the current values of d's parameters"

The parameters is only mean and standard deviations and not the generator.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #2 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #1)
> The heavy weight goes to potentially. The way I understand it is not the max
> of what operator() has produced currently but will potentially return in the
> future.

And I would agree with that interpretation. My inquiry is based on the fact
that I can find *no* Generator outputs which produce the currently-given max()
value.

The goal of the demo program was to show how I arrived at my "7.985583" value
for what seems to be the actual maximum value, and that I didn't make up some
arbitrary value.

If it helps, imagine the min/max printf() to appear before any generation has
been done, or even without any generation. I would expect the same result,
modulo order of printouts.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #1 from Andrew Pinski  ---
The heavy weight goes to potentially. The way I understand it is not the max of
what operator() has produced currently but will potentially return in the
future.

[Bug libstdc++/110045] New: std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

Bug ID: 110045
   Summary: std::normal_distribution (and likely others)
give wrong min() and max() values
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gccbugs at elkpod dot com
  Target Milestone: ---

>From my (non-expert) reading of the C++ spec (I'm using
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf for the
current 2023-05-10 draft), the min() and max() methods for
std::normal_distribution are returning incorrect values.

"28.5.3.6 Random number distribution requirements" says that "A class D meets
the requirements of a random number distribution if the expressions shown in
Table 97 are valid and have the indicated semantics", and that in Table 97 "x"
is a "(possibly const) value[s] of D", and that "x.min()" "Returns glb", and
that "x.max()" "Returns lub", and that "glb and lub are values of T
respectively corresponding to the greatest lower bound and the least upper
bound on the values potentially returned by d's operator(), as determined by
the current values of d's parameters".

By my reading, this means that if I have "std::normal_distribution norm
{0, 1};", then "norm.max()" should return the least upper bound on the values
potentially returned by norm's operator(). It does not seem to do that. 

For example, the highest value I was able to get norm() to generate was
0x1.ff13ccp+2 (== 7.985583). norm.max() reports 0x1.fep+127 (==
340282346638528859811704183484516925440.00). While the values returned are
technically valid bounds, they do not seem to be the least upper or greatest
lower. I could find no Generator outputs which produced a value of
0x1.fep+127 from std::normal_distribution.

It is possible that my interpretation of the spec is wrong, and that somehow
the min/max values should encompass all possible instantiations of
std::normal_distribution, or some other loophole may exist. I could not find a
better forum to find the answer to that; sorry.

I have looked at the implementation of std::normal_distribution, and written
the following code to generate what I believe to be the actual extreme values
that are "potentially returned by norm's operator()".

Reproduction code:
#include 
#include 
#include 

int32_t offsets[8] = { -1, +0, +0, -1, +0, +1, +1, +0 };
class SimpleGen {
using result_type = uint32_t;
 public:
result_type val, ctr = 0;
static constexpr result_type min() { return 0; }
static constexpr result_type max() { return 0xff; }
result_type operator()()   { val = 0x80 + offsets[(ctr++)%8];
 printf("\tG 0x%06x\n", val);
 return val; }
};

int main(void) {
SimpleGen gen;
std::normal_distribution norm {0, 1};

for (int i = 0; i < 8; i++) {
float r = norm(gen);
printf("%d %f %a\n", i, r, r);
}
printf("\n%f %a\n%f %a\n", norm.min(), norm.min(), norm.max(), norm.max());
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.c -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Output:
G 0x7f
G 0x80
0 0.00 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
G 0x80
G 0x7f
2 -7.985583 -0x1.ff13ccp+2
3 0.00 0x0p+0
G 0x80
G 0x81
4 7.985583 0x1.ff13ccp+2
5 0.00 0x0p+0
G 0x81
G 0x80
6 0.00 0x0p+0
7 7.985583 0x1.ff13ccp+2

-340282346638528859811704183484516925440.00 -0x1.fep+127
340282346638528859811704183484516925440.00 0x1.fep+127

Expected output:
G 0x7f
G 0x80
0 0.00 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
G 0x80
G 0x7f
2 -7.985583 -0x1.ff13ccp+2
3 0.00 0x0p+0
G 0x80
G 0x81
4 7.985583 0x1.ff13ccp+2
5 0.00 0x0p+0
G 0x81
G 0x80
6 0.00 0x0p+0
7 7.985583 0x1.ff13ccp+2

-7.985583 -0x1.ff13ccp+2
7.985583 0x1.ff13ccp+2

$ g++-12.2 -v
Using built-in specs.
COLLECT_GCC=g++-12.2
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-slackware-linux/12.2.0/lto-wrapper
Target: x86_64-slackware-linux
Configured with: ../gcc-12.2.0/configure --prefix=/usr/local
--program-suffix=-12.2 -enable-languages=c,c++,lto --enable-lto
--disable-multilib --with-gnu-ld --enable-threads --verbose
--target=x86_64-slackware-linux --build=x86_64-slackware-linux
--host=x86_64-slackware-linux --enable-tls --with-fpmath=avx
--enable-__cxa_atexit --enable-gnu-indirect-function --enable-bootstrap
--enable-libssp
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC)

[Bug target/108938] Missing bswap detection

2023-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938

Hongtao.liu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #17 from Hongtao.liu  ---
Fixed for GCC14.

[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float

2023-05-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804

--- Comment #6 from Hongtao.liu  ---
Fixed for GCC14.

[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804

--- Comment #5 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:3279b6223066d36d2e6880a137f80a46d3c82c8f

commit r14-1421-g3279b6223066d36d2e6880a137f80a46d3c82c8f
Author: liuhongt 
Date:   Wed Feb 22 17:54:46 2023 +0800

Enhance NARROW FLOAT_EXPR vectorization by truncating integer to lower
precision.

Similar like WIDEN FLOAT_EXPR, when direct_optab is not existed, try
intermediate integer type whenever gimple ranger can tell it's safe.

.i.e.
When there's no direct optab for vector long long -> vector float, but
the value range of integer can be represented as int, try vector int
-> vector float if availble.

gcc/ChangeLog:

PR tree-optimization/108804
* tree-vect-patterns.cc (vect_get_range_info): Remove static.
* tree-vect-stmts.cc (vect_create_vectorized_demotion_stmts):
Add new parameter narrow_src_p.
(vectorizable_conversion): Enhance NARROW FLOAT_EXPR
vectorization by truncating to lower precision.
* tree-vectorizer.h (vect_get_range_info): New declare.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr108804.c: New test.

[Bug tree-optimization/110043] [14 Regression] ice in size_remaining, at pointer-query.cc:875

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-05-30
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Note you can make this C++ valid code too:
```
__int128 g_116_1;
extern char g_521[][8];
void func_24() {
  for (; g_116_1 >= 0;)
g_521[g_116_1][g_116_1] &= 0;
}
```

Confirmed.

[Bug tree-optimization/110043] [14 Regression] ice in size_remaining, at pointer-query.cc:875

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043

Andrew Pinski  changed:

   What|Removed |Added

Summary|ice in size_remaining, at   |[14 Regression] ice in
   |pointer-query.cc:875|size_remaining, at
   ||pointer-query.cc:875
  Component|c   |tree-optimization
   Target Milestone|--- |14.0
   Keywords||ice-on-valid-code

[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ABI, wrong-code

--- Comment #1 from Andrew Pinski  ---
I suspect the issue is inside darwin_rs6000_special_round_type_align .
But I can't seem to figure out just by looking at the code.

[Bug c/109836] -Wpointer-sign should be enabled by default

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109836

Eric Gallager  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #5 from Eric Gallager  ---
(In reply to Eric Gallager from comment #4)
> How about:
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 0d0ad0a6374..f046d91d03b 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1178,7 +1178,7 @@ C ObjC C++ ObjC++ Var(warn_pointer_arith) Warning
> LangEnabledBy(C ObjC C++ ObjC+
>  Warn about function pointer arithmetic.
>  
>  Wpointer-sign
> -C ObjC Var(warn_pointer_sign) Warning LangEnabledBy(C ObjC,Wall ||
> Wpedantic)
> +C ObjC Var(warn_pointer_sign) Warning LangEnabledBy(C ObjC,Wall ||
> Wpedantic || Wextra)
>  Warn when a pointer differs in signedness in an assignment.
>  
>  Wpointer-compare

I sent this to the gcc-patches mailing list:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620137.html

[Bug target/110044] New: #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works

2023-05-30 Thread vital.had at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044

Bug ID: 110044
   Summary: #pragma pack(push, 1) may not force packing, while
__attribute__((packed, aligned(1))) works
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vital.had at gmail dot com
CC: iains at gcc dot gnu.org
  Target Milestone: ---
Target: powerpc-apple-darwin

Problem: #pragma pack(push, 1) may not work correctly on ppc (32-bit); seems to
be present across GCC versions, confirmed to affect gcc7, gcc11 and gcc12.
Old Apple GCC 4.2 is not affected, at the same time.


Test code:

#include 
#include 

#pragma pack(push, 1)

/* struct from OpenEXR; should be packed with the pragma directive  */
typedef struct
{
uint32_t x_size;
uint32_t y_size;
uint8_t  level_and_round;
} exr_attr_tiledesc_t;

/* same struct but reordered */
typedef struct
{
uint8_t  level_and_round;
uint32_t x_size;
uint32_t y_size;
} new1_exr_attr_tiledesc_t;

/* same as first struct but with packed forced */
typedef struct
{
uint32_t x_size;
uint32_t y_size;
uint8_t  level_and_round;
} __attribute__((packed, aligned(1))) new2_exr_attr_tiledesc_t;

#pragma pack(pop)

int main() {
std::cout << sizeof(exr_attr_tiledesc_t) << " "
  << sizeof(new1_exr_attr_tiledesc_t) << " "
  << sizeof(new2_exr_attr_tiledesc_t) << "\n";

return 0;
}


On Mac OS X Leopart (10.5 PowerPC):
`g++-mp-7 main.cxx && ./a.out` gives: 12 9 9
`g++ main.cxx && ./a.out`  gives: 9  9 9
`g++* -arch ppc64 && ./a.out`  gives: 9 9 9

On Mac OS X Snow Leopard (10A190 PowerPC):
`g++-mp-11 main.cxx && ./a.out` gives: 12 9 9
`g++-mp-12 main.cxx && ./a.out` gives: 12 9 9
`g++ main.cxx && ./a.out`   gives: 9 9 9

where g++ stands for Xcode gcc-4.2.

Discussion in: https://github.com/macports/macports-ports/pull/18872
Also see: https://trac.macports.org/ticket/63490

[Bug c/109826] Incompatible pointer types in ?: not covered by -Wincompatible-pointer-types

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109826

Eric Gallager  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-05-30
 Ever confirmed|0   |1

--- Comment #4 from Eric Gallager  ---
(anyways, confirmed)

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

--- Comment #6 from Andrew Pinski  ---
Created attachment 55219
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55219&action=edit
untested patch

I am going to test this on both x86_64 and aarch64 tonight.

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-30
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #5 from Andrew Pinski  ---
I have a patch which adds support for paradoxical subregs.
Since paradoxical subregs as the dest always assign the full register still,
there is no reason to reject it.

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

--- Comment #4 from Andrew Pinski  ---
Because of the subreg.

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

--- Comment #3 from Andrew Pinski  ---
bb_valid_for_noce_process_p returns false for the zero_extract case ...

[Bug c/110043] New: ice in size_remaining, at pointer-query.cc:875

2023-05-30 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043

Bug ID: 110043
   Summary: ice in size_remaining, at pointer-query.cc:875
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

For this C source code:

__int128 g_116_1;
char g_521[][8];
func_24() {
  for (; g_116_1 >= 0;)
g_521[g_116_1][g_116_1] &= 0;
}

compiled by recent gcc, does this:

$ ~/gcc/results/bin/gcc -c -w -O1 bug927.c
during GIMPLE pass: waccess
bug927.c: In function ‘func_24’:
bug927.c:3:1: internal compiler error: in size_remaining, at
pointer-query.cc:875
3 | func_24() {
  | ^~~
0xd1823d
access_ref::size_remaining(generic_wide_int
 >*) const
../../trunk.year/gcc/pointer-query.cc:875

The bug seems to exist since sometime before 20220515.

[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension

2023-05-30 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592

--- Comment #10 from Jeffrey A. Law  ---
Created attachment 55218
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55218&action=edit
(Incomplete) Patch

[Bug tree-optimization/108041] ivopts results in extra instruction in simple loop

2023-05-30 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041

--- Comment #4 from Jeffrey A. Law  ---
Patch was for a different problem.  Sorry.

[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension

2023-05-30 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592

--- Comment #9 from Jeffrey A. Law  ---
Weird, I don't see the attachment either.  I'll extract & upload it again.

WRT costing.  fwprop and combine will both query the target rtx costs and will
reject when the target costing model indicates the change isn't actually
profitable.

As you'd noted before, combine will internally transform a sign/zero extension
into a pair of shifts.  The whole point of that internal canonicalization is to
expose cases where the shifts can combine with other nearby operations.  So
there's no significant risk to detecting and creating the extension form
earlier.

[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error

2023-05-30 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822

Matthias Kretz (Vir)  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Matthias Kretz (Vir)  ---
Resolved on all branches.

[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822

--- Comment #5 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:39a60f2d5f7bf6806a4c4d7d1f52f139e157e01a

commit r11-10835-g39a60f2d5f7bf6806a4c4d7d1f52f139e157e01a
Author: Matthias Kretz 
Date:   Fri May 26 12:23:44 2023 +0200

libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

(cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)

[Bug testsuite/52641] Test cases fail for 16-bit int targets

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52641

--- Comment #21 from CVS Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

commit r14-1418-ge4c986fde56a6248f8fbe6cf0704e1da34b055d8
Author: Georg-Johann Lay 
Date:   Tue May 30 22:04:57 2023 +0200

testsuite/52641: Fix more of implicit int=32 assumption fallout.

gcc/testsuite/
PR testsuite/52641
* gcc.dg/torture/pr107451.c: Require int32plus.
* gcc.dg/torture/pr108574-3.c: Use __INT32_TYPE__ instead of int.
* gcc.dg/torture/pr109940.c: Use __INTPTR_TYPE__ instead of long.
* gcc.dg/torture/pr95248.c: Require size24plus.
* gcc.dg/torture/pr95295-3.c: Use var_* with at least 32 bits int.
* gcc.dg/torture/pr98640.c: Cast to __INT32_TYPE__ instead of int.
* gcc.dg/tree-ssa/pr103771.c: Use int with at least 32 bits.

[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822

--- Comment #4 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:467887d5750d03d438ab704437b2c5e5da78497e

commit r12-9666-g467887d5750d03d438ab704437b2c5e5da78497e
Author: Matthias Kretz 
Date:   Fri May 26 12:23:44 2023 +0200

libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

(cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

2023-05-30 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #50 from Oleg Endo  ---
Actually, let's take any further discussion of shift patterns to PR 54089.

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

2023-05-30 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #49 from Oleg Endo  ---
(In reply to Alexander Klepikov from comment #48)
> I made tests (including *.c files from GCC testsuite) and everything looks
> fine for now. But I'm still afraid that pattern for 'ashrsi3_libcall_expand'
> is too wide. It is possible to narrow it down as much as possible by adding
> distinct attribute and set when emitting 'ashrsi3_libcall_collapsed' and
> then check it and fail if not set:
> 

For this kind of change, the whole GCC test suite needs to be ran for at least
big/little -m2,-m4 variants.


+(define_insn_and_split "ashrsi3_libcall_expand"
+  [(parallel [(set (match_operand:SI 0 "arith_reg_dest")
+   (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand")
+   (match_operand:SI 2 "const_int_operand"))
+   )(clobber (reg:SI T_REG))
+   (clobber (reg:SI PR_REG))
+  ])]

The 'parallel' construct looks strange.

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

--- Comment #2 from Andrew Pinski  ---
Here is a bitfield testcase which shows this was a latent issue:
```
struct f
{
  unsigned t:3;
  unsigned t1:4;
};
unsigned f2(struct f);
unsigned
f1(int t, struct f y)
{
  int tt = 0;
  if(t)
tt = y.t1;
  return tt;
}
```


We should produce:
```
ubfxw8, w1, #3, #4
cmp w0, #0
cselw0, wzr, w8, eq
ret
``

But currently produces:
```
cbz w0, .L3
ubfxx0, x1, 3, 4
ret
.L3:
mov w0, 0
ret
```

The IR is similar too:
```
(insn 11 10 12 3 (set (subreg:DI (reg:QI 96) 0)
(zero_extract:DI (subreg:DI (reg/v:SI 95 [ y ]) 0)
(const_int 4 [0x4])
(const_int 3 [0x3]))) "/app/example.cpp":12:11 832 {*extzvdi}
 (expr_list:REG_DEAD (reg/v:SI 95 [ y ])
(nil)))
(insn 12 11 24 3 (set (reg:SI 93 [  ])
(zero_extend:SI (reg:QI 96))) "/app/example.cpp":13:10 146
{*zero_extendqisi2_aarch64}
 (expr_list:REG_DEAD (reg:QI 96)
(nil)))
```

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

--- Comment #1 from Andrew Pinski  ---
I am still looking into this. This is definitely a latent bug and maybe even
can be reproduced some bitfield extractions too.

[Bug rtl-optimization/110042] New: [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042

Bug ID: 110042
   Summary: [14 Regression] missed cmov optimization after
r14-1014-gc5df248509b489364c573e8
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64-linux-gnu

Take:
```
unsigned
f1(int t, int t1)
{
  int tt = 0;
  if(t)
tt = (t1&0x8)!=0;
  return tt;
}
```
On aarch64 we should produce:
```
cmp w0, 0
ubfxx0, x1, 3, 1
cselw0, w0, wzr, ne
ret
```

But on the trunk we get:
```
cbz w0, .L3
ubfxx0, x1, 3, 1
ret
.p2align 2,,3
.L3:
mov w0, 0
ret
```


The difference in the IR is:
old:
```
(insn 11 10 12 3 (set (reg:SI 97)
(lshiftrt:SI (reg/v:SI 96 [ t1 ])
(const_int 3 [0x3]))) "/app/example.cpp":7:18 782
{*aarch64_lshr_sisd_or_int_si3}
 (expr_list:REG_DEAD (reg/v:SI 96 [ t1 ])
(nil)))
(insn 12 11 25 3 (set (reg:SI 94 [  ])
(and:SI (reg:SI 97)
(const_int 1 [0x1]))) "/app/example.cpp":7:18 533 {andsi3}
 (expr_list:REG_DEAD (reg:SI 97)
(nil)))
```
new:
```
(insn 11 10 12 3 (set (subreg:DI (reg:SI 97) 0)
(zero_extract:DI (subreg:DI (reg/v:SI 96 [ t1 ]) 0)
(const_int 1 [0x1])
(const_int 3 [0x3]))) "/app/example.cpp":7:18 832 {*extzvdi}
 (expr_list:REG_DEAD (reg/v:SI 96 [ t1 ])
(nil)))
(insn 12 11 24 3 (set (reg:SI 94 [  ])
(reg:SI 97)) "/app/example.cpp":8:10 64 {*movsi_aarch64}
 (expr_list:REG_DEAD (reg:SI 97)
(nil)))
```
noce_try_cmove_arith handles the old one but not the new one for some reason:
```
if-conversion succeeded through noce_try_cmove_arith
``

[Bug tree-optimization/50286] Missed optimization, fails to propagate bool

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50286

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED
   Target Milestone|--- |13.0

--- Comment #3 from Andrew Pinski  ---
Fixed in GCC 13.

Checking profitability of path (backwards):  bb:3 (8 insns) bb:5 (latch)
  Control statement insns: 2
  Overall: 6 insns

 Registering killing_def (path_oracle) i_9
 Registering killing_def (path_oracle) _10
 Registering killing_def (path_oracle) _11
Checking profitability of path (backwards): 
  [1] Registering jump thread: (5, 3) incoming edge;  (3, 4) nocopy; 
path: 5->3->4 SUCCESS
Jump threading proved probability of edge 3->4 too small (it is 11.0% (guessed)
should be always (guessed))

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-05-30 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

--- Comment #11 from Georg-Johann Lay  ---
Created attachment 55217
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55217&action=edit
Proposed patch for postreload.cc to record clobbers of next insn + test case.

This patch solves the problem for avr and tests with no additional regressions
for avr.

--

rtl-optimization/101188: Don't bypass clobbers of some insns that are
optimized or are optimization candidates.

gcc/
PR rtl-optimization/101188
* postreload.cc (reload_cse_move2add): Record clobbers of next
insn using move2add_note_store.

gcc/testsuite/
PR rtl-optimization/101188
* gcc.c-torture/execute/pr101188.c: New test.

[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation

2023-05-30 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041

Uroš Bizjak  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
Version|unknown |14.0
 Resolution|--- |FIXED
   Target Milestone|--- |14.0
 Target||x86

--- Comment #3 from Uroš Bizjak  ---
Fixed.

[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:2720bbd597f56742a17119dfe80edc2ba86af255

commit r14-1416-g2720bbd597f56742a17119dfe80edc2ba86af255
Author: Uros Bizjak 
Date:   Tue May 30 20:38:20 2023 +0200

i386: Fix misleading identation in i386-expand.cc [PR110041]

gcc/ChangeLog:

PR target/110041
* config/i386/i386-expand.cc (ix86_expand_vecop_qihi2):
Fix misleading identation.

[Bug tree-optimization/58483] missing optimization opportunity for const std::vector compared to std::array

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58483

--- Comment #17 from Andrew Pinski  ---
Here is related reduced testcase:
```

int f(void)
{
int tt = 100;
int t[3] = {10,20,30};
int *t1 = new int[3];
__builtin_memcpy(t1, t, sizeof(t));
for(int *i = t1; i != &t1[3]; i++)
  tt += *i;
delete[] t1;
return tt;
}
```
Note in the above testcase we can remove the memcpy but not the operator
new/delete.  This is unlike the original testcase where memcpy is not removed
either.

[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation

2023-05-30 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041

David Binderman  changed:

   What|Removed |Added

 CC||uros at gcc dot gnu.org

--- Comment #1 from David Binderman  ---
Adding author of code.

[Bug target/110041] New: gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation

2023-05-30 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041

Bug ID: 110041
   Summary: gcc/config/i386/i386-expand.cc:23394:5: warning:
misleading indentation
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

I just tried a build of gcc trunk with clang. It said:

gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation;
statement is not part of the previous 'else' [-Wmisleading-indentation]

git blame says:

52ff3f7b86 (Uros Bizjak  2023-05-25 19:40:26 +0200 23394) if
(code != MULT && op2vec)

It might be worth tidying this up.

[Bug debug/63572] [10/11/12/13/14 Regression] ICF breaks user debugging experience

2023-05-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63572

--- Comment #31 from Jakub Jelinek  ---
In theory, what we could do (expensive though) is keep the IL of the functions
that were ICF merged with the picked up candidate, just mark them in cgraph
specially so that e.g. IPA doesn't consider references to functions/vars from
the other copies as distinct references, compile those functions right after
compiling their chosen ICF winner (or right before it), but don't emit into
assembly, instead compare with how the ICF winner
and emit just debug info for the other copies after building some mapping
between the debug related labels in the different functions.  If we compiled it
into different code, something bad happened (e.g. some debug counter or
similar) and we'd just not emit the debug info for the other copies (like we
don't emit it currently for those).

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #3 from Andrew Pinski  ---
We don't even optimize:
```
struct MyClass
{
unsigned long long arr[128];
};

[[gnu::noipa]]
void sink(void *m);
void gg(MyClass &a, MyClass *b)
{
  MyClass c = a;
  *b = c;
  sink(b);
}
```

As I mentioned there are dups of the above testcase.

[Bug middle-end/106776] Unexpected use-after-free warning

2023-05-30 Thread drfiemost at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106776

--- Comment #6 from Leandro Nini  ---
Can't reproduce anymore with gcc 13.1.0
Still there in gcc 12.3.0

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Ever confirmed|0   |1
   Severity|normal  |enhancement
   Last reconfirmed||2023-05-30
 Status|UNCONFIRMED |NEW

--- Comment #2 from Andrew Pinski  ---
In the case of x86_64, it is just moving the loads across the operator new, I
think:
  vect_SR.14_5 = MEM  [(struct MyClass
&)obj_2(D)];
  vect_SR.15_28 = MEM  [(struct MyClass &)obj_2(D)
+ 16];
  vect_SR.16_30 = MEM  [(struct MyClass &)obj_2(D)
+ 32];
  _6 = operator new (48);
  MEM  [(struct MyClass2 *)_6] = vect_SR.14_5;
  MEM  [(struct MyClass2 *)_6 + 16B] =
vect_SR.15_28;
  MEM  [(struct MyClass2 *)_6 + 32B] =
vect_SR.16_30;
  HandleMyClass2 (_6); [tail call]

Other targets is moving across the operator new too:

  D.14580.__obj = *obj_2(D);
  _6 = operator new (48);
  MEM[(struct MyClass2 *)_6].f = D.14580;


More obvious Reduced testcase:
```
struct MyClass
{
unsigned long long arr[128];
};

[[gnu::noipa]]
void sink(void *m){}
void gg(MyClass &a)
{
  MyClass c = a;
  MyClass *b = new MyClass;
  *b = c;
  sink(b);
}
```

There might be a dup of this issue too.

[Bug tree-optimization/101024] Missed min expression at phiopt1

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101024

--- Comment #10 from Andrew Pinski  ---
The only thing left to do to remove minmax_replacement, is the improvement
mentioned in PR 95699 (or rather r11-1504-g2e0f4a18bc978c7362 ).

[Bug target/87913] max(n, 1) code generation

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87913

--- Comment #7 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:45466eecf5ef669164c0922e5be8fd288b144886

commit r14-1412-g45466eecf5ef669164c0922e5be8fd288b144886
Author: Andrew Pinski 
Date:   Tue May 16 14:26:41 2023 -0700

Add a != MIN/MAX_VALUE_CST ? CST-+1 : a to minmax_from_comparison

This patch adds the support for match that was implemented for PR 87913 in
phiopt.
It implements it by adding support to minmax_from_comparison for the check.
It uses the range information if available which allows to produce MIN/MAX
expression
when comparing against the lower/upper bound of the range instead of
lower/upper
of the type.

minmax-20.c is the new testcase which tests the ranges part.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* fold-const.cc (minmax_from_comparison): Add support for NE_EXPR.
* match.pd ((cond (cmp (convert1? x) c1) (convert2? x) c2)
pattern):
Add ne as a possible cmp.
((a CMP b) ? minmax : minmax pattern): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/minmax-22.c: New test.

[Bug target/106907] gcc/config/rs6000/rs6000.cc:23155: strange expression ?

2023-05-30 Thread jeevitha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106907

Jeevitha  changed:

   What|Removed |Added

 CC||jeevitha at gcc dot gnu.org

--- Comment #4 from Jeevitha  ---
(In reply to Andreas Schwab from comment #3)
> Should probably be written as swapped != !BYTES_BIG_ENDIAN.

I bootstrapped and regtest there is no regression with this change.

[Bug target/110040] rs6000 port emits dead mfvsrd instruction for simple test case

2023-05-30 Thread jeevitha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110040

Jeevitha  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 CC||bergner at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jeevitha at gcc dot 
gnu.org
   Keywords||missed-optimization
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2023-05-30
 Ever confirmed|0   |1
 Target||powerpc64le-linux

--- Comment #1 from Jeevitha  ---
I am working on this.

[Bug target/110040] New: rs6000 port emits dead mfvsrd instruction for simple test case

2023-05-30 Thread jeevitha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110040

Bug ID: 110040
   Summary: rs6000 port emits dead mfvsrd instruction for simple
test case
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jeevitha at gcc dot gnu.org
  Target Milestone: ---

GCC Trunk generates a dead mfvsrd for the following test case.

[jeevitha@ltcden2-lp1 ~]$ cat bug.c 
#include 

void
foo (signed long *dst, vector signed __int128 src)
{
  *dst = (signed long) src[0];
}

[jeevitha@ltcden2-lp1 ~]$ gcc bug.c  -O2 -mcpu=power9 -S -o bug.s
[jeevitha@ltcden2-lp1 ~]$ cat bug.s
.file   "bug.c"
.machine power9
.abiversion 2
.section".text"
.align 2
.p2align 4,,15
.globl foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
mfvsrd 11,34   #dead instruction
mfvsrld 10,34
std 10,0(3)
blr

[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822

--- Comment #3 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Matthias Kretz
:

https://gcc.gnu.org/g:717a14e727bce409ac7e7f10c413530e704f4ca7

commit r13-7393-g717a14e727bce409ac7e7f10c413530e704f4ca7
Author: Matthias Kretz 
Date:   Fri May 26 12:23:44 2023 +0200

libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

(cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)

[Bug target/110027] Misaligned vector store on detect_stack_use_after_return

2023-05-30 Thread oconnor663 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027

Jack O'Connor  changed:

   What|Removed |Added

 CC||oconnor663 at gmail dot com

--- Comment #3 from Jack O'Connor  ---
Thanks to Andrew Pinski's comment about -fstack-protector-strong, I can now
reproduce this issue on Godbolt: https://godbolt.org/z/47a695sWY. So the
minimal set of flags to reproduce on most distros (other than Arch Linux) is:
-mavx512f -fsanitize=address -fstack-protector-strong

[Bug target/110026] [Bug] 5% performance drop on important benchmark after r260951.

2023-05-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ra

--- Comment #3 from Andrew Pinski  ---
(In reply to d_vampile from comment #2)
> O0 does miss a lot of optimizations. However, for the problem I mentioned,
> the GPRs used before and the FP registers after modification are used. When
> vectorization is not applicable, the X0 register is faster than the D0
> register. Is it appropriate to modify here?


Well the generic_tunings has:
  { 4, /* load_int.  */
4, /* store_int.  */
4, /* load_fp.  */
4, /* store_fp.  */
4, /* load_pred.  */
4 /* store_pred.  */
  }, /* memmov_cost.  */


Which says the load/store of fp has the same cost as ints (gprs) (this is the
same as a53's tuning).

If anything that should be changed 

Of you should use -mcpu=* where appliable.

[Bug target/110023] 10% performance drop on important benchmark after r247544.

2023-05-30 Thread d_vampile at 163 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110023

--- Comment #2 from d_vampile  ---
(In reply to Andrew Pinski from comment #1)
> This is almost definitely an aarch64 cost model issue ...

Do you mean that the vectorized cost_model of the underlying hardware causes
the policy of not peeling the loop after r247544 to be chosen? ? So why does
loop peeling result in performance improvements?
For the following code, I understand that this is a very standard vectorized
effective loop.
for (j=0; j

[Bug target/110026] [Bug] 5% performance drop on important benchmark after r260951.

2023-05-30 Thread d_vampile at 163 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026

--- Comment #2 from d_vampile  ---
(In reply to Jakub Jelinek from comment #1)
> Note, any benchmarking for speed with -O rather than -O2/-O3 is
> intentionally missing various optimizations which can greatly improve
> performance.

O0 does miss a lot of optimizations. However, for the problem I mentioned, the
GPRs used before and the FP registers after modification are used. When
vectorization is not applicable, the X0 register is faster than the D0
register. Is it appropriate to modify here?

[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Matthias Kretz :

https://gcc.gnu.org/g:668d43502f465d48adbc1fe2956b979f36657e5f

commit r14-1409-g668d43502f465d48adbc1fe2956b979f36657e5f
Author: Matthias Kretz 
Date:   Fri May 26 12:23:44 2023 +0200

libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

[Bug target/107172] [13 Regression] wrong code with "-O1 -ftree-vrp" on x86_64-linux-gnu since r13-1268-g8c99e307b20c502e

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107172

--- Comment #51 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:69185294f322dd53d4e1592115014c5488302e2e

commit r14-1405-g69185294f322dd53d4e1592115014c5488302e2e
Author: Roger Sayle 
Date:   Tue May 30 14:40:50 2023 +0100

PR target/107172: Avoid "unusual" MODE_CC comparisons in simplify-rtx.cc

I believe that a better (or supplementary) fix to PR target/107172 is to
avoid producing incorrect (but valid) RTL in
simplify_const_relational_operation when presented with questionable
(obviously invalid) expressions, such as those produced during combine.
Just as with the "first do no harm" clause with the Hippocratic Oath,
simplify-rtx (probably) shouldn't unintentionally transform invalid RTL
expressions, into incorrect (non-equivalent) but valid RTL that may be
inappropriately recognized by recog.

In this specific case, many GCC backends represent their flags register via
MODE_CC, whose representation is intentionally "opaque" to the middle-end.
The only use of MODE_CC comprehensible to the middle-end's RTL optimizers
is relational comparisons between the result of a COMPARE rtx (op0) and
zero
(op1).  Any other uses of MODE_CC should be left alone, and some might
argue
indicate representational issues in the backend.

In practice, CPUs occasionally have numerous instructions that affect the
flags register(s) other than comparisons [AVR's setc, powerpc's mtcrf,
x86's clc, stc and cmc and x86_64's ptest that sets C and Z flags in
non-obvious ways, c.f. PR target/109973].  Currently care has to be taken,
wrapping these in UNSPEC, to avoid combine inappropriately merging flags
setters with flags consumers (such as conditional jumps).  It's safer to
teach simplify_const_relational_operation not to modify expressions that
it doesn't understand/recognize.

2023-05-30  Roger Sayle  

gcc/ChangeLog
PR target/107172
* simplify-rtx.cc (simplify_const_relational_operation): Return
early if we have a MODE_CC comparison that isn't a COMPARE against
const0_rtx.

[Bug target/109973] [13/14 Regression] Wrong code for AVX2 since 13.1 by combining VPAND and VPTEST since r13-2006-ga56c1641e9d25e

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109973

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:69185294f322dd53d4e1592115014c5488302e2e

commit r14-1405-g69185294f322dd53d4e1592115014c5488302e2e
Author: Roger Sayle 
Date:   Tue May 30 14:40:50 2023 +0100

PR target/107172: Avoid "unusual" MODE_CC comparisons in simplify-rtx.cc

I believe that a better (or supplementary) fix to PR target/107172 is to
avoid producing incorrect (but valid) RTL in
simplify_const_relational_operation when presented with questionable
(obviously invalid) expressions, such as those produced during combine.
Just as with the "first do no harm" clause with the Hippocratic Oath,
simplify-rtx (probably) shouldn't unintentionally transform invalid RTL
expressions, into incorrect (non-equivalent) but valid RTL that may be
inappropriately recognized by recog.

In this specific case, many GCC backends represent their flags register via
MODE_CC, whose representation is intentionally "opaque" to the middle-end.
The only use of MODE_CC comprehensible to the middle-end's RTL optimizers
is relational comparisons between the result of a COMPARE rtx (op0) and
zero
(op1).  Any other uses of MODE_CC should be left alone, and some might
argue
indicate representational issues in the backend.

In practice, CPUs occasionally have numerous instructions that affect the
flags register(s) other than comparisons [AVR's setc, powerpc's mtcrf,
x86's clc, stc and cmc and x86_64's ptest that sets C and Z flags in
non-obvious ways, c.f. PR target/109973].  Currently care has to be taken,
wrapping these in UNSPEC, to avoid combine inappropriately merging flags
setters with flags consumers (such as conditional jumps).  It's safer to
teach simplify_const_relational_operation not to modify expressions that
it doesn't understand/recognize.

2023-05-30  Roger Sayle  

gcc/ChangeLog
PR target/107172
* simplify-rtx.cc (simplify_const_relational_operation): Return
early if we have a MODE_CC comparison that isn't a COMPARE against
const0_rtx.

[Bug target/106887] ICE in extract_insn, at recog.cc:2791 since r13-2111-g6910cad55ffc330d

2023-05-30 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106887

Martin Jambor  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||jamborm at gcc dot gnu.org
 Status|NEW |RESOLVED

--- Comment #5 from Martin Jambor  ---
This issue has been fixed (I cannot reproduce it with GCC 13 nor master).

[Bug target/110039] [14 Regression] FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2

2023-05-30 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug target/110039] New: FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2

2023-05-30 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039

Bug ID: 110039
   Summary: FAIL: gcc.target/aarch64/rev16_2.c
scan-assembler-times rev16\\tw[0-9]+ 2
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

I think after g:d8545fb2c71683f407bfd96706103297d4d6e27b the test regresses on
aarch64.
We now generate:
__rev16_32_alt:
rev w0, w0
ror w0, w0, 16
ret

__rev16_32:
rev w0, w0
ror w0, w0, 16
ret

whereas before it was:
__rev16_32_alt:
rev16   w0, w0
ret

__rev16_32:
rev16   w0, w0
ret

I think the GIMPLE at expand time is better and the RTL that it tries to match
is simpler:
Failed to match this instruction:
(set (reg:SI 95)
(rotate:SI (bswap:SI (reg:SI 96))
(const_int 16 [0x10])))

So maybe it's simply a matter of adding that pattern to aarch64.md.

Anyway, filing this here to track the regression

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

2023-05-30 Thread klepikov.alex+bugs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #48 from Alexander Klepikov  
---
I made tests (including *.c files from GCC testsuite) and everything looks fine
for now. But I'm still afraid that pattern for 'ashrsi3_libcall_expand' is too
wide. It is possible to narrow it down as much as possible by adding distinct
attribute and set when emitting 'ashrsi3_libcall_collapsed' and then check it
and fail if not set:

(define_attr "libcall_collapsed"
 "ashrsi3,nil"
 (const_string "nil"))

(define_insn "ashrsi3_libcall_collapsed"
  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
(ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0")
 (match_operand:SI 2 "const_int_operand")))
   (clobber (reg:SI T_REG))
   (clobber (reg:SI PR_REG))]
  "TARGET_SH1"
  "OOPS"
  [(set_attr "type" "dyn_shift")
   (set_attr "libcall_collapsed" "ashrsi3")
   (set_attr "needs_delay_slot" "yes")])

 if (get_attr_libcall_collapsed(insn) != LIBCALL_COLLAPSED_ASHRSI3)
return false;

It will be super safe then but ugly a little bit.

[Bug c++/110037] GCC accepts private member access of enclosing class through friend function of inner class

2023-05-30 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110037

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Patrick Palka  ---
dup

*** This bug has been marked as a duplicate of bug 106756 ***

[Bug c++/106756] [CWG1699] Overbroad friendship for nested classes

2023-05-30 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106756

Patrick Palka  changed:

   What|Removed |Added

 CC||jlame646 at gmail dot com

--- Comment #5 from Patrick Palka  ---
*** Bug 110037 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/110035] Missed optimization for dependent assignment statements

2023-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035

--- Comment #1 from Richard Biener  ---
Ick - convoluted C++.  We end up with

void ff (struct MyClass & obj)
{
  vector(2) long unsigned int vect_SR.16;
  vector(2) long unsigned int vect_SR.15;
  vector(2) long unsigned int vect_SR.14;
  void * _6;

   [local count: 1073741824]:
  vect_SR.14_5 = MEM  [(struct MyClass
&)obj_2(D)];
  vect_SR.15_28 = MEM  [(struct MyClass &)obj_2(D)
+ 16];
  vect_SR.16_30 = MEM  [(struct MyClass &)obj_2(D)
+ 32];
  _6 = operator new (48);
  MEM  [(struct MyClass2 *)_6] = vect_SR.14_5;
  MEM  [(struct MyClass2 *)_6 + 16B] =
vect_SR.15_28;
  MEM  [(struct MyClass2 *)_6 + 32B] =
vect_SR.16_30;
  HandleMyClass2 (_6); [tail call]

and the issue is that 'operator new (48)' can alter what 'obj' points to,
so we cannot move the loads across the call and we get spilling.

There is no inter-procedural analysis in GCC that would tell us that
'obj_2(D)' (the MyClass & obj argument of ff) does not point to an
object that did not escape.  In fact 'ff' has global visibility
and it might have other callers.

If you add -fwhole-program then you get the function inlined to main and

main:
.LFB652:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movl$48, %edi
call_Znwm
movq$0, (%rax)
movq%rax, %rdi
movq$0, 8(%rax)
movq$0, 16(%rax)
movq$0, 24(%rax)
movq$0, 32(%rax)
movq$0, 40(%rax)
call_Z14HandleMyClass2Pv
xorl%eax, %eax
addq$8, %rsp
.cfi_def_cfa_offset 8
ret

(not using vectors because 'main' is considered cold).  Do you cite an
inline copy of ff() for clang?

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-05-30 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-30
   Target Milestone|--- |14.0
 Ever confirmed|0   |1
   Priority|P3  |P1
 Status|UNCONFIRMED |NEW
 CC||lili.cui at intel dot com

--- Comment #1 from Richard Biener  ---
Probably best to limit the values to reassoc-width by adding the
appropriate IntegerRange attribute in params.opt

IntegerRange(0, 256)

maybe?

[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer

2023-05-30 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036

Andreas Schwab  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Andreas Schwab  ---
Fixed on all branches.

[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036

--- Comment #3 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Andreas Schwab
:

https://gcc.gnu.org/g:2910660f00c74d12d17e3114870e287804a3332c

commit r12-9661-g2910660f00c74d12d17e3114870e287804a3332c
Author: Andreas Schwab 
Date:   Sun May 28 12:08:22 2023 +0200

riscv: update riscv_asan_shadow_offset

gcc/
PR target/110036
* config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to
match libsanitizer.

[Bug libstdc++/86880] Incorrect mersenne_twister_engine equality comparison between rotated states

2023-05-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86880

--- Comment #3 from Jonathan Wakely  ---
We have the same problem with std::subtract_with_carry_engine. Its equality
operator doesn't work for rotated states.

[Bug libstdc++/60441] Incorrect textual representation for std::mersenne_twister_engine

2023-05-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60441

--- Comment #3 from Jonathan Wakely  ---
We have the same problem with std::subtract_with_carry_engine.

[Bug libstdc++/86880] Incorrect mersenne_twister_engine equality comparison between rotated states

2023-05-30 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86880

--- Comment #2 from Jonathan Wakely  ---
This would fix the equality operator to correctly compare rotated states:

--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -601,8 +601,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend bool
   operator==(const mersenne_twister_engine& __lhs,
 const mersenne_twister_engine& __rhs)
-  { return (std::equal(__lhs._M_x, __lhs._M_x + state_size, __rhs._M_x)
-   && __lhs._M_p == __rhs._M_p); }
+  {
+   const _UIntType* const __lx = __lhs._M_x;
+   const _UIntType* const __rx = __rhs._M_x;
+   size_t __lp = __lhs._M_p % state_size;
+   size_t __rp = __rhs._M_p % state_size;
+   size_t __n1, __n2;
+   if (__lp > __rp)
+ {
+   __n1 = state_size - __lp;
+   __n2 = __lp - __rp;
+ }
+   else
+ {
+   __n1 = state_size - __rp;
+   __n2 = __rp - __lp;
+ }
+   if (!std::equal(__lx + __lp, __lx + __lp + __n1, __rx + __rp))
+ return false;
+   if (__n1 == state_size) // i.e. __lhs._M_p == 0 && __rhs._M_p == 0
+ return true;
+   __lp = (__lp + __n1) % state_size;
+   __rp = (__rp + __n1) % state_size;
+   if (!std::equal(__lx + __lp, __lx + __lp + __n2, __rx + __rp))
+ return false;
+   __lp = (__lp + __n2) % state_size;
+   __rp = (__rp + __n2) % state_size;
+   size_t __n3 = state_size - __n1 - __n2;
+   if (!std::equal(__lx + __lp, __lx + __lp + __n3, __rx + __rp))
+ return false;
+   return true;
+  }

   /**
* @brief Inserts the current state of a % mersenne_twister_engine


But the testcase above still fails, because mteB and mteA have different
content in the array, even though they produce the same sequence of numbers.

[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036

--- Comment #2 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Andreas Schwab
:

https://gcc.gnu.org/g:acf4fac6c5d14b30dca6cbde75f8b7db89850e04

commit r13-7389-gacf4fac6c5d14b30dca6cbde75f8b7db89850e04
Author: Andreas Schwab 
Date:   Sun May 28 12:08:22 2023 +0200

riscv: update riscv_asan_shadow_offset

gcc/
PR target/110036
* config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to
match libsanitizer.

[Bug c/109999] [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause

2023-05-30 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

Tobias Burnus  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Tobias Burnus  ---
FIXED for mainline/GCC 14.

[Bug c/109999] [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause

2023-05-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:a899401404186843f38462c8fc9de733f19ce864

commit r14-1404-ga899401404186843f38462c8fc9de733f19ce864
Author: Tobias Burnus 
Date:   Tue May 30 12:49:09 2023 +0200

OpenMP: Improve C/C++ parsing error message [PR10]

Replace
  error: expected '#pragma omp' clause before ...
by the the more readable/clearer
  error: expected an OpenMP clause before ...

(And likewise for '#pragma acc' and OpenACC.)

PR c/10

gcc/c/ChangeLog:

* c-parser.cc (c_parser_oacc_all_clauses,
c_parser_omp_all_clauses): Improve error wording.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_oacc_all_clauses,
cp_parser_omp_all_clauses): Improve error wording.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/asyncwait-1.c: Update dg-error.
* c-c++-common/goacc/clauses-fail.c: Likewise.
* c-c++-common/goacc/data-2.c: Likewise.
* c-c++-common/gomp/declare-target-2.c: Likewise.
* c-c++-common/gomp/directive-1.c: Likewise.
* g++.dg/goacc/data-1.C: Likewise.

[Bug c/109827] Pointer/integer mismatch in ?: not covered by -Wint-conversion

2023-05-30 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109827

Eric Gallager  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-30
 Status|UNCONFIRMED |NEW
 CC||egallager at gcc dot gnu.org
 Ever confirmed|0   |1
 Blocks||44209

--- Comment #1 from Eric Gallager  ---
Confirmed.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44209
[Bug 44209] [meta-bug] Some warnings are not linked to diagnostics options

  1   2   >