date:20220624

[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails

2022-06-24 Thread foom at fuhm dot net via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

James Y Knight  changed:

   What|Removed |Added

 CC||foom at fuhm dot net

--- Comment #31 from James Y Knight  ---
It doesn't make sense to have a mode in which `int array[0]` is accepted but is
not a flex array.

Either that should be a compilation error (as the standard specifies), or it
should be a flex array. Accepting it as an extension but having it do the wrong
thing is not useful or helpful.

Note that Clang has a dedicated warning flag for zero-length arrays:
-Wzero-length-array, so anyone who wants to prohibit them may use
-Werror=zero-length-array. It would be helpful for GCC could follow suit there.

The other proposed modes:
- Treat all trailing arrays as flexible arrays. the default behavior;
- Only treating [], [0], and [1] as flexible array;
- Only treating [] and [0] as flexible array;
do make sense.

[Bug libfortran/106083] New: gfortran.dg/ieee/large_1.f90 fails on powerpc64 with ieee longlongs

2022-06-24 Thread seurer at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106083

Bug ID: 106083
   Summary: gfortran.dg/ieee/large_1.f90 fails on powerpc64 with
ieee longlongs
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

make  -k check-fortran RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/large_1.f90"

FAIL: gfortran.dg/ieee/large_1.f90   -O0  execution test
FAIL: gfortran.dg/ieee/large_1.f90   -O1  execution test
FAIL: gfortran.dg/ieee/large_1.f90   -O2  execution test
FAIL: gfortran.dg/ieee/large_1.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/ieee/large_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/ieee/large_1.f90   -Os  execution test


This occurs with the compiler configured with  --with-long-double-format=ieee
on a distro (Fedora 36) which also used  --with-long-double-format=ieee.  

It fails when compiled with gcc 10, 12, and trunk but works when compiled with
gcc 11.  The code generated for gcc 11 is identical to that for gcc 12 which is
why I suspect a library difference.


spawn -ignore SIGHUP
/home/seurer/gcc/git/build/gcc-ieee/gcc/testsuite/gfortran/../../gfortran
-B/home/seurer/gcc/git/build/gcc-ieee/gcc/testsuite/gfortran/../../
-B/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libgfortran/
/home/seurer/gcc/git/gcc-ieee/gcc/testsuite/gfortran.dg/ieee/large_1.f90
-fdiagnostics-plain-output -fdiagnostics-plain-output -O0 -pedantic-errors
-fintrinsic-modules-path
/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libgfortran/
-fno-unsafe-math-optimizations -frounding-math -fsignaling-nans
-B/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libgfortran/.libs
-L/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libatomic/.libs
-B/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libquadmath/.libs
-L/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libquadmath/.libs
-L/home/seurer/gcc/git/build/gcc-ieee/powerpc64le-unknown-linux-gnu/./libquadmath/.libs
-lm -o ./large_1.exe^M
PASS: gfortran.dg/ieee/large_1.f90   -O0  (test for excess errors)
...
spawn [open ...]^M
STOP 2
FAIL: gfortran.dg/ieee/large_1.f90   -O0  execution test


The code from large_1.f90 that fails is just:

! Testing IEEE modules on large real kinds

program test

  use ieee_arithmetic
  implicit none

  ! k1 and k2 will be large real kinds, if supported, and single/double
  ! otherwise
  integer, parameter :: k1 = &
max(ieee_selected_real_kind(precision(0.d0) + 1), kind(0.))
  if (ieee_is_finite(ieee_value(0._k1, ieee_negative_inf))) STOP 2
end program test

[Bug c++/20710] g++ should warn when hiding non-virtual method in base class

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20710

Jason Merrill  changed:

   What|Removed |Added

 CC||EisahLee at gmx dot de

--- Comment #11 from Jason Merrill  ---
*** Bug 67345 has been marked as a duplicate of this bug. ***

[Bug c++/67345] -Woverloaded-virtual false negative: Does not warn on overloaded virtual function

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67345

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE
 CC||jason at gcc dot gnu.org

--- Comment #3 from Jason Merrill  ---
Dup.

*** This bug has been marked as a duplicate of bug 20710 ***

[Bug c++/87656] Useful flags to enable with -Wall or -Wextra

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87656
Bug 87656 depends on bug 87729, which changed state.

Bug 87729 Summary: Please include -Woverloaded-virtual in -Wall
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87729

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/87729] Please include -Woverloaded-virtual in -Wall

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87729

Jason Merrill  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Jason Merrill  ---
Done for GCC 13.

[Bug c++/20423] Warning -Woverloaded-virtual triggers to often

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20423

Jason Merrill  changed:

   What|Removed |Added

   Target Milestone|--- |13.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #13 from Jason Merrill  ---
For GCC 13 the requested semantics will be available in -Woverloaded-virtual=1
or -Wall.

[Bug c++/20423] Warning -Woverloaded-virtual triggers to often

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20423

--- Comment #12 from CVS Commits  ---
The master branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:113844d68e94f4e9c0e946db351ba7d3d4a1335a

commit r13-1262-g113844d68e94f4e9c0e946db351ba7d3d4a1335a
Author: Jason Merrill 
Date:   Fri Jun 24 14:40:12 2022 -0400

c++: Include -Woverloaded-virtual in -Wall [PR87729]

This seems like a good warning to have in -Wall, as requested.  But as
pointed out in PR20423, some users want a warning only when a derived
function doesn't override any base function.  So let's put that lesser
version in -Wall (and -Woverloaded-virtual=1) while leaving the semantics
for the existing option the same.

PR c++/87729
PR c++/20423

gcc/c-family/ChangeLog:

* c.opt (Woverloaded-virtual): Add levels, include in -Wall.

gcc/ChangeLog:

* doc/invoke.texi: Document changes.

gcc/cp/ChangeLog:

* class.cc (warn_hidden): Handle -Woverloaded-virtual=1.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Woverloaded-virt1.C: New test.
* g++.dg/warn/Woverloaded-virt2.C: New test.

[Bug c++/87729] Please include -Woverloaded-virtual in -Wall

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87729

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:113844d68e94f4e9c0e946db351ba7d3d4a1335a

commit r13-1262-g113844d68e94f4e9c0e946db351ba7d3d4a1335a
Author: Jason Merrill 
Date:   Fri Jun 24 14:40:12 2022 -0400

c++: Include -Woverloaded-virtual in -Wall [PR87729]

This seems like a good warning to have in -Wall, as requested.  But as
pointed out in PR20423, some users want a warning only when a derived
function doesn't override any base function.  So let's put that lesser
version in -Wall (and -Woverloaded-virtual=1) while leaving the semantics
for the existing option the same.

PR c++/87729
PR c++/20423

gcc/c-family/ChangeLog:

* c.opt (Woverloaded-virtual): Add levels, include in -Wall.

gcc/ChangeLog:

* doc/invoke.texi: Document changes.

gcc/cp/ChangeLog:

* class.cc (warn_hidden): Handle -Woverloaded-virtual=1.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Woverloaded-virt1.C: New test.
* g++.dg/warn/Woverloaded-virt2.C: New test.

[Bug c++/20423] Warning -Woverloaded-virtual triggers to often

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20423

Jason Merrill  changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug rtl-optimization/106082] [13 Regression] Recent change broke m68k

2022-06-24 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106082

--- Comment #1 from Jeffrey A. Law  ---
/* { dg-require-effective-target trampolines } */

extern void abort (void);
extern void exit (int);

static void recursive (int n, void (*proc) (void))
{
  __label__ l1;

  void do_goto (void)
  {
goto l1;
  }

  if (n == 3)
  recursive (n - 1, do_goto);
  else if (n > 0)
recursive (n - 1, proc);
  else
(*proc) ();
  return;

l1:
  if (n == 3)
exit (0);
  else
abort ();
}

int main ()
{
  recursive (10, abort);
  abort ();
}

[Bug rtl-optimization/106082] New: [13 Regression] Recent change broke m68k

2022-06-24 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106082

Bug ID: 106082
   Summary: [13 Regression] Recent change broke m68k
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: law at gcc dot gnu.org
  Target Milestone: ---
Target: m68k

This change:
commit 4f77738c3b44cb6b7bfe2a7ef823a5d9d75c0e79 (HEAD, refs/bisect/bad)
Author: Richard Biener 
Date:   Thu Apr 14 14:06:22 2022 +0200

rtl-optimization/105231 - distribute_notes and REG_EH_REGION

The following mitigates a problem in combine distribute_notes which
places an original REG_EH_REGION based on only may_trap_p which is
good to test whether a non-call insn can possibly throw but not if
actually it does or we care.  That's something we decided at RTL
expansion time where we possibly still know the insn evaluates
to a constant.

In fact, the REG_EH_REGION note with lp > 0 can only come from the
original i3 and an assert is added to that effect.  That means we only
need to retain the note on i3 or, if that cannot trap, drop it but we
should never move it to i2.

The following places constraints on the insns to combine with
non-call exceptions since we cannot handle the case where we
have more than one EH side-effect in the IL.  The patch also
makes sure we can accumulate that on i3 and do not split
a possible exception raising part of it to i2.  As a special
case we do not place any restriction on all externally
throwing insns when there is no REG_EH_REGION present.

2022-04-22  Richard Biener  

PR rtl-optimization/105231
* combine.cc (distribute_notes): Assert that a REG_EH_REGION
with landing pad > 0 is from i3.  Put any REG_EH_REGION note
on i3 or drop it if the insn can not trap.
(try_combine): Ensure that we can merge REG_EH_REGION notes
with non-call exceptions.  Ensure we are not splitting a
trapping part of an insn with non-call exceptions when there
is any REG_EH_REGION note to preserve.

* gcc.dg/torture/pr105231.c: New testcase.

Broke various tests on the m68k.  This is from nested-func-5.c:
./xgcc -B./ -O2 j.c -c
j.c: In function ‘recursive’:
j.c:28:1: error: in basic block 2:
   28 | }
  | ^
j.c:28:1: error: flow control insn inside a basic block
(call_insn 23 22 24 2 (call (mem:QI (symbol_ref:SI ("__clear_cache")) [0  S1
A8])
(const_int 8 [0x8])) "j.c":6:13 406 {*call}
 (expr_list:REG_CALL_DECL (symbol_ref:SI ("__clear_cache"))
(nil))
(expr_list (use (mem:SI (plus:SI (reg/f:SI 15 %sp)
(scratch:SI)) [0  S4 A8]))
(nil)))
during RTL pass: combine
j.c:28:1: internal compiler error: in rtl_verify_bb_insns, at cfgrtl.cc:2797


This can be seen with a m68k-linux-gnu cross compiler.

Sorry I took so long to report this.  I'd assumed the failures were a result of
the binutils work which started warning for bogus rwx segments which happened
about the same time.

[Bug c++/87729] Please include -Woverloaded-virtual in -Wall

2022-06-24 Thread dblaikie at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87729

David Blaikie  changed:

   What|Removed |Added

 CC||dblaikie at gmail dot com

--- Comment #7 from David Blaikie  ---
FWIW, I implemented (or at least tuned) overloaded-virtual in Clang - it
doesn't quite match GCC's behavior. (specifically it doesn't warn on two
overloads within the same class - it specifically warns on the case where a
user might've mismatched what was intended to be an override but instead became
an overload).

At least that's my recollection.

[Bug target/106022] [12/13 Regression] Enable vectorizer generates extra load

2022-06-24 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106022

--- Comment #12 from H.J. Lu  ---
(In reply to Richard Biener from comment #11)
> (In reply to H.J. Lu from comment #9)
> > (In reply to Richard Biener from comment #8)
> > > (In reply to H.J. Lu from comment #6)
> > > > Created attachment 53169 [details]
> > > > A patch
> > > > 
> > > > This patch multiplies the vector store cost by the number of scalar 
> > > > elements
> > > > in
> > > > a word to properly compare scalar store cost against vector store cost.
> > > 
> > > But that's not "properly" but "wrong" ...
> > > 
> > > Note we already cost the vector load from the constant pool so the vector
> > > side costing is correct.
> > > 
> > > What's eventually imprecise is the scalar cost where you could anticipate
> > > store merging, but adjusting the vector cost side is just wrong.
> > 
> > I tried to adjust the scalar cost.  When the scalar cost of storing a byte
> > is 6, dividing it by 8 (the number of scalar elements in a word) becomes 0.
> > Will it work?
> 
> No, I think you would need to pattern match an actual store sequence,
> for example by looking at
> 
>  if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
>  && pow2p_hwi (DR_GROUP_STORE_COUNT (stmt_info)))
>/* cost a possibly merged store only once (but with larger mode?) */
>if (DR_GROUP_FIRST_ELEMENT (stmt_info) == stmt_info)
>  ...

The information aren't available in add_stmt_cost.  I will
count number of scalar stores and vector stores.  Then I will
compare them in finish_cost.

> So costing the whole sequence of scalar stores a single time, with
> adjusted mode.
> 
> store-merging also handles non-QImode stores btw.

[Bug fortran/105813] ICE in gfc_simplify_unpack, at fortran/simplify.cc:8490

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105813

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:f21f17f95c0237f4f987a5fa9f1fa9c7e0db3c40

commit r13-1260-gf21f17f95c0237f4f987a5fa9f1fa9c7e0db3c40
Author: Harald Anlauf 
Date:   Fri Jun 24 22:21:39 2022 +0200

Fortran: fix checking of arguments to UNPACK when MASK is a variable
[PR105813]

gcc/fortran/ChangeLog:

PR fortran/105813
* check.cc (gfc_check_unpack): Try to simplify MASK argument to
UNPACK so that checking of the VECTOR argument can work when MASK
is a variable.

gcc/testsuite/ChangeLog:

PR fortran/105813
* gfortran.dg/unpack_vector_1.f90: New test.

[Bug tree-optimization/101868] [9 Regression] Incorrect reordering in -O2 with LTO

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101868

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Dimitar Dimitrov :

https://gcc.gnu.org/g:b1d0d3520e96802dee37e8fc1c56e19c13d598b1

commit r13-1257-gb1d0d3520e96802dee37e8fc1c56e19c13d598b1
Author: Dimitar Dimitrov 
Date:   Sun May 15 17:30:52 2022 +0300

testsuite: Remove reliance on argc in lto/pr101868_0.c

Some embedded targets do not pass any argv arguments.  When argc is
zero, this causes spurious failures for lto/pr101868_0.c.  Fix by
following the strategy in r0-114701-g2c49569ecea56d.  Use a volatile
variable instead of argc to inject a runtime value into the test.

I validated the following:
  - No changes in testresults for x86_64-pc-linux-gnu.
  - The spurious failures are fixed for PRU target.
  - lto/pr101868_0.c still fails on x86_64-pc-linux-gnu, if
the PR/101868 fix (r12-2254-gfedcf3c476aff7) is reverted.

PR tree-optimization/101868

gcc/testsuite/ChangeLog:

* gcc.dg/lto/pr101868_0.c (zero): New volatile variable.
(main): Use it instead of argc.

Signed-off-by: Dimitar Dimitrov

[Bug tree-optimization/94026] combine missed opportunity to simplify comparisons with zero

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026

--- Comment #11 from Segher Boessenkool  ---
Wrt rs6000: we have shift+mask+compare in just one insn (it is basic powerpc),
and our
  (define_insn "*and3_imm_dot_shifted"
pattern outputs this as just an "andi." insn when it can.  But indeed the shift
wasn't optimised away for us either.

[Bug c++/87729] Please include -Woverloaded-virtual in -Wall

2022-06-24 Thread jason at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87729

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
   Target Milestone|--- |13.0
 CC||jason at gcc dot gnu.org

[Bug d/105413] gdc extended assembler cannot constraints r8 - r15

2022-06-24 Thread ibuclaw at gdcproject dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105413

Iain Buclaw  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Iain Buclaw  ---
@register attribute has been added, meaning that you can now have the following
as an alternative to your example.

---
import gcc.attributes : register;

@register("rax") SYSCALL rax = ident; // rax - syscall number
@register("rdi") size_t rdi = arg1;   // rdi - arg1
@register("rsi") size_t rsi = arg2;   // rsi - arg2
@register("rdx") size_t rdx = arg3;   // rdx - arg3
@register("r10") size_t r10 = arg4;   // r10 - arg4
asm @nogc nothrow {
  "syscall"
  : "=r" (rax)
  // inputs:
  : "r" (rax),
"r" (rdi),
"r" (rsi),
"r" (rdx),
"r" (r10),
"m"( *cast(ubyte*)arg1)   // "dummy" input instead of full memory clobber
  // clobers
  : "rcx", "r11";  // Clobers rax, and rcx and r11.
}
return rax;
---

https://godbolt.org/z/PvnTsea9T

[Bug d/105413] gdc extended assembler cannot constraints r8 - r15

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105413

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Iain Buclaw :

https://gcc.gnu.org/g:91418c42089cd1cbe71edcd6b2f5b26559819372

commit r13-1255-g91418c42089cd1cbe71edcd6b2f5b26559819372
Author: Iain Buclaw 
Date:   Thu Jun 23 18:24:07 2022 +0200

d: Add `@register' attribute to compiler and library.

The `@register` attribute specifies that a local or `__gshared` variable
is to be given a register storage-class in the C sense of the term, and
will be placed into a register named `registerName`.

The variable needs to boiled down to a data type that fits the target
register.  It also cannot have either thread-local or `extern` storage.
It is an error to take the address of a register variable.

PR d/105413

gcc/d/ChangeLog:

* d-attribs.cc (d_handle_register_attribute): New function.
(d_langhook_attribute_table): Add register attribute.
* d-codegen.cc (d_mark_addressable): Error if taken address of
register variable.
(build_frame_type): Error if register variable has non-local
references.
* d-tree.h (d_mark_addressable): Add complain parameter.
* decl.cc (get_symbol_decl): Mark register varibles DECL_REGISTER.
Error when register variable declared thread-local or extern.
* expr.cc (ExprVisitor::visit (IndexExp *)): Don't complain about
marking register vectors as addressable in an ARRAY_REF.

libphobos/ChangeLog:

* libdruntime/gcc/attributes.d (register): Define.

gcc/testsuite/ChangeLog:

* gdc.dg/attr_register1.d: New test.
* gdc.dg/attr_register2.d: New test.
* gdc.dg/attr_register3.d: New test.

[Bug tree-optimization/94026] combine missed opportunity to simplify comparisons with zero

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026

--- Comment #10 from Segher Boessenkool  ---
So on Arm we get

Trying 6 -> 8:
6: r119:SI=r123:SI>>0x8
  REG_DEAD r123:SI
8: {cc:CC_NZ=cmp(r119:SI&0x6,0);clobber scratch;}
  REG_DEAD r119:SI
Failed to match this instruction:
(parallel [
(set (reg:CC_NZ 100 cc)
(compare:CC_NZ (and:SI (lshiftrt:SI (reg:SI 123)
(const_int 8 [0x8]))
(const_int 6 [0x6]))
(const_int 0 [0])))
(clobber (scratch:SI))
])
Failed to match this instruction:
(set (reg:CC_NZ 100 cc)
(compare:CC_NZ (and:SI (lshiftrt:SI (reg:SI 123)
(const_int 8 [0x8]))
(const_int 6 [0x6]))
(const_int 0 [0])))

instead of something like

(set (reg:CC_NZ 100 cc)
 (compare:CC_NZ (and:SI (reg:SI 123)
(const_int 1536))
(const_int 0)))

which is correct for every CC mode even, not just NZ?

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #25 from Jakub Jelinek  ---
(In reply to Linus Torvalds from comment #23)
> (In reply to Jakub Jelinek from comment #22)
> > 
> > If the wider registers are narrowed before register allocation, it is just
> > a pair like (reg:SI 123) (reg:SI 256) and it can be allowed anywhere.
> 
> That was more what I was thinking - why is the DImode information being kept
> so long?

This is what is being discussed here.
Some possibilities are lower these multi-word operations during expansion from
GIMPLE to RTL (after all, the generic code usually does that without anything
special needed on the backend side unless one declares the backend can do that
better), one counter-argument to that is the x86 STV pass which uses vector
operations for 2 word operations when possible and it won't really work when it
is lowered during expansion.

Another is splitting those before register allocation, which is what some
patterns did and what other patterns didn't.

Or it can be split after register allocation.

My understanding was that Roger tried to change some patterns from splitting
after RA to before RA and it didn't improve this testcase, so in the end
changed some other patterns from splitting before RA to after RA.

[Bug tree-optimization/94026] combine missed opportunity to simplify comparisons with zero

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026

--- Comment #9 from Segher Boessenkool  ---
This is all handled in combine, nothing is specific to rs6000 (only the
description of all of our insns is, of course, but there is really no way
around that, nor should there be :-) )

Why does combine not optimise this for Arm?  Of course it would be good if
this would be optimised early as well, but that does not mean we should not
try to optimise it late as well!

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread torvalds--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #24 from Linus Torvalds  ---
(In reply to Linus Torvalds from comment #23)
> 
> And this now brings back my memory of the earlier similar discussion - it
> wasn't about DImode code generation, it was about bitfield code generation
> being horrendous,

Searching around, it's this one from 2011:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696

not really related to this issue, apart from the superficially similar issue
with oddly bad code generation.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread torvalds--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #23 from Linus Torvalds  ---
(In reply to Jakub Jelinek from comment #22)
> 
> If the wider registers are narrowed before register allocation, it is just
> a pair like (reg:SI 123) (reg:SI 256) and it can be allowed anywhere.

That was more what I was thinking - why is the DImode information being kept so
long?

I realize that you want to do a lot of the early CSE etc operations at that
higher level, but by the time you are actually allocating registers and
thinking about spilling them, why is it still a DImode thing?

And this now brings back my memory of the earlier similar discussion - it
wasn't about DImode code generation, it was about bitfield code generation
being horrendous, where gcc was keeping the whole "this is a bitfield"
information around for a long time and as a result generating truly horrendous
code. When it looked like it instead should just have turned it into a load and
shift early, and then doing all the sane optimizations at that level (ie
rewriting simple bitfield code to just do loads and shifts generated *much*
better code than using bitfields).

But this is just my personal curiosity at this point - it looks like Roger
Sayle's patch has fixed the immediate problem, so the big issue is solved. And
maybe the fact that clang is doing so much better is due to something else
entirely - it just _looks_ like it might be this artificial constraint by gcc
that makes it do bad register and spill choices.

[Bug analyzer/106066] crash dump when "-fdump-analyzer" enabled

2022-06-24 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106066

--- Comment #4 from David Malcolm  ---
(In reply to David Malcolm from comment #2)
> Thanks for filing this bug.
> 
> I can reproduce both crashes with trunk.

Correction: for src/ssl_crtlist.c I'm seeing the same crash as in comment #0
(in dump_mem_ref), rather than in c_tree_printer.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #22 from Jakub Jelinek  ---
(In reply to Linus Torvalds from comment #21)
> Whee.
> 
> Why does gcc have that constraint, btw? I tried to look at the clang code
> generation once more, and I don't *think* clang has the same constraint, and
> maybe that is why it does so much better?

Registers in RTL have just a single register number and mode (ok, it has some
extra info, but not a set of registers).  When it is a pseudo register, that
doesn't constrain anything, it is just
(reg:DI 175).
But when it is a hard register, it still has just a single register number,
so there is no way to express through that non-consecutive set of registers,
so
(reg:DI 4)
needs to be di:si pair etc.
If the wider registers are narrowed before register allocation, it is just
a pair like (reg:SI 123) (reg:SI 256) and it can be allowed anywhere.
If we wanted RA to allocate non-consecutive registers, we'd need to represent
that differently (say as concatenation of SImode registers), but then it
wouldn't be accepted by constraints and predicates of most of the
define_insn_and_split patterns.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread torvalds--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #21 from Linus Torvalds  ---
(In reply to CVS Commits from comment #20)
>
> One might think
> that splitting early gives the register allocator more freedom to
> use available registers, but in practice the constraint that double
> word values occupy consecutive registers (when ultimately used as a
> DImode value) is the greater constraint. 

Whee.

Why does gcc have that constraint, btw? I tried to look at the clang code
generation once more, and I don't *think* clang has the same constraint, and
maybe that is why it does so much better?

Yes, x86 itself inherently has a couple of forced register pairings (notably
%edx:%eax for 64-bit multiplication and division), and obviously the whole
calling convention requires well-defined pairings, but in the general case it
seems to be a mistake to keep DImode values as DImode values and force them to
be consecutive registers when used.

Maybe I misunderstand. But now that this comes up I have this dim memory of
actually having had a discussion like this before on bugzilla, where gcc
generated horrible DImode code.

> GCC 11  [use %ecx to address memory, require a 24-byte stack frame]
> sub esp, 24
> mov ecx, DWORD PTR [esp+40]
> 
> GCC 12 [use %eax to address memory, require a 44-byte stack frame]
> sub esp, 44
> mov eax, DWORD PTR [esp+64]

I just checked the current git -tip, and this does seem to fix the original
case too, with the old horrid 2620 bytes of stack frame now being a *much*
improved 404 bytes!

So your patch - or other changes - does fix it for me, unless I did something
wrong in my testing (which is possible).

Thanks. I'm not sure what the gcc policy on closing the bug is (and I don't
even know if I am allowed), so I'm not marking this closed, but it seems to be
fixed as far as I am concerned, and I hope it gets released as a dot-release
for the gcc-12 series.

[Bug analyzer/106066] crash dump when "-fdump-analyzer" enabled

2022-06-24 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106066

--- Comment #3 from David Malcolm  ---
Minimal reproducer for crash in comment #0 (crash in dump_mem_ref seen with
_do_poll:

struct s {
  unsigned int f;
};
int use(unsigned int);
static struct s *arr;

void test(int n) {
  int i;
  for (i = 0; i < n; i++) {
unsigned int n, e;
e = arr[i].f;
n = e ? 42 : 0;
use(n);
  }
}

$ ./xgcc -B. -fanalyzer -fdump-analyzer -O1
../../src/gcc/testsuite/gcc.dg/analyzer/pr106066.c
during IPA pass: analyzer
../../src/gcc/testsuite/gcc.dg/analyzer/pr106066.c:12:16: internal compiler
error: Segmentation fault
   12 | n = e ? 42 : 0;
  | ~~~^~~
0x13fac05 crash_signal
../../src/gcc/toplev.cc:322
0xa3c54f tree_class_check(tree_node*, tree_code_class, char const*, int, char
const*)
../../src/gcc/tree.h:3638
0x15428d7 dump_mem_ref
../../src/gcc/tree-pretty-print.cc:1700
0x1544ce3 dump_generic_node(pretty_printer*, tree_node*, int, dump_flag, bool)
../../src/gcc/tree-pretty-print.cc:2061
0x1547439 dump_generic_node(pretty_printer*, tree_node*, int, dump_flag, bool)
../../src/gcc/tree-pretty-print.cc:2425
0x19af603 ana::dump_tree(pretty_printer*, tree_node*)
../../src/gcc/analyzer/region-model.cc:87
0x19af646 ana::dump_quoted_tree(pretty_printer*, tree_node*)
../../src/gcc/analyzer/region-model.cc:97
0x199d935 ana::sm_state_map::print(ana::region_model const*, bool, bool,
pretty_printer*) const
../../src/gcc/analyzer/program-state.cc:240
0x199fa94 ana::program_state::dump_to_pp(ana::extrinsic_state const&, bool,
bool, pretty_printer*) const
../../src/gcc/analyzer/program-state.cc:899
0x19761d5 ana::exploded_graph::get_or_create_node(ana::program_point const&,
ana::program_state const&, ana::exploded_node*)
../../src/gcc/analyzer/engine.cc:2584
0x1978504
ana::exploded_graph::maybe_process_run_of_before_supernode_enodes(ana::exploded_node*)
../../src/gcc/analyzer/engine.cc:3447
0x1977706 ana::exploded_graph::process_worklist()
../../src/gcc/analyzer/engine.cc:3113
0x197d252 ana::impl_run_checkers(ana::logger*)
../../src/gcc/analyzer/engine.cc:5833
0x197d66b ana::run_checkers()
../../src/gcc/analyzer/engine.cc:5907
0x1970646 execute
../../src/gcc/analyzer/analyzer-pass.cc:88
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug analyzer/106066] crash dump when "-fdump-analyzer" enabled

2022-06-24 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106066

David Malcolm  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2022-06-24
 Ever confirmed|0   |1

--- Comment #2 from David Malcolm  ---
Thanks for filing this bug.

I can reproduce both crashes with trunk.

[Bug tree-optimization/94026] combine missed opportunity to simplify comparisons with zero

2022-06-24 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026

--- Comment #8 from Jeffrey A. Law  ---
I don't think so -- the goal here is to optimize this in gimple so that all
targets benefit rather than every target having to customize a solution for
this idiom.

If Roger's patch is sound you might even be able to simplify the ppc backend
eve so slightly.

[Bug middle-end/106081] missed vectorization

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106081

--- Comment #1 from Jan Hubicka  ---
This is an attempt to vectorize by hand, but it seems we do not generate
vpmovsxwd for the vector short->double conversion

struct pixels
{
short a __attribute__ ((vector_size(4*2)));
} *pixels;
struct dpixels
{
double a __attribute__ ((vector_size(8*4)));
};
typedef double v4df __attribute__ ((vector_size (32)));

struct dpixels
test(double *k)
{
struct dpixels results={};
for (int u=0; u<1;u++,k--)
{
results.a += *k*__builtin_convertvector (pixels[u].a, v4df);
}
return results;
}

clang seems to do right thing here.

[Bug middle-end/106059] [13 regression] cc.dg/vect/pr79347.c fails after r13-1171-g9f55aee9dca759

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106059

--- Comment #5 from Segher Boessenkool  ---
Thank you for the quick fix!

[Bug middle-end/106081] New: missed vectorization

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106081

Bug ID: 106081
   Summary: missed vectorization
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

This testcase (derived from ImageMagick)

struct pixels
{
short a,b,c,d;
} *pixels;
struct dpixels
{
double a,b,c,d;
};

double
test(double *k)
{
struct dpixels results={};
for (int u=0; u<1;u++,k--)
{
results.a += *k*pixels[u].a;
results.b += *k*pixels[u].b;
results.c += *k*pixels[u].c;
results.d += *k*pixels[u].d;
}
return results.a+results.b*2+results.c*3+results.d*4;
}

gets vectorized by clang:
test:   # @test
.cfi_startproc
# %bb.0:
movqpixels(%rip), %rax
vxorpd  %xmm0, %xmm0, %xmm0
xorl%ecx, %ecx
.p2align4, 0x90
.LBB0_1:# =>This Inner Loop Header: Depth=1
vpmovsxwd   (%rax), %xmm1
vbroadcastsd(%rdi,%rcx,8), %ymm2
addq$8, %rax
decq%rcx
vcvtdq2pd   %xmm1, %ymm1
vfmadd231pd %ymm2, %ymm1, %ymm0 # ymm0 = (ymm1 * ymm2) + ymm0
cmpq$-1, %rcx   # imm = 0xD8F0
jne .LBB0_1
# %bb.2:
vpermilpd   $1, %xmm0, %xmm1# xmm1 = xmm0[1,0]
vfmadd132sd .LCPI0_0(%rip), %xmm0, %xmm1 # xmm1 = (xmm1 * mem) +
xmm0
vextractf128$1, %ymm0, %xmm0
vfmadd231sd .LCPI0_1(%rip), %xmm0, %xmm1 # xmm1 = (xmm0 * mem) +
xmm1
vpermilpd   $1, %xmm0, %xmm0# xmm0 = xmm0[1,0]
vfmadd132sd .LCPI0_2(%rip), %xmm1, %xmm0 # xmm0 = (xmm0 * mem) +
xmm1
vzeroupper
retq

but not by GCC.
Original loop is:
0.94 :   423cb0: vmovdqu (%rsi,%rdi,8),%xmm5 // morphology.c:2984
 : 2983   if ( IsNaN(*k) ) continue;
0.29 :   423cb5: vpermilpd $0x1,(%rcx),%xmm4
 : 2982   for (u=0; u < (ssize_t) kernel->width; u++, k--) {
0.46 :   423cbb: add$0x2,%rdi
0.07 :   423cbf: add$0xfff0,%rcx
 : 2984   result.red += (*k)*k_pixels[u].red;
0.03 :   423cc3: vpshufb %xmm12,%xmm5,%xmm6
6.81 :   423cc8: vcvtdq2pd %xmm6,%xmm6
   13.05 :   423ccc: vfmadd231pd %xmm6,%xmm4,%xmm1
 : 2985   result.green   += (*k)*k_pixels[u].green;
   17.45 :   423cd1: vpshufb %xmm15,%xmm5,%xmm6 // morphology.c:2985
0.33 :   423cd6: vcvtdq2pd %xmm6,%xmm6
0.00 :   423cda: vfmadd231pd %xmm6,%xmm4,%xmm3
 : 2986   result.blue+= (*k)*k_pixels[u].blue;
   15.28 :   423cdf: vpshufb %xmm13,%xmm5,%xmm6 // morphology.c:2986
 : 2987   result.opacity += (*k)*k_pixels[u].opacity;
0.00 :   423ce4: vpshufb %xmm8,%xmm5,%xmm5
 : 2986   result.blue+= (*k)*k_pixels[u].blue;
0.00 :   423ce9: vcvtdq2pd %xmm6,%xmm6
 : 2987   result.opacity += (*k)*k_pixels[u].opacity;
0.21 :   423ced: vcvtdq2pd %xmm5,%xmm5
 : 2986   result.blue+= (*k)*k_pixels[u].blue;
0.97 :   423cf1: vfmadd231pd %xmm6,%xmm4,%xmm0
 : 2987   result.opacity += (*k)*k_pixels[u].opacity;
   19.16 :   423cf6: vfmadd231pd %xmm5,%xmm4,%xmm2 // morphology.c:2987
 : 2982   for (u=0; u < (ssize_t) kernel->width; u++, k--) {
   14.51 :   423cfb: cmp%rdi,%rbp // morphology.c:2982
0.00 :   423cfe: jne423cb0 

Changing short to double makes it vectorized:
.L2:
vmovupd (%rax), %ymm4
vmovupd 64(%rax), %ymm2
subq$-128, %rax
subq$32, %rdx
vunpcklpd   -96(%rax), %ymm4, %ymm1
vunpckhpd   -96(%rax), %ymm4, %ymm0
vmovupd -64(%rax), %ymm4
vunpckhpd   -32(%rax), %ymm2, %ymm2
vunpcklpd   -32(%rax), %ymm4, %ymm4
vpermpd $27, 32(%rdx), %ymm3
vpermpd $216, %ymm1, %ymm1
vpermpd $216, %ymm0, %ymm0
vpermpd $216, %ymm2, %ymm2
vpermpd $216, %ymm4, %ymm4
vunpcklpd   %ymm2, %ymm0, %ymm10
vunpckhpd   %ymm2, %ymm0, %ymm0
vunpckhpd   %ymm4, %ymm1, %ymm9
vunpcklpd   %ymm4, %ymm1, %ymm1
vpermpd $216, %ymm10, %ymm10
vpermpd $216, %ymm0, %ymm0
vfmadd231pd %ymm3, %ymm10, %ymm6
vfmadd231pd %ymm3, %ymm0, %ymm8
vpermpd $216, %ymm9, %ymm9
vpermpd $216, %ymm1, %ymm1
vfmadd231pd %ymm3, %ymm1, %ymm5
vfmadd231pd %ymm3, %ymm9, %ymm7
cmpq%rax, %rcx
jne .L2

howver clang's code looks shorter:
LBB0_1:# =>This Inner Loop Header: Depth=1
vbroadcastsd(%rdi,%rcx,8), %ymm1
vfmadd231pd (%rax), %ymm1, %ymm0# ymm0 = (ymm1 * mem) + ymm0

[Bug tree-optimization/94026] combine missed opportunity to simplify comparisons with zero

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94026

--- Comment #7 from Segher Boessenkool  ---
For Power, both the original testcase and the one in comment 5 generate perfect
code, for all -mcpu= I tested.  Should this be a target bug?

[Bug target/105991] [12/13 Regression] rldicl+sldi+add generated instead of rldimi

2022-06-24 Thread segher at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105991

Segher Boessenkool  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #8 from Segher Boessenkool  ---
Yes, this needs a backport.

[Bug middle-end/106080] Labels as values triggering -Wdangling-pointer

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106080

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://github.com/ocaml/oc
   ||aml/issues/11358
URL|https://github.com/ocaml/oc |
   |aml/issues/11358|

--- Comment #1 from Andrew Pinski  ---
Storing local labels in globals might cause issues as you cannot use computed
gotos to jump to it without undefined results so the warning message should be
changed.

[Bug fortran/106071] single where run error

2022-06-24 Thread kargl at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106071

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org
   Last reconfirmed||2022-06-24
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from kargl at gcc dot gnu.org ---
(In reply to han.wu from comment #0)
> two layer where work ok, why single where work error? 
> 

Clearly a bug due to calling a function with side effects.  While valid, it is
a questionable programming style.

[Bug middle-end/106080] New: Labels as values triggering -Wdangling-pointer

2022-06-24 Thread david at tarides dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106080

Bug ID: 106080
   Summary: Labels as values triggering -Wdangling-pointer
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: david at tarides dot com
  Target Milestone: ---

Given file1.c:

```
char * caml_instr_base;
void f (void) { return; }
```

and

file2.c:

```
extern char * caml_instr_base;
void f (void);

int main (void) {
lbl_ACC0:
  caml_instr_base = &_ACC0;
  /* Uncommenting this call suppresses the dangling-pointer warning */
  /*f();*/

  return 0;
}
```

GCC 12.1.0 (run from the published images on Docker Hub) run with `gcc -Wall -c
file1.c file2.c` gives:

file2.c:6:19: warning: storing the address of local variable 'lbl_ACC0' in
'caml_instr_base' [-Wdangling-pointer=]

The message is at least unclear (lbl_ACC0 is a label, not a local variable)
and, further, if the call to f is uncommented then the warning completely
disappears.

[Bug c++/105931] [12/13 regression] ICE in cxx_eval_constant_expression

2022-06-24 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105931

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Patrick Palka  ---
Should be fully fixed for 12.2/trunk now, thanks and sorry for the delay.

[Bug c++/105931] [12/13 regression] ICE in cxx_eval_constant_expression

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105931

--- Comment #7 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:5cf4746c3d4e80a360bd15b31136339d6812597e

commit r12-8513-g5cf4746c3d4e80a360bd15b31136339d6812597e
Author: Patrick Palka 
Date:   Thu Jun 23 16:36:43 2022 -0400

c++: constexpr folding in unevaluated context [PR105931]

Changing the type of N from int to unsigned in decltype82.C (from
r13-986-g0ecb6b906f215e) reveals another spot where we perform constexpr
evaluation in an unevaluated context for sake of warnings, this time
from the call to shorten_compare in cp_build_binary_op, which calls
fold_for_warn.

We could (and probably should) suppress the shorten_compare warnings
when in an unevaluated context, but there's probably other callers of
fold_for_warn that are similarly affected.  So this patch takes the
approach of directly suppressing fold_for_warn when in an unevaluated
context.

PR c++/105931

gcc/cp/ChangeLog:

* expr.cc (fold_for_warn): Don't fold when in an unevaluated
context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype82a.C: New test.

(cherry picked from commit b00b95198e6720eb23a2618870d67800f6180fdd)

[Bug ipa/106077] Invalid IPA-SRA with non-call exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106077

--- Comment #1 from Jan Hubicka  ---
Also note that the dominance check is written the wrong way, so it only passes
for first BB in the function

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 96b020fb2dd..6b2df2f3ff0 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -2403,8 +2403,8 @@ process_scan_results (cgraph_node *node, struct function
*fun,
pdoms_calculated = true;
  }
if (dominated_by_p (CDI_POST_DOMINATORS,
-   gimple_bb (call_stmt),
-   single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun
+   single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun))
+   gimple_bb (call_stmt)))
  csum->m_arg_flow[argidx].safe_to_import_accesses = true;
  }

[Bug target/106069] [12/13 Regression] wrong code with -O -fno-tree-forwprop -maltivec on ppc64le

2022-06-24 Thread mpolacek at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106069

--- Comment #3 from Marek Polacek  ---
Sure.  (If you're looking for a ppc64le machine, the compile farm has a few.)

$ diff -up q95.s q96.s
--- q95.s   2022-06-23 23:08:22.870777519 +
+++ q96.s   2022-06-23 23:08:10.990476157 +
@@ -12,12 +12,12 @@ _Z3fooPhPj:
 0: addis 2,12,.TOC.-.LCF0@ha
addi 2,2,.TOC.-.LCF0@l
.localentry _Z3fooPhPj,.-_Z3fooPhPj
-   lxsiwzx 50,0,4
+   lxsiwzx 49,0,4
xxlxor 0,0,0
-   xxpermdi 50,0,50,0
+   xxpermdi 49,0,49,0
addi 9,4,8
-   lxsiwzx 51,0,9
-   xxpermdi 51,0,51,0
+   lxsiwzx 50,0,9
+   xxpermdi 50,0,50,0
addi 9,4,20
lxsiwzx 44,0,9
xxpermdi 44,0,44,0
@@ -28,8 +28,8 @@ _Z3fooPhPj:
lxsiwzx 32,0,9
xxpermdi 32,0,32,0
addi 9,4,32
-   lxsiwzx 49,0,9
-   xxpermdi 49,0,49,0
+   lxsiwzx 34,0,9
+   xxpermdi 34,0,34,0
addi 9,4,36
lxsiwzx 43,0,9
xxpermdi 43,0,43,0
@@ -40,8 +40,8 @@ _Z3fooPhPj:
lxsiwzx 33,0,9
xxpermdi 33,0,33,0
addi 9,4,48
-   lxsiwzx 34,0,9
-   xxpermdi 34,0,34,0
+   lxsiwzx 35,0,9
+   xxpermdi 35,0,35,0
addi 9,4,52
lxsiwzx 38,0,9
xxpermdi 38,0,38,0
@@ -51,14 +51,14 @@ _Z3fooPhPj:
addi 9,4,60
lxsiwzx 39,0,9
xxpermdi 39,0,39,0
-   xxlor 48,50,50
-   xxlor 35,50,50
+   xxlor 47,49,49
+   xxlor 51,49,49
addis 9,2,.LC0@toc@ha
addi 9,9,.LC0@toc@l
lvx 4,0,9
addis 9,2,.LC1@toc@ha
addi 9,9,.LC1@toc@l
-   lvx 15,0,9
+   lvx 16,0,9
li 9,10
mtctr 9
 .L2:
@@ -68,34 +68,30 @@ _Z3fooPhPj:
xxlxor 45,45,37
xxlxor 32,32,33
vrlw 0,0,4
-   vadduwm 8,3,12
-   vadduwm 9,16,13
-   vadduwm 10,19,0
-   vadduwm 3,8,12
-   vadduwm 16,9,13
-   vadduwm 19,10,0
-   xxlxor 40,40,35
+   vadduwm 8,19,12
+   vadduwm 9,15,13
+   vadduwm 10,18,0
+   vadduwm 19,8,12
+   vadduwm 15,9,13
+   vadduwm 18,10,0
+   xxlxor 40,40,51
xxlxor 39,39,40
-   xxlxor 41,34,41
-   vrlw 2,9,15
+   xxlxor 41,35,41
+   vrlw 3,9,16
xxlxor 42,38,42
-   vrlw 6,10,15
+   vrlw 6,10,16
vadduwm 5,5,7
-   vadduwm 1,1,2
-   vadduwm 17,17,6
+   vadduwm 1,1,3
+   vadduwm 2,2,6
vadduwm 11,11,14
xxlxor 44,37,44
vrlw 12,12,4
xxlxor 45,33,45
vrlw 13,13,4
-   xxlxor 32,49,32
+   xxlxor 32,34,32
vrlw 0,0,4
bdnz .L2
-   vadduwm 3,3,18
-   xxmrglw 51,51,35
-   xxmrglw 50,50,48
-   xxmrglw 50,50,51
-   vspltw 0,18,3
+   vspltw 0,17,0
mfvsrwz 9,32
stb 9,0(3)
blr

[Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106078

--- Comment #4 from Richard Biener  ---
(In reply to Richard Biener from comment #2)
> (In reply to Jan Hubicka from comment #1)
> > This is version that does not need -fnon-call-exceptions
> > If called test (NULL, 0) it should be indefinitely increasing val rather
> > then segfaulting.  Seems clang gets this one right.
> > 
> > int array[1];
> > volatile int val;
> > int test(short *b,int s)
> > {
> > for (int i = 0; i<1;i++)
> >   {
> > for (int j = 0; j < 10; j+=s)
> > val++;
> > array[i]+=*b;
> >   }
> > }
> 
> For this one it's PRE hoisting *b across the endless loop (PRE handles
> calls as possibly not returning but not loops as possibly not terminating...)
> So it's a different bug.

Btw, C++ requiring forward progress makes the testcase undefined.

[Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106078

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Keywords||wrong-code
 Status|NEW |ASSIGNED

--- Comment #3 from Richard Biener  ---
(In reply to Jan Hubicka from comment #0)
> Here I think it is invalid to move *b out of the loop with
> -fnon-call-exceptions:
> 
> int array[1];
> int test(short *b,int e, int f)
> {
> for (int i = 0; i<1;i++)
>   {
> e/=f;
> array[i]+=*b+e;
>   }
> }

LIM uses nonpure_call_p to see whether a stmt possibly terminates a block
(and 'contains_call').  We probably have to treat const/pure throwing
calls and general throwing (or trapping) stmts the same way.

[Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106078

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2022-06-24
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
(In reply to Jan Hubicka from comment #1)
> This is version that does not need -fnon-call-exceptions
> If called test (NULL, 0) it should be indefinitely increasing val rather
> then segfaulting.  Seems clang gets this one right.
> 
> int array[1];
> volatile int val;
> int test(short *b,int s)
> {
> for (int i = 0; i<1;i++)
>   {
> for (int j = 0; j < 10; j+=s)
> val++;
> array[i]+=*b;
>   }
> }

For this one it's PRE hoisting *b across the endless loop (PRE handles
calls as possibly not returning but not loops as possibly not terminating...)
So it's a different bug.

[Bug libfortran/106079] [12/13 regression] gfortran.dg/boz_15.f90 fails after gcc-12-6498-g07c60b8e33

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106079

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.2

[Bug tree-optimization/103035] [meta-bug] YARPGen bugs

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103035
Bug 103035 depends on bug 106025, which changed state.

Bug 106025 Summary: [13 Regression] Incorrect optimization at -O2 leads to 
infinite test execution time since r13-469-g9a53101caadae1b5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106025

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

--- Comment #8 from Richard Biener  ---
*** Bug 106025 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/106025] [13 Regression] Incorrect optimization at -O2 leads to infinite test execution time since r13-469-g9a53101caadae1b5

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106025

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Richard Biener  ---
And as expected fixed by the fix for PR106070.

*** This bug has been marked as a duplicate of bug 106070 ***

[Bug tree-optimization/103035] [meta-bug] YARPGen bugs

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103035
Bug 103035 depends on bug 106070, which changed state.

Bug 106070 Summary: [13 Regression] Wrong code with -O1 since 
r13-469-g9a53101caadae1b5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Richard Biener  ---
Fixed.

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:b36a1c964f99758de1f3b169628965d3c3af812b

commit r13-1243-gb36a1c964f99758de1f3b169628965d3c3af812b
Author: Richard Biener 
Date:   Fri Jun 24 13:37:22 2022 +0200

middle-end/106070 - bogus cond-expr folding

The following fixes up r13-469-g9a53101caadae1b5 by properly
implementing what operand_equal_for_comparison_p did.

2022-06-24  Richard Biener  

PR middle-end/106070
* match.pd (a != b ? a : b): Fix translation of
operand_equal_for_comparison_p.

* gcc.dg/torture/pr106070.c: New testcase.

[Bug target/88469] [7/8 regression] AAPCS/AAPCS64 - Struct with 64-bit bitfield (128-bit on AArch64) may be passed in wrong registers

2022-06-24 Thread jenimjohn at orgs dot com.co via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

Jenim  changed:

   What|Removed |Added

 CC||jenimjohn at orgs dot com.co

--- Comment #19 from Jenim  ---
003c21e0 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  3c21e0:   e24dd008sub sp, sp, #8
  3c21e4:   e309c130movwip, #37168  ; 0x9130
  3c21e8:   e340c133movtip, #307; 0x133
  3c21ec:   e92d4370push{r4, r5, r6, r8, r9, lr}
  3c21f0:   e24dd018sub sp, sp, #24
  3c21f4:   e1a05001mov r5, r1
  3c21f8:   e1a06000mov r6, r0
  3c21fc:   e58d3034str r3, [sp, #52]   ; 0x34
  3c2200:   e1a03002mov r3, r2
  3c2204:   e1cd83d4ldrdr8, [sp, #52]   ; 0x34
  3c2208:   e1a02001mov r2, r1
  3c220c:   e28d1018add r1, sp, #24
  3c2210:   e3a0e000mov lr, #0
  3c2214:   e1cd81f0strdr8, [sp, #16]
  3c2218:   e9110003ldmdb   r1, {r0, r1}
  3c221c:   e58de008str lr, [sp, #8]
  3c2220:   e88d0003stm sp, {r0, r1}
  3c2224:   e1a01006mov r1, r6
  3c2228:   e59cldr r0, [ip]
  3c222c:   eb43bl  3c1f40
<_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  3c2230:   e1a04000mov r4, r0
  3c2234:   eb071320bl  586ebc
<_Z24initialize_inline_failedP11cgraph_edge>
  3c2238:   e5953044ldr r3, [r5, #68]   ; 0x44
  3c223c:   e1a4mov r0, r4
  3c2240:   e353cmp r3, #0
  3c2244:   e5843014str r3, [r4, #20]
  3c2248:   15834010strne   r4, [r3, #16]
  3c224c:   e5963040ldr r3, [r6, #64]   ; 0x40
  3c2250:   e353cmp r3, #0
  3c2254:   e584301cstr r3, [r4, #28]
  3c2258:   15834018strne   r4, [r3, #24]
  3c225c:   e5864040str r4, [r6, #64]   ; 0x40
https://ruoungoai88.com/
  3c2260:   e5854044str r4, [r5, #68]   ; 0x44
  3c2264:   e28dd018add sp, sp, #24
  3c2268:   e8bd4370pop {r4, r5, r6, r8, r9, lr}
003c21e0 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  3c21e0:   e24dd008sub sp, sp, #8
  3c21e4:   e309c130movwip, #37168  ; 0x9130
  3c21e8:   e340c133movtip, #307; 0x133
  3c21ec:   e92d4370push{r4, r5, r6, r8, r9, lr}
  3c21f0:   e24dd018sub sp, sp, #24
  3c21f4:   e1a05001mov r5, r1
  3c21f8:   e1a06000mov r6, r0
  3c21fc:   e58d3034str r3, [sp, #52]   ; 0x34
  3c2200:   e1a03002mov r3, r2
  3c2204:   e1cd83d4ldrdr8, [sp, #52]   ; 0x34
  3c2208:   e1a02001mov r2, r1
  3c220c:   e28d1018add r1, sp, #24
  3c2210:   e3a0e000mov lr, #0
  3c2214:   e1cd81f0strdr8, [sp, #16]
  3c2218:   e9110003ldmdb   r1, {r0, r1}
  3c221c:   e58de008str lr, [sp, #8]
  3c2220:   e88d0003stm sp, {r0, r1}
  3c2224:   e1a01006mov r1, r6
  3c2228:   e59cldr r0, [ip]
  3c222c:   eb43bl  3c1f40
<_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  3c2230:   e1a04000mov r4, r0
  3c2234:   eb071320bl  586ebc
<_Z24initialize_inline_failedP11cgraph_edge>
  3c2238:   e5953044ldr r3, [r5, #68]   ; 0x44
  3c223c:   e1a4mov r0, r4
  3c2240:   e353cmp r3, #0
  3c2244:   e5843014str r3, [r4, #20]
  3c2248:   15834010strne   r4, [r3, #16]
  3c224c:   e5963040ldr r3, [r6, #64]   ; 0x40
  3c2250:   e353cmp r3, #0
  3c2254:   e584301cstr r3, [r4, #28]
  3c2258:   15834018strne   r4, [r3, #24]
  3c225c:   e5864040str r4, [r6, #64]   ; 0x40
  3c2260:   e5854044str r4, [r5, #68]   ; 0x44
  3c2264:   e28dd018add sp, sp, #24
  3c2268:   e8bd4370pop {r4, r5, r6, r8, r9, lr}
  3c226c:   e28dd008add sp, sp, #8
  3c2270:   e12fff1ebx  lr
  3c226c:   e28dd008add sp, sp, #8
  3c2270:   e12fff1ebx  lr

[Bug libfortran/106079] New: [12/13 regression] gfortran.dg/boz_15.f90 fails after gcc-12-6498-g07c60b8e33

2022-06-24 Thread seurer at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106079

Bug ID: 106079
   Summary: [12/13 regression] gfortran.dg/boz_15.f90 fails after
gcc-12-6498-g07c60b8e33
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:07c60b8e33c614a6cdd9fe3de7f409319b6a239a, gcc-12-6498-g07c60b8e33
make  -k check-gcc-fortran RUNTESTFLAGS="dg.exp=gfortran.dg/boz_15.f90"
FAIL: gfortran.dg/boz_15.f90   -O0  execution test
FAIL: gfortran.dg/boz_15.f90   -O1  execution test
FAIL: gfortran.dg/boz_15.f90   -O2  execution test
FAIL: gfortran.dg/boz_15.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/boz_15.f90   -O3 -g  execution test
FAIL: gfortran.dg/boz_15.f90   -Os  execution test
# of expected passes6
# of unexpected failures6


commit 07c60b8e33c614a6cdd9fe3de7f409319b6a239a (HEAD)
Author: Jakub Jelinek 
Date:   Tue Jan 4 10:37:48 2022 +0100

fortran, libgfortran: -mabi=ieeelongdouble I/O


At line 20 of file
/home/seurer/gcc/git/gcc-12-test/gcc/testsuite/gfortran.dg/boz_15.f90
Fortran runtime error: Bad value during integer read

Error termination. Backtrace:
#0  0x7fffa583af03 in formatted_transfer_scalar_read
at /home/seurer/gcc/git/gcc-12-test/libgfortran/io/transfer.c:1513
#1  0x7fffa583c307 in formatted_transfer
at /home/seurer/gcc/git/gcc-12-test/libgfortran/io/transfer.c:2339
#2  0x7fffa5836f47 in wrap_scalar_transfer
at /home/seurer/gcc/git/gcc-12-test/libgfortran/io/transfer.c:2382
#3  0x100104d3 in ???
#4  0x10010ce3 in ???
#5  0x7fffa5237f2b in ???
#6  0x7fffa5238107 in ???
#7  0x in ???
FAIL: gfortran.dg/boz_15.f90   -O0  execution test

This occurs with the compiler configured with  --with-long-double-format=ieee
on a distro (Fedora 36) which also used  --with-long-double-format=ieee.

[Bug c++/106057] Missed stmt_can_throw_external check in stmt_kills_ref_p

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106057

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:7fd34782b95bbe1b4dc9936b8923f86d4aaee379

commit r13-1241-g7fd34782b95bbe1b4dc9936b8923f86d4aaee379
Author: Jan Hubicka 
Date:   Fri Jun 24 13:52:44 2022 +0200

Fix stmt_kills_ref_p WRT external throws

Add missing check to stmt_kills_ref_p for case that function
is terminated by EH before call return value kills the ref. In the PR
I tried to construct testcase but I don't know how to do that until I
annotate EH code with fnspec attributes which I will do in separate patch
and add a testcase.

PR ipa/106057
* tree-ssa-alias.cc (stmt_kills_ref_p): Check for external throw.

[Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106078

--- Comment #1 from Jan Hubicka  ---
This is version that does not need -fnon-call-exceptions
If called test (NULL, 0) it should be indefinitely increasing val rather then
segfaulting.  Seems clang gets this one right.

int array[1];
volatile int val;
int test(short *b,int s)
{
for (int i = 0; i<1;i++)
  {
for (int j = 0; j < 10; j+=s)
val++;
array[i]+=*b;
  }
}

[Bug c++/98992] attribute malloc error associating a member deallocator with an allocator

2022-06-24 Thread rdiezmail-gcc at yahoo dot de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98992

R. Diez  changed:

   What|Removed |Added

 CC||rdiezmail-gcc at yahoo dot de

--- Comment #1 from R. Diez  ---
I have been using the following up to GCC 11.3:

struct MyClass
{
  static void FreeMemory ( const void * pMem ) throw();

  #if __GNUC_PREREQ(11, 0)
__attribute__ (( malloc, malloc( MyClass::FreeMemory, 1 ) ))
  #else
__attribute__ (( malloc ))
  #endif

  static void * AllocMemory ( size_t Size ) throw();

  [...]
};

However, GCC 12.1 does not want to accept it anymore:

error: 'malloc' attribute argument 1 does not name a function

I tried placing the attribute outside the class, like this:

__attribute__ (( malloc, malloc( MyClass::FreeMemory, 1 ) ))
void * MyClass::AllocMemory ( size_t Size ) throw()
{
  return malloc( Size );
}

But then I got 2 errors:

error: 'static void MyClass::FreeMemory(const void*)' is protected within this
context
  710 | __attribute__ (( malloc, malloc( MyClass::FreeMemory, 1 ) ))
  |   ^~

error: 'malloc' attribute argument 1 does not name a function

That cannot be right. GCC should not insist that the deallocator is public.
After all, the allocator AllocMemory is not public.

[Bug tree-optimization/106063] [12/13 Regression] ICE: in gimple_expand_vec_cond_expr, at gimple-isel.cc:281 with -O2 -fno-tree-forwprop --param=evrp-mode=legacy-first

2022-06-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106063

--- Comment #3 from Tamar Christina  ---
(In reply to Richard Biener from comment #2)
> 
> but after vector lowering only vector operations that are handled by the
> target may be introduced.  The pattern
> 

We can't tell that we're after veclower can we? so does it make sense to just
never introduce a vector operation the target has no optab for?

[Bug middle-end/106078] New: Invalid loop invariant motion with non-call-exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106078

Bug ID: 106078
   Summary: Invalid loop invariant motion with non-call-exceptions
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

Here I think it is invalid to move *b out of the loop with
-fnon-call-exceptions:

int array[1];
int test(short *b,int e, int f)
{
for (int i = 0; i<1;i++)
  {
e/=f;
array[i]+=*b+e;
  }
}

jan@localhost:~> more t.s
.text
.file   "t.c"
.globl  _Z4testPsii # -- Begin function _Z4testPsii
.p2align4, 0x90
.type   _Z4testPsii,@function
_Z4testPsii:# @_Z4testPsii
.cfi_startproc
# %bb.0:
movl%edx, %ecx
movl%esi, %eax
movswl  (%rdi), %esi
xorl%edi, %edi
.p2align4, 0x90
.LBB0_1:# =>This Inner Loop Header: Depth=1
cltd
idivl   %ecx
movl%eax, %edx
addl%esi, %edx
addl%edx, array(,%rdi,4)
addq$1, %rdi
jmp .LBB0_1
.Lfunc_end0:

compiled code will segfault instead dividing by zero.

[Bug middle-end/106075] Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

--- Comment #5 from Jan Hubicka  ---
Also note that the longjmp testcase will not get misoptimized since we consider
longjmp as using all global memory.

[Bug middle-end/106075] Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

--- Comment #4 from Jan Hubicka  ---
PR106077 demonstrates related problem where ipa-sra concludes it is safe to
move dereference earlier in the code.  It uses dominator test for that.

[Bug ipa/106077] New: Invalid IPA-SRA with non-call exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106077

Bug ID: 106077
   Summary: Invalid IPA-SRA with non-call exceptions
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Compiled with -O2 -fno-ipa-cp we make t() to trap instead of calling
non-call-exception from wrap

short e,f;
static __attribute__ ((noinline))
int a(int *b)
{
return *b;
}
static __attribute__ ((noinline))
__attribute__ ((optimize("non-call-exceptions")))
int wrap(int *b,int e, int f)
{
e/=f;
return a(b)+e;
}

int
t()
{
return wrap(0,1,0);
}

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

--- Comment #5 from Richard Biener  ---
+Applying pattern match.pd:4591, gimple-match.cc:59644
...
@@ -31,7 +41,7 @@
   var_4.3_3 = (unsigned int) var_4.2_2;
   iftmp.0_9 = (long unsigned int) var_2.1_1;
   iftmp.0_10 = (long unsigned int) var_4.2_2;
   _4 = var_2.1_1 != var_4.3_3;
-  iftmp.0_6 = _4 ? iftmp.0_10 : iftmp.0_9;
+  iftmp.0_6 = (long unsigned int) var_4.3_3;

[local count: 955630225]:
   # a_16 = PHI 

so that's

 var_2 != (unsigned) var_4 ? -> (unsigned long) var_4 : (unsigned long) var_2

which we turn into (unsigned long) (unsigned) var_4.  I thought that's what
generic also does but I have to double-check operand_equal_for_comparison_p
here.

Ah, it checks

  /* Discard a single widening conversion from ARG1 and see if the inner
 value is the same as ARG0.  */
  if (CONVERT_EXPR_P (arg1)
  && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (arg1, 0)))
  && TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (arg1, 0)))
 < TYPE_PRECISION (TREE_TYPE (arg1))
  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))

and the operand_equal_p matches against the non-NOP-stripped arg0.  So it's
either a same precision or widening of the comparison op.

Hmm, will have to think how to translate that into a sign check to make it
fit match.pd (I don't want to export and use operand_equal_for_comparison_p)

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

--- Comment #8 from Jakub Jelinek  ---
The IMHO UB case is for a != b when one address is at the start of one object
and the other address is at the end of another one, which for zero sized
objects is more often because the start address is the same as end address.
For integral comparisons we try to be more conservative.

[Bug tree-optimization/106076] Sub-optimal code is generated for checking bitfields via proxy functions

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106076

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-06-24
 Status|UNCONFIRMED |NEW
   Keywords||missed-optimization

--- Comment #2 from Richard Biener  ---
Confirmed.  that's fold-const.c optimize_bit_field_compare that doesn't work on
GIMPLE (after inlining).  The fold-const.c part is also premature so moving it
to GIMPLE would be appreciated.

[Bug middle-end/106075] Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2022-06-24
 CC||rguenth at gcc dot gnu.org

--- Comment #3 from Richard Biener  ---
Related:

int a = 1;
int c,d;
void
test()
{
  a=12345;
  c/d;
  a=1;
}

where the possibly throwing division (by zero) does not have virtual operands.
Likewise

void __attribute__((noreturn,const)) foo ()
{
  longjmp ();
}

int a = 1;
int c,d;
void
test()
{
  a=12345;
  foo ();
  a=1;
}

but then we can simply declare 'const' invalid on 'foo'.

For the non-VOP case we'd need to assign a "context" counter to stmts
(in UID for example) and increment it when seeing a possible (implicit)
control flow terminating statement.  When the DSE walk discovers a new
context it has to consider that a use.  The expense is an extra whole-IL
walk over the function [with -fnon-call-exceptions].  Note there's also
the possibility to create a const externally throwing function but its
semantics are disputed (see another PR for that).

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread acoplan at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

--- Comment #7 from Alex Coplan  ---
(In reply to Richard Biener from comment #6)
> Note whether a != b is probably undefined (but zero size objects are a GNU
> extension).

Just to clarify, are you saying this is undefined specifically for zero size
objects or undefined in general for distinct objects? I thought for the latter
== and != were ok, but < and such were UB.

[Bug tree-optimization/106073] [12/13 Regression] wrong code at -O3 on x86_64-linux-gnu

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106073

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
Version|unknown |12.1.1

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |13.0
 Status|NEW |ASSIGNED

--- Comment #4 from Richard Biener  ---
Ah, smaller.  Nice.

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

--- Comment #6 from Richard Biener  ---
Note whether a != b is probably undefined (but zero size objects are a GNU
extension).

[Bug tree-optimization/106063] [12/13 Regression] ICE: in gimple_expand_vec_cond_expr, at gimple-isel.cc:281 with -O2 -fno-tree-forwprop --param=evrp-mode=legacy-first

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106063

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.2
   Priority|P3  |P2
Summary|[13 Regression] ICE: in |[12/13 Regression] ICE: in
   |gimple_expand_vec_cond_expr |gimple_expand_vec_cond_expr
   |, at gimple-isel.cc:281 |, at gimple-isel.cc:281
   |with -O2 -fno-tree-forwprop |with -O2 -fno-tree-forwprop
   |--param=evrp-mode=legacy-fi |--param=evrp-mode=legacy-fi
   |rst |rst

[Bug tree-optimization/106063] [13 Regression] ICE: in gimple_expand_vec_cond_expr, at gimple-isel.cc:281 with -O2 -fno-tree-forwprop --param=evrp-mode=legacy-first

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106063

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||tnfchris at gcc dot gnu.org
   Last reconfirmed||2022-06-24

--- Comment #2 from Richard Biener  ---
Interestingly

typedef __int128 __attribute__((__vector_size__ (16))) V;
typedef unsigned __int128 __attribute__((__vector_size__ (16))) uV;
typedef V bV __attribute__((vector_mask));
V __GIMPLE (ssa,startwith("isel"))
foo (V v)
{
  uV _1;
  bV _2;
  V _4;

  __BB(2,guessed_local(1073741824)):
  _1 = __VIEW_CONVERT (v_3(D));
  _2 = _1 <= _Literal (uV) { _Literal (unsigned __int128) 15 };
  _4 = _2 ? _Literal (V) { _Literal (__int128) -1 } : _Literal (V) { 0 };
  return _4;

}

works, even with -fdisable-tree-veclower21

The issue is that we cannot expand the unsigned comparsion _2 = _1 <= { 15 };
which is introduced in VRP2 from the equality compare:

--- t2.c.197t.threadfull2   2022-06-24 12:25:08.204648605 +0200
+++ t2.c.198t.vrp2  2022-06-24 12:25:08.204648605 +0200
@@ -10,13 +10,15 @@
 ;; 2 succs { 1 }
 V foo (V v)
 {
+  vector(1) uint128_t _1;
   vector(1)  _2;
   V _4;
   vector(1) __int128 _6;

[local count: 1073741824]:
   _6 = v_3(D) & { -16 };
-  _2 = _6 == { 0 };
+  _1 = VIEW_CONVERT_EXPR(v_3(D));
+  _2 = _1 <= { 15 };
   _4 = VEC_COND_EXPR <_2, { -1 }, { 0 }>;
   return _4;

but after vector lowering only vector operations that are handled by the
target may be introduced.  The pattern

/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
   where ~Y + 1 == pow2 and Z = ~Y.  */
(for cst (VECTOR_CST INTEGER_CST)
 (for cmp (eq ne)
  icmp (le gt)
  (simplify
   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
(with { tree csts = bitmask_inv_cst_vector_p (@1); }
 (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
  (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
   (icmp @0 { csts; })
   (with { tree utype = unsigned_type_for (TREE_TYPE (@1)); }
 (icmp (view_convert:utype @0) { csts; }

fails to check that in the vector case.  Caused by r12-5650-g29df53fe349073
(thus latent on the branch).

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread acoplan at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

Alex Coplan  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #5 from Alex Coplan  ---
Bisection shows this changed with
r9-5411-g93aa3c4aca3647645cd5bce724f9d2126de4b5ea on AArch64:

commit 93aa3c4aca3647645cd5bce724f9d2126de4b5ea (refs/bisect/bad)
Author: Jakub Jelinek 
Date:   Tue Jan 15 09:11:00 2019 +0100

re PR tree-optimization/88775 (Optimize std::string assignment)

PR tree-optimization/88775
* match.pd (cmp (convert1?@2 addr@0) (convert2? addr@1)): Optimize
equal == 0 equality pointer comparisons some more if compared in
integral types and either one points to an automatic var and the
other to a global, or we can prove at least one points to the
middle
or both point to start or both point to end.

* gcc.dg/tree-ssa/pr88775-1.c: New test.
* gcc.dg/tree-ssa/pr88775-2.c: New test.

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

--- Comment #4 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Mikael Pettersson from comment #2)
> > Seems target-dependent. I can't reproduce on x86_64-linux-gnu or
> > sparc64-linux-gnu: both compile f() to return 1 and g() to perform a runtime
> > computation. But ppc64-linux-gnu and armv7l-linux-gnueabi behave as your
> > aarch64 example: f() returns 1 and g() returns 0 (unconditionally, no
> > runtime computations).
> 
> Most likely section anchors is the cause of the difference between the
> targets.
> The ones which implement section anchors return different values between the
> functions. Is suspect MIPS has similar behavior too.

It is also most likely why aarch64 changed between 8 and 9 too.

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

--- Comment #3 from Andrew Pinski  ---
(In reply to Mikael Pettersson from comment #2)
> Seems target-dependent. I can't reproduce on x86_64-linux-gnu or
> sparc64-linux-gnu: both compile f() to return 1 and g() to perform a runtime
> computation. But ppc64-linux-gnu and armv7l-linux-gnueabi behave as your
> aarch64 example: f() returns 1 and g() returns 0 (unconditionally, no
> runtime computations).

Most likely section anchors is the cause of the difference between the targets.
The ones which implement section anchors return different values between the
functions. Is suspect MIPS has similar behavior too.

[Bug tree-optimization/106076] Sub-optimal code is generated for checking bitfields via proxy functions

2022-06-24 Thread kyrylo.bohdanenko at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106076

--- Comment #1 from Kyrylo Bohdanenko  ---
The provided assembly is for -O2/-O3

[Bug ipa/106061] [13 Regression] during GIMPLE pass: einline ICE: verify_cgraph_node failed (edge points to wrong declaration) with -Og since r13-1204-gd68d366425369649

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106061

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/106055] [13 Regression] ICE in replace_uses_by with -floop-parallelize-all and returns_twice

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106055

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2022-06-24

--- Comment #2 from Richard Biener  ---
I will have a look.

[Bug tree-optimization/106076] New: Sub-optimal code is generated for checking bitfields via proxy functions

2022-06-24 Thread kyrylo.bohdanenko at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106076

Bug ID: 106076
   Summary: Sub-optimal code is generated for checking bitfields
via proxy functions
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyrylo.bohdanenko at gmail dot com
  Target Milestone: ---

Consider the following struct:

#include 

struct SomeClass {
uint16_t dummy1 : 1;
uint16_t cfg2 : 1;
uint16_t cfg3 : 1;
uint16_t dummy2 : 1;
uint16_t dummy3 : 1;
uint16_t dummy4 : 1;
uint16_t cfg1 : 1;
uint16_t dummy5 : 1;
uint16_t cfg4 : 1;

constexpr bool checkA() const { return cfg1 || cfg2 || cfg3; }
constexpr bool checkB() const { return cfg4; }

constexpr bool checkA_B() const { return (cfg1 || cfg2 || cfg3) || cfg4; }
constexpr bool checkA_B_SLOW() const { return checkA() || checkB(); }
};


For the following functions (which do the same thing) GCC generates different
assembly.

bool check(const SomeClass& rt) {
return rt.checkA_B();
}

bool check_SLOW(const SomeClass& rt) {
return rt.checkA_B_SLOW();
}

Compiled as:

g++ -std=c++17 -S 

The assembly:

; demangled: check(SomeClass const&)
_Z5checkRK9SomeClass:
endbr64
testw   $326, (%rdi)
setne   %al
ret

; demangled: check_SLOW(SomeClass const&)
_Z10check_SLOWRK9SomeClass:
endbr64
movzwl  (%rdi), %edx
movl$1, %eax
testb   $70, %dl
jne .L3
movzbl  %dh, %eax
andl$1, %eax
.L3:
ret

As we can see, during check_SLOW GCC decided to check the result on
byte-by-byte basis introducing a conditional jump in between. It looks like GCC
did not fully analyse the code after inlining checkA() and checkB().

FYI, the same code on Clang produces the 1st option of ASM for both functions.

[Bug fortran/106048] [10/11/12/13 Regression] ICE in ubsan_encode_value, at ubsan.cc:143 / verify_gimple failed

2022-06-24 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106048

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |10.4

[Bug tree-optimization/106064] Wrong code comparing two global zero-sized arrays

2022-06-24 Thread mikpelinux at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106064

Mikael Pettersson  changed:

   What|Removed |Added

 CC||mikpelinux at gmail dot com

--- Comment #2 from Mikael Pettersson  ---
Seems target-dependent. I can't reproduce on x86_64-linux-gnu or
sparc64-linux-gnu: both compile f() to return 1 and g() to perform a runtime
computation. But ppc64-linux-gnu and armv7l-linux-gnueabi behave as your
aarch64 example: f() returns 1 and g() returns 0 (unconditionally, no runtime
computations).

[Bug target/105991] [12/13 Regression] rldicl+sldi+add generated instead of rldimi

2022-06-24 Thread roger at nextmovesoftware dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105991

Roger Sayle  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|12.2|13.0

--- Comment #7 from Roger Sayle  ---
This should now be fixed on mainline.  If anyone feels strongly that the fix
should be backported to the GCC 12 branch, please feel free to reopen this PR.
Thanks again to Marek.

[Bug middle-end/106075] Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

--- Comment #2 from Andrew Pinski  ---
Oh wait -fdelete-dead-exceptions won't change that here or will ir.

[Bug middle-end/106075] Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

--- Comment #1 from Andrew Pinski  ---
There is another option to not to remove stores for non call exceptions.

[Bug middle-end/106075] New: Wrong DSE with -fnon-call-exceptions

2022-06-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106075

Bug ID: 106075
   Summary: Wrong DSE with -fnon-call-exceptions
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

In the following testcase:

int a = 1;
short *b;
void
test()
{
a=12345;
*b=0;
a=1;
}

we should not optimize out a=12345 when compiling with -fnon-call-exceptions. 
The store to b may throw and thus the store is live.

[Bug c++/106074] New: Spurious Wstringop-overflow for int-to-string with SSE4

2022-06-24 Thread ed at catmur dot uk via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106074

Bug ID: 106074
   Summary: Spurious Wstringop-overflow for int-to-string with
SSE4
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

Code adapted from
https://github.com/capnproto/capnproto/blob/be7a80b4add706ddaeb41689221146b86c3e0f5f/c%2B%2B/src/kj/string.c%2B%2B#L157-L203
:

auto f(short i) {
  struct R { char data[6]; } result;
  bool negative = i < 0;
  unsigned u = i;
  if (negative)
u = -u;
  unsigned char reverse[5];
  unsigned char* p = reverse;
  if (u == 0)
*p++ = 0;
  else
for (; u > 0; u /= 10)
  *p++ = u % 10;
  char* p2 = result.data;
  if (negative)
*p2++ = '-';
  while (p > reverse) {
#ifdef ASSUME
if (p2 >= ()[1]) __builtin_unreachable();
#endif
*p2++ = '0' + *--p;
  }
  if (p2 < ()[1])
*p2 = '\0';
  return result;
}

When compiled with -O3 -msse4, from 12.1.0 through 13.0.0 20220619:

: In function 'auto f(short int)':
:21:11: warning: writing 1 byte into a region of size 0
[-Wstringop-overflow=]
   21 | *p2++ = '0' + *--p;
  | ~~^~~~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
2 |   struct R { char data[6]; } result;
  |   ^~~~
:2:30: note: at offset 7 into destination object 'result' of size 6
2 |   struct R { char data[6]; } result;
  |  ^~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
2 |   struct R { char data[6]; } result;
  |   ^~~~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
:2:30: note: at offset 7 into destination object 'result' of size 6
2 |   struct R { char data[6]; } result;
  |  ^~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
2 |   struct R { char data[6]; } result;
  |   ^~~~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
:2:30: note: at offset 7 into destination object 'result' of size 6
2 |   struct R { char data[6]; } result;
  |  ^~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
2 |   struct R { char data[6]; } result;
  |   ^~~~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
:2:30: note: at offset 7 into destination object 'result' of size 6
2 |   struct R { char data[6]; } result;
  |  ^~
:2:19: note: at offset 6 into destination object 'f(short
int)::R::data' of size 6
2 |   struct R { char data[6]; } result;
  |   ^~~~

Adding the assumption at line 19 (-DASSUME) works around this.

[Bug tree-optimization/106070] [13 Regression] Wrong code with -O1 since r13-469-g9a53101caadae1b5

2022-06-24 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106070

Martin Liška  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||marxin at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
Summary|[13 Regression] Wrong code  |[13 Regression] Wrong code
   |with -O1|with -O1 since
   ||r13-469-g9a53101caadae1b5

--- Comment #3 from Martin Liška  ---
Started with r13-469-g9a53101caadae1b5.

[Bug middle-end/106059] [13 regression] cc.dg/vect/pr79347.c fails after r13-1171-g9f55aee9dca759

2022-06-24 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106059

Martin Liška  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Martin Liška  ---
Fixed, sorry for stupid mistake.

[Bug middle-end/106059] [13 regression] cc.dg/vect/pr79347.c fails after r13-1171-g9f55aee9dca759

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106059

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Martin Liska :

https://gcc.gnu.org/g:268b5c81e93ac3ff44fc8ace22ce504d8faa4b07

commit r13-1240-g268b5c81e93ac3ff44fc8ace22ce504d8faa4b07
Author: Martin Liska 
Date:   Thu Jun 23 22:59:11 2022 +0200

profile-count: fix /= and *= operators

PR middle-end/106059

gcc/ChangeLog:

* profile-count.h: *= and /= operators need to modify this
object.

[Bug tree-optimization/106073] [12/13 Regression] wrong code at -O3 on x86_64-linux-gnu

2022-06-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106073

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||needs-bisection, wrong-code
   Last reconfirmed||2022-06-24
Summary|wrong code at -O3 on|[12/13 Regression] wrong
   |x86_64-linux-gnu|code at -O3 on
   ||x86_64-linux-gnu
   Target Milestone|--- |12.2

--- Comment #2 from Andrew Pinski  ---
Confirmed.
When we remove:
au = (int*) 
The testcase works; au is otherwise unused.
I noticed that the major different between with and without that line was a
loop was unrolled and then SLP vectorized.

I have not looked into it any further really.

[Bug tree-optimization/106073] wrong code at -O3 on x86_64-linux-gnu

2022-06-24 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106073

--- Comment #1 from Zhendong Su  ---
Compiler Explorer: https://godbolt.org/z/o3jq85vYK

[Bug tree-optimization/106073] New: wrong code at -O3 on x86_64-linux-gnu

2022-06-24 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106073

Bug ID: 106073
   Summary: wrong code at -O3 on x86_64-linux-gnu
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It seems to be a regression from 11.*. The test is still quite complicated, but
seems difficult to be reduced much further. 

[507] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-sanitizers
--enable-languages=c,c++ --disable-werror --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.0.0 20220624 (experimental) [master r12-4647-g3f861a5c8fd] (GCC) 
[508] % 
[508] % gcctk -O2 small.c; ./a.out
[509] % 
[509] % gcctk -O3 small.c
[510] % ./a.out
Aborted
[511] % 
[511] % cat small.c
int a, f = 1, h, l, m = 1, o, r = 4, q, s, x, e, aa, ab, ac, *ad, ae = 5, **y,
**af, ag, ah, ai, aj;
static int c[6], d, g[6][5], n, *v = , ak;
volatile int p;
const volatile int al;
static volatile int t, u, w = 3, z, am, an;
static int ao();
void ap();
static void aq() {
  int ar[4] = {6, 6, 6, 6}, as[1], i, j;
  as[0] = 0;
  if (m) {
int at[11] = {4, 4, 6, 5, 7, 0, 7, 6, 7, 6, 6}, *au, *av[7], k;
au = (int*) 
for (i = 0; i < 1; i++)
  for (j = 0; j < 1; j++)
for (k = 0; k < 7; k++) {
  (t || n) && u;
  av[k] = 0;
}
y = av;
while (o) {
  int *b[2] = {as, ar};
  *af = at;
}
m = 0;
  }
}
inline void ap() {
  for (; l <= 4; l++) {
*v = 0;
aq();
if (a)
  break;
for (; q; q++)
  ;
  }
}
int ao() {
  int be = 0, j;
  if (n)
aa = d = 0;
  l = 0;
  for (; be < 2; be++) {
int bf[7][2];
for (ai = 0; ai < 7; ai++)
  for (j = 0; j < 2; j++)
bf[ai][j] = 5;
if (be) {
  for (; h >= 0; h--) {
while (z >= w) {
  ap();
  *ad = 0;
}
ap();
  }
  return bf[3][0];
}
if (bf[3][0])
  continue;
while (1)
  ;
  }
  return 0;
}
static void aw() {
  for (; ah; ah++) {
p = 0;
p = 0;
  }
  int ax = ~e;
 L1:
  e = a = 0;
 L2:
  if (!r)
goto L3;
  if (!ax)
goto L2;
  if (d)
goto L1;
  if (!ae)
goto L1;
  if (w && x <= 808 && f)
ag = ao();
  g[0][4] = ag;
  if (a) {
int bd;
n++;
while (n)
  for (bd = 0; bd < 7; bd++) {
am;
am;
am;
am;
d = c[d ^ am];
  }
  } else {
  L3:
an;
for (; ak; ak++) {
  int bc = 7;
  for (; bc >= 0; bc--) {
al;
al;
d = f && an;
an;
  }
}
  }
}
int main() {
  int k;
  for (; aj < 6; aj++)
c[0] = aj;
  aw();
  for (aj = 0; aj < 6; aj++)
for (k = 0; k < 5; k++)
  d = c[d ^ g[aj][k]];
  if (d != 5)
__builtin_abort();
  return 0;
}

[Bug c++/106072] New: Bogus -Wnonnull warning breaks rust bootstrap

2022-06-24 Thread ro at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106072

Bug ID: 106072
   Summary: Bogus -Wnonnull warning breaks rust bootstrap
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: msebor at gcc dot gnu.org
  Target Milestone: ---
Target: sparc*-sun-solaris2.11

Created attachment 53198
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53198=edit
Workaround patch

Bootstrapping the devel/rust/master branch on Solaris/SPARC (both 32 and
64-bit)
breaks like this:

In file included from /vol/gcc/src/git/rust/gcc/rust/parse/rust-parse.h:727,
 from
/vol/gcc/src/git/rust/gcc/rust/expand/rust-macro-builtins.cc:25:
/vol/gcc/src/git/rust/gcc/rust/parse/rust-parse-impl.h: In member function
'Rust::AST::ClosureParam
Rust::Parser::parse_closure_param() [with
ManagedTokenSource = Rust::Lexer]':
/vol/gcc/src/git/rust/gcc/rust/parse/rust-parse-impl.h:9022:70: error: 'this'
pointer is null [-Werror=nonnull]
 9022 | std::move (type), std::move (outer_attrs));
  | 

The error doesn't occur on i386-pc-solaris2.11, amd64-pc-solaris2.11,
i686-pc-linux-gnu, x86_64-pc-linux-gnu.

I've hacked around this with the attached patch.

The preprocessed file is 3.5 MB, unfortunately; currently trying to reduce it
to
something manageable, but this is slow going unfortunately.

[Bug fortran/106071] New: single where run error

2022-06-24 Thread han.wu--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106071

Bug ID: 106071
   Summary: single where run error
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: han...@compiler-dev.com
  Target Milestone: ---

two layer where work ok, why single where work error? 

example one:

program test
  real :: x(3) = 1, y(3) = 2
  logical :: m(3) = .true., m2(3) = .false.
  where (m)
x = f1()
!where (m2)
!  y = f2()
!end where
  end where
  if (any(x/=3)) then
print *, 'FAIL', x
  !if (any(x/=3 .or. y/=3)) then
  !  print *,'FAIL',x,y
  else
print *,'ok'
  end if
contains
  function f1()
m = .false.
!m2 = .true.
f1 = 3
  end function
  !function f2()
  !  m2 = .false.
  !  f2 = 3
  !end function
end

gfortran-12:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 0
 FAIL   3.   1.   1.   

example two:
program test
  real :: x(3) = 1, y(3) = 2
  logical :: m(3) = .true., m2(3) = .false.
  where (m)
x = f1()
where (m2)
  y = f2()
end where
  end where
  if (any(x/=3 .or. y/=3)) then
print *,'FAIL',x,y
  else
print *,'ok'
  end if
contains
  function f1()
m = .false.
m2 = .true.
f1 = 3
  end function
  function f2()
m2 = .false.
f2 = 3
  end function
end

ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 0
 ok

Looking forward to hearing from you！

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-24 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Roger Sayle :

https://gcc.gnu.org/g:3b8794302b52a819ca3ea78238e9b5025d1c56dd

commit r13-1239-g3b8794302b52a819ca3ea78238e9b5025d1c56dd
Author: Roger Sayle 
Date:   Fri Jun 24 07:15:08 2022 +0100

PR target/105930: Split *xordi3_doubleword after reload on x86.

This patch addresses PR target/105930 which is an ia32 stack frame size
regression in high-register pressure XOR-rich cryptography functions
reported by Linus Torvalds.  The underlying problem is once the limited
number of registers on the x86 are exhausted, the register allocator
has to decide which to spill, where some eviction choices lead to much
poorer code, but these consequences are difficult to predict in advance.

The patch below, which splits xordi3_doubleword and iordi3_doubleword
after reload (instead of before), significantly reduces the amount of
spill code and stack frame size, in what might appear to be an arbitrary
choice.

My explanation of this behaviour is that the mixing of pre-reload split
SImode instructions and post-reload split DImode instructions is
confusing some of the heuristics used by reload.  One might think
that splitting early gives the register allocator more freedom to
use available registers, but in practice the constraint that double
word values occupy consecutive registers (when ultimately used as a
DImode value) is the greater constraint.  Instead, I believe in this
case, the pseudo registers used in memory addressing, appear to be
double counted for split SImode instructions compared to unsplit
DImode instructions.  For the reduced test case in comment #13, this
leads to %eax being used to hold the long-lived argument pointer "v",
blocking the use of the ax:dx pair for processing double word values.
The important lines are at the very top of the assembly output:

GCC 11  [use %ecx to address memory, require a 24-byte stack frame]
sub esp, 24
mov ecx, DWORD PTR [esp+40]

GCC 12 [use %eax to address memory, require a 44-byte stack frame]
sub esp, 44
mov eax, DWORD PTR [esp+64]

2022-06-24  Roger Sayle  
UroÅ¡ Bizjak  

gcc/ChangeLog
PR target/105930
* config/i386/i386.md (*di3_doubleword): Split after
reload.  Use rtx_equal_p to avoid creating memory-to-memory moves,
and emit NOTE_INSN_DELETED if operand[2] is zero (i.e. with -O0).

[Bug fortran/106047] ICE in structure_alloc_comps, at fortran/trans-array.cc:9574

2022-06-24 Thread gscfq--- via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106047

G. Steinmetz  changed:

   What|Removed |Added

   Keywords|ice-on-invalid-code |ice-on-valid-code

--- Comment #3 from G. Steinmetz  ---

Regarding these variants, all of them actually work :


$ cat zc1.f90
program p
   type t
  integer, allocatable :: c
   end type
   type(t) :: x
   integer :: n = storage_size(x)
   print *, n
end


$ cat zc2.f90
program p
   type t
  logical, allocatable :: c
   end type
   type(t) :: x
   integer :: n = storage_size(x)
   print *, n
end


$ cat zc3.f90
program p
   type t
  real, allocatable :: c
   end type
   type(t) :: x
   integer :: n = storage_size(x)
   print *, n
end


$ cat zc4.f90
program p
   type t
  character, allocatable :: c
   end type
   type(t) :: x
   integer :: n = storage_size(x)
   print *, n
end


$ cat zc5.f90
program p
   type t
  class(*), allocatable :: c
   end type
   type(t) :: x
   integer :: n = storage_size(x)
   print *, n
end


$ gfortran-13-20220619 zc1.f90 ; a.out
  64
$ gfortran-13-20220619 zc2.f90 ; a.out
  64
$ gfortran-13-20220619 zc3.f90 ; a.out
  64
$ gfortran-13-20220619 zc4.f90 ; a.out
  64
$ gfortran-13-20220619 zc5.f90 ; a.out
 192

---

Testing additional options, there is never an ICE with zc5.f90,
but with the other ones e.g. at line 9129 instead of line 9574 :


$ gfortran-13-20220619 -c zc1.f90 -fcoarray=single
zc1.f90:5:15:

5 |type(t) :: x
  |   1
internal compiler error: Segmentation fault
0xcd90df crash_signal
../../gcc/toplev.cc:322
0x7c4366 structure_alloc_comps
../../gcc/fortran/trans-array.cc:9129
0x7c7348 gfc_nullify_alloc_comp(gfc_symbol*, tree_node*, int, int)
../../gcc/fortran/trans-array.cc:10149
0x7c7348 gfc_trans_deferred_array(gfc_symbol*, gfc_wrapped_block*)
../../gcc/fortran/trans-array.cc:11206
0x7dabac gfc_trans_deferred_vars(gfc_symbol*, gfc_wrapped_block*)
../../gcc/fortran/trans-decl.cc:4994
0x7dd1a5 gfc_generate_function_code(gfc_namespace*)
../../gcc/fortran/trans-decl.cc:7756
0x75f53e translate_all_program_units
../../gcc/fortran/parse.cc:6669
0x75f53e gfc_parse_file()
../../gcc/fortran/parse.cc:6956
0x7acb6f gfc_be_parse_file
../../gcc/fortran/f95-lang.cc:229

97 matches

Mail list logo