[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #5 from Maxim Kuvyrkov  ---
Looking at this problem more, I think the issue is due to ARM target trying
hard to avoid UNSUPPORTED tests, instead of embracing them.

For the vectorization NEON check we have ...
===
proc check_effective_target_arm_neon_ok_nocache { } {
global et_arm_neon_flags
set et_arm_neon_flags ""
if { [check_effective_target_arm32] } {
foreach flags {"" "-mfloat-abi=softfp" "-mfpu=neon" "-mfpu=neon
-mfloat-abi=softfp" "-mfpu=neon -mfloat-abi=softfp -march=armv7-a"
"-mfloat-abi=hard" "-mfpu=neon -mfloat-abi=hard" "-mfpu=neon -mfloat-abi=hard
-march=armv7-a"} {
if { [check_no_compiler_messages_nocache arm_neon_ok object {
#include 
...
===
... where target tries to find a set of flags compatible with _any_ of the
built multilibs to run the testsuite.

I think this is excessive, since each multilib should be tested on its own
merits, and if armv7-m does not support vectorization, there should be no
effort to try and switch to armv7-a or armv8-m+mve multilib in order to run
vectorization tests.  In other words, vectorization tests should be marked
UNSUPPORTED in armv7-m, and PASS/FAIL in armv7-a and/or armv8-m+mve.

In practical terms, my proposed solution to this problem is to remove all
"foreach flags" options except for the default "".

ARM maintainers, what am I missing?

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

--- Comment #3 from Maxim Kuvyrkov  ---
Changing from compile-only to link test is as simple as changing "object" to
"executable" in
[check_no_compiler_messages_nocache arm_neon_ok object ...]
.

However, ... this pattern of checking for ARM architectural features is shared
by 20+ check_effective_target_arm_* routines.  IMO, we should either update all
of these to be link tests (unless there is a good reason to keep them as
compile-only that we can document in the comments).  Or just accept this
vectorization test failure on ARM targets that don't support vectorization.

[Bug testsuite/114568] [14 regression] g++.dg/vect/pr84556.cc fails to produce executable since r14-9706-gb8e7aaaf350a45

2024-04-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114568

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org,
   ||rearnsha at gcc dot gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
The test now fails with linker error:
.../arm-eabi/bin/ld: error: /tmp/cc2Q27GE.o: conflicting architecture profiles
A/M

This is due to command line having
-mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp -mfpu=auto ...
-mfpu=neon -mfloat-abi=softfp -march=armv7-a

The first part comes from toolchain configuration settings, and the second part
(-mfpu=neon -mfloat-abi=softfp -march=armv7-a) comes from
check_effective_target_arm_neon_ok_nocache().

Surprisingly (to me), GCC accepts such mixed options, which makes
check_effective_target_arm_neon_ok_nocache() succeed, since it's only doing a
compilation test.  The linker, though, fails.

Richard E., it is expected that GCC accepts conflicting -march= options?

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-28 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #10 from Maxim Kuvyrkov  ---
Fixed.

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-13 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

--- Comment #8 from Maxim Kuvyrkov  ---
Patch posted:
https://patchwork.sourceware.org/project/gcc/patch/20240313105839.2785627-1-maxim.kuvyr...@linaro.org/

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mkuvyrkov at gcc dot 
gnu.org

--- Comment #7 from Maxim Kuvyrkov  ---
Working on this, including reviewing gcc.dg/vect/, g++.dg/vect/ and
gfortran.dg/vect/ testsuites.

[Bug target/99829] MVE: ICE in lra_assign at -O3

2024-03-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99829

--- Comment #5 from Maxim Kuvyrkov  ---
Hi Vladimir,

The email you got from Linaro CI was about tests the ICE changing "names" due
to assert in lra-assigns.cc changing line numbers.  E.g.,
FAIL: c-c++-common/torture/vshuf-v16qi.c -O3 -g  (internal compiler error: in
lra_assign, at lra-assigns.cc:1651)
... becoming ...
FAIL: c-c++-common/torture/vshuf-v16qi.c -O3 -g  (internal compiler error: in
lra_assign, at lra-assigns.cc:1657)
These are not real regressions, and that report can be ignored.

This PR though, is about fixing the ICE triggered by this test.  We are seeing
the ICEs using these configure options:
https://ci.linaro.org/job/tcwg_gnu_embed_check_gcc--master-thumb_m55_hard_eabi-build/lastSuccessfulBuild/artifact/artifacts/notify/configure-make.txt/*view*/
.

Where did you see the timeouts, btw?

[Bug testsuite/114307] [ARM] Vectorization tests not disabled for vector-less targets

2024-03-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

--- Comment #5 from Maxim Kuvyrkov  ---
Thanks to Christophe L. for explaining the logic behind "is-effective-target
arm_neon_ok".

The issue is that "/* { dg-options "-O3" } */" in pr113576.c overrides
DEFAULT_VECTFLAGS set by check_vect_support_and_set_flags, which contains
"-mfpu=neon" -- as added by "is-effective-target arm_neon_ok".

It so happens that pr113576.c is the only test that both uses "dg-options" and
"tree-vect.h", thus the new error.

[Bug target/99829] MVE: ICE in lra_assign at -O3

2024-03-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99829

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #3 from Maxim Kuvyrkov  ---
Hi Vladimir,

Could you take a look at this, please?

[Bug testsuite/114307] [ARM] GCC generates instruction that assembler rejects

2024-03-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

--- Comment #4 from Maxim Kuvyrkov  ---
What I don't understand is why vect.exp tests run at all for 32-bit ARM
non-neon targets.

All vect.exp are predicated on check_vect_support_and_set_flags, which, as far
as I can tell, has only arm_neon_ok clause for 32-bit ARM.

I'm reproducing this problem locally to understand .exp flow.  If anyone knows
the answer off-hand -- please share.

Thanks!

[Bug target/114307] New: [ARM] GCC generates instruction that assembler rejects

2024-03-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114307

Bug ID: 114307
   Summary: [ARM] GCC generates instruction that assembler rejects
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Recently added vectorization test "gcc.dg/vect/pr113576.c" fails to build for
arm-linux-gnueabihf with:
===
/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/x86_64-pc-linux-gnu/bin/arm-linux-gnueabihf-gcc
--sysroot=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/x86_64-pc-linux-gnu/arm-linux-gnueabihf/libc
/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/snapshots/gcc.git~master/gcc/testsuite/gcc.dg/vect/pr113576.c
-fdiagnostics-plain-output -O3 -lm -o ./pr113576.exe
/tmp/ccRWeLpQ.s: Assembler messages:
/tmp/ccRWeLpQ.s:37: Error: selected FPU does not support instruction -- `vorr
d6,d6,d7'
compiler exited with status 1
output is:
/tmp/ccRWeLpQ.s: Assembler messages:
/tmp/ccRWeLpQ.s:37: Error: selected FPU does not support instruction -- `vorr
d6,d6,d7'

comp_output (pruned) is:
/tmp/ccRWeLpQ.s: Assembler messages:
/tmp/ccRWeLpQ.s:37: Error: selected FPU does not support instruction -- `vorr
d6,d6,d7'

FAIL: gcc.dg/vect/pr113576.c (test for excess errors)
===

The toolchain uses tip-of-trunk binutils for the build.

The relevant configure flags are: --with-float=hard --with-fpu=vfpv3-d16
--with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a

Full configure options are at
https://ci.linaro.org/job/tcwg_gnu_cross_check_gcc--master-arm-build/lastSuccessfulBuild/artifact/artifacts/notify/configure-make.txt/*view*/

[Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267

2024-02-21 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu.org
   Assignee|mkuvyrkov at gcc dot gnu.org   |unassigned at gcc dot 
gnu.org

--- Comment #4 from Maxim Kuvyrkov  ---
Reply from Richard Earnshaw on gcc-patches@ to my patch to make the testcase
accept both "bic" and "bics" instructions:

The test was added (r6-823-g0454e698401a3e) specifically to check that a BICS
instruction was being generated.  Whether or not that is right is somewhat
debatable, but this change seems to be papering over a different issue.

Either we should generate BICS, making this change incorrect, or we should
disable the test for thumb code on the basis that this isn't really a win.

But really, we should fix the compiler to do better here.  We really want
something like

BICS  r0, r0, r1  // r0 is 0 or non-zero
MOVNE r0, #1  // convert all non-zero to 1

in Arm state (ie using the BICS instruction to set the result to zero); and in
thumb2, perhaps something like:

BICS  r0, r0, r1
ITne
MOVNE r0, #1

or maybe even better:

BIC  r0, r0, r1
SUBS r1, r0, #1
SBC  r0, r0, r1

which is slightly better than BICS because SUBS breaks a condition-code chain
(all the flag bits are set).

There are similar quality issues for other NE(arith-op, 0) cases; we just don't
have tests for those.

[Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267

2024-01-25 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #3 from Maxim Kuvyrkov  ---
Copy-pasting my comment from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267#c15 :

I've looked into the reason for the above failures, and it seems to be not an
issue.

After the patch fwprop1 decides to do an additional propagation, which was
considered as "would increase complexity of pattern" before the patch.  This
results in change from "bics; mov" to "bic; subs".  If I understand ARM
assembler correctly, handling of sign was shifted from "bics" to "subs"
instruction.

This is the actual code: BEFORE:
bicsr0, r0, r1  @ 9 [c=4 l=4] 
*andsi_notsi_si_compare0_scratch
mov r0, #1  @ 23[c=4 l=4]  *thumb2_movsi_vfp/1
it  eq
moveq   r0, #0  @ 26[c=8 l=4]  *p *thumb2_movsi_vfp/2
bx  lr  @ 29[c=8 l=4]  *thumb2_return

and AFTER:
bic r0, r0, r1  @ 8 [c=4 l=4]  andsi_notsi_si
subsr0, r0, #0  @ 22[c=4 l=4]  cmpsi2_addneg/0
it  ne
movne   r0, #1  @ 23[c=8 l=4]  *p *thumb2_movsi_vfp/2
bx  lr  @ 26[c=8 l=4]  *thumb2_return

If I don't hear anything to the contrary, I'll update the testcase to accept
both "bic" and "bics".

[Bug rtl-optimization/111267] [14 Regression] Codegen regression from i386 argument passing changes

2024-01-25 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267

--- Comment #15 from Maxim Kuvyrkov  ---
(In reply to Maxim Kuvyrkov from comment #13)
> We are seeing scan-assembler failures in a single 32-bit arm test.  This
> affects both linux and bare-metal targets: arm-linux-gnueabihf and
> arm-none-eabi.
> 
>   === gcc tests ===
> 
> Running gcc:gcc.target/arm/arm.exp ...
> FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
> r[0-9]+ 2
> FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
> r[0-9]+, .sl #2 1

I've looked into the reason for the above failures, and it seems to be not an
issue.

After the patch fwprop1 decides to do an additional propagation, which was
considered as "would increase complexity of pattern" before the patch.  This
results in change from "bics; mov" to "bic; subs".  If I understand ARM
assembler correctly, handling of sign was shifted from "bics" to "subs"
instruction.

This is the actual code: BEFORE:
bicsr0, r0, r1  @ 9 [c=4 l=4] 
*andsi_notsi_si_compare0_scratch
mov r0, #1  @ 23[c=4 l=4]  *thumb2_movsi_vfp/1
it  eq
moveq   r0, #0  @ 26[c=8 l=4]  *p *thumb2_movsi_vfp/2
bx  lr  @ 29[c=8 l=4]  *thumb2_return

and AFTER:
bic r0, r0, r1  @ 8 [c=4 l=4]  andsi_notsi_si
subsr0, r0, #0  @ 22[c=4 l=4]  cmpsi2_addneg/0
it  ne
movne   r0, #1  @ 23[c=8 l=4]  *p *thumb2_movsi_vfp/2
bx  lr  @ 26[c=8 l=4]  *thumb2_return

If I don't hear anything to the contrary, I'll update the testcase to accept
both "bic" and "bics".

[Bug rtl-optimization/111267] [14 Regression] Codegen regression from i386 argument passing changes

2024-01-24 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #13 from Maxim Kuvyrkov  ---
We are seeing scan-assembler failures in a single 32-bit arm test.  This
affects both linux and bare-metal targets: arm-linux-gnueabihf and
arm-none-eabi.

=== gcc tests ===

Running gcc:gcc.target/arm/arm.exp ...
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+ 2
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+, .sl #2 1

[Bug target/113485] [14 regression] ICE with -fno-guess-branch-probability on aarch64 starting with r14-7187-g74e3e839ab2d36

2024-01-23 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113485

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org

--- Comment #3 from Maxim Kuvyrkov  ---
Hi Richard,

Would you please investigate this?

[Bug bootstrap/113445] [14 Regression] bootstrap failure on f95-lang.cc: ‘-fcompare-debug’ failure since r14-8174

2024-01-18 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113445

--- Comment #9 from Maxim Kuvyrkov  ---
(In reply to Jakub Jelinek from comment #8)
> Committed now.

Thanks, Jakub.

[Bug bootstrap/113445] [14 Regression] bootstrap failure on f95-lang.cc: ‘-fcompare-debug’ failure since r14-8174

2024-01-17 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113445

--- Comment #4 from Maxim Kuvyrkov  ---
Testing patch
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643267.html .

[Bug rtl-optimization/111554] [12/13 regression] Timeout with with "-O3 -fno-dse -fno-inline -fno-store-merging -fno-toplevel-reorder -fno-tree-dce -fno-tree-dse" since r12-2097-g9f34b780b0461e

2024-01-17 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111554

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Maxim Kuvyrkov  ---
Richard,

Thanks for confirming.  I was about to do the same.

[Bug rtl-optimization/96388] scheduling takes forever with -fPIC

2024-01-16 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Maxim Kuvyrkov  ---
Fixed.

[Bug rtl-optimization/111554] [12/13/14 regression] Timeout with with "-O3 -fno-dse -fno-inline -fno-store-merging -fno-toplevel-reorder -fno-tree-dce -fno-tree-dse" since r12-2097-g9f34b780b0461e

2023-11-20 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111554

--- Comment #12 from Maxim Kuvyrkov  ---
(In reply to Richard Biener from comment #11)
> (In reply to Maxim Kuvyrkov from comment #10)
> > This is, mostly, a dup of PR96388.
> > With the problem in PR96388 fixed, I'm seeing strange -ftime-report
> > reporting that [wrongly?] attributes a lot of time to alias analysis.
> 
> You can try -ftime-report -ftime-report-details (yeah, two options),
> this lists "dependent" times in other timevars beyond the passes using them.

Ack, thanks, Richard.

Posted patch in
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637419.html .

[Bug rtl-optimization/96388] scheduling takes forever with -fPIC

2023-11-20 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388

--- Comment #16 from Maxim Kuvyrkov  ---
Posted patch in
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637419.html

[Bug rtl-optimization/111554] [12/13/14 regression] Timeout with with "-O3 -fno-dse -fno-inline -fno-store-merging -fno-toplevel-reorder -fno-tree-dce -fno-tree-dse" since r12-2097-g9f34b780b0461e

2023-11-19 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111554

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #10 from Maxim Kuvyrkov  ---
This is, mostly, a dup of PR96388.
With the problem in PR96388 fixed, I'm seeing strange -ftime-report reporting
that [wrongly?] attributes a lot of time to alias analysis.

[Bug rtl-optimization/96388] scheduling takes forever with -fPIC

2023-11-18 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #15 from Maxim Kuvyrkov  ---
Finished analysis.  Will post a patch next week.

[Bug rtl-optimization/111554] [12/13/14 regression] Timeout with with "-O3 -fno-dse -fno-inline -fno-store-merging -fno-toplevel-reorder -fno-tree-dce -fno-tree-dse" since r12-2097-g9f34b780b0461e

2023-10-30 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111554

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mkuvyrkov at gcc dot 
gnu.org

--- Comment #9 from Maxim Kuvyrkov  ---
Taking.

[Bug rtl-optimization/96388] scheduling takes forever with -fPIC

2023-10-30 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96388

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mkuvyrkov at gcc dot 
gnu.org

--- Comment #14 from Maxim Kuvyrkov  ---
Taking.

[Bug bootstrap/107119] Bootstrap ICE on 32-bit ARM after r13-2871-g1b74b5cb4e9

2022-10-03 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107119

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Maxim Kuvyrkov  ---
(In reply to Jeffrey A. Law from comment #2)
> Does it still happen after this:
> Author: Jeff Law 
> Date:   Tue Sep 27 01:44:38 2022 -0400
> 
> Fix ICEs due to recent jump-to-return optimization
> 
> gcc/
> * cfgrtl.cc (fixup_reorder_chain): Verify that simple_return
> and return are available before trying to use them.

After this commit the bootstrap still failed, but something from the below
committed on September 29th fixed it:
ada: Fix checking of Refined_State with nested package renamings (detail)
ada: Improve efficiency of slice-of-component assignment (detail)
ada: Further tweak new expansion of contracts (detail)
ada: Remove duplicated doc comment section (detail)
support -gz=zstd for both linker and assembler (detail)
driver, cppdefault: Unbreak bootstrap on Debian/Ubuntu [PR107059] (detail)
Add OPTIONS_H_EXTRA to GTFILES (detail)
aarch64: Rename AARCH64_ISA architecture-level macros (detail)
aarch64: Rename AARCH64_FL architecture-level macros (detail)
aarch64: Rename AARCH64_FL_FOR_ARCH macros (detail)
aarch64: Add "V" to aarch64-arches.def names (detail)
aarch64: Small config.gcc cleanups (detail)
aarch64: Avoid redundancy in aarch64-cores.def (detail)
aarch64: Remove AARCH64_FL_RCPC8_4 [PR107025] (detail)
aarch64: Fix transitive closure of features (detail)
aarch64: Reorder an entry in aarch64-option-extensions.def (detail)
aarch64: Simplify feature definitions (detail)
aarch64: Simplify generation of .arch strings (detail)
aarch64: Avoid std::string in static data (detail)
aarch64: Tweak constness of option-related data (detail)
aarch64: Make more use of aarch64_feature_flags (detail)
aarch64: Tweak contents of flags_on/off fields (detail)
aarch64: Tweak handling of -mgeneral-regs-only (detail)
aarch64: Remove redundant TARGET_* checks (detail)
data-ref: Fix ranges_maybe_overlap_p test (detail)
place `const volatile' objects in read-only sections (detail)
c++: import/export NTTP objects (detail)
libstdc++: Guard use of new built-in with __has_builtin (detail)
c++: introduce TRAIT_TYPE alongside TRAIT_EXPR (detail)
c++: implement __remove_cv, __remove_reference and __remove_cvref (detail)
OpenACC: whole struct vs. component mappings (PR107028) (detail)

[Bug bootstrap/107119] Bootstrap ICE on 32-bit ARM after r13-2871-g1b74b5cb4e9

2022-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107119

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||jeffreyalaw at gmail dot com,
   ||mkuvyrkov at gcc dot gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
Hi Jeff,

Would you please take a look at this?  Please let me know if you need any
assistance with reproducing this.

I've seen this only on 32-bit ARM; doesn't affect AArch64.

[Bug bootstrap/107119] New: Bootstrap ICE on 32-bit ARM after r13-2871-g1b74b5cb4e9

2022-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107119

Bug ID: 107119
   Summary: Bootstrap ICE on 32-bit ARM after
r13-2871-g1b74b5cb4e9
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

After basepoints/gcc-13-2871-g1b74b5cb4e9:
===
commit 1b74b5cb4e9d7191f298245063a8f9c3a1bbeff4
Author: Jeff Law 
Date:   Sun Sep 25 12:23:59 2022 -0400

   [RFA] Minor improvement to coremark, avoid unconditional jump to return
===

Bootstrap on 32-bit ARM (armv8l-linux-gnueabihf) fails with:

00:04:19 during RTL pass: bbro
00:04:19
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libgomp/env.c:
In function ‘gomp_get_initial_icv_item’:
00:04:19
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/libgomp/env.c:1964:1:
internal compiler error: in target_gen_simple_return, at
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/config/arm/arm.md:8998
00:04:19  1964 | }
00:04:19   | ^
00:04:19   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/config/arm/arm.md:8998
00:04:19 0x7df80b force_nonfallthru_and_redirect(edge_def*, basic_block_def*,
rtx_def*)
00:04:19../../../../../../gcc/gcc/cfgrtl.cc:1711
00:04:19 0x7e3681 fixup_reorder_chain
00:04:19../../../../../../gcc/gcc/cfgrtl.cc:4070
00:04:19 0x7e4791 cfg_layout_finalize()
00:04:19../../../../../../gcc/gcc/cfgrtl.cc:4565
00:04:19 0x1a80f1b execute
00:04:19../../../../../../gcc/gcc/bb-reorder.cc:2663
00:04:19 Please submit a full bug report, with preprocessed source (by using
-freport-bug).
00:04:19 Please include the complete backtrace with any bug report.
00:04:19 See <https://gcc.gnu.org/bugs/> for instructions.

Configure line was:

.../configure --with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto
--enable-shared --without-included-gettext --enable-nls --with-system-zlib
--disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id
--disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu
--enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no
--with-isl=no --disable-multilib --with-float=hard --with-fpu=neon-fp-armv8
--with-mode=thumb --with-arch=armv8-a --enable-threads=posix --enable-multiarch
--enable-libstdcxx-time=yes --enable-gnu-indirect-function
--enable-checking=yes --enable-bootstrap --enable-languages=c,c++,fortran,to

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2022-08-02 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 90957, which changed state.

Bug 90957 Summary: 453.povray is miscompiled with -flto -static
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90957

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

[Bug lto/90957] 453.povray is miscompiled with -flto -static

2022-08-02 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90957

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|WAITING |RESOLVED

--- Comment #2 from Maxim Kuvyrkov  ---
>From further analysis it seems this problem was due to how old armv7 kernels
(3.10) allocated process stack.  The stack bumped against special
kernel-provided page, so, when compiler generate bigger-than-usual stack
frames, the stack would overflow and process would crash.

Non-ancient kernels don't have this problem.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2021-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 90378, which changed state.

Bug 90378 Summary: [9/10/11/12 regression] -Os -flto miscompiles 454.calculix 
after r266385 on Arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

[Bug rtl-optimization/90378] [9/10/11/12 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2021-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Resolution|--- |WORKSFORME
 Status|NEW |RESOLVED

--- Comment #13 from Maxim Kuvyrkov  ---
This problem has not been occurring since we have moved aarch32 benchmarking to
a machine with newer kernel.  Closing.

[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev

2021-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|UNCONFIRMED
 Ever confirmed|1   |0

[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev

2021-10-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org
   Assignee|thomas.preudhomme at celest dot fr |unassigned at gcc dot 
gnu.org

--- Comment #16 from Maxim Kuvyrkov  ---
AFAIK, Thomas isn't working on this.

[Bug target/95958] [meta-bug] Inefficient arm_neon.h code for AArch64

2021-08-17 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95958
Bug 95958 depends on bug 91598, which changed state.

Bug 91598 Summary: [9 regression] 60% speed drop on neon intrinsic loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/47562] [meta-bug] keep track of Neon Intrinsics enhancements

2021-08-17 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47562
Bug 47562 depends on bug 91598, which changed state.

Bug 91598 Summary: [9 regression] 60% speed drop on neon intrinsic loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/91598] [9 regression] 60% speed drop on neon intrinsic loop

2021-08-17 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Maxim Kuvyrkov  ---
Closing.

[Bug target/91598] [9 regression] 60% speed drop on neon intrinsic loop

2021-08-12 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

--- Comment #12 from Maxim Kuvyrkov  ---
(In reply to Tamar Christina from comment #11)
> Can this issue be closed? all inline assembly have been removed from
> arm_neon.h but backporting these are extremely unlikely.. (some are
> intrusive)

Hi Tamar,

Looking at this I now remember that I have a couple of minor patches approved,
but which I forgot to commit.

Let me retest and commit those, and we can close this.

[Bug testsuite/101506] [12 Regression] gcc.target/aarch64/vect-fmaxv-fminv-compile.c fail after gcc-12-2292-g1dd3f2109585

2021-07-19 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101506

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #3 from Maxim Kuvyrkov  ---
Richard S., could you investigate, please?  You've added the testcase in 2018
and it hasn't been touched since.

[Bug tree-optimization/101506] [12 Regression] gcc.target/aarch64/vect-fmaxv-fminv-compile.c fail after gcc-12-2292-g1dd3f2109585

2021-07-19 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101506

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
Hi Richard,

Would you please take a look?

[Bug tree-optimization/101506] New: [12 Regression] gcc.target/aarch64/vect-fmaxv-fminv-compile.c fail after gcc-12-2292-g1dd3f2109585

2021-07-19 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101506

Bug ID: 101506
   Summary: [12 Regression]
gcc.target/aarch64/vect-fmaxv-fminv-compile.c fail
after gcc-12-2292-g1dd3f2109585
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

On aarch64-linux-gnu I get 2 new failures after
===
commit 1dd3f21095858fbfd3e28a149578d5fb67e75f95
Author: Richard Biener 
Date:   Tue Jul 13 13:59:15 2021 +0200

   Support reduction def re-use for epilogue with different vector size

   The following adds support for re-using the vector reduction def
   from the main loop in vectorized epilogue loops on architectures
   which use different vector sizes for the epilogue.  That's only
   x86 as far as I am aware.
===

=== gcc tests ===

Running gcc.target/aarch64/aarch64.exp ...
PASS: gcc.target/aarch64/vect-fmaxv-fminv-compile.c (test for excess errors)
FAIL: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fminnmv
FAIL: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fmaxnmv
PASS: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fminnmp
PASS: gcc.target/aarch64/vect-fmaxv-fminv-compile.c scan-assembler fmaxnmp

[Bug rtl-optimization/90378] [9/10/11/12 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2021-06-01 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

--- Comment #12 from Maxim Kuvyrkov  ---
I'm now more of the opinion that this problem is due to the unfortunate stack
layout on older ARMv7 systems.  I.e., LTO increases stack usage, which causes
the process to run out of stack.  I don't think anything needs to be fixed on
GCC side, and, hopefully, we can fix this on our end by upgrading from 3.10
kernel to something more modern.

[Bug middle-end/100537] Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM after gcc-12-657-ga076632e274a

2021-05-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100537

--- Comment #2 from Maxim Kuvyrkov  ---
So far I managed to reproduce this only with armhf Go build, so one needs go1
binary from either a native armhf system or an armhf cross-toolchain to
reproduce.

To reproduce on a native system configure GCC with:
../gcc/configure --disable-bootstrap --disable-multilib --with-float=hard
--enable-languages=go

To reproduce using a cross-toolchain and Linaro scripts:
git clone https://git.linaro.org/toolchain/abe.git
cd abe
./configure
./abe.sh --target arm-linux-gnueabihf --build all --extraconfigdir
config/master

[Bug middle-end/100537] Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM after gcc-12-657-ga076632e274a

2021-05-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100537

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
 Target||arm-linux-gnueabihf

--- Comment #1 from Maxim Kuvyrkov  ---
A reduced testcase is coming ...

[Bug middle-end/100537] New: Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM after gcc-12-657-ga076632e274a

2021-05-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100537

Bug ID: 100537
   Summary: Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM
after gcc-12-657-ga076632e274a
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Patch a076632e274abe344ca7648b7c7f299273d4cbe0 appears to have broken
bootstrap-O3 and bootstrap-debug at least on 32-bit armhf.

00:33:32 In function ‘syscall.forkExec’:
00:33:32 go1: error: address taken, but ADDRESSABLE bit not set
00:33:32 PHI argument
00:33:32 &go..C479;
00:33:32 for PHI node
00:33:32 err$__object_78 = PHI 
00:33:32 during GIMPLE pass: fre
00:33:32 go1: internal compiler error: verify_ssa failed
00:33:32 0x9c18d7 verify_ssa(bool, bool)
00:33:32   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-ssa.c:1214
00:33:32 0x6f8d5b execute_function_todo
00:33:32   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/passes.c:2049
00:33:32 0x6f9abf do_per_function
00:33:32   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/passes.c:1687
00:33:32 0x6f9abf execute_todo
00:33:32   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/passes.c:2096
00:33:32 Please submit a full bug report,
00:33:32 with preprocessed source if appropriate.
00:33:32 Please include the complete backtrace with any bug report.
00:33:32 See <https://gcc.gnu.org/bugs/> for instructions.
00:33:32 Makefile:3001: recipe for target 'syscall.lo' failed

[Bug bootstrap/99983] [9/10 regression] ICE in bootstrap while building libstdc++

2021-04-09 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99983

--- Comment #8 from Maxim Kuvyrkov  ---
I'll revert on gcc-10 and then backport the revert to gcc-9.

[Bug bootstrap/99983] [10 regression] ICE in bootstrap while building libstdc++

2021-04-09 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99983

Maxim Kuvyrkov  changed:

   What|Removed |Added

  Build|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||x86_64-linux-gnu
   ||aarch64-linux-gnu
   ||arm-linux-gnueabihf
 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||x86_64-linux-gnu
   ||aarch64-linux-gnu
   ||arm-linux-gnueabihf
 CC||fdumont at gcc dot gnu.org,
   ||mkuvyrkov at gcc dot gnu.org
   Host|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||x86_64-linux-gnu
   ||aarch64-linux-gnu
   ||arm-linux-gnueabihf

--- Comment #3 from Maxim Kuvyrkov  ---
(In reply to seurer from comment #1)
> The failures shown were on a power 8 LE system for
> g:348fb9db7858b0fe852da3cd1195b90b2211b983, r10-9675.  I have something
> running to look for what revision started it.

It appears to be
===
commit 1c4e8a96cd695c03ff85299bf2392476feae99bb
Author: François Dumont 
Date: Mon Jan 20 19:15:43 2020 +0100

libstdc++: Fix unordered containers move constructors noexcept qualification

_Hashtable move constructor is wrongly qualified as noexcept(true) regardless
of
_Equal and _H1 copy constructor qualifications.
_Hashtable allocator-aware move constructor is missing its noexcept
qualification like the depending unordered containers ones.

This backport also includes the changes from r11-8062.
===

And also confirmed on x86_64, aarch64, and aarch32.

[Bug tree-optimization/98415] [11 Regression] GCC crashes on Linux kernel build after r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667

2020-12-21 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98415

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Maxim Kuvyrkov  ---
Jakub, would you please take a look at this?

Let me know if you need any assistance in reproducing this.

[Bug tree-optimization/98415] [11 Regression] GCC crashes on Linux kernel build after r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667

2020-12-21 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98415

--- Comment #1 from Maxim Kuvyrkov  ---
Created attachment 49826
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49826&action=edit
Reduced testcase

[Bug tree-optimization/98415] New: [11 Regression] GCC crashes on Linux kernel build after r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667

2020-12-21 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98415

Bug ID: 98415
   Summary: [11 Regression] GCC crashes on Linux kernel build
after
r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

GCC crashes building one of Linux kernel drivers after
r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667 at least for AArch64 and ARM
(likely, other architectures as well, but I didn't test).

Attached is a reduced (down from 3.5M!) testcase for GCC targeting
aarch64-linux-gnu.

Before r11-6271-g69165332a914f1167c3077fa1f57afc64fd8a667 cc1 cleanly compiles
the attached testcase:
$ ./cc1 channel.i -o channel.s -quiet -Wall -Wundef -Werror
-mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0
-mstack-protector-guard-offset=1344 -O2 -Wno-pointer-sign
-Wno-packed-not-aligned -fno-strict-aliasing

and after we get:

$ ./cc1 channel.i -o channel.s -quiet -Wall -Wundef -Werror
-mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0
-mstack-protector-guard-offset=1344 -O2 -Wno-pointer-sign
-Wno-packed-not-aligned -fno-strict-aliasing
free(): invalid next size (fast)
during GIMPLE pass: ccp
channel.i: In function ‘ak’:
channel.i:19:1: internal compiler error: Aborted
   19 | }
  | ^
0xf440b3 crash_signal
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/toplev.c:327
0xac0e4a ~dom_info
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/dominance.c:284
0xac2a5b calculate_dominance_info(cdi_direction)
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/dominance.c:733
0xf99929 cleanup_tree_cfg_noloop
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/tree-cfgcleanup.c:1086
0xf99929 cleanup_tree_cfg(unsigned int)
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/tree-cfgcleanup.c:1187
0xe5634c execute_function_todo
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/passes.c:2008
0xe5705e execute_todo
   
/home/maxim.kuvyrkov/tcwg_kernel/abe/snapshots/gcc.git~master/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


Interestingly, the original testcase crashes with a different backtrace:

00:18:04 realloc(): invalid next size
00:18:04 during GIMPLE pass: pre
00:18:04 drivers/net/wireless/broadcom/brcm80211/brcmsmac/channel.c: In
function ‘brcms_reg_notifier’:
00:18:04 drivers/net/wireless/broadcom/brcm80211/brcmsmac/channel.c:695:13:
internal compiler error: Aborted
00:18:04   695 | static void brcms_reg_notifier(struct wiphy *wiphy,
00:18:04   | ^~
00:18:04 0xd95a3f crash_signal
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/toplev.c:327
00:18:04 0x1adb9cc xrealloc
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/libiberty/xmalloc.c:179
00:18:04 0xf8772a void va_heap::reserve(vec*&, unsigned int, bool)
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/vec.h:290
00:18:04 0xf8772a vec::reserve(unsigned int,
bool)
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/vec.h:1778
00:18:04 0xf7ddd9 vec::safe_grow(unsigned int,
bool)
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/vec.h:1926
00:18:04 0xf7ddd9 vec::safe_grow_cleared(unsigned int, bool)
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/vec.h:1945
00:18:04 0xf7ddd9 add_to_value
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:715
00:18:04 0xf7edd7 phi_translate_1
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:1740
00:18:04 0xf7f7cf phi_translate
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:1807
00:18:04 0xf82aeb phi_translate_set
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:1852
00:18:04 0xf8305a compute_antic_aux
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:2192
00:18:04 0xf8305a compute_antic
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-pre.c:2538
00:18:04 0xf8562b execute
00:18:04   
/home/tcwg-buildslave/workspace/tcwg_kernel_

[Bug rtl-optimization/98246] [11 Regression] Linux kernel build fails after 6fbec038f7a7

2020-12-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98246

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

--- Comment #1 from Maxim Kuvyrkov  ---
H.J., would you please take a look?  Please let me know if you need assistance
with reproducing this.

Thanks!

[Bug rtl-optimization/98246] New: [11 Regression] Linux kernel build fails after 6fbec038f7a7

2020-12-11 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98246

Bug ID: 98246
   Summary: [11 Regression] Linux kernel build fails after
6fbec038f7a7
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Linux kernel build fails at least for AArch64 and armhf after


commit 6fbec038f7a7ddf29f074943611b53210d17c40c
Author: H.J. Lu 
Date:   Mon Feb 3 11:55:43 2020 -0800

Use SHF_GNU_RETAIN to preserve symbol definitions


The error is:
00:04:49 In file included from ./include/linux/serial_sci.h:6,
00:04:49  from drivers/tty/serial/sh-sci.c:42:
00:04:49 ./include/linux/compiler-gcc.h:47:45: error:
‘__UNIQUE_ID___earlycon_hscif362’ causes a section type conflict with
‘banner’
00:04:4947 | #define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_,
prefix), __COUNTER__)
00:04:49   | ^~~~

This likely also happens on other architectures.

To reproduce a vanilla aarch64-linux-gnu stage1 cross compiler and any recent
Linux version is sufficient.

Compiler build log:
https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-mainline-defconfig/24/artifact/artifacts/build-first_bad/04-build_abe-stage1/console.log/*view*/

Kernel build log:
https://ci.linaro.org/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-mainline-defconfig/24/artifact/artifacts/build-first_bad/05-build_linux/console.log/*view*/

[Bug target/97323] [10/11 Regression] ICE 'verify_type' failed on arm-linux-gnueabihf

2020-10-28 Thread mkuvyrkov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97323

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org,
   ||rth at gcc dot gnu.org

--- Comment #8 from Maxim Kuvyrkov  ---
Hi Richard,

Interested in checking out this bug?  The original testcase is from QEMU
source: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972789 .

[Bug rtl-optimization/96998] GCC ICEs in on building AArch64 Linux kernel after basepoints/gcc-11-2903-g6b3034eaba83

2020-09-09 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96998

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||acoplan at gcc dot gnu.org,
   ||mkuvyrkov at gcc dot gnu.org
 Target||aarch64-linux-gnu

--- Comment #1 from Maxim Kuvyrkov  ---
Alex, would you please look into this?

Let me know if you have problems reproducing the crash and I'll make a smaller
reproducer.

[Bug rtl-optimization/96998] New: GCC ICEs in on building AArch64 Linux kernel after basepoints/gcc-11-2903-g6b3034eaba83

2020-09-09 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96998

Bug ID: 96998
   Summary: GCC ICEs in on building AArch64 Linux kernel after
basepoints/gcc-11-2903-g6b3034eaba83
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

GCC ICEs on building Linux kernel for AArch64 target after
basepoints/gcc-11-2903-g6b3034eaba83:
===
commit 6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf
Author: Alex Coplan 
Date:   Thu Aug 27 09:49:57 2020 +0100

lra: Canonicalize mult to shift in address reloads
===

The error message is:
===
00:01:02 during RTL pass: reload
00:01:02 mm/backing-dev.c: In function ‘clear_bdi_congested’:
00:01:02 mm/backing-dev.c:930:1: internal compiler error: in
lra_set_insn_recog_data, at lra.c:1006
00:01:02   930 | }
00:01:02   | ^
00:01:03 0xb4f991 lra_set_insn_recog_data(rtx_insn*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1004
00:01:03 0xb4feb7 lra_get_insn_recog_data
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra-int.h:488
00:01:03 0xb4feb7 lra_update_insn_regno_info(rtx_insn*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1625
00:01:03 0xb50027 lra_update_insn_regno_info(rtx_insn*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1774
00:01:03 0xb50027 lra_push_insn_1
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1780
00:01:03 0xb50027 lra_push_insn(rtx_insn*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1788
00:01:03 0xb50027 push_insns
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1831
00:01:03 0xb51c0c push_insns
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/rtl.h:886
00:01:03 0xb51c0c lra_process_new_insns(rtx_insn*, rtx_insn*, rtx_insn*, char
const*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:1879
00:01:03 0xb6981e curr_insn_transform
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra-constraints.c:4572
00:01:03 0xb6b7cd lra_constraints(bool)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra-constraints.c:5066
00:01:03 0xb52c4c lra(_IO_FILE*)
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/lra.c:2415
00:01:03 0xb032a9 do_reload
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/ira.c:5525
00:01:03 0xb032a9 execute
00:01:03   
/home/tcwg-buildslave/workspace/tcwg_kernel_0/abe/snapshots/gcc.git~master/gcc/ira.c:5711
00:01:03 Please submit a full bug report,
00:01:03 with preprocessed source if appropriate.
00:01:03 Please include the complete backtrace with any bug report.
00:01:03 See <https://gcc.gnu.org/bugs/> for instructions.
00:01:03 scripts/Makefile.build:283: recipe for target 'mm/backing-dev.o'
failed
00:01:03 make[1]: *** [mm/backing-dev.o] Error 1
===

Toolchain configurations is a no-frills default aarch64-linux-gnu toolchain,
but only linux-next tree seems to be affected.  It's possible that the ICE is
due to a wrong register constraint in a poorly-tested patch in linux-next. 
Nevertheless, the ICE is worth investigating.

The linux-next tree as tested was "7a6956579ce69 Add linux-next specific files
for 20200903".

More details on the toolchain build:
https://ci.linaro.org/view/tcwg_kernel-gnu/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-next-allnoconfig/4/artifact/artifacts/build-first_bad/04-build_abe-stage1/console.log/*view*/
More details on the linux build:
https://ci.linaro.org/view/tcwg_kernel-gnu/job/tcwg_kernel-gnu-bisect-gnu-master-aarch64-next-allnoconfig/4/artifact/artifacts/build-first_bad/05-build_linux/console.log/*view*/
.

[Bug target/88836] [SVE] Redundant PTEST in loop test

2020-08-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88836

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #3 from Maxim Kuvyrkov  ---
Hi Richard,
Hi Kyrill,

Does Kugan's patch tentatively look good?

If it does, we'll pick it up, retest, fix/improve if necessary and submit for
review.

[Bug tree-optimization/96022] ICE during GIMPLE pass: slp in operator[], at vec.h:867

2020-07-08 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96022

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 CC||mkuvyrkov at gcc dot gnu.org
 Resolution|FIXED   |---

--- Comment #5 from Maxim Kuvyrkov  ---
Hi Richard,

This causes ICEs on many vectorization testcases for arm-linux-gnueabihf.  Full
list of regressions is at [1].  Sum and log files are at [2].

[1]
https://ci.linaro.org/view/tcwg_cross/job/tcwg_cross-bisect-gnu-master-arm-check_cross/7/artifact/artifacts/build-first_bad/results/*view*/
[2]
https://ci.linaro.org/view/tcwg_cross/job/tcwg_cross-bisect-gnu-master-arm-check_cross/7/artifact/artifacts/build-first_bad/sumfiles/
 

Typical error log:

spawn -ignore SIGHUP
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/x86_64-unknown-linux-gnu/bin/arm-linux-gnueabihf-gcc
--sysroot=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/sysroots/arm-linux-gnueabihf
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/testsuite/gcc.dg/vect/pr55857-1.c
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -fdiagnostics-urls=never
--sysroot=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/sysroots/arm-linux-gnueabihf
-mfpu=neon -ffast-math -ftree-vectorize -fno-tree-loop-distribute-patterns
-fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details -S -o pr55857-1.s
during GIMPLE pass: vect
dump file: pr55857-1.c.163t.vect
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/testsuite/gcc.dg/vect/pr55857-1.c:
In function 'foo':
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/testsuite/gcc.dg/vect/pr55857-1.c:4:1:
internal compiler error: Segmentation fault
0xcd933f crash_signal
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/toplev.c:328
0xf889ee get_vectype_for_scalar_type(vec_info*, tree_node*, _slp_tree*)
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-stmts.c:10999
0xf889ee vectorizable_shift
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-stmts.c:5404
0xfa3dbf vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*, _slp_tree*,
_slp_instance*, vec*)
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-stmts.c:10555
0xfbc5b9 vect_analyze_loop_operations
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-loop.c:1613
0xfbeb14 vect_analyze_loop_2
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-loop.c:2164
0xfbeb14 vect_analyze_loop(loop*, vec_info_shared*)
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vect-loop.c:2612
0xfe206c try_vectorize_loop_1
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vectorizer.c:955
0xfe2f09 vectorize_loops()
   
/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/gcc/tree-vectorizer.c:1189
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
compiler exited with status 1
FAIL: gcc.dg/vect/pr55857-1.c (internal compiler error)
FAIL: gcc.dg/vect/pr55857-1.c (test for excess errors)


GCC was configured as a typical armhf cross-compiler:
 --disable-multilib --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb
--with-tune=cortex-a9 --with-arch=armv7-a --enable-threads=posix
--enable-multiarch --enable-libstdcxx-time=yes --enable-gnu-indirect-function
--enable-checking=yes --disable-bootstrap --enable-languages=c,c++,fortran,lto
--build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu
--target=arm-linux-gnueabihf

[Bug tree-optimization/96049] basepoints/gcc-11-1649-gd2adb79eac66 crashes on building 416.gamess with -O3 -flto

2020-07-07 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96049

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #4 from Maxim Kuvyrkov  ---
Hi Richard,

This is indeed fixed on current mainline.

For the record, we benchmark cross-toolchains for both aarch64-linux-gnu and
arm-linux-gnueabihf (we want to benchmark LTO builds and at least ARMv7
machines don't have enough RAM).  SPEC2k6 harness runs natively on the
benchmarking ARMv8 and ARMv7 boards, and it calls compiler wrappers for remote
compilation on x86_64 boxes.

Configure options are fairly standard for a cross-toolchain:
--with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto --enable-shared
--without-included-gettext --enable-nls --with-system-zlib
--disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id
--disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu
--enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no
--with-isl=no --disable-multilib --enable-fix-cortex-a53-835769
--enable-fix-cortex-a53-843419 --with-arch=armv8-a --enable-threads=posix
--enable-multiarch --enable-libstdcxx-time=yes --enable-gnu-indirect-function
--disable-libsanitizer
--with-build-sysroot=/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/sysroots/aarch64-linux-gnu
--with-sysroot=/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/builds/destdir/x86_64-unknown-linux-gnu/aarch64-linux-gnu/libc
--enable-checking=yes --disable-bootstrap --enable-languages=c,c++,fortran,lto
--build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu
--target=aarch64-linux-gnu
--prefix=/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/builds/destdir/x86_64-unknown-linux-gnu

[Bug tree-optimization/96049] basepoints/gcc-11-1649-gd2adb79eac66 crashes on building 416.gamess with -O3 -flto

2020-07-03 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96049

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
Richard, would you please take a look?

[Bug tree-optimization/96049] New: basepoints/gcc-11-1649-gd2adb79eac66 crashes on building 416.gamess with -O3 -flto

2020-07-03 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96049

Bug ID: 96049
   Summary: basepoints/gcc-11-1649-gd2adb79eac66 crashes on
building 416.gamess with -O3 -flto
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Linaro benchmarking CI flagged SHA1 due to compiler ICE'ing while build
SPEC2k6's 416.gamess with -O3 -flto -funconstrained-commons -std=legacy for
aarch64-linux-gnu:
1593354814.04: during GIMPLE pass: vect
1593354814.04: ecp.fppized.f: In function 'ecpdra':
1593354814.04: ecp.fppized.f:620: internal compiler error: in
compute_live_loop_exits, at tree-ssa-loop-manip.c:247
1593354814.04:   620 |   SUBROUTINE
ECPDRA(IC4C,NPNP,FP,FQ,COEFI,COEFQ,COEFJ,
1593354814.04:   | 
1593354815.35: 0xde105c compute_live_loop_exits
1593354815.35: 
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-loop-manip.c:247
1593354815.35: 0xde105c add_exit_phis_var
1593354815.35: 
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-loop-manip.c:334
1593354815.35: 0xde105c add_exit_phis
1593354815.35: 
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-loop-manip.c:356
1593354815.35: 0xde105c rewrite_into_loop_closed_ssa_1(bitmap_head*, unsigned
int, int, loop*)
1593354815.35: 
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/snapshots/gcc.git~master/gcc/tree-ssa-loop-manip.c:678
1593354815.35: 0xf539a7 vectorize_loops()
1593354815.35: 
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/snapshots/gcc.git~master/gcc/tree-vectorizer.c:1260
1593354815.40: Please submit a full bug report,
1593354815.40: with preprocessed source if appropriate.
1593354815.40: Please include the complete backtrace with any bug report.
1593354815.40: See <https://gcc.gnu.org/bugs/> for instructions.
1593354815.44: lto-wrapper: fatal error:
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/builds/destdir/x86_64-unknown-linux-gnu/bin/aarch64-linux-gnu-gfortran
returned 1 exit status
1593354815.44: compilation terminated.
1593354815.44:
/home/tcwg-buildslave/workspace/tcwg_bmk_0/abe/builds/destdir/x86_64-unknown-linux-gnu/lib/gcc/aarch64-linux-gnu/11.0.0/../../../../aarch64-linux-gnu/bin/ld:
error: lto-wrapper failed
1593354815.45: collect2: error: ld returned 1 exit status
1593354815.52: specmake: *** [gamess] Error 1

[Bug middle-end/95052] [9/10/11 Regression] Excess padding of partially initialized strings/char arrays

2020-05-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #11 from Maxim Kuvyrkov  ---
Created attachment 48634
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48634&action=edit
Crash testcase (from Linux kernel)

[Bug middle-end/95052] [9/10/11 Regression] Excess padding of partially initialized strings/char arrays

2020-05-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

--- Comment #10 from Maxim Kuvyrkov  ---
To reproduce:

1. Configure GCC for arm-linux-gnueabihf (e.g., for x86_64->armhf cross):
--with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto --enable-shared
--without-included-gettext --enable-nls --with-system-zlib
--disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id
--disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu
--enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no
--with-isl=no --disable-multilib --with-float=hard --with-fpu=vfpv3-d16
--with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a
--enable-threads=posix --enable-multiarch --enable-libstdcxx-time=yes
--enable-gnu-indirect-function --disable-libssp --disable-libquadmath
--disable-threads --without-headers --with-newlib --disable-libmudflap
--disable-bootstrap --disable-decimal-float --disable-libgomp
--disable-libatomic --disable-libsanitizer --disable-plugins --disable-libitm
--disable-shared --with-glibc-version=2.18 --disable-libstdcxx --disable-libvtv
--enable-languages=c,c++ --build=x86_64-unknown-linux-gnu
--host=x86_64-unknown-linux-gnu --target=arm-linux-gnueabihf

2. Try compiling attached pre-processed file:
.../cc1 -fpreprocessed libata-eh.i -quiet -dumpdir drivers/ata/ -dumpbase
libata-eh.c -dumpbase-ext .c -mlittle-endian -mabi=aapcs-linux -mfpu=vfp -marm
-mfloat-abi=soft -mtune=cortex-a9 -mtls-dialect=gnu -march=armv7-a -O2 -Wall
-Wundef -Werror=strict-prototypes -Wno-trigraphs
-Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security
-Wno-frame-address -Wformat-truncation=0 -Wformat-overflow=0
-Wno-address-of-packed-member -Wframe-larger-than=1024
-Wno-unused-but-set-variable -Wimplicit-fallthrough=3 -Wunused-const-variable=0
-Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation
-Wno-zero-length-bounds -Wno-array-bounds -Wstringop-overflow=0 -Wno-restrict
-Wno-maybe-uninitialized -Werror=date-time -Werror=incompatible-pointer-types
-Werror=designated-init -Wno-packed-not-aligned -std=gnu90 -version
-fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE -fno-dwarf2-cfi-asm
-fno-ipa-sra -funwind-tables -fno-delete-null-pointer-checks
-fno-allow-store-data-races -fstack-protector-strong -fomit-frame-pointer
-fno-var-tracking-assignments -fno-strict-overflow -fno-merge-all-constants
-fmerge-constants -fstack-check=no -fconserve-stack -fmacro-prefix-map=./=  -o
libata-eh.s

3. It should crash with:
...
during RTL pass: expand
drivers/ata/libata-eh.c: In function \u2018ata_eh_link_report\u2019:
drivers/ata/libata-eh.c:2293:8: internal compiler error: in store_expr, at
expr.c:5845
0x8e8a56 store_expr(tree_node*, rtx_def*, int, bool, bool)

[Bug middle-end/95052] [9/10/11 Regression] Excess padding of partially initialized strings/char arrays

2020-05-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95052

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #9 from Maxim Kuvyrkov  ---
Hi Jakub,

Above patch has causes gcc to crash when building linux kernel for
arm-linux-gnueabihf:
# 00:02:09 drivers/char/tpm/eventlog/tpm1.c:148:7: internal compiler error: in
store_expr, at expr.c:5845
# 00:02:09 make[3]: *** [drivers/char/tpm/eventlog/tpm1.o] Error 1
# 00:02:14 make[2]: *** [drivers/char/tpm] Error 2
# 00:02:14 make[1]: *** [drivers/char] Error 2
# 00:02:58 drivers/ata/libata-eh.c:2293:8: internal compiler error: in
store_expr, at expr.c:5845
# 00:02:58 make[2]: *** [drivers/ata/libata-eh.o] Error 1
# 00:03:22 make[1]: *** [drivers/ata] Error 2
# 00:05:35 make: *** [drivers] Error 2

I've started a reproduction build to make a testcase.

[Bug rtl-optimization/90378] [9/10 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2020-02-17 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

--- Comment #3 from Maxim Kuvyrkov  ---
The miscompilation went away for a while, but has reappeared just now after

commit 2a07345c4f8dabc286fc470e76c53473e5bc3eb7
Author: Vladimir N. Makarov 
Date:   Fri Jan 31 14:26:26 2020 -0500

   Fix for PR 91333 - suboptimal register allocation for inline asm

   2020-01-31  Vladimir Makarov  

   PR rtl-optimization/91333
   * ira-color.c (bucket_allocno_compare_func): Move conflict hard
   reg preferences comparison up.

   2020-01-31  Vladimir Makarov  

   PR rtl-optimization/91333
   * gcc.target/i386/pr91333.c: New.


Miscompilation occurs in same configuration: arm-linux-gnueabihf at -O2 -flto.

I'll try to narrow this down to a single object.

[Bug tree-optimization/90576] [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413

2020-01-14 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90576

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Maxim Kuvyrkov  ---
Confirmed fixed on both aarch64-linux-gnu and arm-linux-gnueabihf for -Os
-flto.  Thanks!

[Bug tree-optimization/90576] [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413

2020-01-09 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90576

--- Comment #7 from Maxim Kuvyrkov  ---
Apologies for delay.  Kicked off SPEC2k6 builds, and will report results
tomorrow.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-10 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #28 from Maxim Kuvyrkov  ---
(In reply to Ilya Leoshkevich from comment #27)
> With
> 
> -DSPEC_CPU -DNDEBUG -DPERL_CORE   -O3 -save-temps=obj
> -fopt-info-vec-optimized   -DSPEC_CPU_LP64 -DSPEC_CPU_LINUX_X64
> -fgnu89-inline
> 
> on gcc113 I can see 2% slowdown:
> 
> r277511 (without this fix): 880.09s
> r277515 (with this fix):897.85s
> 
> The function that degraded the most is indeed S_regmatch:
> 
> $ perf diff perf-9760321.data perf-44b2b4c.data
> 32.24%   exe[.] S_regmatch  
> 
>  8.92%   exe[.] S_find_byclass.isra.0   
> 
>  6.80%   +0.28%  libc-2.19.so   [.] 0x0007dec0  
> 
>  5.20%   exe[.] S_regtry
> 
> 
> However, the "shape" of S_regmatch did not change, that is, when all
> offsets and register numbers are replaced with "x" in the objdump
> output, the old and the new versions are identical.  This hints at some
> microarchitectural effect - aliasing in the branch predictor maybe?
> 
> From my perspective, this happens too often, so I use the following test
> to rule this out: just add a nop at the beginning of the problematic
> function. This changes all the offsets and makes aliasing situation
> completely different.  And indeed, by adding a single nop to S_regmatch,
> I get wildly different results (for now this is just 1 repeat, I will
> run best-of-3 overnight):
> 
> r277511 (without this fix): 929.1s
> r277515 (with this fix):931.48s

Hi Ilya,

Thanks for the analysis.  Doesn't seem like we can do anything useful about
this regression.

[For completeness, I see same 5% slowdown with "-O3 -funroll-loops" as with
plain -O3 on Cortex-A57.]

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-09 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

--- Comment #25 from Maxim Kuvyrkov  ---
(In reply to Ilya Leoshkevich from comment #24)
> I got the following results on gcc113:
> 
> 400.perlbench
> 
> Compiler flags: -DSPEC_CPU -DNDEBUG -DPERL_CORE   -march=native -g -O3
> -funroll-loops -fopt-info-vec-optimized   -DSPEC_CPU -DNDEBUG
> -DPERL_CORE -DSPEC_CPU_LINUX -DSPEC_CPU_BIGENDIAN -D_GNU_SOURCE
> -DSPEC_CPU_LP64 -fno-strict-aliasing -std=gnu90
> 
> r277511 (without this fix): 884.11s
> r277515 (with this fix):874.93s
> 
> Maxim, could you please share compiler flags with which you are seeing the
> regression?

Hi Ilya,

Thank you for looking into this.

The flags were "-O3 -save-temps=obj -c -o av.o -DSPEC_CPU -DNDEBUG -DPERL_CORE
-DSPEC_CPU_LP64 -DSPEC_CPU_LINUX_X64 -fgnu89-inline" .  From
"-DSPEC_CPU_BIGENDIAN" I'm guessing a Power architecture, and I've confirmed
the regression on AArch64 Cortex-A57.

I'll start a run with "-funroll-loops -fopt-info-vec-optimized" to check if
they are making the problem go away.

[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)

2019-12-06 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #21 from Maxim Kuvyrkov  ---
(In reply to iii from comment #18)
> Author: iii
> Date: Mon Oct 28 13:09:54 2019
> New Revision: 277515
> 
> URL: https://gcc.gnu.org/viewcvs?rev=277515&root=gcc&view=rev
> Log:
> Move jump threading before reload

Hi Ilya,

This patch regresses performance of SPEC CPU2006's 400.perlbench on
aarch64-linux-gnu -O3 by 5% with most of the slowdown in the hottest function
S_regmatch (this is for gcc-9-branch).

benchmark,symbol,rel_sample,rel_size,results-0:sample,results-1:sample,results-0:size,results-1:size
400.perlbench,perlbench_base.default, 105,100,9281,9761,1281408,1285488
400.perlbench,[.] S_regmatch, 107,100,3641,3910,16460,16460

Is there a way to fix the problem gcc-9-branch in less intrusive way?

[Bug libfortran/78314] [aarch64] ieee_support_halting does not report unsupported fpu traps correctly

2019-11-19 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314

--- Comment #29 from Maxim Kuvyrkov  ---
(In reply to Richard Earnshaw from comment #28)
> The last release of gcc-7 has now been made, so it's end-of-life and no
> further fixes for it will be made.

Well, yes, but I'm about to build the final Linaro GCC 7.5 cross-toolchain
releases, and I could fix the problem there.  E.g., by backporting a fix from
GCC-8 or reverting Uros's patch.  Backporting from gcc-8 is preferred to
reverting, obviously.

[Bug libfortran/78314] [aarch64] ieee_support_halting does not report unsupported fpu traps correctly

2019-11-15 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #27 from Maxim Kuvyrkov  ---
Hi,

GCC 7.5 released with this bug, manifested by 
=== gfortran Summary ===

FAIL: gfortran.dg/ieee/ieee_6.f90   -O0  execution test
FAIL: gfortran.dg/ieee/ieee_6.f90   -O1  execution test
FAIL: gfortran.dg/ieee/ieee_6.f90   -O2  execution test
FAIL: gfortran.dg/ieee/ieee_6.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/ieee/ieee_6.f90   -O3 -g  execution test
FAIL: gfortran.dg/ieee/ieee_6.f90   -Os  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -O0  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -O1  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -O2  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -O3 -g  execution test
XPASS: gfortran.dg/ieee/ieee_8.f90   -Os  execution test

Do we care enough about this to fix on GCC-7?

[Bug tree-optimization/83661] sincos does not handle sin(2x)

2019-09-02 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83661

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #7 from Maxim Kuvyrkov  ---
Hi Christophe,

Re. synchronization of optimized-routines and the popular C libraries -- this
is done on regular cycles by Szabolcs (Arm).

FYI, Linaro and Arm are merging cortex-strings into optimized-routines, so soon
there will be a single repo for Arm implementations of C library functions. 
This and more will be discussed at SAN19 Connect, and there is even a
main-track session on optimized-routines.

[Bug target/91598] [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

--- Comment #2 from Maxim Kuvyrkov  ---
Created attachment 46784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46784&action=edit
Patch for 70% of the regression

[Bug target/91598] [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

--- Comment #1 from Maxim Kuvyrkov  ---
Created attachment 46783
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46783&action=edit
Testcase

Testcase reported on
https://lists.linaro.org/pipermail/linaro-toolchain/2019-August/006983.html

[Bug target/91598] New: [8/9/10 regression] 60% speed drop on neon intrinsic loop

2019-08-29 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598

Bug ID: 91598
   Summary: [8/9/10 regression] 60% speed drop on neon intrinsic
loop
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Performance of the attached neon loop drops on Cortex-A53 by about 60% between
GCC 7 and GCC 8.  Performance of trunk is the same as GCC 8.

There are two separate changes, both related to instruction scheduler that
cause the regression.  The first change in r253235 is responsible for 70% of
the regression.
===
haifa-sched: fix autopref_rank_for_schedule qsort comparator

* haifa-sched.c (autopref_rank_for_schedule): Order 'irrelevant'
insns
first, always call autopref_rank_data otherwise.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253235
138bc75d-0d04-0410-961f-82ee72b054a4
===

After this change instead of
r1 = [rb + 0]
r2 = [rb + 8]
r3 = [rb + 16]
r4 = 
r5 = 
r6 = 

we got
r1 = [rb + 0]

r2 = [rb + 8]

r3 = [rb + 16]


which, apparently, cortex-a53 autoprefetcher doesn't recognize.  This schedule
happens because r2= load gets lower priority than the "irrelevant"  due to the above patch.

If we think about it, the fact that "r1 = [rb + 0]" can be scheduled means that
true dependencies of all similar base+offset loads are resolved.  Therefore,
for autoprefetcher-friendly schedule we should prioritize memory reads before
"irrelevant" instructions.

On the other hand, following similar logic, we want to delay memory stores as
much as possible to start scheduling them only after all potential producers
are scheduled.  I.e., for autoprefetcher-friendly schedule we should prioritize
"irrelevant" instructions before memory writes.

Obvious patch to implement the above is attached.  It brings 70% of regressed
performance on this testcase back.

The second part of the regression is due to compiler getting lucky with
scheduling inline-asms representing the intrinsics.  After 
===
Set default sched pressure algorithm

The Arm backend sets the default sched-pressure algorithm to
SCHED_PRESSURE_MODEL.
Benchmarking on AArch64 shows this speeds up floating point performance on
SPEC -
eg. CactusBSSN improves by ~16%.  The gains are mostly due to less
spilling,
so enable this on AArch64 by default.

gcc/
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set PARAM_SCHED_PRESSURE_ALGORITHM to SCHED_PRESSURE_MODEL.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@254378
138bc75d-0d04-0410-961f-82ee72b054a4
===
the compiler no longer gets lucky on this testcase.

The solution here is to convert intrinsics in arm-neon.h to builtins/UNSPECs
and attach scheduler descriptions to the UNSPECs.

[Bug lto/90957] New: 453.povray is miscompiled with -flto -static

2019-06-21 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90957

Bug ID: 90957
   Summary: 453.povray is miscompiled with -flto -static
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46509
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46509&action=edit
Script to reproduce

453.povray is miscompiled with -static -flto on arm-linux-gnueabihf.  The
failure confirmed vs trunk and gcc-8-branch, didn't check other releases.

Failure seems to occur only when both -static -flto are used together; -Ox
flags don't affect the outcome.

I've reduce the problem to a single object povms.o.  The failure occurs when
povms.o is compiled with "-flto -static" and the rest of benchmark is compiled
with "-static".  Compiling povms.o with "-static" makes the failure go away.

Attached is a script to reproduce: it uses Arm's pre-built cross-toolchain as
base install and replaces compiler with a freshly built one.  You need to
supply your own sources of 453.povray and an aarch32 machine.  Tested on armv7,
but armv8 should do as well.

# Produce good binary
./povray-build.sh true

# Produce bad binary
./povray-build.sh false

[Bug tree-optimization/90576] New: [10 regression] SPEC CPU2006 450.soplex miscompiled with -Os -flto after r271413

2019-05-22 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90576

Bug ID: 90576
   Summary: [10 regression] SPEC CPU2006 450.soplex miscompiled
with -Os -flto after r271413
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

After
===
commit ce7b4f267706c23405705d848c1dcf686496f262
Author: hubicka 
Date:   Mon May 20 12:01:40 2019 +

   * tree-ssa-alias.c (compare_sizes): New function.
   (sompare_type_sizes): New function
   (aliasing_component_refs_p): Use it.
   (indirect_ref_may_alias_decl_p): Likewise.


   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@271413
138bc75d-0d04-0410-961f-82ee72b054a4
===
GCC miscompiles 450.soplex with -Os -flto at least on AArch64 and AArch32.  The
benchmark finishes within seconds with
===
450.soplex: copy 0 non-zero return code (exit code=11, signal=0)
===

FWIW, "-Os -fno-lto" seem to work.

Considering that both AArch64 and AArch32 are affected and the nature of the
patch, this is likely affects other architectures.

Honza, would you please investigate?  Please let me know if it doesn't readily
reproduce for you, and I'll help with a testcase.

[Bug rtl-optimization/90378] New: [9 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2019-05-07 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

Bug ID: 90378
   Summary: [9 regression] -Os -flto miscompiles 454.calculix
after r266385 on Arm
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

I'm investigating a miscompilation of 454.calculix, which occurs only with LTO
on ARMv7.  Miscompilation results in SIGSEGV and seems to occur only on ARMv7
hardware (cortex-a15), same binary runs fine on ARMv8 hardware.

Miscompilation occurs at and after
===
commit 8fc3599dcab36e0b905fa442c0fc0a905280eea2 (HEAD)
Author: vmakarov 
Date:   Thu Nov 22 17:25:57 2018 +

2018-11-22  Vladimir Makarov  

PR rtl-optimization/87718
* ira-costs.c: Remove trailing white-spaces.
(record_operand_costs): Add a special treatment for moves
involving a hard register.

2018-11-22  Vladimir Makarov  

PR rtl-optimization/87718
* gcc.target/i386/pr82361-1.c: Check only the first operand of
moves.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266385
138bc75d-0d04-0410-961f-82ee72b054a4
===
but I do not have hard evidence that this patch is the root cause.  It may be
triggering a latent bug or even exposing undefined behavior in the benchmark.

I'm struggling to reduce the 300-object LTO compilation down to something
manageable.  Advice and ideas on where to look are appreciated.

[Bug bootstrap/89656] [9 Regression] profiledbootstrap failure on aarch64-linux since r269453

2019-03-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89656

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #2 from Maxim Kuvyrkov  ---
Hi Jakub,

The above patch fixes ICE on building linux kernel for arm
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89655)

[Bug middle-end/89655] GCC crashes building linux kernel for arm 32-bit (culprit r269453)

2019-03-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89655

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Maxim Kuvyrkov  ---
Jakub's patch in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89656 fixes ICE
on linux kernel for arm.

[Bug middle-end/89655] GCC crashes building linux kernel for arm 32-bit (culprit r269453)

2019-03-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89655

--- Comment #3 from Maxim Kuvyrkov  ---
ICE backtrace:
*** WARNING *** there are active plugins, do not report this as a bug unless
you can reproduce it without enabling any plugins.
Event| Plugins
PLUGIN_ATTRIBUTES| latent_entropy_plugin
PLUGIN_START_UNIT| latent_entropy_plugin
during GIMPLE pass: vrp
net/6lowpan/nhc.c: In function  lowpan_nhc_do_uncompression :
net/6lowpan/nhc.c:157:5: internal compiler error: in min_value, at
wide-int.cc:332
0xf4e9ee wi::min_value(unsigned int, signop)
   
/home/maxim.kuvyrkov/tcwg_kernel-gnu/abe/snapshots/gcc.git~master/gcc/wide-int.cc:332
0xe33c2c set_range_info(tree_node*, value_range_kind,
generic_wide_int > const&,
generic_wide_int > const&)
   
/home/maxim.kuvyrkov/tcwg_kernel-gnu/abe/snapshots/gcc.git~master/gcc/tree-ssanames.c:384
0xe34056 set_range_info(tree_node*, value_range_base const&)
   
/home/maxim.kuvyrkov/tcwg_kernel-gnu/abe/snapshots/gcc.git~master/gcc/tree-ssanames.c:408
0xeacb5a vrp_prop::vrp_finalize(bool)
   
/home/maxim.kuvyrkov/tcwg_kernel-gnu/abe/snapshots/gcc.git~master/gcc/tree-vrp.c:6692
0xebda8e execute_vrp
   
/home/maxim.kuvyrkov/tcwg_kernel-gnu/abe/snapshots/gcc.git~master/gcc/tree-vrp.c:6780
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug middle-end/89655] GCC crashes building linux kernel for arm 32-bit (culprit r269453)

2019-03-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89655

Maxim Kuvyrkov  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #1 from Maxim Kuvyrkov  ---
Hi Richi,

Would you please investigate?  Ping me on IRC (maximk) if you have problems
with reproducing the ICE.

[Bug middle-end/89655] New: GCC crashes building linux kernel for arm 32-bit (culprit r269453)

2019-03-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89655

Bug ID: 89655
   Summary: GCC crashes building linux kernel for arm 32-bit
(culprit r269453)
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

Created attachment 45933
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45933&action=edit
Reproducer tarball

After r269453 GCC ICEs when building linux kernel for AArch32 in allmodconfig
and allyesconfig kernel configurations.

I've reduced the crash down to a single preprocessed file, but could not
eliminate all kernel plugins -- one plugin is still left.

The attached tarball has preprocessed source, binary plugin and GCC's
configuration string.

Culprit:

commit 791a496442cb02f7ab6b50e291e1f0669e09e99d
Author: rguenth 
Date:   Thu Mar 7 12:46:44 2019 +

   2019-03-07  Richard Biener  

PR tree-optimization/89595
* tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Take
stmt iterator as reference, take boolean output parameter to
indicate whether the stmt was removed and thus the iterator
already advanced.
(dom_opt_dom_walker::before_dom_children): Re-iterate over
stmts created by folding.

* gcc.dg/torture/pr89595.c: New testcase.


   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@269453
138bc75d-0d04-0410-961f-82ee72b054a4


Reproduction instructions from Linaro TCWG's automated bisections (just in
case):
Reproduce builds:

mkdir investigate-gcc-791a496442cb02f7ab6b50e291e1f0669e09e99d
cd investigate-gcc-791a496442cb02f7ab6b50e291e1f0669e09e99d

git clone https://git.linaro.org/toolchain/jenkins-scripts

mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh
https://ci.linaro.org/job/tcwg_kernel-bisect-gnu-master-arm-lts-allyesconfig/11/artifact/artifacts/manifests/build-baseline.sh
curl -o artifacts/manifests/build-parameters.sh
https://ci.linaro.org/job/tcwg_kernel-bisect-gnu-master-arm-lts-allyesconfig/11/artifact/artifacts/manifests/build-parameters.sh
curl -o artifacts/test.sh
https://ci.linaro.org/job/tcwg_kernel-bisect-gnu-master-arm-lts-allyesconfig/11/artifact/artifacts/test.sh
chmod +x artifacts/test.sh

# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_kernel-build.sh @@ artifacts/manifests/build-baseline.sh

cd gcc

# Reproduce first_bad build
git checkout --detach 791a496442cb02f7ab6b50e291e1f0669e09e99d
../artifacts/test.sh

# Reproduce last_good build
git checkout --detach 02a7fc594d3a0b1f2b2fc8d5fd8ea425ff45d418
../artifacts/test.sh

cd ..


[Bug gcov-profile/85871] g++.dg/gcov/gcov-8.C random failures

2018-05-30 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85871

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #5 from Maxim Kuvyrkov  ---
(In reply to Christophe Lyon from comment #0)
> Hi,
> 
> It seems g++.dg/gcov/gcov-8.C is failing randomly as one can see by looking
> at reports on https://gcc.gnu.org/ml/gcc-testresults/
...
> FAIL: g++.dg/gcov/gcov-8.C  -std=gnu++98  gcov failed: File
> '/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_32-build/snapshots/
> gcc.git~master_rev_e4ddfbc4c1d49271c7c17e574c522a53f9152522/gcc/testsuite/
> g++.dg/gcov/gcov-8.C'

Is this unfortunate copy-paste or does "gcov failed: File ..." part appears on
the same line as "FAIL: g++.dg/gcov/gcov-8.C  -std=gnu++98"?

If it is the latter, then the problem is, likely, in testing harness mixing up
file descriptors, and sending 1st line of output to a wrong file.  This causes
the failure since pattern for the first line doesn't match.

[Bug target/85434] Address of stack protector guard spilled to stack on ARM

2018-05-01 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85434

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #10 from Maxim Kuvyrkov  ---
Patch posted: https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01264.html

[Bug target/69770] [ARM] -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2018-02-21 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69770

--- Comment #4 from Maxim Kuvyrkov  ---
Update from Charles:
===
When I tried to bootstrap the compiler with the patch, it failed because of
issues with -fPIC. I'm not sure it's worth pursuing the amount of work required
to resolve the -fPIC issue.
===

[Bug target/69770] [ARM] -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2018-02-21 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69770

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
 CC||mkuvyrkov at gcc dot gnu.org
   Assignee|cbaylis at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

[Bug target/43725] Poor instructions selection, scheduling and registers allocation for ARM NEON intrinsics

2017-09-28 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725

Maxim Kuvyrkov  changed:

   What|Removed |Added

 Status|ASSIGNED|UNCONFIRMED
 Ever confirmed|1   |0

[Bug target/65139] Improve register allocation for aarch64_*_sisd_or_int3 patterns

2017-06-06 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65139

--- Comment #4 from Maxim Kuvyrkov  ---
Kugan posted a patch for this, but it was rejected.

Kugan, would you please put a link to your patch in the comments and unassign
yourself (unless you plan to work on this further).

Thanks.

[Bug target/77728] [5 Regression] Miscompilation multiple vector iteration on ARM

2017-05-10 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728

--- Comment #60 from Maxim Kuvyrkov  ---
(In reply to Jakub Jelinek from comment #59)
> And another thing was the bug
> mentioned here, introduced for arm32 in 5.2 and for aarch64 only during
> development of GCC 7.  So there was no release for aarch64 with that bug.

Thanks Jacub, I missed the last part about aarch64 being affected only in
development versions of GCC 7.

[Bug target/77728] [5 Regression] Miscompilation multiple vector iteration on ARM

2017-05-10 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77728

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #58 from Maxim Kuvyrkov  ---
Shouldn't the release note [*] also specify AArch64 as the affected target, not
just ARM/AArch32?

[*] https://gcc.gnu.org/gcc-7/changes.html

[Bug rtl-optimization/79149] bad optimization on MIPS and ARM leading to excessive stack usage in some cases

2017-01-20 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149

--- Comment #6 from Maxim Kuvyrkov  ---
Without looking at the code (it's 11pm) my guess is that 1st scheduling pass is
misbehaving in some way, most likely it is doing a lot of interblock moves. 
One of the big differences between x86 and ARM/MIPS scheduling is that x86
disables interblock scheduling.  Does -fno-schedule-insns fix the warnings on
ARM/MIPS?

[Bug rtl-optimization/79059] Information from CCmode is not propagated across basic blocks

2017-01-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79059

--- Comment #3 from Maxim Kuvyrkov  ---
(In reply to Richard Biener from comment #2)
> Somewhat related to PR78200 (x86 has the CCmodes nicely combined but the
> branches are the wrong order).

Indeed it is related.  It is [expectedly] mcf's primal_bea_mpp where this is
triggered.

[Bug rtl-optimization/79059] Information from CCmode is not propagated across basic blocks

2017-01-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79059

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||bernds at gcc dot gnu.org,
   ||ramana.radhakrishnan@gmail.
   ||com, rearnsha at arm dot com

--- Comment #1 from Maxim Kuvyrkov  ---
Adding people who might be interested.

[Bug rtl-optimization/79059] New: Information from CCmode is not propagated across basic blocks

2017-01-11 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79059

Bug ID: 79059
   Summary: Information from CCmode is not propagated across basic
blocks
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mkuvyrkov at gcc dot gnu.org
  Target Milestone: ---

This bug report is motivated by a performance regression [1] in 429.mcf on
AArch64, but is relevant to all targets that use CCmode and rely on combine
optimization.  Below description assumes AArch64 ISA.

Sample code:
===
:
...
add w1, w1, w2
cmp w1, #0
b.nz BB1

:
ccmp ..., eq// Some instruction that needs only NZ bits of CC register
cmp w2, w3  // Set CC to something new
b.eq BB1
===

The high-level issue is that "add" and "cmp" instructions can't be combined
into "adds" in BB1 because reg liveness info at the top of BB2 advertises that
it needs "CC" register [1].  While BB2 really needs only part of CC register
valid (NZ flags), liveness info cannot relate that.  Therefore liveness info
marks all of CC as used thus preventing combining optimization.

I've considered several ways to improve on the situation, but none of them seem
particularly appealing.  I would appreciate improvements and suggestions on
these or other approaches.

#1 Make register liveness info include mode information.

The current state can be viewed as all registers listing their widest mode.  We
can [incrementally] set more precise modes on registers (e.g., CC_REGNUM) when
cases like the above present themselves.  This would be a substantial overall
project, with several milestones each of which is worthy in itself.  I.e.,
phase_1: add mode field, set it conservatively, and verify it is propagated
correctly through dataflow;
phase_2: improve handling of CC modes for the above motivating example;
phase_3: improve handling of modes for non-CC registers when examples present
themselves.

The main advantage of this approach is that it will benefit many architectures
and will improve liveness information for all registers, not just CC_REGNUM. 
The main disadvantage -- it is a big project.

#2 Split CC_REGNUM into separate registers: CC_NZ_REGNUM, CC_CV_REGNUM.

This would require substantial rework of aarch64 backend.  All patterns needs
to audited, some patterns will need to be duplicated.  It might be possible to
reduce pattern duplication by inventing additional iterators in MD files, or
otherwise automating conversion.

This work needs to be done entirely in aarch64 backend, which, IMO, is bad
since other targets do not benefit.

#3 

Suggestions and comments are welcome.

[1] The regression occurred after a legitimate patch (IIRC, rev. 232442 by
Kyrill Tkachov) made GCC generate "ccmp" instruction in BB2 instead of starting
BB2 with "cmp w1, #0".

[2] "adds" instruction sets NZ flags just like "cmp" instruction would, but CV
flags are set differently.  Therefore "cmp" can be substituted with "adds" only
when CV flags are unused.

[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication

2016-12-12 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438

--- Comment #12 from Maxim Kuvyrkov  ---
(In reply to Andrew Pinski from comment #11)
> (In reply to Maxim Kuvyrkov from comment #9)
> > which then becomes for aarch64:
> > .L4:
> > ld2 {v0.2d - v1.2d}, [x1]
> > add w2, w2, 1
> > cmp w2, w7
> > eor v0.16b, v2.16b, v0.16b
> > umovx4, v0.d[1]
> > st1 {v0.d}[0], [x1]
> > add x1, x1, 32
> > str x4, [x1, -16]
> > bcc .L4
> 
> 
> What I did for thunderx was create a vector cost model which caused this
> loop not be vectorized to get the regression from happening.  Not this might
> actually be better code for some micro arch. I need to check with the new
> processor we have in house but that is next week or so.  I don't know how
> much I can share next week though.

You are making an orthogonal point to this bug report: whether or not to
vectorize such a loop.  But if loop is vectorized, then on any
microarchitecture it is better to have "st2" vs "umov; st1; str".

[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication

2016-12-12 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438

--- Comment #10 from Maxim Kuvyrkov  ---
(In reply to Maxim Kuvyrkov from comment #9)
> which then becomes for aarch64:
> .L4:
>   ld2 {v0.2d - v1.2d}, [x1]
>   add w2, w2, 1
>   cmp w2, w7
>   eor v0.16b, v2.16b, v0.16b
>   umovx4, v0.d[1]
>   st1 {v0.d}[0], [x1]
>   add x1, x1, 32
>   str x4, [x1, -16]
>   bcc .L4

IIUC,
umovx4, v0.d[1]
st1 {v0.d}[0], [x1]
str x4, [x1, -16]
could become just
st2 {v0.d - v1.2d}, [x1]

[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication

2016-12-12 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438

Maxim Kuvyrkov  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #9 from Maxim Kuvyrkov  ---
I've looked into another case where inability to handle stores with gaps
generates sub-optimal code.  I'm interested in spending some time on fixing
this, provided some guidance in the vectorizer.

Is it substantially more difficult to handle stores with gaps compared to loads
with gaps?

The following is [minimally] reduced from 462.libquantum:quantum_sigma_x(),
which is #2 function in 462.libquantum profile.  This cycle accounts for about
25% of total 462.libquantum time.

===struct node_struct
{
  float _Complex gap;
  unsigned long long state;
};

struct reg_struct
{
  int size;
  struct node_struct *node;
};

void
func(int target, struct reg_struct *reg)
{
  int i;

  for(i=0; isize; i++)
reg->node[i].state ^= ((unsigned long long) 1 << target);
}
===

This loop vectorizes into
  :
  # vectp.8_39 = PHI 
  vect_array.10 = LOAD_LANES (MEM[(long long unsigned int *)vectp.8_39]);
  vect__5.11_41 = vect_array.10[0];
  vect__5.12_42 = vect_array.10[1];
  vect__7.13_44 = vect__5.11_41 ^ vect_cst__43;
  _48 = BIT_FIELD_REF ;
  MEM[(long long unsigned int *)ivtmp_45] = _48;
  ivtmp_50 = ivtmp_45 + 16;
  _51 = BIT_FIELD_REF ;
  MEM[(long long unsigned int *)ivtmp_50] = _51;

which then becomes for aarch64:
.L4:
ld2 {v0.2d - v1.2d}, [x1]
add w2, w2, 1
cmp w2, w7
eor v0.16b, v2.16b, v0.16b
umovx4, v0.d[1]
st1 {v0.d}[0], [x1]
add x1, x1, 32
str x4, [x1, -16]
bcc .L4

[Bug debug/61033] [4.9 Regression] Infinite loop in variable tracking

2016-02-24 Thread mkuvyrkov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61033

--- Comment #15 from Maxim Kuvyrkov  ---
Author: mkuvyrkov
Date: Wed Feb 24 16:41:52 2016
New Revision: 233675

URL: https://gcc.gnu.org/viewcvs?rev=233675&root=gcc&view=rev
Log:
PR debug/61033
* g++.dg/pr61033.C: Add a regression testcase for PR debug/61033.

Added:
trunk/gcc/testsuite/g++.dg/pr61033.C
Modified:
trunk/gcc/testsuite/ChangeLog

  1   2   >