[Bug target/92051] New: Many aarch64 SVE tests fail with ICE (expected integer_cst, have poly_int_cst in to_wide)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92051 Bug ID: 92051 Summary: Many aarch64 SVE tests fail with ICE (expected integer_cst, have poly_int_cst in to_wide) Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- I am seeing several hundred aarch64 sve tests fail with an ICE since Oct 8, 2019. One such failure is gcc.target/aarch64/sve/while_1.c, the ICE is: 0x6150f7 tree_check_failed(tree_node const*, char const*, int, char const*, ...) ../../gcc/gcc/tree.c:9924 0x679e13 tree_int_cst_elt_check(tree_node const*, int, char const*, int, char const*) ../../gcc/gcc/tree.h:3455 0x679e13 wi::to_wide(tree_node const*) ../../gcc/gcc/tree.h:5795 0xe89417 value_range_base::lower_bound(unsigned int) const ../../gcc/gcc/tree-vrp.c:6136 0xe8954f value_range_base::lower_bound(unsigned int) const ../../gcc/gcc/tree-vrp.c:6123 0x1412203 range_operator::fold_range(tree_node*, value_range_base const&, value_range_base const&) const ../../gcc/gcc/range-op.cc:156 0xe92287 range_fold_binary_expr(value_range_base*, tree_code, tree_node*, value_range_base const*, value_range_base const*) ../../gcc/gcc/tree-vrp.c:1915 0xf0ba47 vr_values::extract_range_from_binary_expr(value_range*, tree_code, tree_node*, tree_node*, tree_node*) ../../gcc/gcc/vr-values.c:808 0xf126cf vr_values::extract_range_from_assignment(value_range*, gassign*) ../../gcc/gcc/vr-values.c:1466 0x1357e73 evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool) ../../gcc/gcc/gimple-ssa-evrp-analyze.c:307 0xd213e7 dom_opt_dom_walker::before_dom_children(basic_block_def*) ../../gcc/gcc/tree-ssa-dom.c:1503 0x1331f3f dom_walker::walk(basic_block_def*) ../../gcc/gcc/domwalk.c:309 0xd2099f execute^M ../../gcc/gcc/tree-ssa-dom.c:724 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions
[Bug tree-optimization/90836] Missing popcount pattern matching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90836 --- Comment #4 from Steve Ellcey --- Author: sje Date: Tue Oct 8 21:53:03 2019 New Revision: 276722 URL: https://gcc.gnu.org/viewcvs?rev=276722&root=gcc&view=rev Log: 2019-10-08 Dmitrij Pochepko PR tree-optimization/90836 * lib/target-supports.exp (check_effective_target_popcount) (check_effective_target_popcountll): New effective targets. * gcc.dg/tree-ssa/popcount4.c: New test. * gcc.dg/tree-ssa/popcount4l.c: New test. * gcc.dg/tree-ssa/popcount4ll.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount4.c trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount4l.c trunk/gcc/testsuite/gcc.dg/tree-ssa/popcount4ll.c Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/target-supports.exp
[Bug tree-optimization/90836] Missing popcount pattern matching
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90836 --- Comment #3 from Steve Ellcey --- Author: sje Date: Tue Oct 8 21:50:05 2019 New Revision: 276721 URL: https://gcc.gnu.org/viewcvs?rev=276721&root=gcc&view=rev Log: 2019-10-08 Dmitrij Pochepko PR tree-optimization/90836 * gcc/match.pd (popcount): New pattern. Modified: trunk/gcc/ChangeLog trunk/gcc/match.pd
[Bug middle-end/91983] New: g++.dg/tree-ssa/pr61034.C regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91983 Bug ID: 91983 Summary: g++.dg/tree-ssa/pr61034.C regression Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- The g++.dg/tree-ssa/pr61034.C test has been failing since around Sept 15, 2019. Looking at gcc-testresults it looks like is failing on aarch64, x86, power, and probably more. FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++14 scan-tree-dump-times fre3 "free" 14 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++14 scan-tree-dump-times fre3 ";; Function" 1 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++14 scan-tree-dump-times optimized "free" 0 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++17 scan-tree-dump-times fre3 "free" 14 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++17 scan-tree-dump-times fre3 ";; Function" 1 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++17 scan-tree-dump-times optimized "free" 0 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++98 scan-tree-dump-times fre3 "free" 14 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++98 scan-tree-dump-times fre3 ";; Function" 1 FAIL: g++.dg/tree-ssa/pr61034.C -std=gnu++98 scan-tree-dump-times optimized "free" 0
[Bug target/91982] New: gcc.target/aarch64/sve/clastb_*.c tests failing with segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91982 Bug ID: 91982 Summary: gcc.target/aarch64/sve/clastb_*.c tests failing with segfault Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- Target: aarch64 A number of gcc.target/aarch64/sve/clastb_* tests (1-8) are failing with segfaults. This seems to have started around September 30, 2019. One example: % install/bin/gcc -O2 -ftree-vectorize -march=armv8.2-a+sve clastb_1.c during GIMPLE pass: vect clastb_1.c: In function 'condition_reduction': clastb_1.c:9:1: internal compiler error: Segmentation fault 9 | condition_reduction (int *a, int min_v) | ^~~ 0xc0103f crash_signal ../../gcc/gcc/toplev.c:326 0x7f5118 dominated_by_p(cdi_direction, basic_block_def const*, basic_block_def const*) ../../gcc/gcc/dominance.c:1118 0xe4dcdb vect_transform_stmt(_stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*) ../../gcc/gcc/tree-vect-stmts.c:10903 0xe5054f vect_transform_loop_stmt ../../gcc/gcc/tree-vect-loop.c:8351 0xe5a51f vect_transform_loop(_loop_vec_info*) ../../gcc/gcc/tree-vect-loop.c:8578 0xe7eda7 try_vectorize_loop_1 ../../gcc/gcc/tree-vectorizer.c:983 0xe7f7d3 vectorize_loops() ../../gcc/gcc/tree-vectorizer.c:1115 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions.
[Bug c++/91889] [10 Regression] error: call of overloaded ‘to_value_ptr(B*&)’ is ambiguous
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91889 --- Comment #10 from Steve Ellcey --- (In reply to Marek Polacek from comment #9) > I'll raise it with CWG; suspending until then. Not sure if it matters but there seem to be 8 instances of this problem in Boost (get_color, get_left, get_next, get_parent, get_previous, get_right, to_hook_ptr, and to_value_ptr). The actual number of errors you get during a boost build is much higher due to hitting these overload issues numerous times during the build.
[Bug c++/91889] New: Boost does not build with top-of-tree GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91889 Bug ID: 91889 Summary: Boost does not build with top-of-tree GCC Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- A recent g++ change broke the boost build. It is dying with many (many) errors like this: ./boost/intrusive/list.hpp:1448:7: required from here ./boost/intrusive/detail/list_iterator.hpp:93:41: error: call of overloaded 'get _next(boost::intrusive::list_node*&)' is ambiguous 93 | node_ptr p = node_traits::get_next(members_.nodeptr_); |~^~~ In file included from ./boost/intrusive/list_hook.hpp:20, from ./boost/intrusive/list.hpp:20, from ./boost/fiber/context.hpp:29, from libs/fiber/src/algo/algorithm.cpp:9: Marek thinks it is due to his recent patch: commit 5ac76b02008255b7f427e6309c2dc3e42bd64561 Author: mpolacek Date: Mon Sep 23 17:37:54 2019 + PR c++/91844 - Implement CWG 2352, Similar types and reference binding. * call.c (reference_related_p): Use similar_type_p instead of same_type_p. (reference_compatible_p): Update implementation to match CWG 2352. * cp-tree.h (similar_type_p): Declare. * typeck.c (similar_type_p): New. * g++.dg/cpp0x/pr33930.C: Add dg-error. * g++.dg/cpp0x/ref-bind1.C: New test. * g++.dg/cpp0x/ref-bind2.C: New test. * g++.dg/cpp0x/ref-bind3.C: New test. * g++.old-deja/g++.pt/spec35.C: Remove dg-error. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276058 138bc75d-0d04-0410-961f-82ee72b054a4
[Bug tree-optimization/91885] New: ICE when compiling SPEC 2017 blender benchmark with -O3 -fprofile-generate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91885 Bug ID: 91885 Summary: ICE when compiling SPEC 2017 blender benchmark with -O3 -fprofile-generate Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- When compiling the SPEC 2017 526.blender_r benchmark for peak, the compilation that tries to generate profile information aborts with an ICE. This happens to me on an Aarch64 machine (ThunderX2) with -O3 -fprofile-generate. Below is a small testcase cutdown from blender that shows the ICE. % /extra/sellcey/gcc-tot/install/bin/gcc -O3 -fprofile-generate x.i during GIMPLE pass: vect x.i: In function 'IMB_indexer_open': x.i:18:20: internal compiler error: in execute_todo, at passes.c:2032 18 | struct anim_index *IMB_indexer_open(const char *name) { |^~~~ 0xb87933 execute_todo ../../gcc/gcc/passes.c:2032 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. % cat x.i typedef signed long int __int64_t; typedef unsigned long int __uint64_t; typedef __int64_t int64_t; typedef __uint64_t uint64_t; inline void BLI_endian_switch_int64(int64_t *val) { uint64_t tval = *val; *val = ((tval >> 56)) | ((tval << 40) & 0x00ffll) | ((tval << 24) & 0xff00ll) | ((tval << 8) & 0x00ffll) | ((tval >> 8) & 0xff00ll) | ((tval >> 24) & 0x00ffll) | ((tval >> 40) & 0xff00ll) | ((tval << 56)); } typedef struct anim_index_entry { unsigned long long seek_pos_dts; unsigned long long pts; } anim_index_entry; extern struct anim_index_entry *MEM_callocN(int); struct anim_index { int num_entries; struct anim_index_entry *entries; }; struct anim_index *IMB_indexer_open(const char *name) { char header[13]; struct anim_index *idx; int i; idx->entries = MEM_callocN(8); if (((1 == 0) != (header[8] == 'V'))) { for (i = 0; i < idx->num_entries; i++) { BLI_endian_switch_int64((int64_t *)&idx->entries[i].seek_pos_dts); BLI_endian_switch_int64((int64_t *)&idx->entries[i].pts); } } }
[Bug c++/91222] [10 Regression] 507.cactuBSSN_r build fails in warn_types_mismatch at ipa-devirt.c:1006 since r273571
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91222 --- Comment #17 from Steve Ellcey --- I tested Jason's patch on my Aarch64 box and it fixed the ICE. Any chance we could check that patch in so that we could build SPEC 2017 with -flto? I don't know if we want to allow this mismatch or not but we certainly don't want GCC to ICE and this patch does fix that. I guess it also allows the types to match or it wouldn't have created an executable. If we don't want to allow the type match we would have to have SPEC modify the test sources to get it to work.
[Bug bootstrap/91825] Top-of-tree GCC does not bootstrap (uninitialized warning)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91825 --- Comment #4 from Steve Ellcey --- The original bootstrap failure is on aarch64.
[Bug bootstrap/91825] New: Top-of-tree GCC does not bootstrap (uninitialized warning)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91825 Bug ID: 91825 Summary: Top-of-tree GCC does not bootstrap (uninitialized warning) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- GCC bootstrap currently fails with this error: /home/sellcey/tot/src/gcc/gcc/expmed.c: In function ‘rtx_def* emit_store_flag_1(rtx, rtx_code, rtx, rtx, machine_mode, int, int, machine_mode)’: /home/sellcey/tot/src/gcc/gcc/expmed.c:5602:19: error: ‘int_mode’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 5602 | scalar_int_mode int_mode; | ^~~~ This appears to be due to: https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00923.html
[Bug middle-end/91599] New: GCC does not say where warning is happening
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91599 Bug ID: 91599 Summary: GCC does not say where warning is happening Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- When compiling the following source file, GCC gives a warning. The warning notes that the declaration is on line 2 but it does not say what line the actual write is on (line 12). This message started showing up with Martin Sebor's patch for PR c++/83431 though I don't know if he added it or if he just made it show up in places where it wasn't happening before. % cat x.c struct charseq { unsigned char bytes[0]; }; struct locale_ctype_t { struct charseq *mboutdigits[10]; }; void ctype_finish (struct locale_ctype_t *ctype) { long unsigned int cnt; for (cnt = 0; cnt < 20; ++cnt) { static struct charseq replace[2]; replace[0].bytes[1] = '\0'; ctype->mboutdigits[cnt] = &replace[0]; } } % install/bin/gcc -O2 -c x.c x.c: In function ‘ctype_finish’: cc1: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] x.c:2:18: note: destination object declared here 2 |unsigned char bytes[0]; | It would be nice if the warning said the write was on line 12 as well as saying that the declaration is on line 2. This test case is cutdown from code in glibc where the code doing the write was less easy to find.
[Bug c++/89179] compiler error: in ggc_set_mark, at ggc-page.c:1532
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89179 --- Comment #17 from Steve Ellcey --- The bug I was seeing on aarch64 turns out to be PR 91404. It has now been fixed. I don't know if that patch will also fix the original bug seen on Darwin or not.
[Bug c++/89179] compiler error: in ggc_set_mark, at ggc-page.c:1532
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89179 --- Comment #16 from Steve Ellcey --- I built ggc-page.c with GCC_DEBUG_LEVEL 5 and I see: Allocating object, requested size=360, actual=360 at 0x8726c210 on 0x10549200 Freeing object, actual size=360, at 0x8726c210 on 0x10549200 But then I wind up calling gt_ggc_mx_symtab_node with x_p of 0x8726c210. I don't think I should be calling this (via ggc_mark_roots and ggc_collect) if it has already been freed.
[Bug c++/89179] compiler error: in ggc_set_mark, at ggc-page.c:1532
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89179 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #14 from Steve Ellcey --- I think I may be seeing this same bug on aarch64 when building the RAJA library based on where I am dying in ggc_set_mark. I have not been able to create a preprocessed test case because when I compile the preprocessed sources the bug does not happen. Here is my segfault dump: min.cpp:191:1: internal compiler error: Segmentation fault 191 | } | ^ 0xf03b5f crash_signal ../../gcc/gcc/toplev.c:326 0x9cc86c lookup_page_table_entry ../../gcc/gcc/ggc-page.c:632 0x9cc86c ggc_set_mark(void const*) ../../gcc/gcc/ggc-page.c:1531 0xc6fe47 gt_ggc_mx_symtab_node(void*) /home/sellcey/gcc-raja/obj-gcc/gcc/gtype-desc.c:1302 0xe17503 gt_ggc_ma_order ./gt-passes.h:31 0xbe44f3 ggc_mark_root_tab ../../gcc/gcc/ggc-common.c:77 0xbe4813 ggc_mark_roots() ../../gcc/gcc/ggc-common.c:94 0x9cd1fb ggc_collect() ../../gcc/gcc/ggc-page.c:2201 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. When I look at gt_ggc_mx_symtab_node, the initial x_p pointer that comes in is reasonable (0xa020c210) but after xlimit = ((*xlimit).next); The value of xlimit becomes 0xa5a5a5a5a5a5a5a5. That looks like a bogus value something might have put into memory to poison it but I didn't see that specific string in the GCC source tree anywhere.
[Bug driver/91406] New: gcc -Q -v lies about what flags are enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91406 Bug ID: 91406 Summary: gcc -Q -v lies about what flags are enabled Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- If you run 'gcc -Q -v x.c' and look at the 'options enabled:' list, it is not accurate. For example, on aarch64 it will show '-fprefetch-loop-arrays' which is not on by default for a generic aarch64 compiles (even at -O3). The problem is that this flag is initialized to -1 and it might be overridden by aarch64_override_options_internal in some cases to turn it on but if it is not overridden it stays at -1 and then option_enabled (opts-common.c) checks to see if it is zero or not-zero and if not-zero it returns true and says it is enabled. Note that in this case the compiler will not actually generate prefetch instructions because the gate function is checking for 'x > 0', not 'x != 0' like option_enabled does. This can affect any option in commons.opt (or elsewhere) that is initialized to -1. There are also flags that are initialized to 1 but probably should not show up if compiling at -O0 because in that case the pass that would check the flag is never called, such as -faggressive-loop-optimizations for example. If you run a '-Q -v -O0' compilation on x86 the list of enabled options will include -faggressive-loop-optimizations which I am sure is not actually run. I guess you could claim it is enabled but not run, but that seems unhelpful. I could fix the specific aarch64 '-fprefetch-loop-arrays' bug by having aarch64_override_options_internal set the prefetch flag to 0 in those cases where it is not setting it to 1. That way it would never be -1 when option_enabled checks it, but I am not sure this the right/best fix.
[Bug middle-end/91242] ICE on aarch64 SVE tests - gcc.target/aarch64/sve/clastb_[146].c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91242 --- Comment #7 from Steve Ellcey --- (In reply to Martin Liška from comment #5) > (In reply to Jaydeep Chauhan from comment #4) > > Hello, > > > > With latest trunk issue is not reproducible for all three > > case(clastb_1.c,clastb_4.c,clastb_6.c). > > > > Command line options: > > > > gcc/cc1 gcc-10.0/gcc/testsuite/gcc.target/aarch64/sve/clastb_6.c > > -march=armv8.2-a+sve Are you running the test by hand? When I look at the failure in my gcc test run log file I see that it was compiled with -O2 -ftree-vectorize as well as -march=armv8.2-a+sve. I didn't specify those options, they got added on from the dg-options entry in the test program. From my gcc testsuite log file: spawn -ignore SIGHUP /home/sellcey/tot/obj/gcc/gcc/xgcc -B/home/sellcey/tot/obj/gcc/gcc/ -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -march=armv8.2-a+sve -O2 -ftree-vectorize --save-temps -ffat-lto-objects -fno-ident -c -o clastb_1.o /home/sellcey/tot/src/gcc/gcc/testsuite/gcc.target/aarch64/sve/clastb_1.c
[Bug fortran/91253] New: gfortran.dg/continuation_6.f fails when using latest glibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91253 Bug ID: 91253 Summary: gfortran.dg/continuation_6.f fails when using latest glibc Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- When testing GCC with the latest GLIBC, specifically one which creates a math-vector-fortran.h header file, the gfortran.dg/continuation_6.f test fails. If I take the f951 command line: /home/sellcey/tot/install/usr/libexec/gcc/aarch64-linux-gnu/10.0.0/f951 continua tion_6.f -ffixed-form -quiet -mlittle-endian -mabi=lp64 -auxbase continuation_6 -O2 -Wall -std=f2003 -version -ffixed-form -fintrinsic-modules-path /home/sellce y/tot/install/usr/lib/gcc/aarch64-linux-gnu/10.0.0/finclude -fpre-include=/home/ sellcey/tot/install/usr/include/finclude/math-vector-fortran.h -o c.s and remove '-pre-include=/home/.' then the test passes and prints out the expected warning message. If the -pre-include option is there then the warning message does not appear. I wonder if we turn off warning messages while processing this header file and forget to turn them back on?
[Bug tree-optimization/83518] [8/9 Regression] Missing optimization: useless instructions should be dropped
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83518 --- Comment #21 from Steve Ellcey --- (In reply to Richard Biener from comment #20) > (In reply to Steve Ellcey from comment #19) > It should have been fixed by r273732 (checked with a cc1 cross to aarch64, > albeit on a not clean tree...) OK, I rested with top-of-tree that includes this patch and the tests are not failing for me now. My earlier testing did not have this patch in it.
[Bug tree-optimization/83518] [8/9 Regression] Missing optimization: useless instructions should be dropped
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83518 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #19 from Steve Ellcey --- Should this defect be reopened? One of the tests that was added is failing for me on aarch64. FAIL: g++.dg/tree-ssa/pr83518.C -std=gnu++98 scan-tree-dump optimized "return 15;" FAIL: g++.dg/tree-ssa/pr83518.C -std=gnu++14 scan-tree-dump optimized "return 15;" FAIL: g++.dg/tree-ssa/pr83518.C -std=gnu++17 scan-tree-dump optimized "return 15;"
[Bug middle-end/91242] New: ICE on aarch64 SVE tests - gcc.target/aarch64/sve/clastb_[146].c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91242 Bug ID: 91242 Summary: ICE on aarch64 SVE tests - gcc.target/aarch64/sve/clastb_[146].c Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- I get ICE on these GCC tests: FAIL: gcc.target/aarch64/sve/clastb_1.c -march=armv8.2-a+sve (internal compiler error) FAIL: gcc.target/aarch64/sve/clastb_4.c -march=armv8.2-a+sve (internal compiler error) FAIL: gcc.target/aarch64/sve/clastb_6.c -march=armv8.2-a+sve (internal compiler error) After this checkin: commit 8482ddd3ae29c2c74f7e01fa0422ee697689e98c Author: marxin Date: Mon Jun 10 07:04:39 2019 + Fix build with --enable-gather-detailed-mem-stats. 2019-06-10 Martin Liska * hash-map.h: Pass default value to hash_table ctor. * hash-table.h: Add default value to call of a ctor. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272104 138bc75d-0d04-0410-96 1f-82ee72b054a4 Compiler executable checksum: a0a240676a930bb65e6bc0ad707e5aee hash table checking failed: equal operator returns true for a pair of values with a different hash value during GIMPLE pass: dom clastb_6.c: In function ‘condition_reduction’: clastb_6.c:13:1: internal compiler error: in hashtab_chk_error, at hash-table.c:137 13 | condition_reduction (TYPE *a, TYPE min_v) | ^~~ 0x604667 hashtab_chk_error() ../../gcc/gcc/hash-table.c:137 0xffbcaf hash_table::verify(std::pair > const*> const&, unsigned int) ../../gcc/gcc/hash-table.h:1036 0xffbcaf hash_table::find_slot_with_hash(std::pair > const*> const&, unsigned int, insert_option) ../../gcc/gcc/hash-table.h:971 0xfdeacf build_poly_int_cst(tree_node*, poly_int<2u, generic_wide_int > > const&) ../../gcc/gcc/tree.c:1681 0xfe3303 force_fit_type(tree_node*, poly_int<2u, generic_wide_int > > const&, int, bool) ../../gcc/gcc/tree.c:1445 0x92b58b int_const_binop(tree_code, tree_node const*, tree_node const*, int) ../../gcc/gcc/fold-const.c:1196 0x93e28b const_binop ../../gcc/gcc/fold-const.c:1241 0x94080f const_binop(tree_code, tree_node*, tree_node*, tree_node*) ../../gcc/gcc/fold-const.c:1719 0x10a7de3 gimple_resimplify2 ../../gcc/gcc/gimple-match-head.c:255 0x11d122b gimple_simplify(gimple*, gimple_match_op*, gimple**, tree_node* (*)(tree_node*), tree_node* (*)(tree_node*)) ../../gcc/gcc/gimple-match-head.c:930 0x99a14f gimple_fold_stmt_to_constant_1(gimple*, tree_node* (*)(tree_node*), tree_node* (*)(tree_node*)) ../../gcc/gcc/gimple-fold.c:6301 0x104f6d7 vr_values::vrp_visit_assignment_or_call(gimple*, tree_node**, value_range*) ../../gcc/gcc/vr-values.c:2037 0x148ff0f evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool) ../../gcc/gcc/gimple-ssa-evrp-analyze.c:299 0xeb record_temporary_equivalences_from_stmts_at_dest ../../gcc/gcc/tree-ssa-threadedge.c:293 0xf01543 thread_through_normal_block ../../gcc/gcc/tree-ssa-threadedge.c:1064 0xf02f6f thread_across_edge ../../gcc/gcc/tree-ssa-threadedge.c:1372 0xf038bb thread_outgoing_edges(basic_block_def*, gcond*, const_and_copies*, avail_exprs_stack*, evrp_range_analyzer*, tree_node* (*)(gimple*, gimple*, avail_exprs_stack*, basic_block_def*)) ../../gcc/gcc/tree-ssa-threadedge.c:1463 0xe0bd8b dom_opt_dom_walker::after_dom_children(basic_block_def*) ../../gcc/gcc/tree-ssa-dom.c:1549 0x1463b17 dom_walker::walk(basic_block_def*) ../../gcc/gcc/domwalk.c:354 0xe0e7bf execute ../../gcc/gcc/tree-ssa-dom.c:724 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report.
[Bug bootstrap/91176] [10 regression] AArch64 bootstrap fails since r273479
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91176 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #8 from Steve Ellcey --- The patch in comment #5 fixed the bootstrap on aarch64 that I was getting.
[Bug bootstrap/90873] [10 regression] -Wmaybe-uninitialized warning in gcc/tree-ssa-forwprop.c breaks 32-bit bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90873 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #5 from Steve Ellcey --- This had also broken the glibc build on my aarch64 system but the patch in Comment #4 seems to have fixed things and I can build again.
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #51 from Steve Ellcey --- Author: sje Date: Thu Apr 11 18:03:49 2019 New Revision: 270289 URL: https://gcc.gnu.org/viewcvs?rev=270289&root=gcc&view=rev Log: 2018-04-11 Steve Ellcey PR rtl-optimization/87763 * gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks to bfi. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #50 from Steve Ellcey --- Author: sje Date: Thu Apr 11 18:02:41 2019 New Revision: 270288 URL: https://gcc.gnu.org/viewcvs?rev=270288&root=gcc&view=rev Log: 2018-04-11 Steve Ellcey PR rtl-optimization/87763 * config/aarch64/aarch64.md (*aarch64_bfi4_noshift_alt): New Instruction. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.md
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #48 from Steve Ellcey --- (In reply to Richard Biener from comment #47) > What's the state of regressions left? Can we xfail the rest and defer the > bug? I submitted a patch to fix gcc.target/aarch64/lsl_asr_sbfiz.c That email is https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00404.html The other regressions I have are: FAIL: gcc.target/aarch64/insv_1.c scan-assembler bfi\tx[0-9]+, x[0-9]+, 0, 8 FAIL: gcc.target/aarch64/insv_1.c scan-assembler bfi\tx[0-9]+, x[0-9]+, 16, 5 FAIL: gcc.target/aarch64/insv_1.c scan-assembler movk\tx[0-9]+, 0x1d6b, lsl 32 I don't have a patch for those.
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #46 from Steve Ellcey --- Author: sje Date: Wed Apr 10 20:29:57 2019 New Revision: 270267 URL: https://gcc.gnu.org/viewcvs?rev=270267&root=gcc&view=rev Log: 2018-04-10 Steve Ellcey PR rtl-optimization/87763 * gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks to bfi. * gcc.target/aarch64/combine_bfi_2.c: New test. Added: trunk/gcc/testsuite/gcc.target/aarch64/combine_bfi_2.c Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/combine_bfxil.c
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #45 from Steve Ellcey --- Author: sje Date: Wed Apr 10 20:28:19 2019 New Revision: 270266 URL: https://gcc.gnu.org/viewcvs?rev=270266&root=gcc&view=rev Log: 2018-04-10 Steve Ellcey PR rtl-optimization/87763 * config/aarch64/aarch64-protos.h (aarch64_masks_and_shift_for_bfi_p): New prototype. * config/aarch64/aarch64.c (aarch64_masks_and_shift_for_bfi_p): New function. * config/aarch64/aarch64.md (*aarch64_bfi5_shift): New instruction. (*aarch64_bfi5_shift_alt): Ditto. (*aarch64_bfi4_noand): Ditto. (*aarch64_bfi4_noand_alt): Ditto. (*aarch64_bfi4_noshift): Ditto. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-protos.h trunk/gcc/config/aarch64/aarch64.c trunk/gcc/config/aarch64/aarch64.md
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #34 from Steve Ellcey --- I submitted a patch that would fix gcc.target/aarch64/combine_bfi_1.c back in February but have not gotten any feedback on the final version of the patch despite a couple of pings. I have resubmitted the patch again today to see if one of the Aarch64 maintainers will look at it. https://gcc.gnu.org/ml/gcc-patches/2019-04/msg00045.html
[Bug fortran/89724] New: Fortran diagnostics give wrong line number because of math-vector-fortran.h header file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89724 Bug ID: 89724 Summary: Fortran diagnostics give wrong line number because of math-vector-fortran.h header file Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- I am seeing some Fortran regressions in my testing, but only when I build and test with the latest Glibc. The regressions are: FAIL: gfortran.dg/continuation_6.f -O (test for warnings, line 261) FAIL: gfortran.dg/continuation_9.f90 -O (test for warnings, line ) FAIL: gfortran.dg/continuation_9.f90 -O (test for warnings, line ) FAIL: gfortran.dg/continuation_9.f90 -O (test for warnings, line ) FAIL: gfortran.dg/continuation_9.f90 -O (test for excess errors) FAIL: gfortran.dg/tab_continuation.f -O Nonconforming tab (test for errors, line ) FAIL: gfortran.dg/tab_continuation.f -O Nonconforming tab (test for errors, line ) FAIL: gfortran.dg/tab_continuation.f -O Nonconforming tab (test for errors, line ) FAIL: gfortran.dg/tab_continuation.f -O Nonconforming tab (test for errors, line ) When I run continuation_9.f90 by hand I get: % install/usr/bin/gfortran -std=f95 -c continuation_9.f90 f951: Warning: ‘&’ not allowed by itself in line 23 f951: Warning: ‘&’ not allowed by itself in line 24 f951: Warning: ‘&’ not allowed by itself in line 25 Instead of % install/usr/bin/gfortran -std=f95 -c continuation_9.f90 f951: Warning: ‘&’ not allowed by itself in line 3 f951: Warning: ‘&’ not allowed by itself in line 4 f951: Warning: ‘&’ not allowed by itself in line 5 The reason for the line number changes is that the Fortran program is reading the file install/usr/include/finclude/math-vector-fortran.h and these lines are getting counted as part of the line numbers. This file is coming from glibc and was recently added. If I move it out of the way then I get the previous results.
[Bug target/89719] New: [9 regression] gcc.target/aarch64/spellcheck_[456].c testsuite failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89719 Bug ID: 89719 Summary: [9 regression] gcc.target/aarch64/spellcheck_[456].c testsuite failures Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- FAIL: gcc.target/aarch64/spellcheck_4.c (test for errors, line ) FAIL: gcc.target/aarch64/spellcheck_4.c (test for excess errors) FAIL: gcc.target/aarch64/spellcheck_5.c (test for errors, line ) FAIL: gcc.target/aarch64/spellcheck_5.c (test for excess errors) FAIL: gcc.target/aarch64/spellcheck_6.c (test for errors, line ) FAIL: gcc.target/aarch64/spellcheck_6.c (test for excess errors) These tests started failing on or before March 11, 2019. You can see them in the linaro regression results too. https://gcc.gnu.org/ml/gcc-testresults/2019-03/msg01918.html I tried to find the checkin that caused this regression but my attempts to build a working compiler from older sources failed. That is I could build an older compiler, but that compiler still exhibited this failure.
[Bug target/89628] aarch64_vector_pcs does not use v24-v31 as temp regs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89628 Steve Ellcey changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-03-08 Ever confirmed|0 |1 --- Comment #1 from Steve Ellcey --- I think the problem here is that we are not setting REG_ALLOC_ORDER and/or ADJUST_REG_ALLOC_ORDER on aarch64. I am not sure which one we should use, but I will look into it.
[Bug libfortran/78314] [aarch64] ieee_support_halting does not report unsupported fpu traps correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314 --- Comment #24 from Steve Ellcey --- See email strings at: https://gcc.gnu.org/ml/fortran/2019-01/msg00276.html https://gcc.gnu.org/ml/fortran/2019-02/msg00057.html For more discussion.
[Bug middle-end/82479] missing popcount builtin detection
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82479 Steve Ellcey changed: What|Removed |Added Status|NEW |RESOLVED CC||sje at gcc dot gnu.org Known to work||9.0 Resolution|--- |FIXED --- Comment #14 from Steve Ellcey --- It looks like the fix for this is checked in. I verified that on Aarch64, when compiling bits.cpp from 531.deepsjeng_r in SPEC 2017, GCC generated a __builtin_popcount call. It looks like this went in after the 8.* branch was created so 9.1 would be the first version to have it.
[Bug libfortran/78314] [aarch64] ieee_support_halting does not report unsupported fpu traps correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #22 from Steve Ellcey --- It looks like the recent checkins are causing gfortran.dg/ieee/ieee_6.f90 to fail on aarch64. Reading through the comments it looks like this isn't a new problem but the failure showing up in the GCC testsuite run is new.
[Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201 --- Comment #7 from Steve Ellcey --- (In reply to Richard Biener from comment #6) > If Martins bisection to power.fppized.o is correct you can bisect the loop > via the vect_loop or vect_slp debug counters (or first try with just > -fno-tree-{loop,slp}-vectorize to narrow down to loop vs. BB vectorization). I will let one of the x86 experts try that. I was just surprised to find that one of the most popular benchmarks fails on one of the most popular targets and that it has been that way for about a year.
[Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #5 from Steve Ellcey --- Has anyone looked into this any more to see what optimization is causing this failure? In my testing: -Ofast fails -Ofast -fno-unsafe-math-optimizations works -Ofast -fno-tree-loop-vectorize works -O3 works So it seems to be some combination of unsafe math optimizations and vectorization that is causing the failure.
[Bug debug/87451] FAIL: gcc.dg/debug/dwarf2/inline5.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87451 --- Comment #11 from Steve Ellcey --- (In reply to Richard Biener from comment #10) > (In reply to Steve Ellcey from comment #9) > Looks like that's because of different expected comment characters, > # vs. // in your file. The pattern for the comment stuff is > > \[^#/!\]*\[#/!\] DW > > skip until first comment-char (ok), then consume comment (bogus). Adding > + might help. Can you check that? Yes, that patch fixed the failure I was seeing on aarch64.
[Bug debug/87451] FAIL: gcc.dg/debug/dwarf2/inline5.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87451 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #9 from Steve Ellcey --- Created attachment 45559 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45559&action=edit Assembly output from aarch64-linux-gnu This test is still failing on aarch64. Attached is the .s file from a top-of-tree GCC build on aarch64-linux-gnu.
[Bug target/85711] ICE in aarch64_classify_address, at config/aarch64/aarch64.c:5678
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85711 --- Comment #3 from Steve Ellcey --- Author: sje Date: Wed Jan 23 22:43:42 2019 New Revision: 268219 URL: https://gcc.gnu.org/viewcvs?rev=268219&root=gcc&view=rev Log: 2019-01-23 Bin Cheng Steve Ellcey PR target/85711 * recog.c (address_operand): Return false on wrong mode for address. (constrain_operands): Check for mode with 'p' constraint. Modified: trunk/gcc/ChangeLog trunk/gcc/recog.c
[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #21 from Steve Ellcey --- If I look at this specific example: int f2 (int x, int y) { return (x & ~0x0ff000) | ((y & 0x0ff) << 12); } Before the combine change, I see in x.c.260r.combine: Trying 8, 9 -> 15: 8: r98:SI=x1:SI<<0xc&0xff000 REG_DEAD x1:SI 9: r99:SI=x0:SI&0xfff00fff REG_DEAD x0:SI 15: x0:SI=r98:SI|r99:SI REG_DEAD r98:SI REG_DEAD r99:SI Successfully matched this instruction: (set (zero_extract:SI (reg/i:SI 0 x0) (const_int 8 [0x8]) (const_int 12 [0xc])) (zero_extend:SI (reg:QI 1 x1 [ y ]))) allowing combination of insns 8, 9 and 15 original costs 4 + 4 + 4 = 12 replacement cost 4 deferring deletion of insn with uid = 9. Immediately after the combine change, I get: Trying 8, 9 -> 15: 8: r98:SI=r101:SI<<0xc&0xff000 REG_DEAD r101:SI 9: r99:SI=r100:SI&0xfff00fff REG_DEAD r100:SI 15: x0:SI=r98:SI|r99:SI REG_DEAD r98:SI REG_DEAD r99:SI Failed to match this instruction: (set (reg/i:SI 0 x0) (ior:SI (and:SI (reg:SI 100) (const_int -1044481 [0xfff00fff])) (and:SI (ashift:SI (reg:SI 101) (const_int 12 [0xc])) (const_int 1044480 [0xff000] Successfully matched this instruction: (set (reg:SI 99) (ashift:SI (reg:SI 101) (const_int 12 [0xc]))) Failed to match this instruction: (set (reg/i:SI 0 x0) (ior:SI (and:SI (reg:SI 100) (const_int -1044481 [0xfff00fff])) (and:SI (reg:SI 99) (const_int 1044480 [0xff000] Is this because of x0 (a hard register) at the destination in insn 15?
[Bug fortran/88912] Fortran compiler segfaults when pre-include file is not found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912 --- Comment #3 from Steve Ellcey --- It is quite possible I am using the option incorrectly (though that should not result in a segfault of course). Should some other flag be adding this to the command line for me?
[Bug fortran/88912] New: Fortran compiler segfaults when pre-include file is not found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88912 Bug ID: 88912 Summary: Fortran compiler segfaults when pre-include file is not found Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- I am using the new -pre-include= option with Fortran and when the file I am trying to preinclude does not exist the compiler segfaults. % install/usr/bin/gfortran -fpre-include=/tmp/foo.h -Ofast -S x.f90 : internal compiler error: Segmentation fault 0xcb1f37 crash_signal /home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:326 0x6ca5dc load_file /home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/scanner.c:2481 0x6ca76b gfc_new_file() /home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/scanner.c:2681 0x6f0b97 gfc_init /home/sellcey/gcc-vect-fortran/src/gcc/gcc/fortran/f95-lang.c:250 0x6164b3 lang_dependent_init /home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:1929 0x6164b3 do_compile /home/sellcey/gcc-vect-fortran/src/gcc/gcc/toplev.c:2161 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. % install/usr/bin/gfortran -v Using built-in specs. COLLECT_GCC=install/usr/bin/gfortran COLLECT_LTO_WRAPPER=/home/sellcey/gcc-vect-fortran/install/usr/libexec/gcc/aarch64-linux-gnu/9.0.0/lto-wrapper Target: aarch64-linux-gnu Configured with: /home/sellcey/gcc-vect-fortran/src/gcc/configure --prefix=/home/sellcey/gcc-vect-fortran/install/usr --target=aarch64-linux-gnu --host=aarch64-linux-gnu --build=aarch64-linux-gnu --enable-gnu-indirect-function --with-sysroot=/home/sellcey/gcc-vect-fortran/install --enable-languages=c,c++,fortran --disable-libsanitizer --disable-bootstrap --enable-threads --enable-shared Thread model: posix gcc version 9.0.0 20190117 (experimental) (GCC)
[Bug fortran/88898] [Regression 9] gomp is broken by r268045
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88898 --- Comment #6 from Steve Ellcey --- Author: sje Date: Fri Jan 18 00:41:40 2019 New Revision: 268054 URL: https://gcc.gnu.org/viewcvs?rev=268054&root=gcc&view=rev Log: 2018-01-17 Steve Ellcey PR fortran/88898 * gfortran.dg/gomp/declare-simd-2.f90: Add aarch64 target specifier to warning checks. * gfortran.dg/gomp/pr79154-1.f90: Ditto. * gfortran.dg/gomp/pr83977.f90: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gfortran.dg/gomp/declare-simd-2.f90 trunk/gcc/testsuite/gfortran.dg/gomp/pr79154-1.f90 trunk/gcc/testsuite/gfortran.dg/gomp/pr83977.f90
[Bug fortran/88898] [Regression 9] gomp is broken by r268045
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88898 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #4 from Steve Ellcey --- I think this is my fault. My patch shouldn't have affected x86 at all but I see my build/test on x86 only tested C and C++, I didn't have Fortran configured in when I checked for regressions. The problem is the warnings I added to the Fortran tests, like: -function f1 (a, b, c, d, e, f) +function f1 (a, b, c, d, e, f) ! { dg-warning "GCC does not currently support mixed size types for 'simd' functions" } I didn't add a '{ target aarch64-*-* }' clause to the messages. I will work on a patch, it is only the tests that should need to be changed, not the compiler.
[Bug target/85711] ICE in aarch64_classify_address, at config/aarch64/aarch64.c:5678
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85711 --- Comment #2 from Steve Ellcey --- This has been failing for quite a while now and there is apparently a fix for it. Can we get it fixed for GCC 9.0 release?
[Bug target/88682] new test case c-c++-common/pr51628-10.c fails starting with its introduction in r267313
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88682 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #5 from Steve Ellcey --- It looks like this test is violating strict aliasing. If I compile with -fno-strict-aliasing then it works. I think pointing p.i (type __int128_t) to something of type unaligned_int128_t is a standards violation in C or C++ but I am not a language lawyer. FYI: I get the same behavior with C or C++ on aarch64. It works with -O1 on aarch64 but fails with -O2 and the difference is the ordering of loads and stores.
[Bug rtl-optimization/87763] [9.0 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 --- Comment #1 from Steve Ellcey --- I looked at one of the failing tests (gcc.target/aarch64/cvtf_1.c) the code looks worse than before, generating an extra instruction in each of the routines. Here is an example from one function where there is an extra fmov that was not there before. The test runs at -O1 but the extra instruction appears at all optimization levels. void cvt_int32_t_to_float (int a, float b) { float c; c = (float) a; if ( (c - b) > 0.1) abort(); } Which used to generate: cvt_int32_t_to_float: .LFB0: .cfi_startproc scvtf s1, w0 fsubs0, s1, s0 fcvtd0, s0 adrpx0, .LC0 ldr d1, [x0, #:lo12:.LC0] fcmpe d0, d1 bgt .L9 ret .L9: stp x29, x30, [sp, -16]! .cfi_def_cfa_offset 16 .cfi_offset 29, -16 .cfi_offset 30, -8 mov x29, sp bl abort .cfi_endproc Now generates: cvt_int32_t_to_float: .LFB0: .cfi_startproc fmovs1, w0 scvtf s1, s1 fsubs1, s1, s0 fcvtd1, s1 adrpx0, .LC0 ldr d0, [x0, #:lo12:.LC0] fcmpe d1, d0 bgt .L9 ret .L9: stp x29, x30, [sp, -16]! .cfi_def_cfa_offset 16 .cfi_offset 29, -16 .cfi_offset 30, -8 mov x29, sp bl abort .cfi_endproc
[Bug rtl-optimization/87763] New: [9.0 Regression] aarch64 target testcases fail after r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763 Bug ID: 87763 Summary: [9.0 Regression] aarch64 target testcases fail after r265398 Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org CC: segher at gcc dot gnu.org Target Milestone: --- The following tests fail on aarch64 after r265398 (combine: Do not combine moves from hard registers). FAIL: gcc.dg/vect/vect-nop-move.c -flto -ffat-lto-objects scan-rtl-dump combine "deleting noop move" FAIL: gcc.dg/vect/vect-nop-move.c scan-rtl-dump combine "deleting noop move" FAIL: gcc.target/aarch64/combine_bfi_1.c scan-assembler-times \\tbfi\\t 5 FAIL: gcc.target/aarch64/combine_bfxil.c scan-assembler-times bfxil\\t 13 FAIL: gcc.target/aarch64/cvtf_1.c scan-assembler scvtf\td[0-9]+, x[0-9]+ FAIL: gcc.target/aarch64/cvtf_1.c scan-assembler scvtf\ts[0-9]+, w[0-9]+ FAIL: gcc.target/aarch64/cvtf_1.c scan-assembler ucvtf\td[0-9]+, x[0-9]+ FAIL: gcc.target/aarch64/cvtf_1.c scan-assembler ucvtf\ts[0-9]+, w[0-9]+ FAIL: gcc.target/aarch64/insv_1.c scan-assembler bfi\tx[0-9]+, x[0-9]+, 0, 8 FAIL: gcc.target/aarch64/insv_1.c scan-assembler bfi\tx[0-9]+, x[0-9]+, 16, 5 FAIL: gcc.target/aarch64/insv_1.c scan-assembler movk\tx[0-9]+, 0x1d6b, lsl 32 FAIL: gcc.target/aarch64/lsl_asr_sbfiz.c scan-assembler sbfiz\tw FAIL: gcc.target/aarch64/sve/tls_preserve_1.c -march=armv8.2-a+sve scan-assembl er-not \\tst[rp]\\t[dqv] FAIL: gcc.target/aarch64/test_frame_16.c scan-assembler-times sub\tsp, sp, #[0-9 ]+ 2 FAIL: gcc.target/aarch64/tst_5.c scan-assembler tst\t(x|w)[0-9]+,[ \t]*255 FAIL: gcc.target/aarch64/tst_5.c scan-assembler tst\t(x|w)[0-9]+,[ \t]*65535 FAIL: gcc.target/aarch64/tst_6.c scan-assembler tst\t(x|w)[0-9]+,[ \t]*65535 FAIL: gcc.target/aarch64/va_arg_3.c scan-assembler-not x7 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.16b, w[0- 9]+ 3 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.2d, x[0-9 ]+ 2 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.2s, w[0-9 ]+ 2 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.4h, w[0-9 ]+ 3 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.4s, w[0-9 ]+ 2 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.8b, w[0-9 ]+ 3 FAIL: gcc.target/aarch64/vdup_n_1.c scan-assembler-times dup\\tv[0-9]+.8h, w[0-9 ]+ 3 FAIL: gcc.target/aarch64/vect_combine_zeroes_1.c scan-assembler-not mov\tv[0-9]+ .8b, v[0-9]+.8b FAIL: g++.dg/ext/pr82625.C -std=gnu++17 (test for excess errors)
[Bug tree-optimization/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 --- Comment #24 from Steve Ellcey --- Author: sje Date: Fri Oct 5 15:26:40 2018 New Revision: 264874 URL: https://gcc.gnu.org/viewcvs?rev=264874&root=gcc&view=rev Log: 2018-10-05 Steve Ellcey PR tree-optimization/71625 * /gcc.target/aarch64/vclz.c (test_vclz_s8): Add noinline attribute. (test_vclz_s16): Ditto. (test_vclz_s32): Ditto. (test_vclzq_s8): Ditto. (test_vclzq_s16): Ditto. (test_vclzq_s32): Ditto. (test_vclz_u8): Ditto. (test_vclz_u16): Ditto. (test_vclz_u32): Ditto. (test_vclzq_u8): Ditto. (test_vclzq_u16): Ditto. (test_vclzq_u32): Ditto. * gcc.target/aarch64/vneg_s.c (test_vneg_s8): Ditto. (test_vneg_s16): Ditto. (test_vneg_s32): Ditto. (test_vneg_s64): Ditto. (test_vnegd_s64): Ditto. (test_vnegq_s8): Ditto. (test_vnegq_s16): Ditto. (test_vnegq_s32): Ditto. (test_vnegq_s64): Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/vclz.c trunk/gcc/testsuite/gcc.target/aarch64/vneg_s.c
[Bug testsuite/87433] [9 Regression] gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c tests fail after combine two to two instruction patch on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87433 Steve Ellcey changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Steve Ellcey --- Fixed the failures by updating the two tests.
[Bug testsuite/87433] [9 Regression] gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c tests fail after combine two to two instruction patch on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87433 --- Comment #6 from Steve Ellcey --- Author: sje Date: Fri Sep 28 14:44:15 2018 New Revision: 264692 URL: https://gcc.gnu.org/viewcvs?rev=264692&root=gcc&view=rev Log: 2018-09-28 Steve Ellcey PR testsuite/87433 * gcc.target/aarch64/ashltidisi.c: Expect 3 asr instructions instead of 4. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/ashltidisi.c
[Bug testsuite/87433] [9 Regression] gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c tests fail after combine two to two instruction patch on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87433 --- Comment #5 from Steve Ellcey --- Author: sje Date: Fri Sep 28 14:41:45 2018 New Revision: 264691 URL: https://gcc.gnu.org/viewcvs?rev=264691&root=gcc&view=rev Log: 2018-09-28 Steve Ellcey PR testsuite/87433 * gcc.dg/zero_bits_compound-1.c: Do not run on aarch64*-*-*. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/zero_bits_compound-1.c
[Bug tree-optimization/71625] missing strlen optimization on different array initialization style
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71625 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #21 from Steve Ellcey --- Maybe this is already known but this patch: +2018-08-13 Martin Sebor + + PR tree-optimization/71625 + * c-common.c (braced_list_to_string): New function. + * c-common.h (braced_list_to_string): Declare it. + Caused two regressions on aarch64: FAIL: gcc.target/aarch64/vclz.c scan-assembler-times clz\\tv[0-9]+.16b, v[0-9]+.16b 2 FAIL: gcc.target/aarch64/vneg_s.c scan-assembler-times neg\\tv[0-9]+.16b, v[0-9]+.16b 1 FAIL: gcc.target/aarch64/vneg_s.c scan-assembler-times neg\\tv[0-9]+.8b, v[0-9]+.8b 1
[Bug testsuite/87433] [9 Regression] gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c tests fail after combine two to two instruction patch on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87433 --- Comment #3 from Steve Ellcey --- Based on that email string, gcc.target/aarch64/ashltidisi.c can be fixed by looking for 3 asr instructions instead of 4. That seems simple enough. The new code has two fewer instructions that the old code: < NEW CODE > OLD CODE 11,12c11,13 < lsr w1, w0, 11 < lsl x0, x0, 53 --- > uxtwx1, w0 > lsl x0, x1, 53 > lsr x1, x1, 11 24,25c25,27 < sbfxx1, x0, 11, 21 < lsl x0, x0, 53 --- > sxtwx1, w0 > lsl x0, x1, 53 > asr x1, x1, 11 But gcc.dg/zero_bits_compound-1.c is not an aarch64 specific test, we are seeing 'and' expressions in the rtl on aarch64. Based on the email string that seems OK. It generates the same number of instructions as before but there is a some opportunity for doing some of them in parallel. But the test doesn't seem to be failing on the x86 or s390 platforms so should we just not run it on aarch64 anymore? That would seem like the right fix to me, I don't think the test makes sense on aarch64 anymore.
[Bug tree-optimization/61247] vectorization fails for unsigned is used for IV but casted to int before using as the index (and then casted for internal type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61247 --- Comment #4 from Steve Ellcey --- Here is a simpler C version of the problem. On aarch64 in LP64 mode setting TYPE to int, long int, or unsigned long int allows for vectorization but using unsigned int does not get vectorized. In ILP32 mode, all types allow the loop to be vectorized. /* int gets vectorized long int gets vectorized unsigned long int gets vectorized unsigned int does NOT get vectorized in LP64 mode */ typedef unsigned int TYPE; void f(TYPE N, int *C, int *A, int val) { TYPE i,j; for (i=0; i
[Bug middle-end/87433] New: gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c regressions on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87433 Bug ID: 87433 Summary: gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c regressions on aarch64 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- The tests gcc.dg/zero_bits_compound-1.c and gcc.target/aarch64/ashltidisi.c have been failing since this checkin: commit 9fa26361aee8ed622921a36dd26d0b0ed0c75641 Author: segher Date: Mon Jul 30 13:18:17 2018 + combine: Allow combining two insns to two insns This patch allows combine to combine two insns into two. This helps in many cases, by reducing instruction path length, and also allowing further combinations to happen. PR85160 is a typical example of code that it can improve. This patch does not allow such combinations if either of the original instructions was a simple move instruction. In those cases combining the two instructions increases register pressure without improving the code. With this move test register pressure does no longer increase noticably as far as I can tell. (At first I also didn't allow either of the resulting insns to be a move instruction. But that is actually a very good thing to have, as should have been obvious). PR rtl-optimization/85160 * combine.c (is_just_move): New function. (try_combine): Allow combining two instructions into two if neither of the original instructions was a move.
[Bug target/71727] -O3 -mstrict-align produces code which assumes unaligned vector accesses work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #6 from Steve Ellcey --- Is there any reason this defect cannot be closed out?
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 --- Comment #2 from Steve Ellcey --- While I agree that we want users to use the __sync and atomic primitives, it still seems like it would be useful in some cases to know if the LSE operations are available and if GCC is generating code for them. I.e. is TARGET_LSE set or not. I don't like the idea of hiding information (or making it hard to determine) just because it might be misused. I will probably create a patch for this and submit it to gcc-patches to see if there is any support for it.
[Bug middle-end/86540] New: pr77445-2.c and ssa-dom-thread-7.c regressions since May 20, 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86540 Bug ID: 86540 Summary: pr77445-2.c and ssa-dom-thread-7.c regressions since May 20, 2018 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- According to Christophe Lyon, Martin Liska is aware of these failures and will fix them but I wanted to create a bug report to ensure they do not get forgotten. I see these regressions on aarch64 but I am not sure if they are platform specific. https://gcc.gnu.org/ml/gcc/2018-06/msg00289.html https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01479.html
[Bug target/86538] New: GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 Bug ID: 86538 Summary: GCC should define a macro to specify if LSE is enabled or not Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- Right now there is no predefined macro in GCC that can tell if LSE is enable or not. If you compile with -march=armv8.1-a+lse or -march=armv8.1-a+nolse you get the same set of predefined macros and so there is no way a user can tell if they can/should use LSE instructions. Other features (CRYPTO, RDMA, SVE, AES, SHA, etc. do have macros associated with them. (See aarch64_update_cpp_builtins in gcc/config/aarch64/aarch64-c.c)
[Bug other/86153] [9 regression] test case g++.dg/pr83239.C fails starting with r261585
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86153 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #4 from Steve Ellcey --- If we wanted to restore the previous inlining behavior, adding the option '--param early-inlining-insns=30' seems to fix the failure for me on aarch64.
[Bug testsuite/86016] New tests for r260978 report excess errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86016 Steve Ellcey changed: What|Removed |Added Last reconfirmed|2018-06-01 00:00:00 |2018-6-25 CC||dave.pagan at oracle dot com, ||sje at gcc dot gnu.org --- Comment #2 from Steve Ellcey --- Adding David Pagen since it looks like Jeff checked this patch in for him.
[Bug target/79924] aarch64: untranslated diagnostics in aarch64_err_no_fpadvsimd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79924 --- Comment #2 from Steve Ellcey --- Author: sje Date: Tue Jun 5 22:21:36 2018 New Revision: 261218 URL: https://gcc.gnu.org/viewcvs?rev=261218&root=gcc&view=rev Log: 2018-06-05 Steve Ellcey PR target/79924 * gcc.target/aarch64/mgeneral-regs_1.c: Update error message. * gcc.target/aarch64/mgeneral-regs_2.c: Ditto. * gcc.target/aarch64/mgeneral-regs_3.c: Ditto. * gcc.target/aarch64/nofp_1.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_1.c trunk/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_2.c trunk/gcc/testsuite/gcc.target/aarch64/mgeneral-regs_3.c trunk/gcc/testsuite/gcc.target/aarch64/nofp_1.c
[Bug target/79924] aarch64: untranslated diagnostics in aarch64_err_no_fpadvsimd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79924 --- Comment #1 from Steve Ellcey --- Author: sje Date: Tue Jun 5 22:20:13 2018 New Revision: 261217 URL: https://gcc.gnu.org/viewcvs?rev=261217&root=gcc&view=rev Log: 2018-06-05 Steve Ellcey PR target/79924 * config/aarch64/aarch64-protos.h (aarch64_err_no_fpadvsimd): Remove second argument. * config/aarch64/aarch64-protos..c (aarch64_err_no_fpadvsimd): Remove second argument, change how error is called. (aarch64_layout_arg): Remove second argument from aarch64_err_no_fpadvsimd call. (aarch64_init_cumulative_args): Ditto. (aarch64_gimplify_va_arg_expr): Ditto. * config/aarch64/aarch64.md (mov): Ditto. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-protos.h trunk/gcc/config/aarch64/aarch64.c trunk/gcc/config/aarch64/aarch64.md
[Bug target/68256] Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #11 from Steve Ellcey --- FYI: This caused a regression on aarch64. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923
[Bug c/84923] gcc.dg/attr-weakref-1.c failed on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923 Steve Ellcey changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-04-24 CC||sje at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Steve Ellcey --- I am seeing this failure also.
[Bug tree-optimization/85483] New: Many failures on test gcc.target/aarch64/sve/vcond_1.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85483 Bug ID: 85483 Summary: Many failures on test gcc.target/aarch64/sve/vcond_1.c Product: gcc Version: 8.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- I am seeing a bunch of failures in the gcc.target/aarch64/sve/vcond_1.c test on aarch64. You can see them on the test results list at: https://gcc.gnu.org/ml/gcc-testresults/2018-04/msg01710.html They seem to have showed up in the last 24 hours, but I am not sure exactly what patch caused them.
[Bug middle-end/85383] [8 regression] many ICE failures at gcc/toplev.c:325 starting with r259346
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85383 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #2 from Steve Ellcey --- This looks like the same bug I am seeing on Aarch64 with the SPEC 2017 510.parest_r benchmark. I am compiling it with -Ofast -flto=32 -mcpu=native on a T99.
[Bug middle-end/85160] New: GCC generates mvn/and instructions instead of bic on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85160 Bug ID: 85160 Summary: GCC generates mvn/and instructions instead of bic on aarch64 Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- Target: aarch64 With this test case: int foo(int a, int b, int *c, int i, int j) { int x,y; x = ((a & (~c[i])) >> 7) | ((a & (~c[j])) >> 9); y = ((b & (~c[i])) >> 9) | ((b & (~c[j])) >> 7); return x | y; } GCC -O2 generates 2 'mvn' instructions and 4 'and' instructions. LLVM -O2 generates 4 'bic' instructions instead. GCC: foo: ldr w3, [x2, w3, sxtw 2] ldr w2, [x2, w4, sxtw 2] mvn w3, w3 mvn w2, w2 and w4, w3, w1 and w1, w2, w1 and w3, w3, w0 and w2, w2, w0 asr w4, w4, 9 asr w1, w1, 7 orr w3, w4, w3, asr 7 orr w2, w1, w2, asr 9 orr w0, w3, w2 ret LLVM: foo: ldr w8, [x2, w3, sxtw #2] ldr w9, [x2, w4, sxtw #2] bic w10, w0, w8 bic w8, w1, w8 asr w8, w8, #9 bic w11, w0, w9 orr w8, w8, w10, asr #7 bic w9, w1, w9 orr w8, w8, w11, asr #9 orr w0, w8, w9, asr #7 ret I am not sure if this should be considered target specific or not, the 'bic' instruction is aarch64 specific but GCC knows how to use it. I think combine didn't try to replace the mvn instructions because it is used by two subsequent instructions and that may be a generic combine issue.
[Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114 --- Comment #9 from Steve Ellcey --- > Can you let me know if my workaround helped? If useful I could backport it > to GCC7 as well. Yes, the patch helped. I ran spec 2017 fp rate and saw a small improvement (0.7%). Most of the speed up was in 508.namd_r, 526.blender_r, and 544.nab_r. 507.cactuBSSN_r and 511.povray_r slowed down a little bit.
[Bug target/83335] [8 regression][aarch64,ilp32] gcc.target/aarch64/asm-2.c ICEs since 255481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83335 Steve Ellcey changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Steve Ellcey --- This should be fixed now.
[Bug target/83335] [8 regression][aarch64,ilp32] gcc.target/aarch64/asm-2.c ICEs since 255481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83335 --- Comment #6 from Steve Ellcey --- Author: sje Date: Thu Feb 22 17:08:10 2018 New Revision: 257908 URL: https://gcc.gnu.org/viewcvs?rev=257908&root=gcc&view=rev Log: 2018-02-22 Steve Ellcey PR target/83335 * gcc/testsuite/gcc.target/aarch64/asm-2.c: Add dg-error for ILP32 mode. * gcc/testsuite/gcc.target/aarch64/asm-4.c: New test. Added: trunk/gcc/testsuite/gcc.target/aarch64/asm-4.c Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/asm-2.c
[Bug target/83335] [8 regression][aarch64,ilp32] gcc.target/aarch64/asm-2.c ICEs since 255481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83335 --- Comment #5 from Steve Ellcey --- Author: sje Date: Thu Feb 22 17:06:31 2018 New Revision: 257907 URL: https://gcc.gnu.org/viewcvs?rev=257907&root=gcc&view=rev Log: 2018-02-22 Steve Ellcey PR target/83335 * config/aarch64/aarch64.c (aarch64_print_address_internal): Change gcc_assert call to output_operand_lossage. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.c
[Bug testsuite/83983] FAIL: g++.dg/lto/pr83121 (test for LTO warnings, pr83121_0.C line 8)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83983 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #3 from Steve Ellcey --- I tried digging into this some but was not able to come up with a fix. I compared x86, which gives the expected error messages with aarch64 which gives different errors and tried to find out where they diverged. I tracked it down to get_odr_type in ipa-devirt.c. Just before: if (val->type != type && (!val->types_set || !val->types_set->add (type))) I added this print statement: fprintf(stderr,"%p %p %p\n", (void *) type, (void *) val->type, (void *) val->types_set); On x86 I see: 0x7f98aa018dc8 0x7f98aa018930 (nil) 0x7f98aa0189d8 0x7f98aa018d20 (nil) 0x7f98aa0189d8 0x7f98aa018d20 0x32c9610 0x7f0222cbfc78 0x7f0222cbf9d8 (nil) 0x7f0222cbfa80 0x7f0222cbfa80 (nil) On Aarch64 I see: 0x4002c859aaa0 0x4002c859a758 (nil) 0x4002c859a6b0 0x4002c859a6b0 (nil) 0x4002c859a6b0 0x4002c859a6b0 (nil) 0x4001f45aa9f8 0x4001f45aa758 (nil) 0x4001f45aa800 0x4001f45aa800 (nil) I think the second line where type and val->type are the same for Aarch64 but not for x86 is where the problem is but I am not sure why we have this difference. I think it may be a bug in the odr hash function with an accidental hash collision but I am not sure.
[Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114 --- Comment #6 from Steve Ellcey --- (In reply to Wilco from comment #5) > (In reply to Steve Ellcey from comment #4) > > While teaching the reassociation pass about fma's seems like the right > > answer would it be reasonable (and simpler) to do the fma pass > > (pass_optimize_widening_mul) before > > the reassociation pass (pass_reassoc) to get the most fma's? > > > > That fixes my small test case but I haven't done a bigger performance check > > to see what the overall impact would be. > > I don't know what else that would affect since the reassociation phase runs > very early - and it's late at this stage. My patch seems much safer. Even > easier might be to return 1 for FLOAT_MODE PLUS_EXPR in > aarch64_reassociation_width. Then we can fix the reassociation phase in GCC9. Moving the fma phase did not have a good performance impact (it was worse). Your patch of setting the reassociation width to 1 did help performance on ThunderX2.
[Bug tree-optimization/84114] global reassociation pass prevents fma usage, generates slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114 --- Comment #4 from Steve Ellcey --- While teaching the reassociation pass about fma's seems like the right answer would it be reasonable (and simpler) to do the fma pass (pass_optimize_widening_mul) before the reassociation pass (pass_reassoc) to get the most fma's? That fixes my small test case but I haven't done a bigger performance check to see what the overall impact would be.
[Bug tree-optimization/84114] New: global reassociation pass prevents fma usage, generates slower code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84114 Bug ID: 84114 Summary: global reassociation pass prevents fma usage, generates slower code Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- Created attachment 43279 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43279&action=edit Test case The example code comes from milc in SPEC2006. GCC on x86 or aarch64 generates better code with -O3 than it does with -Ofast or '-O3 -ffast-math'. On x86 compiling with '-mfma -O3' I get 5 vfmadd231sd instructions, 1 vmulsd instruction and 6 vmovsd. With '-mfma -Ofast' I get 3 vfmadd231sd, 2 vaddsd, 3 vmulsd, and 6 vmovsd. That is two extra instructions. The problem seems to be that -Ofast turns on -ffast-math and that enables the global reassociation pass (tree-ssa-reassoc.c) and the code changes done there create some temporary variables which inhibit the recognition and use of fma instructions. Using -O3 and -Ofast on aarch64 shows the same change.
[Bug c/65345] ICE with _Generic selection on _Atomic int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65345 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #29 from Steve Ellcey --- It looks like this has been fixed on all the different platforms and my googling hasn't found any pr65345-[12].c failures in gcc-testresults since 2016. Should we close this out?
[Bug target/83726] [8 Regression] ICE: in final_scan_insn, at final.c:3063
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83726 --- Comment #7 from Steve Ellcey --- I tested the patch on my aarch64 box, I got three regressions: FAIL: gcc.target/aarch64/pr78733.c scan-assembler adr FAIL: gcc.target/aarch64/pr79041-2.c scan-assembler adr FAIL: gfortran.fortran-torture/compile/pr83081.f90, -O3 -g (internal compiler error) pr83081.f90 looks like a new test so it may not actually be a regression related to this patch, but the first two are. pr79041-2 generates: t: mov x0, 0 mov x1, 65536 ret with the patch rather than: t: adr x0, .LC0 ldp x0, x1, [x0] ret .size t, .-t .align 4 .LC0: .xword 0 .xword 65536
[Bug target/82066] #pragma GCC target documentation does not say it is implemented for aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82066 Steve Ellcey changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||sje at gcc dot gnu.org Known to work||8.0 Resolution|--- |FIXED --- Comment #4 from Steve Ellcey --- Yes, this is fixed for 8.0.
[Bug target/83726] [8 Regression] ICE: in final_scan_insn, at final.c:3063
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83726 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #4 from Steve Ellcey --- This looks like the same thing as PR 83632.
[Bug target/83285] non-atomic stores can removed with seq_cst (and store release) on AArch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83285 Steve Ellcey changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Steve Ellcey --- This was fixed for 8.0 with this patch: commit 31d7a9b35fef974dad13881df6d319b7b08600e8 Author: amonakov Date: Mon Sep 4 10:16:37 2017 + optabs: ensure atomic_load/stores have compiler barriers PR rtl-optimization/57448 PR target/67458 PR target/81316 * optabs.c (expand_atomic_load): Place compiler memory barriers if using atomic_load pattern. (expand_atomic_store): Likewise. testsuite/ * gcc.dg/atomic/pr80640-2.c: New testcase. * gcc.dg/atomic/pr81316.c: New testcase.
[Bug target/83285] non-atomic stores can removed with seq_cst (and store release) on AArch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83285 Steve Ellcey changed: What|Removed |Added CC||sje at gcc dot gnu.org --- Comment #2 from Steve Ellcey --- This may be fixed on top-of-tree (though I don't know what change might have fixed it). If I build with ToT (8.0 prelease I see all three stores: Example from commoent #1 compiled with -O3: % more x.s .arch armv8-a .file "x.c" .text .align 2 .p2align 3,,7 .global _Z7seq_cstRiRSt6atomicIiE .type _Z7seq_cstRiRSt6atomicIiE, %function _Z7seq_cstRiRSt6atomicIiE: .LFB342: .cfi_startproc mov w2, 1 str w2, [x0] mov w2, 2 stlrw2, [x1] mov w1, 3 str w1, [x0] ret .cfi_endproc .LFE342: .size _Z7seq_cstRiRSt6atomicIiE, .-_Z7seq_cstRiRSt6atomicIiE .ident "GCC: (GNU) 8.0.0 20180105 (experimental)" .section.note.GNU-stack,"",@progbits
[Bug target/83335] [8 regression][aarch64,ilp32] gcc.target/aarch64/asm-2.c ICEs since 255481
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83335 --- Comment #3 from Steve Ellcey --- Proposed patch https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00348.html
[Bug rtl-optimization/83500] gcc.dg/tree-prof/switch-case-1.c fails on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83500 Steve Ellcey changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #4 from Steve Ellcey --- Since the test is now passing I will close this defect as fixed.
[Bug target/83466] Wrong TLS GD sequence for ILP32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83466 --- Comment #4 from Steve Ellcey --- Created attachment 43027 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43027&action=edit Patch file being tested I am testing this patch for regressions, I have verified that it does fix the small test case.
[Bug rtl-optimization/83500] gcc.dg/tree-prof/switch-case-1.c fails on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83500 --- Comment #3 from Steve Ellcey --- The test now passes for me.
[Bug rtl-optimization/83500] New: gcc.dg/tree-prof/switch-case-1.c fails on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83500 Bug ID: 83500 Summary: gcc.dg/tree-prof/switch-case-1.c fails on aarch64 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org Target Milestone: --- This test started failing on aarch64-linux-gnu with this checkin: commit b33f4eb038b5c30bf57de6bb10f40e11481c6be6 Author: hubicka Date: Sat Oct 7 16:33:26 2017 + * tree-switch-conversion.c (do_jump_if_equal, emit_cmp_and_jump_insns): Update profile. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@253512 138bc75d-0d04-0410-96 1f-82ee72b054a4 Before the change I see these lines in the expand file: % grep 'basic block.* count 4000' *expand ;; basic block 17, loop depth 0, count 4000, freq 4000, maybe hot ;; basic block 18, loop depth 0, count 4000, freq 4000, maybe hot % grep 'basic block.* count 2000' *expand ;; basic block 23, loop depth 0, count 2000, freq 2000, maybe hot After the change I see these lines in the expand file: % grep 'basic block.* count 4000' *expand ;; basic block 17, loop depth 0, count 4000, freq 4000, maybe hot ;; basic block 18, loop depth 0, count 4000, freq 4000, maybe hot % grep 'basic block.* count 2000' *expand ;; basic block 8, loop depth 0, count 2000 (adjusted), freq 2000, maybe hot ;; basic block 23, loop depth 0, count 2000, freq 2000, maybe hot Having two lines with 'count 2000' is causing the test failure, I do not know if the test needs to be changed or if this is indicating an actual problem in the compiler.
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #8 from Steve Ellcey --- Author: sje Date: Tue Nov 21 00:18:14 2017 New Revision: 254977 URL: https://gcc.gnu.org/viewcvs?rev=254977&root=gcc&view=rev Log: 2017-11-20 Steve Ellcey PR target/81356 * gfortran.dg/pr45636.f90 (aarch64*-*-*): Remove from xfail list. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gfortran.dg/pr45636.f90
[Bug target/81356] __builtin_strcpy is not good for copying an empty string on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81356 --- Comment #7 from Steve Ellcey --- Author: sje Date: Fri Nov 17 22:44:32 2017 New Revision: 254901 URL: https://gcc.gnu.org/viewcvs?rev=254901&root=gcc&view=rev Log: 2017-11-17 Steve Ellcey PR target/81356 * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p): Remove. (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.c
[Bug target/79868] aarch64: diagnostic "malformed target %s value" not translateable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79868 Steve Ellcey changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED Target Milestone|--- |8.0 --- Comment #11 from Steve Ellcey --- Fixed on Tot for 8.0.
[Bug tree-optimization/80925] [8 Regression] vect peeling failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80925 --- Comment #27 from Steve Ellcey --- (In reply to Richard Biener from comment #26) > Fixed? I see still these vect failures on aarch64: FAIL: gcc.dg/vect/pr65947-14.c execution test FAIL: gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects execution test FAIL: g++.dg/vect/slp-pr56812.cc -std=c++11 scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times) FAIL: g++.dg/vect/slp-pr56812.cc -std=c++14 scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times) FAIL: g++.dg/vect/slp-pr56812.cc -std=c++98 scan-tree-dump-times slp1 "basic block vectorized" 1 (found 0 times) I don't think the pr65947-14.c failure is related to this change but the pr56812.cc failure is one of the failures listed in the original report.
[Bug target/79868] aarch64: diagnostic "malformed target %s value" not translateable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79868 --- Comment #10 from Steve Ellcey --- Author: sje Date: Thu Nov 2 21:58:05 2017 New Revision: 254360 URL: https://gcc.gnu.org/viewcvs?rev=254360&root=gcc&view=rev Log: PR target/79868 * gcc.target/aarch64/spellcheck_1.c: Update dg-error string to match new format. * gcc.target/aarch64/spellcheck_2.c: Ditto. * gcc.target/aarch64/spellcheck_3.c: Ditto. * gcc.target/aarch64/target_attr_11.c: Ditto. * gcc.target/aarch64/target_attr_12.c: Ditto. * gcc.target/aarch64/target_attr_17.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/spellcheck_1.c trunk/gcc/testsuite/gcc.target/aarch64/spellcheck_2.c trunk/gcc/testsuite/gcc.target/aarch64/spellcheck_3.c trunk/gcc/testsuite/gcc.target/aarch64/target_attr_11.c trunk/gcc/testsuite/gcc.target/aarch64/target_attr_12.c trunk/gcc/testsuite/gcc.target/aarch64/target_attr_17.c
[Bug target/79868] aarch64: diagnostic "malformed target %s value" not translateable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79868 --- Comment #9 from Steve Ellcey --- Author: sje Date: Thu Nov 2 21:56:00 2017 New Revision: 254359 URL: https://gcc.gnu.org/viewcvs?rev=254359&root=gcc&view=rev Log: PR target/79868 * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Remove second argument from aarch64_process_target_attr call. * config/aarch64/aarch64-protos.h (aarch64_process_target_attr): Ditto. * config/aarch64/aarch64.c (aarch64_attribute_info): Change field type. (aarch64_handle_attr_arch): Remove second argument. (aarch64_handle_attr_cpu): Ditto. (aarch64_handle_attr_tune): Ditto. (aarch64_handle_attr_isa_flags): Ditto. (aarch64_process_one_target_attr): Ditto. (aarch64_process_target_attr): Ditto. (aarch64_option_valid_attribute_p): Remove second argument. on aarch64_process_target_attr call. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-c.c trunk/gcc/config/aarch64/aarch64-protos.h trunk/gcc/config/aarch64/aarch64.c
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #17 from Steve Ellcey --- Yes, this fixed my SPEC problem.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #14 from Steve Ellcey --- (In reply to Segher Boessenkool from comment #13) > I have a simpler patch. It is testing... Can you attach your patch to this defect so I can test it as well.
[Bug target/82786] New: aarch64 frame patch caused a number of target specific test failures.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82786 Bug ID: 82786 Summary: aarch64 frame patch caused a number of target specific test failures. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sje at gcc dot gnu.org CC: wdijkstr at arm dot com Target Milestone: --- Target: aarch64-*-* This patch: 2017-10-26 Wilco Dijkstra * config/aarch64/aarch64.c (aarch64_layout_frame): Ensure LR is always stored at the bottom of the callee-saves. Remove rarely used frame layout which saves callee-saves at top of frame, so the store of LR can be used as a valid probe in all cases. Caused several tests to fail: FAIL: gcc.target/aarch64/test_frame_10.c scan-assembler ldp\tx19, x30, \\[sp, [0-9]+\\] FAIL: gcc.target/aarch64/test_frame_10.c scan-assembler-times stp\tx19, x30, \\[sp, [0-9]+\\] 1 (found 0 times) FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler ldp\tx19, x30, \\[sp\\], [0-9]+ FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler-times stp\tx19, x30, \\[sp, -[0-9]+\\]! 1 (found 0 times) FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler ldp\tx19, x30, \\[sp\\], [0-9]+ FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler-times stp\tx19, x30, \\[sp, -[0-9]+\\]! 1 (found 0 times) FAIL: gcc.target/aarch64/test_frame_7.c scan-assembler ldp\tx19, x30, \\[sp\\] FAIL: gcc.target/aarch64/test_frame_7.c scan-assembler-times stp\tx19, x30, \\[sp] 1 (found 0 times)
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #12 from Steve Ellcey --- Created attachment 42491 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42491&action=edit Patch that fixes the test case Here is a possible patch. It fixes the test case and I am doing a bootstrap and make check on it.
[Bug rtl-optimization/82683] Combine: GCC generates bad code with -tune=thunderx2t99
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683 --- Comment #11 from Steve Ellcey --- I think I see where this is going wrong but I don't know what to do about it. In try_combine, line 3288 we have i2 and i3 of: (insn 18 16 19 3 (set (reg:DI 91) (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2]))) "x.i":31 674 {*aarch64_ashl_sisd_or_int_di3} (expr_list:REG_DEAD (reg:DI 83 [ _26 ]) (nil))) (insn 19 18 20 3 (set (reg/f:DI 78 [ _7 ]) (plus:DI (reg/f:DI 76 [ _4 ]) (reg:DI 91))) "x.i":31 94 {*adddi3_aarch64} (expr_list:REG_DEAD (reg:DI 91) (expr_list:REG_DEAD (reg/f:DI 76 [ _4 ]) (nil After that if statement we have i2 and i3 looking like: (insn 18 16 19 3 (set (reg:DI 91) (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2]))) "x.i":31 674 {*aarch64_ashl_sisd_or_int_di3} (expr_list:REG_DEAD (reg:DI 83 [ _26 ]) (nil))) (insn 19 18 20 3 (set (reg/f:DI 78 [ _7 ]) (plus:DI (ashift:DI (reg:DI 83 [ _26 ]) (const_int 2 [0x2])) (reg/f:DI 76 [ _4 ]))) "x.i":31 94 {*adddi3_aarch64} (expr_list:REG_DEAD (reg:DI 91) (expr_list:REG_DEAD (reg/f:DI 76 [ _4 ]) (nil There is the bogus REG_DEAD of 83 on insn 18. I don't know where or how this should be fixed up though.