[Bug target/105325] power10: Error: operand out of range
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325 acsawdey at gcc dot gnu.org changed: What|Removed |Added CC||acsawdey at gcc dot gnu.org --- Comment #12 from acsawdey at gcc dot gnu.org --- I do have a patch for this one that has been sitting around that I forgot about, looking at reviving that to at least post.
[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 --- Comment #5 from acsawdey at gcc dot gnu.org --- Bisection reveals that this starts with this commit: 20d70cd2719815d9ea853314775ae5787648ece5 is the first bad commit commit 20d70cd2719815d9ea853314775ae5787648ece5 Author: Alan Modra Date: Thu May 9 08:37:26 2019 +0930 [RS6000] PR89271, gcc.target/powerpc/vsx-simode2.c This patch makes a number of corrections to rs6000_register_move_cost, adds a new register union class, GEN_OR_VSX_REGS, and adjusts insn alternative costs to suit.
[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 --- Comment #4 from acsawdey at gcc dot gnu.org --- I was compiling with -mcpu=power9, yes: /home2/sawdey/work/gcc/trunk/build/gcc/xgcc -B/home2/sawdey/work/gcc/trunk/build/gcc -O3 -mcpu=power9 bug2.c
[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 --- Comment #2 from acsawdey at gcc dot gnu.org --- >From the reload dump: 0 Non input pseudo reload: reject++ 1 Non-pseudo reload: reject+=2 1 Non input pseudo reload: reject++ alt=0,overall=16,losers=2,rld_nregs=2 0 Non input pseudo reload: reject++ alt=1,overall=7,losers=1,rld_nregs=1 alt=2,overall=6,losers=1,rld_nregs=0 [...] Choosing alt 2 in insn 9: (0) wa (1) Z {*movqi_internal} The addressing for insn 9 is just reg+const so why did it think it would have to reload one register for alt 1 (d-form) and 0 for alt 2 which is x-form?
[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 --- Comment #1 from acsawdey at gcc dot gnu.org --- Looking at trunk, after expand we have this: (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 5 3 2 (set (reg/v/f:DI 117 [ a ]) (reg:DI 3 3 [ a ])) "bug2.c":3:1 -1 (nil)) (insn 3 2 4 2 (set (reg/v/f:DI 118 [ b ]) (reg:DI 4 4 [ b ])) "bug2.c":3:1 -1 (nil)) (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 4 9 2 (set (reg:DI 119) (mem:DI (reg/v/f:DI 118 [ b ]) [0 MEM [(void *)b_3(D)]+0 S8 A8])) "bug2.c":4:3 -1 (nil)) (insn 9 7 8 2 (set (reg:QI 120) (mem:QI (plus:DI (reg/v/f:DI 118 [ b ]) (const_int 8 [0x8])) [0 MEM [(void *)b_3(D)]+8 S1 A8])) "bug2.c":4:3 -1 (nil)) (insn 8 9 10 2 (set (mem:DI (reg/v/f:DI 117 [ a ]) [0 MEM [(void *)a_2(D)]+0 S8 A8]) (reg:DI 119)) "bug2.c":4:3 -1 (nil)) (insn 10 8 0 2 (set (mem:QI (plus:DI (reg/v/f:DI 117 [ a ]) (const_int 8 [0x8])) [0 MEM [(void *)a_2(D)]+8 S1 A8]) (reg:QI 120)) "bug2.c":4:3 -1 (nil)) Which is the expected code, DI and QI loads/stores that should produce D-form instructions. But it looks like reload put the QI into hard reg 32 which is a fp reg: (insn 9 17 8 2 (set (reg:QI 32 0 [orig:120 MEM [(void *)b_3(D)]+8 ] [120]) (mem:QI (reg:DI 10 10 [124]) [0 MEM [(void *)b_3(D)]+8 S1 A8])) "bug2.c":4:3 549 {*movqi_internal} which leads to the lxsibzx/stxsibx on output.
[Bug target/103197] New: ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 Bug ID: 103197 Summary: ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte Product: gcc Version: 10.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- This got broken sometime in gcc 10 timeframe. For this test case: #include void m(char *a, char *b) { memcpy(a,b,9); } AT13 (gcc 9.3.1) produces: m: .LFB0: .cfi_startproc ld 10,0(4) lbz 9,8(4) std 10,0(3) stb 9,8(3) blr .long 0 .byte 0,0,0,0,0,0,0,0 .cfi_endproc which is the expected code to copy 9 bytes. AT14 (gcc 10.3.1), gcc 11, and current trunk all produce: m: .LFB0: .cfi_startproc addi 10,4,8 ld 9,0(4) lxsibzx 0,0,10 std 9,0(3) addi 9,3,8 stxsibx 0,0,9 blr .long 0 .byte 0,0,0,0,0,0,0,0 .cfi_endproc which is really bad, mixing gpr and vsx. The inline expansion code in expand_block_move() does not attempt to generate vsx code at all unless the size is at least 16 bytes.
[Bug target/100996] rs6000 p10 vector add-add fusion should work with -m32 but doesn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100996 acsawdey at gcc dot gnu.org changed: What|Removed |Added Ever confirmed|0 |1 Target||powerpc-*-*-* Last reconfirmed||2021-06-09 Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED
[Bug target/100996] New: rs6000 p10 vector add-add fusion should work with -m32 but doesn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100996 Bug ID: 100996 Summary: rs6000 p10 vector add-add fusion should work with -m32 but doesn't Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- The fusion-p10-addadd.c test case does not get vector add-add fusion when compiling with -m32: /home/sawdey/work/gcc/trunk/build/gcc/xgcc -B/home/sawdey/work/gcc/trunk/build/gcc/ /home/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.target/powerpc/fusion-p10-addadd.c -m32 -fdiagnostics-plain-output -mcpu=power10 -O3 -dap -fno-ident -S typedef vector long vlong; vlong vaddadd(vlong a, vlong b, vlong c) { return a+b+c; } vaddadd: .LFB3: .cfi_startproc vadduwm 2,2,3# 8[c=4 l=4] addv4si3 vadduwm 2,2,4# 14 [c=4 l=4] addv4si3 blr # 24 [c=4 l=4] simple_return .cfi_endproc
[Bug target/97926] ICE in patch_jump_insn, at cfgrtl.c:1298
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97926 --- Comment #3 from acsawdey at gcc dot gnu.org --- So the underlying problem here is that the unordered comparisons are not allowed with -ffinite-math-only due to this predicate: ;; Return 1 if OP is a comparison operation that is valid for a branch ;; instruction. We check the opcode against the mode of the CC value. ;; validate_condition_mode is an assertion. (define_predicate "branch_comparison_operator" (and (match_operand 0 "comparison_operator") (match_test "GET_MODE_CLASS (GET_MODE (XEXP (op, 0))) == MODE_CC") (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCFPmode") (if_then_else (match_test "flag_finite_math_only") (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) (match_test "validate_condition_mode (GET_CODE (op), GET_MODE (XEXP (op, 0))), 1"))) But ubsan_instrument_float_cast() generates this: t = fold_build2 (UNLE_EXPR, boolean_type_node, expr, min); tt = fold_build2 (UNGE_EXPR, boolean_type_node, expr, max); which eventually leads to the ICE. Even if this branch wasn't rewritten by patch_dump_insn() it would not be recognized and would eventually ICE. Segher is working on a change to that predicate for PR98092 though which may be a workaround fix for this.
[Bug target/97926] ICE in patch_jump_insn, at cfgrtl.c:1298
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97926 --- Comment #2 from acsawdey at gcc dot gnu.org --- patch_jump_insn() is running into a land mine -- the insn before modification is invalid: (gdb) p insn_invalid_p(insn, true) $4 = 1 (gdb) pr insn (jump_insn 18 17 114 6 (set (pc) (if_then_else (unle (reg:CCFP 131) (const_int 0 [0])) (label_ref 21) (pc))) "../../gcc/gcc/testsuite/c-c++-common/ubsan/float-cast-overflow-11.c":9:10 -1 (nil) -> 21) So verify_changes() fails because the same insn with a different label_ref inserted is also invalid.
[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from acsawdey at gcc dot gnu.org --- Fixed in trunk.
[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070 --- Comment #4 from acsawdey at gcc dot gnu.org --- OK, I see the fail with -mcpu=power9. Looks like I botched something with addressing and allowed D-form addresses when it should be DS-form. On power10 this would result in selection of a prefix D-form load, which then causes the ld-cmpi to be split. On anything previous we just ICE.
[Bug target/99070] ICE in extract_constrain_insn, at recog.c:2670
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99070 acsawdey at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org --- Comment #2 from acsawdey at gcc dot gnu.org --- What are the build flags for this compiler?
[Bug rtl-optimization/98692] Unitialized Values reported only with -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98692 --- Comment #7 from acsawdey at gcc dot gnu.org --- The inline expansion should be disabled by -Os, the patterns for cmpstr[n]si both have this: if (optimize_insn_for_size_p ()) FAIL;
[Bug target/98688] C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688 --- Comment #3 from acsawdey at gcc dot gnu.org --- Yeah it's pretty clear that something needs to be output, as with that code I get an error like this: In module imported at mma-module-2.C:1:1: mma_foo0: In function ‘int bar(__vector_quad*, vec_t*, __vector_pair*)’: mma_foo0: error: failed to read compiled module cluster 2: Bad file data mma_foo0: note: compiled module file is ‘gcm.cache/mma_foo0.gcm’ mma-module-2.C:7:5: fatal error: failed to load binding ‘::foo0@mma_foo0’ 7 | foo0 (dst, vec, pvecp); | ^~~~ This is with a little test case of two files: export module mma_foo0; typedef unsigned char vec_t __attribute__((vector_size(16))); export void foo0 (__vector_quad *dst, vec_t *vec, __vector_pair *pvecp) { __vector_quad acc; __vector_pair vecp0 = *pvecp; vec_t vec1 = vec[1]; __builtin_mma_xvf64ger (&acc, vecp0, vec1); __builtin_mma_xvf64gerpp (&acc, vecp0, vec1); __builtin_mma_xvf64gerpn (&acc, vecp0, vec1); dst[0] = acc; } typedef unsigned char vec_t __attribute__((vector_size(16))); int bar(__vector_quad *dst, vec_t *vec, __vector_pair *pvecp) { foo0 (dst, vec, pvecp); }
[Bug target/98688] C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688 --- Comment #1 from acsawdey at gcc dot gnu.org --- I don't know if this is the right thing to do, but ignoring the opaque type here make the ICE go away. I suspect I need to construct a module test case using vector_pair/vector_quad to really test this though. diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc index d2093916c9e..3ec0b04def3 100644 --- a/gcc/cp/module.cc +++ b/gcc/cp/module.cc @@ -8831,6 +8831,10 @@ trees_out::type_node (tree type) } break; +case OPAQUE_TYPE: + /* No additional data. */ + break; + case OFFSET_TYPE: tree_node (TYPE_OFFSET_BASETYPE (type)); break;
[Bug target/98688] New: C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98688 Bug ID: 98688 Summary: C++ modules support does not work on PowerPC with opaque MMA types vector_pair/vector_quad Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Similar to PR98645, we run into trouble if we try to compile code using vector_pair/vector_quad using -fmodule-header: /home/sawdey/work/gcc/trunk2/build/gcc/xg++ -B/home/sawdey/work/gcc/trunk2/build/gcc/ /home/sawdey/work/gcc/trunk2/gcc/gcc/testsuite/gcc.target/powerpc/mma-builtin-2.c -std=c++2a -fmodule-header -S -o mb2.s /home/sawdey/work/gcc/trunk2/gcc/gcc/testsuite/gcc.target/powerpc/mma-builtin-2.c: internal compiler error: in type_node, at cp/module.cc:8779 0x10468287 trees_out::type_node(tree_node*) ../../gcc/gcc/cp/module.cc:8779 0x1046446b trees_out::tree_node(tree_node*) ../../gcc/gcc/cp/module.cc:9106 0x10467dcb trees_out::type_node(tree_node*) ../../gcc/gcc/cp/module.cc:8773 0x1046446b trees_out::tree_node(tree_node*) ../../gcc/gcc/cp/module.cc:9106 0x10465d57 trees_out::core_vals(tree_node*) ../../gcc/gcc/cp/module.cc:6088 0x1046783b trees_out::tree_node_vals(tree_node*) ../../gcc/gcc/cp/module.cc:7141 0x1046783b trees_out::fn_parms_init(tree_node*) ../../gcc/gcc/cp/module.cc:10037 0x10461833 trees_out::decl_value(tree_node*, depset*) ../../gcc/gcc/cp/module.cc:7738 0x1046e163 depset::hash::find_dependencies() ../../gcc/gcc/cp/module.cc:13199 0x1046eae7 module_state::write(elf_out*, cpp_reader*) ../../gcc/gcc/cp/module.cc:17568 0x10470313 finish_module_processing(cpp_reader*) ../../gcc/gcc/cp/module.cc:19747 0x103ae82f c_parse_final_cleanups() ../../gcc/gcc/cp/decl2.c:5178 0x1072cad7 c_common_parse_file() ../../gcc/gcc/c-family/c-opts.c:1233 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. It looks like trees_out::type_node() needs to understand opaque type, and possibly whatever reads that in needs to understand it on the way in as well.
[Bug c++/97947] [11 Regression] ICE in digest_init_r, at cp/typeck2.c:1145
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97947 acsawdey at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org --- Comment #3 from acsawdey at gcc dot gnu.org --- I'll take a look at this. Probably missed something when adding OPAQUE_TYPE.
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 --- Comment #10 from acsawdey at gcc dot gnu.org --- For now, disabling use of POImode for expansion of memcpy/memmove to avoid this problem while we figure out the real fix: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553672.html
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 --- Comment #9 from acsawdey at gcc dot gnu.org --- I did post a small patch that fixes this, but more for the purpose of provoking discussion than because I am sure it is the right way to fix this. https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553523.html
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 --- Comment #8 from acsawdey at gcc dot gnu.org --- Another small test case, reduced from my compile failure of c/c-typeck.c and modified to provoke truncation from POImode to various other modes: typedef int *a; struct b { a ba; }; enum c { c1=1 }; struct e { union eu { char f_char; short f_short; int f_int; long f_long; int *f_ptr; long long f_ll; } u; c g; a h; b i; }; a d(bool, bool, bool); e j(int, e, bool, bool); void k() { int l; for (;;) { e expr; l = sizeof(struct e); expr = j(l, expr, true, false); d(expr.u.f_char, false, __null); d(expr.u.f_short, false, __null); d(expr.u.f_int, false, __null); d(expr.u.f_long, false, __null); d(expr.u.f_ptr, false, __null); d(expr.u.f_ll, false, __null); } }
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 --- Comment #7 from acsawdey at gcc dot gnu.org --- I wonder if this other case works properly when compiled with -m64. Trying to generate a stxvp with a 32-bit address seems odd.
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 acsawdey at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org --- Comment #5 from acsawdey at gcc dot gnu.org --- Nope, this is my patch that added vector pair to memcpy/memmove expansion. We apparently don't have the right patterns defined for this to extract things from the POImode reg that it uses. This is the code in expr.c: if (GET_MODE_CLASS (from_mode) == MODE_PARTIAL_INT) { rtx new_from; scalar_int_mode full_mode = smallest_int_mode_for_size (GET_MODE_BITSIZE (from_mode)); convert_optab ctab = unsignedp ? zext_optab : sext_optab; enum insn_code icode; icode = convert_optab_handler (ctab, full_mode, from_mode); gcc_assert (icode != CODE_FOR_nothing); convert_optab_handler doesn't find anything to go from POImode to DImode, so the assert fires.
[Bug target/96791] ICE in convert_mode_scalar, at expr.c:412
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96791 --- Comment #3 from acsawdey at gcc dot gnu.org --- This also requires -mbig which may be implicit in the original poster's build. But I see it failing as well.
[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787 --- Comment #3 from acsawdey at gcc dot gnu.org --- Never mind that, all I'm seeing is the lack of save/restore of r2 in the power10 version.
[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787 --- Comment #2 from acsawdey at gcc dot gnu.org --- I'm seeing some load-past-store code motion that happens when compiling for power10 vs power9 that makes me suspicious.
[Bug target/96787] rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787 --- Comment #1 from acsawdey at gcc dot gnu.org --- Created attachment 49123 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49123&action=edit hashtab.c with target power9 attribute on htab_delete()
[Bug target/96787] New: rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96787 Bug ID: 96787 Summary: rs6000 mcpu=power10 miscompiles libiberty htab_delete() causing bootstrap failure Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Created attachment 49122 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49122&action=edit hashtab.c with no added attributes Building r11-2827 configured using --with-cpu=power10, I am seeing some kind of compile failure in libiberty hashtab.o function htab_delete(). Putting __attribute__ ((target("cpu=power9")) in front of that function clears the problem. The manifestation is that genmddeps segfaults. I've attached asm output with (.fixed.s) and without (.broken.s) the attribute on htab_delete().
[Bug c/96151] bootstrap fails due to ICE in c_omp_split_clauses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96151 --- Comment #1 from acsawdey at gcc dot gnu.org --- This compile is successful like this but fails if I add -mcpu=power9. /home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/xg++ -B/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/ -B/opt/binutils-gcc-p10/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home2/sawdey/work/gcc/mamboCI/gcc-master/libstdc++-v3/libsupc++ -L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -Ic-family -I../../gcc-master/gcc -I../../gcc-master/gcc/c-family -I../../gcc-master/gcc/../include -I../../gcc-master/gcc/../libcpp/include -I../../gcc-master/gcc/../libdecnumber -I../../gcc-master/gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc-master/gcc/../libbacktrace -o c-family/c-omp.o -MT c-family/c-omp.o -MMD -MP -MF c-family/.deps/c-omp.TPo ../../gcc-master/gcc/c-family/c-omp.c
[Bug c/96151] New: bootstrap fails due to ICE in c_omp_split_clauses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96151 Bug ID: 96151 Summary: bootstrap fails due to ICE in c_omp_split_clauses Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Started to see this on trunk last night. Tested again and still see it with r11-2018. configured with: /home2/sawdey/work/gcc/mamboCI/gcc-master/configure --prefix=/opt/binutils-gcc-p10 --enable-languages=all --enable-bootstrap --with-cpu=power9 /home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/xg++ -B/home2/sawdey/work/gcc/mamboCI/build-mambo/./prev-gcc/ -B/opt/binutils-gcc-p10/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -B/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home2/sawdey/work/gcc/mamboCI/gcc-master/libstdc++-v3/libsupc++ -L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home2/sawdey/work/gcc/mamboCI/build-mambo/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -DIN_GCC_FRONTEND -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -Ic-family -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../include -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libcpp/include -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libdecnumber -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libdecnumber/dpd -I../libdecnumber -I/home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/../libbacktrace -o c-family/c-omp.o -MT c-family/c-omp.o -MMD -MP -MF c-family/.deps/c-omp.TPo /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c during RTL pass: expand /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c: In function ‘void c_omp_split_clauses(location_t, tree_code, omp_clause_mask, tree, tree_node**)’: /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/c-family/c-omp.c:1561:1: internal compiler error: in reduce_to_bit_field_precision, at expr.c:11530 1561 | c_omp_split_clauses (location_t loc, enum tree_code code, | ^~~ 0x10decc9f reduce_to_bit_field_precision /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:11530 0x10dde35b expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8786 0x10de5443 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:10152 0x10ddc6c7 expand_expr_real(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8469 0x10db55f7 expand_expr /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.h:282 0x10ddac03 expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**, rtx_def**, expand_modifier) /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8065 0x10ddc8f7 expand_cond_expr_using_cmove /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:8518 0x10de3c97 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/expr.c:9869 0x10b79587 expand_gimple_stmt_1 /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:3786 0x10b798d3 expand_gimple_stmt /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:3847 0x10b83887 expand_gimple_basic_block /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:5888 0x10b861d7 execute /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/cfgexpand.c:6572 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. Makefile:1124: recipe for target 'c-family/c-omp.o' failed
[Bug target/95347] rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from acsawdey at gcc dot gnu.org --- This is fixed now.
[Bug target/95347] rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347 acsawdey at gcc dot gnu.org changed: What|Removed |Added Last reconfirmed||2020-06-02 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from acsawdey at gcc dot gnu.org --- Turns out that lfs/plfs has the same problem. Patch for that coming shortly.
[Bug target/95347] New: rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95347 Bug ID: 95347 Summary: rs6000 mcpu=future generating stfs instead of pstfs for pc-relative references Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Problem exists in r11-639. /home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc -B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/ /home2/sawdey/work/gcc/mamboCI/gcc-master/gcc/testsuite/gcc.c-torture/execute/pr79354.c -mcpu=future -mpcrel -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -fdiagnostics-urls=never -O1 -w -lm -o ./pr79354.exe --save-temps ./pr79354.s: Assembler messages: ./pr79354.s:31: Error: missing operand The relevant piece of the asm output: xscvuxdsp 0,32 pstfs 0,.LANCHOR0+16@pcrel stfs 0,.LANCHOR0+20@pcrel lwa 10,0(3) pstw 10,.LANCHOR0+20@pcrel The extended mnemonic "pstfs Fx,value" is equivalent to "pstfs Fx,value(0),1" and is only valid for pstfs not stfs.
[Bug target/94740] ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94740 --- Comment #1 from acsawdey at gcc dot gnu.org --- Reduced test case: struct __attribute__((scalar_storage_order("big-endian"))) { int a; int b[]; } c; int d; int e() { d = c.b[0]; }
[Bug target/94740] New: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94740 Bug ID: 94740 Summary: ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Compiler is trunk 3bcdb5dec72b6d7b197821c2b814bc9fc07f4628 on ppc64le power9 host. ~/work/gcc/trunk/build/gcc/xgcc -B/home2/sawdey/work/gcc/trunk/build/gcc /home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c -mcpu=future -mpcrel -O1 -lm -o ./t5.exe during RTL pass: reload /home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c: In function ‘main’: /home2/sawdey/work/gcc/trunk/gcc/gcc/testsuite/gcc.dg/sso/t5.c:73:1: internal compiler error: in set_address_disp, at rtlanal.c:6254 73 | } | ^ 0x10a50ca3 set_address_disp ../../gcc/gcc/rtlanal.c:6254 0x10a50ca3 set_address_disp ../../gcc/gcc/rtlanal.c:6252 0x10a50ca3 decompose_automod_address ../../gcc/gcc/rtlanal.c:6297 0x10a50ca3 decompose_address(address_info*, rtx_def**, machine_mode, unsigned char, rtx_code) ../../gcc/gcc/rtlanal.c:6457 0x10887973 process_address_1 ../../gcc/gcc/lra-constraints.c:3367 0x10889b9b process_address ../../gcc/gcc/lra-constraints.c:3641 0x10889b9b curr_insn_transform ../../gcc/gcc/lra-constraints.c:3956 0x1088f95f lra_constraints(bool) ../../gcc/gcc/lra-constraints.c:5029 0x1087119f lra(_IO_FILE*) ../../gcc/gcc/lra.c:2440 0x10810b9b do_reload ../../gcc/gcc/ira.c:5523 0x10810b9b execute ../../gcc/gcc/ira.c:5709 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions.
[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 acsawdey at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from acsawdey at gcc dot gnu.org --- Fixed in trunk.
[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 --- Comment #3 from acsawdey at gcc dot gnu.org --- I'm wondering if the same problem exists for atomic_store, store_quadpti, and pstq vs stq?
[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 --- Comment #2 from acsawdey at gcc dot gnu.org --- Solution is going to be to always use plq if prefixed, which makes sense anyway for little endian because it avoids the ugly doubleword swap.
[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 --- Comment #1 from acsawdey at gcc dot gnu.org --- Compiling with -dap we see: sync # 7[c=12 l=4] *hwsync plq 8,.LANCHOR0@pcrel# 8[c=8 l=12] load_quadpti mr 10,9 # 9[c=4 l=4] *movdi_internal64/2 mr 11,8 # 10 [c=4 l=4] *movdi_internal64/2 I think the problem is that atomic_load thinks it always needs to do a doubleword swap if little endian for TImode, which is true for lq, but not for plq.
[Bug target/94622] testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 acsawdey at gcc dot gnu.org changed: What|Removed |Added Last reconfirmed||2020-04-16 Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED
[Bug target/94622] New: testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622 Bug ID: 94622 Summary: testsuite/gcc.dg/atomic/c11-atomic-exec-1.c fails on powerpc64le with -mpcrel Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Compile command: /home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc -B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/ /home2/sawdey/work/gcc/mamboCI/pike-trunk/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c -B/home2/sawdey/work/gcc/mamboCI/build-mambo/powerpc64le-unknown-linux-gnu/./libatomic/ -L/home2/sawdey/work/gcc/mamboCI/build-mambo/powerpc64le-unknown-linux-gnu/./libatomic/.libs -latomic -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -fdiagnostics-urls=never -O1 -std=c11 -pedantic-errors -lm -mpcrel -mcpu=future -o c11-atomic-exec-1.exe Compiler is trunk from about a week ago. Reduced test case: extern void abort (void); extern void exit (int); static void test_simple_assign (void) { do { do { static volatile _Atomic (long double) b = (long double) ((1)); if (b != ((long double) ((1 abort (); } while (0); } while (0); } int main (void) { test_simple_assign (); exit (0); } The problem seems to be that with -mpcrel, we generate a plq for the load of the long double constant and are swapping around the doublewords, which is only needed for lq not plq. The generated code with -mpcrel: plq 8,.LANCHOR0@pcrel mr 10,9 mr 11,8 cmpw 0,10,10 bne- 0,$+4 isync std 9,32(1) std 8,40(1) plfd 0,.LC0@pcrel plfd 1,.LC0+8@pcrel lfd 12,32(1) lfd 13,40(1) fcmpu 0,12,0 bne 0,$+8 fcmpu 0,13,1 bne 0,.L4 And with -mno-pcrel: addis 9,2,.LANCHOR0@toc@ha addi 9,9,.LANCHOR0@toc@l lq 10,0(9) mr 8,10 mr 9,11 mr 10,11 mr 11,8 cmpw 0,10,10 bne- 0,$+4 isync std 9,32(1) std 8,40(1) addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lfd 0,0(9) lfd 1,8(9) lfd 12,32(1) lfd 13,40(1) fcmpu 0,12,0 bne 0,$+8 fcmpu 0,13,1 bne 0,.L4
[Bug target/94542] test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates incorrect code on ppc64le with -mpcrel -mcpu=future -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94542 acsawdey at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2020-04-09 Ever confirmed|0 |1
[Bug target/94542] New: test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates incorrect code on ppc64le with -mpcrel -mcpu=future -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94542 Bug ID: 94542 Summary: test gcc/testsuite/gcc.dg/tls/pr24428-2.c generates incorrect code on ppc64le with -mpcrel -mcpu=future -O2 Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- The test case is: __thread double thrtest[81]; int main () { double *p, *e; e = &thrtest[81]; for (p = &thrtest[0]; p < e; ++p) *p = 1.0; return 0; } Generated code for p and e is paddi 9,13,thrtest@tprel pla 8,thrtest+648@pcrel The second should also be using a @tprel relocation. Because it didn't, the loop runs off the end of allocated memory and segfaults. This test runs correctly when compiled with -O0. Compile command: /home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/xgcc -B/home2/sawdey/work/gcc/mamboCI/build-mambo/gcc/ /home2/sawdey/work/gcc/mamboCI/pike-trunk/gcc/testsuite/gcc.dg/tls/pr24428-2.c -O2 -mpcrel -mcpu=future -S -o pr24428-2.exe.s
[Bug target/92379] rs6000.c:5598:13: runtime error: shift exponent 64 is too large for 64-bit type 'long int'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92379 acsawdey at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from acsawdey at gcc dot gnu.org --- Fix checked in to trunk.
[Bug target/92379] rs6000.c:5598:13: runtime error: shift exponent 64 is too large for 64-bit type 'long int'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92379 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED CC||acsawdey at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org --- Comment #6 from acsawdey at gcc dot gnu.org --- I've reproduced this with current trunk, going to see if I can cook up a patch quick.
[Bug target/93129] PPC memset not using vector instruction on >= Power8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93129 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2020-01-06 CC||acsawdey at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Ever confirmed|0 |1
[Bug target/93130] PPC simple memset not inlined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93130 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2020-01-06 CC||acsawdey at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Ever confirmed|0 |1
[Bug jit/87808] gcc_lib_dir is missing from libgccjit's search path when driver is not installed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87808 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-22 CC||acsawdey at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #6 from acsawdey at gcc dot gnu.org --- I'm also seeing this same problem simply from not having the gcc driver in PATH. Using the example from downstream redhat BZ 1566178: [sawdey@marlin trunk]$ /home2/sawdey/work/gcc/trunk/install/bin/gcc -Wl,-rpath,/home2/sawdey/work/gcc/trunk/install/lib -g -Wall -Werror t.c -lgccjit [sawdey@marlin trunk]$ ./a.out ld: cannot find crtbeginS.o: No such file or directory ld: cannot find -lgcc ld: cannot find -lgcc_s libgccjit.so: error: error invoking gcc driver gcc_jit_result_get_code: NULL result Segmentation fault (core dumped) [sawdey@marlin trunk]$ PATH=/home2/sawdey/work/gcc/trunk/install/bin:$PATH ./a.out hello foo Using strace the failing version makes this ld command, with no path for crtbeginS.o: /usr/bin/ld --eh-frame-hdr -shared -m elf64lppc -o /tmp/libgccjit-lMbVEL/fake.so /usr/lib/powerpc64le-linux-gnu/crti.o crtbeginS.o -L/lib/powerpc64le-linux-gnu -L/lib/../lib64 -L/usr/lib/powerpc64le-linux-gnu /tmp/cchtuxH0.o -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state crtendS.o /usr/lib/powerpc64le-linux-gnu/crtn.o When it can find the driver, this is the ld command, with the full path to the correct installed bits: /usr/bin/ld --eh-frame-hdr -shared -m elf64lppc -o /tmp/libgccjit-t81bpM/fake.so /usr/lib/powerpc64le-linux-gnu/crti.o /home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/crtbeginS.o -L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0 -L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/../../../../lib64 -L/lib/powerpc64le-linux-gnu -L/lib/../lib64 -L/usr/lib/powerpc64le-linux-gnu -L/home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/../../.. /tmp/ccD4Cbi4.o -lgcc --push-state --as-needed -lgcc_s -pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /home2/sawdey/work/gcc/trunk/install/lib/gcc/powerpc64le-unknown-linux-gnu/10.0.0/crtendS.o /usr/lib/powerpc64le-linux-gnu/crtn.o This is with trunk, configured with --disable-bootstrap --enable-languages=c,c++,jit --enable-host-shared --prefix=/home2/sawdey/work/gcc/trunk/install
[Bug rtl-optimization/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from acsawdey at gcc dot gnu.org --- Fixed in trunk.
[Bug rtl-optimization/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308 --- Comment #6 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Fri Feb 15 15:41:25 2019 New Revision: 268942 URL: https://gcc.gnu.org/viewcvs?rev=268942&root=gcc&view=rev Log: 2019-02-15 Aaron Sawdey PR rtl-optimization/88308 * shrink-wrap.c (move_insn_for_shrink_wrap): Fix LABEL_NUSES counts on copied instruction. Modified: trunk/gcc/ChangeLog trunk/gcc/shrink-wrap.c
[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308 --- Comment #5 from acsawdey at gcc dot gnu.org --- After some more digging, it appears that the problem is move_insn_for_shrink_wrap() is deleting and re-creating insns to move them from one BB to another. The label reference count gets decremented in delete_insn() but does not get re-incremented when the new insn is created in a different BB. If you add -fno-shrink-wrap, the ICE does not occur.
[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308 --- Comment #4 from acsawdey at gcc dot gnu.org --- Tracked down the difference between -m32 and -m64. In the -m64 case, rs6000_emit_move calls force_const_mem and that will set LABEL_PRESERVE_P on a label_ref that it finds, which is what marks the jump table label for preservation. In the -m32 case, none of this if tests succeed inside the case E_SImode/E_DImode and as a result rs6000_emit_move does not call force_const_mem. It really seems to me like the label for the jump table should be marked for preservation somewhere more definite than this.
[Bug target/88308] ICE in maybe_record_trace_start, at dwarf2cfi.c:2309
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88308 --- Comment #3 from acsawdey at gcc dot gnu.org --- One difference between compiling this -m32 and -m64 is that the label for the jump table is marked /s in the 64-bit version: (code_label/s 22 21 23 4 (nil) [4 uses]) (jump_table_data 23 22 24 (addr_diff_vec:SI (label_ref:DI 22) [ (label_ref:DI 25) (label_ref:DI 54) (label_ref:DI 83) In the 32-bit version it is not, and that label plus the jump_table_data insn that follows are not present in the dumps after split2: (code_label 23 22 24 4 (nil) [4 uses]) ;; Insn is not within a basic block (jump_table_data 24 23 25 (addr_diff_vec:SI (label_ref:SI 23) [ (label_ref:SI 26) (label_ref:SI 64) (label_ref:SI 102) (label_ref:SI 140) (label_ref:SI 178) The significance of this is that tablejump_p() looks at the next insn to determine if it is in fact a tablejump: bool tablejump_p (const rtx_insn *insn, rtx_insn **labelp, rtx_jump_table_data **tablep) { if (!JUMP_P (insn)) return false; rtx target = JUMP_LABEL (insn); if (target == NULL_RTX || ANY_RETURN_P (target)) return false; rtx_insn *label = as_a (target); rtx_insn *table = next_insn (label); if (table == NULL_RTX || !JUMP_TABLE_DATA_P (table)) return false; Since the label insn and jump table insn seem to be gone, this return false. Then in create_trace_edges() we end up in the final stanza of the if (JUMP_P(insn): else { rtx_insn *lab = JUMP_LABEL_AS_INSN (insn); gcc_assert (lab != NULL); maybe_record_trace_start (lab, insn); } And so we try to create a trace for the jump table label which leads to the ICE.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from acsawdey at gcc dot gnu.org --- This is fixed in trunk and gcc-8-branch. Hopefully I got this into 8 in time for it to get into 8.3.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #10 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Sat Feb 9 17:11:06 2019 New Revision: 268725 URL: https://gcc.gnu.org/viewcvs?rev=268725&root=gcc&view=rev Log: 2019-02-09 Aaron Sawdey Backported from mainline 2019-02-05 Aaron Sawdey PR target/89112 * config/rs6000/rs6000.md (tf_): Generate a local label for the long branch case. 2019-02-05 Aaron Sawdey PR target/89112 * config/rs6000/rs6000-string.c (do_ifelse, expand_cmp_vec_sequence, expand_compare_loop, expand_block_compare_gpr, expand_strncmp_align_check, expand_strncmp_gpr_sequence): Insert REG_BR_PROB notes in inline expansion of memcmp/strncmp. Add #include "profile-count.h" and "predict.h" for types and functions needed to work with REG_BR_PROB notes. 2019-02-09 Aaron Sawdey * config/rs6000/rs6000-string.c (expand_compare_loop, expand_block_compare): Insert REG_BR_PROB notes in inline expansion of memcmp/strncmp. Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/rs6000/rs6000-string.c branches/gcc-8-branch/gcc/config/rs6000/rs6000.md
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #9 from acsawdey at gcc dot gnu.org --- The fixes for this are in trunk now. I will backport to gcc-8-branch in a week and then this can be closed.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #8 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Feb 5 16:32:06 2019 New Revision: 268547 URL: https://gcc.gnu.org/viewcvs?rev=268547&root=gcc&view=rev Log: 2019-02-05 Aaron Sawdey PR target/89112 * config/rs6000/rs6000-string.c (do_ifelse, expand_cmp_vec_sequence, expand_compare_loop, expand_block_compare_gpr, expand_strncmp_align_check, expand_strncmp_gpr_sequence): Insert REG_BR_PROB notes in inline expansion of memcmp/strncmp. Add #include "profile-count.h" and "predict.h" for types and functions needed to work with REG_BR_PROB notes. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000-string.c
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #7 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Feb 5 16:30:45 2019 New Revision: 268546 URL: https://gcc.gnu.org/viewcvs?rev=268546&root=gcc&view=rev Log: 2019-02-05 Aaron Sawdey PR target/89112 * config/rs6000/rs6000.md (tf_): Generate a local label for the long branch case. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.md
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #5 from acsawdey at gcc dot gnu.org --- This patch fixes the issue on trunk: Index: gcc/config/rs6000/rs6000.md === --- gcc/config/rs6000/rs6000.md (revision 268403) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -12639,8 +12639,8 @@ else { static char seq[96]; - char *bcs = output_cbranch (operands[3], "$+8", 1, insn); - sprintf(seq, " $+12\;%s;b %%l0", bcs); + char *bcs = output_cbranch (operands[3], ".L%=", 1, insn); + sprintf(seq, " .L%%=\;%s\;b %%l0\;.L%%=:", bcs); return seq; } } I'm testing now, I will get this posted. Once approved for backport I'll apply the same thing to gcc-8-branch for inclusion in the next 8 release.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #4 from acsawdey at gcc dot gnu.org --- Well I can't blame this one on the linker or optimization. The splitting for the case where the branch destination is too far is wrong in tf_: static char seq[96]; char *bcs = output_cbranch (operands[3], "$+8", 1, insn); sprintf(seq, " $+12\;%s;b %%l0", bcs); return seq; This is wrong in both gcc 8 and 9. I'll get this fixed right away. The longer term question is how do I convince gcc to keep the code for a memcmp expansion together? I think this is happening because it thinks some of the code is cold and is throwing it at the end of the function.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #3 from acsawdey at gcc dot gnu.org --- It appears that gcc decided to split the bdnzt generated by the memcmp expansion because the destination was out of range, and produced this: bdz $+12 beq 0,$+8 b $+8;b .L939 bne 0,.L937 ; --> to setb code So after the second iteration the bdz should branch to the bne which branches to a setb if there was a difference or falls through and does an overlapping compare to get the last 4 bytes of the 36 being compared. But the disassembly when I look at things in gdb has an extra branch in there which messes things up: 0x10008b90 : bdz 0x10008b9c 0x10008b94 : beq 0x10008b9c 0x10008b98 : b 0x10008ba0 0x10008b9c : b 0x1b8c 0x10008ba0 : bne 0x1bac So now the bdz branches to a branch to b8c which is back to the top of the loop to compare another 16 bytes which is of course wrong. It's possible this all happened because I didn't generate labels in the splitter, so multiple conditional branches had to be split because they were out of range.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 --- Comment #2 from acsawdey at gcc dot gnu.org --- I'm seeing this on both gcc-8-branch and trunk, but only with -mcpu=power9. I'll figure out what happened here and get it fixed in trunk then back ported to 8.
[Bug target/89112] Incorrect code generated by rs6000 memcmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89112 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-01-30 Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Ever confirmed|0 |1
[Bug target/88027] PowerPC generates slightly weird code for memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from acsawdey at gcc dot gnu.org --- Backport to 8 tested ok and is now checked in as 267580.
[Bug target/88027] PowerPC generates slightly weird code for memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-01-03 Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #5 from acsawdey at gcc dot gnu.org --- This is fixed in trunk but I should backport to 8 now too.
[Bug target/88027] PowerPC generates slightly weird code for memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027 --- Comment #3 from acsawdey at gcc dot gnu.org --- This appears to have to do with alignment. In this test case, expand_block_clear() sees alignment of only 8 bits for the pointer p. If you declare a local struct st and pass that to __builtin_memset, it sees alignment of 128 bits and generates 4 stxv or stvx. There is a bug here though: for (offset = 0; bytes > 0; offset += clear_bytes, bytes -= clear_bytes) { machine_mode mode = BLKmode; rtx dest; if (TARGET_ALTIVEC && ((bytes >= 16 && align >= 128) || (bytes >= 32 && TARGET_EFFICIENT_UNALIGNED_VSX))) The intention here was to only do unaligned VSX if there were at least 32 bytes to clear. However because bytes is decremented, what this actually does is to always do the last 16 bytes using std if it is unaligned. This doesn't make a lot of sense and would be an easy fix.
[Bug target/88027] PowerPC generates slightly weird code for memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88027 acsawdey at gcc dot gnu.org changed: What|Removed |Added CC||acsawdey at gcc dot gnu.org --- Comment #2 from acsawdey at gcc dot gnu.org --- What can I say? expand_block_clear() steps through the block to be cleared, using smaller writes at the end if necessary. The rtx is generated for the write by: dest = adjust_address (orig_dest, mode, offset); emit_move_insn (dest, CONST0_RTX (mode)); My guess is scheduling moved the gpr stores up.
[Bug target/87474] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from acsawdey at gcc dot gnu.org --- Fixed on trunk.
[Bug target/87474] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474 --- Comment #4 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Oct 2 17:31:53 2018 New Revision: 264799 URL: https://gcc.gnu.org/viewcvs?rev=264799&root=gcc&view=rev Log: 2018-10-02 Aaron Sawdey PR target/87474 * config/rs6000/rs6000-string.c (expand_strn_compare): Check that both P8_VECTOR and VSX are enabled. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000-string.c
[Bug target/87474] ICE in extract_insn, at recog.c:2305
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87474 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org --- Comment #3 from acsawdey at gcc dot gnu.org --- This looks like I screwed up the conditions, clearly it shouldn't be trying to generate the vector/vsx strncmp expansion with -mno-power8-vector.
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from acsawdey at gcc dot gnu.org --- Fix committed to trunk and gcc-8-branch.
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 --- Comment #6 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Jun 26 16:43:38 2018 New Revision: 262157 URL: https://gcc.gnu.org/viewcvs?rev=262157&root=gcc&view=rev Log: 2018-06-26 Aaron Sawdey Backport from trunk 2018-06-22 Aaron Sawdey PR target/86222 * config/rs6000/rs6000-string.c (expand_strn_compare): Handle -m32 correctly. Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/rs6000/rs6000-string.c
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 --- Comment #5 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Fri Jun 22 15:37:36 2018 New Revision: 261906 URL: https://gcc.gnu.org/viewcvs?rev=261906&root=gcc&view=rev Log: Forgot PR target/86222 in ChangeLog Modified: trunk/gcc/ChangeLog
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 --- Comment #4 from acsawdey at gcc dot gnu.org --- Well when compiling this with -m32 -mcpu=power[6789] I get this for the rtx of the length argument: (const_int -2147483648 [0x8000]) So when I am doing UINTVAL (bytes_rtx) I get 0x8000 and things go awry. In the tree optimized dump I am seeing this, as Martin did: _2 = strncmpD.898 (&aD.2760, &bD.2761, 2147483648); [tail call] So somewhere in between something appears to be gratuitously sign extending this.
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 --- Comment #3 from acsawdey at gcc dot gnu.org --- OK, so this requires -m32 and also -mcpu=power6 or higher. I have reproduced it so should have a fix shortly.
[Bug target/86222] ICE in final_scan_insn_1 calling strncmp() with a bound of PTRDIFF_MAX + 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86222 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #19 from acsawdey at gcc dot gnu.org --- Backported to gcc 7 and 6. Closing again.
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 --- Comment #18 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Apr 24 00:19:43 2018 New Revision: 259590 URL: https://gcc.gnu.org/viewcvs?rev=259590&root=gcc&view=rev Log: 2018-04-23 Aaron Sawdey Backport from mainline 2018-04-16 Aaron Sawdey PR target/83660 * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark vec_extract expression as having side effects to make sure it gets a cleanup point. 2018-04-23 Aaron Sawdey Backport from mainline 2018-04-16 Aaron Sawdey PR target/83660 * gcc.target/powerpc/pr83660.C: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr83660.C Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/rs6000/rs6000-c.c branches/gcc-7-branch/gcc/testsuite/ChangeLog
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 --- Comment #17 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Apr 24 00:14:21 2018 New Revision: 259586 URL: https://gcc.gnu.org/viewcvs?rev=259586&root=gcc&view=rev Log: 2018-04-23 Aaron Sawdey Backport from mainline 2018-04-16 Aaron Sawdey PR target/83660 * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark vec_extract expression as having side effects to make sure it gets a cleanup point. 2018-04-23 Aaron Sawdey Backport from mainline 2018-04-16 Aaron Sawdey PR target/83660 * gcc.target/powerpc/pr83660.C: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr83660.C Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/rs6000/rs6000-c.c branches/gcc-6-branch/gcc/testsuite/ChangeLog
[Bug target/85436] [7 Regression] ICE compiling go code with -mcpu=power9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85436 --- Comment #1 from acsawdey at gcc dot gnu.org --- Created attachment 43966 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43966&action=edit shorter reduced test case I've further reduced the test case and now it's only 38 lines so so should be easier to work with.
[Bug target/85436] New: [7 Regression] ICE compiling go code with -mcpu=power9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85436 Bug ID: 85436 Summary: [7 Regression] ICE compiling go code with -mcpu=power9 Product: gcc Version: 7.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org CC: bergner at gcc dot gnu.org, segher at gcc dot gnu.org, wschmidt at gcc dot gnu.org Target Milestone: --- Target: powerpc64le-linux-gnu Created attachment 43964 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43964&action=edit reduced test case To reproduce: Build gcc-7-branch using: configure --disable-bootstrap --enable-languages=c,c++,go --with-long-double-128 --enable-secureplt --disable-multilib --without-ppl --without-cloog --without-libelf gcc/gccgo -Bgcc -O3 -Lpowerpc64le-linux-gnu/libgo -S bug_reduced.go -mcpu=power9 This affects 259009 through the head of the gcc-7-branch.
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #16 from acsawdey at gcc dot gnu.org --- Possibly need backports to both 7 and 6.
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #15 from acsawdey at gcc dot gnu.org --- Fixed in 259403.
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 --- Comment #14 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Mon Apr 16 14:50:06 2018 New Revision: 259403 URL: https://gcc.gnu.org/viewcvs?rev=259403&root=gcc&view=rev Log: 2018-04-16 Aaron Sawdey PR target/83660 * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Mark vec_extract expression as having side effects to make sure it gets a cleanup point. 2018-04-16 Aaron Sawdey PR target/83660 * gcc.target/powerpc/pr83660.C: New test. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr83660.C Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000-c.c trunk/gcc/testsuite/ChangeLog
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 --- Comment #12 from acsawdey at gcc dot gnu.org --- This function is called from cp/semantics.c maybe_cleanup_point_expr() tree fold_build_cleanup_point_expr (tree type, tree expr) { /* If the expression does not have side effects then we don't have to wrap it with a cleanup point expression. */ if (!TREE_SIDE_EFFECTS (expr)) return expr; In the vec_extract case it bails out due to no side effects and does not put in the cleanup point. So in fact a more minimal version of Jakub's patch also works. If you mark that this has side effects, then the cleanup point is added for us by the existing code: Index: config/rs6000/rs6000-c.c === --- config/rs6000/rs6000-c.c(revision 259353) +++ config/rs6000/rs6000-c.c(working copy) @@ -6704,6 +6704,8 @@ stmt = convert (innerptrtype, stmt); stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1); stmt = build_indirect_ref (loc, stmt, RO_NULL); + if (c_dialect_cxx ()) + TREE_SIDE_EFFECTS (stmt) = 1; return stmt; } Any comments on whether this is the right way to fix this? I think the vec_insert case does not need to be changed because the MODIFY_EXPR used there will mark that there are side effects for us.
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |acsawdey at gcc dot gnu.org
[Bug target/83660] ICE with vec_extract inside expression statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660 acsawdey at gcc dot gnu.org changed: What|Removed |Added CC||acsawdey at gcc dot gnu.org --- Comment #11 from acsawdey at gcc dot gnu.org --- Looking at the dump of an analogous test case for vec_insert: #include typedef __vector unsigned int uvec32_t __attribute__((__aligned__(16))); uvec32_t get_word(uvec32_t v) { return({const unsigned _B1 = 32; vec_insert(10, (uvec32_t)v, 2);}); } It seems that we do get an additional cleanup_point like you are proposing to add for vec_extract, which is maybe why that does not get into trouble: ;; Function __vector(4) unsigned int get_word(__vector(4) unsigned int) (null) ;; enabled by -tree-original { < = { const unsigned int _B1 = 32; <>; <> + 8) = 10;, D.3231>>; }>>; } I've gotten as far as seeing that something is calling fold_build_cleanup_point_expr an additional time compared to the vec_extract example.
[Bug target/85321] Missing documentation and option misc for ppc64le
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED CC||acsawdey at gcc dot gnu.org Resolution|--- |FIXED --- Comment #6 from acsawdey at gcc dot gnu.org --- All fixed in 259324.
[Bug target/85321] Missing documentation and option misc for ppc64le
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321 --- Comment #5 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Wed Apr 11 15:25:42 2018 New Revision: 259324 URL: https://gcc.gnu.org/viewcvs?rev=259324&root=gcc&view=rev Log: 2018-04-11 Aaron Sawdey PR target/85321 * doc/invoke.texi (RS/6000 and PowerPC Options): Document options -mcall- and -mtraceback=. Remove options -mabi=spe and -mabi=no-spe from PowerPC section. * config/rs6000/sysv4.opt (mcall-): Improve help text. * config/rs6000/rs6000.opt (mblock-compare-inline-limit=): Trim help text that is too long. * config/rs6000/rs6000.opt (mblock-compare-inline-loop-limit=): Trim help text that is too long. * config/rs6000/rs6000.opt (mstring-compare-inline-limit=): Trim help text that is too long. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.opt trunk/gcc/config/rs6000/sysv4.opt trunk/gcc/doc/invoke.texi
[Bug target/85321] Missing documentation and option misc for ppc64le
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85321 --- Comment #4 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Apr 10 22:05:41 2018 New Revision: 259302 URL: https://gcc.gnu.org/viewcvs?rev=259302&root=gcc&view=rev Log: 2018-04-10 Aaron Sawdey PR target/85321 * doc/invoke.texi (RS/6000 and PowerPC Options): Document options -mblock-compare-inline-limit, -mblock-compare-inline-loop-limit, and -mstring-compare-inline-limit. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/invoke.texi
[Bug target/83822] trunk/gcc/config/rs6000/rs6000-string.c:970]: (style) Redundant condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83822 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #5 from acsawdey at gcc dot gnu.org --- Fixed in 258975.
[Bug target/83822] trunk/gcc/config/rs6000/rs6000-string.c:970]: (style) Redundant condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83822 --- Comment #4 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Fri Mar 30 12:17:31 2018 New Revision: 258975 URL: https://gcc.gnu.org/viewcvs?rev=258975&root=gcc&view=rev Log: 2018-03-30 Aaron Sawdey PR target/83822 * config/rs6000/rs6000-string.c (expand_compare_loop): Fix redundant condition. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Fix redundant condition. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000-c.c trunk/gcc/config/rs6000/rs6000-string.c
[Bug target/83707] g++.dg/eh/simd-3.C fails on power7 -m32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83707 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from acsawdey at gcc dot gnu.org --- Apparently fixed so closing.
[Bug target/83707] g++.dg/eh/simd-3.C fails on power7 -m32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83707 --- Comment #5 from acsawdey at gcc dot gnu.org --- I can also confirm with trunk 258957 I do not see this fail with -m32 -mcpu=power7.
[Bug target/84743] default widths for parallel reassociation now hurt rather than help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743 acsawdey at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #6 from acsawdey at gcc dot gnu.org --- Updated parallel reassociation widths that give better performance than no parallel reassociation are checked in now so this can be closed.
[Bug target/84743] default widths for parallel reassociation now hurt rather than help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743 --- Comment #5 from acsawdey at gcc dot gnu.org --- Author: acsawdey Date: Tue Mar 13 16:28:09 2018 New Revision: 258495 URL: https://gcc.gnu.org/viewcvs?rev=258495&root=gcc&view=rev Log: 2018-03-13 Aaron Sawdey PR target/84743 * config/rs6000/rs6000.c (rs6000_reassociation_width): Disable parallel reassociation for int modes. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c
[Bug target/84743] default widths for parallel reassociation now hurt rather than help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743 acsawdey at gcc dot gnu.org changed: What|Removed |Added Priority|P1 |P3 --- Comment #4 from acsawdey at gcc dot gnu.org --- This turned out to be a system with a bad clock. The reassociation widths still need to be checked and corrected but the performance differences are mostly in the 0.5% range with just one that is about 2% (xz).
[Bug target/84743] default widths for parallel reassociation now hurt rather than help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743 --- Comment #3 from acsawdey at gcc dot gnu.org --- Yes I'm digging into this now and omnetpp is at the top of the list. I can see if there is a difference between cpu2006 and 2017 as well. For gcc7 I used 2006 to determine the widths.
[Bug target/84743] New: default widths for parallel reassociation now hurt rather than help
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84743 Bug ID: 84743 Summary: default widths for parallel reassociation now hurt rather than help Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: acsawdey at gcc dot gnu.org Reporter: acsawdey at gcc dot gnu.org Target Milestone: --- Target: powerpc64le power8 The width settings in rs6000_reassociation_width() were chosen to help performance for SPEC CPU in gcc 7. Testing on power8 shows that in gcc 8 (258101) there are now major degradations in CPU2017 int with the default reassociation widths as compared to using --param tree-reassoc-width=1 to disable reassociation. Benchmark 500.perlbench_r -5.98% 502.gcc_r -1.16% 505.mcf_r -12.44% 520.omnetpp_r -39.00% 523.xalancbmk_r -9.78% 525.x264_r -1.76% 531.deepsjeng_r -4.23% 548.exchange2_r -0.66% 557.xz_r-2.04%
[Bug middle-end/84433] gcc 7 and before miscompile loop and remove exit due to incorrect range calculation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84433 --- Comment #8 from acsawdey at gcc dot gnu.org --- It looks like both gcc 7 and 8 assume that the statement ptrA->sA[ptrB->int1].zt = parm1; will only be executed 14+1 times because of the declaration sA[15]. However gcc 7 assumes the whole loop will only execute that number of times: Statement ptrA_14(D)->sA[ptrB__int1_lsm.11_22].zt = _34; is executed at most 14 (bounded by 14) + 1 times in loop 1. Analyzing # of iterations of loop 1 exit condition [15, + , 4294967295] != 0 bounds on difference of bases: -15 ... -15 result: # of iterations 15, bounded by 15 Loop 1 iterates 15 times. Loop 1 iterates at most 14 times. Loop 1 likely iterates at most 14 times. Analyzing # of iterations of loop 1 exit condition [15, + , 4294967295] != 0 bounds on difference of bases: -15 ... -15 result: # of iterations 15, bounded by 15 Removed pointless exit: if (ivtmp_24 != 0) were gcc8 does not: Statement ptrA_13(D)->sA[ptrB__int1_lsm.5_22].zt = _20; is executed at most 14 (bounded by 14) + 1 times in loop 1. Analyzing # of iterations of loop 1 exit condition [15, + , 4294967295] != 0 bounds on difference of bases: -15 ... -15 result: # of iterations 15, bounded by 15 Loop 1 iterates 15 times. Loop 1 iterates at most 15 times. Loop 1 likely iterates at most 15 times. Neither gcc 7 nor 8 produce any warnings for the revised test case with -Wall.