[Bug rtl-optimization/116516] [15 Regression] [lra] ICE in decompose_normal_address, at rtlanal.cc:6712 by r15-3213-g708ee71808ea61
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116516 Andrew Pinski changed: What|Removed |Added CC||sjames at gcc dot gnu.org --- Comment #5 from Andrew Pinski --- *** Bug 116523 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/116523] [15 regression] ICE when building hardinfo-0.6_alpha_pre20221113
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116523 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 116516 ***
[Bug target/116521] missing optimization: xtensa sibcall with windowed ABI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116521 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/116521] missing optimization: xtensa tail-call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116521 --- Comment #1 from Andrew Pinski --- The default ABI for xtensa is to use windowed registers which does not currently support sibcalling. include/xtensa-config.h : #define XSHAL_ABI XTHAL_ABI_WINDOWED #define XTHAL_ABI_WINDOWED 0 gcc/config/xtensa/xtensa.h: #define TARGET_WINDOWED_ABI_DEFAULT (XSHAL_ABI == XTHAL_ABI_WINDOWED) gcc/config/xtensa/xtensa.cc: if (xtensa_windowed_abi == -1) xtensa_windowed_abi = TARGET_WINDOWED_ABI_DEFAULT; /* Do not allow sibcalls when windowed registers ABI is in effect. */ if (TARGET_WINDOWED_ABI) return false;
[Bug tree-optimization/116520] Multiple condition lead to missing vectorization due to missing early break
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116520 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement
[Bug rtl-optimization/116516] [15 Regression] [lra] ICE in decompose_normal_address, at rtlanal.cc:6712 by r15-3213-g708ee71808ea61
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116516 --- Comment #4 from Andrew Pinski --- Just for reference here is the code: ``` extern void my_func (int); typedef struct { int var; } info_t; extern void *_data_offs; void test() { info_t *info = (info_t *) ((void *)((void *)1) + ((unsigned int)&_data_offs)); my_func(info->var == 0); } ```
[Bug tree-optimization/116515] LTO link time prints warning for system headers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116515 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed||2024-08-28 --- Comment #2 from Andrew Pinski --- Can you attach the preprocessed source? Since the preprocessed source will still include which lines were from system headers and such, it should be good to go. Also not everyone has fmt installed and it could also depend on the version of fmt too.
[Bug tree-optimization/116518] GCC does not optimize-out useless operations. Clang does.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116518 --- Comment #2 from Andrew Pinski --- I tried to reduce it to: ``` struct s1 { struct { char t[1]; } t; unsigned long tt; }; struct s2 { char t[3]; }; int f() { s2 *t; { struct s1 a = {}; a.t.t[0] = 120; a.t.t[1] = 120; a.t.t[2] = 120; a.tt = 3; t = new s2; __builtin_memcpy(t, &a.t.t, 3); } delete t; return 3; } ``` But the above gets optimized out. There must be some more in the IR than I don't know.
[Bug tree-optimization/116518] GCC does not optimize-out useless operations. Clang does.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116518 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-08-28 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- ``` __builtin_memcpy (_25, &MEM[(const struct array *)&D.94502]._M_elems, 3); D.94502 ={v} {CLOBBER(eos)}; operator delete (_25, 3); ``` The memcpy here is not deleted even though it is dead.
[Bug tree-optimization/116518] GCC does not optimize-out useless operations. Clang does.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116518 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug tree-optimization/56456] [meta-bug] bogus/missing -Warray-bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 Bug 56456 depends on bug 116519, which changed state. Bug 116519 Summary: Arm64(?): undue array bounds warning https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116519 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID
[Bug tree-optimization/116519] Arm64(?): undue array bounds warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116519 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- # RANGE [irange] int [-INF, -2][32, +INF] irq.1_1 = (intD.7) irq_5(D); if (irq.1_1 <= 31) The warning is correct here. GCC does not know `[-INF, -2]` is an invalid range for the IRQ. Changing `irq < 32` to be `irq < 32 && irq >= 0` also fixes the issue. Basically is_assignable_irq makes the range `[32, INF]` for unsigned and casting that to signed gets `[-INF, -2], [32, INF]` and then __irq_to_desc checks based on signed 32 so ...
[Bug target/116512] [12/13/14/15 Regression] vzeroupper emitted even though the upper half of the z registers are returned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-28 Ever confirmed|0 |1 Keywords||wrong-code Status|UNCONFIRMED |NEW --- Comment #5 from Andrew Pinski --- (In reply to Hongtao Liu from comment #4) > gdb shows crtl->return_rtx is > > 21(parallel/i:BLK [ > > 22(expr_list:REG_DEP_TRUE (reg:XI 20 xmm0) > > 23(const_int 0 [0])) > > 24]) Oh, so ix86_avx_u128_mode_exit does not understand parallel here.
[Bug target/116512] [12/13/14/15 Regression] vzeroupper emitted even though the upper half of the z registers are returned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-08-28 Known to work||6.1.0, 6.4.0, 7.2.0 Known to fail||6.5.0, 7.3.0, 7.5.0, 8.1.0 Keywords||wrong-code --- Comment #2 from Andrew Pinski --- Confirmed. Looks like the use of XImode which is missed.
[Bug target/116512] [12/13/14/15 Regression] vzeroupper emitted even though the upper half of the z registers are returned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116512 Andrew Pinski changed: What|Removed |Added Known to work||7.1.0 Target Milestone|--- |12.5 Summary|Wrong vzeroupper at the |[12/13/14/15 Regression] |function epilogue |vzeroupper emitted even ||though the upper half of ||the z registers are ||returned Known to fail||9.1.0
[Bug c++/116511] [14/15 Regression] ICE with enum value used in requires
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116511 --- Comment #3 from Andrew Pinski --- Reduced and cleaned up some more: ``` template struct s1 { enum { e1 = 1 }; }; template struct s2 { enum { e1 = s1::e1 }; s2() requires(0 != e1); }; s2<8> a; ```
[Bug c++/116511] [14/15 Regression] ICE with enum value used in requires
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116511 Andrew Pinski changed: What|Removed |Added Summary|ICE segmentation fault |[14/15 Regression] ICE with ||enum value used in requires Last reconfirmed||2024-08-28 Target Milestone|--- |14.3 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- Reduced and cleaned up testcase: ``` template struct int_vector_width { enum { WORD_BITS, ELEMENT_BITS }; }; template struct int_vector_trait { enum { ELEMENT_BITS = int_vector_width::ELEMENT_BITS }; void push_back() requires(0 != ELEMENT_BITS) ; }; void f() { int_vector_trait<8> vec; vec.push_back(); } ``` Confirmed.
[Bug c++/116511] ICE segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116511 --- Comment #1 from Andrew Pinski --- Back trace: #0 tree_class_check (__g=0x25daeef "to_wide", __l=6460, __f=0x25d20d9 "../../gcc/tree.h", __class=tcc_type, __t=0x0) at ../../gcc/tree.h:3786 #1 wi::to_wide (t=t@entry=0x772eee38) at ../../gcc/tree.h:6460 #2 tree_int_cst_sgn (t=t@entry=0x772eee38) at ../../gcc/tree.cc:6509 #3 0x00b9f149 in write_integer_cst (cst=0x772eee38) at ../../gcc/cp/mangle.cc:2067 #4 0x00b9f63e in write_template_arg_literal (value=0x773072a0) at ../../gcc/tree.h:3779 #5 0x00ba0065 in write_expression (expr=0x772eee88) at ../../gcc/cp/mangle.cc:3943 #6 0x00ba2161 in write_constraint_expression (expr=0x772eee88) at ../../gcc/cp/mangle.cc:860 #7 write_encoding (decl=) at ../../gcc/cp/mangle.cc:968 #8 0x00ba220e in write_mangled_name (decl=0x77310200, top_level=) at ../../gcc/cp/mangle.cc:820 #9 0x00ba7b5f in mangle_decl_string (decl=0x77310200) at ../../gcc/cp/mangle.cc:4421 #10 0x00ba7d67 in get_mangled_id (decl=0x77310200) at ../../gcc/cp/mangle.cc:4442 #11 mangle_decl (decl=0x77310200) at ../../gcc/cp/mangle.cc:4480 #12 0x016e521e in decl_assembler_name (decl=decl@entry=0x77310200) at ../../gcc/tree.cc:725 #13 0x00e745fa in symbol_table::insert_to_assembler_name_hash (this=0x77406000, node=0x77302550, with_clones=) at ../../gcc/symtab.cc:175 #14 0x00e74765 in symbol_table::symtab_initialize_asm_name_hash (this=0x77406000) at ../../gcc/symtab.cc:267 #15 0x00e8ee9f in analyze_functions (first_time=) at ../../gcc/cgraphunit.cc:1423 #16 0x00e8f07e in symbol_table::finalize_compilation_unit (this=0x77406000) at ../../gcc/cgraphunit.cc:2560 #17 0x0139bf6d in compile_file () at ../../gcc/toplev.cc:478 #18 0x00a7de09 in do_compile () at ../../gcc/toplev.cc:2209 #19 toplev::main (this=this@entry=0x7fffdb4e, argc=, argc@entry=3, argv=, argv@entry=0x7fffdc78) at ../../gcc/toplev.cc:2369 #20 0x00a7ff4f in main (argc=3, argv=0x7fffdc78) at ../../gcc/main.cc:39
[Bug middle-end/116510] [15 Regression] ice in decompose, at wide-int.h:1049
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116510 --- Comment #6 from Andrew Pinski --- #6 0x0187822c in gimple_simplify_226 (res_op=0x7fffcd00, seq=0x7fffced0, valueize=0xd13880 , type=0x7741cb28, captures=0x7fffa160, cmp=EQ_EXPR) at gimple-match-9.cc:1994 1994 if (wi::bit_and_not (wi::to_wide (captures[1]), get_nonzero_bits (captures[0])) != 0 (gdb) p debug_tree(captures[1]) constant 92> $1 = void (gdb) p debug_tree(captures[0]) unit-size align:32 warn_if_not_align:0 symtab:0 alias-set 1 canonical-type 0x7741c5e8 precision:32 min max pointer_to_this > visited def_stmt _8 = gg_strescape_i.2_11 & 255; version:8 ptr-info 0x7733ddb0> ... (gdb) p debug_tree(switch_cond) unit-size align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7741cb28 precision:1 min max > arg:0 arg:0 arg:0 visited def_stmt _8 = gg_strescape_i.2_11 & 255; version:8 ptr-info 0x7733ddb0> arg:1 t5.c:6:5 start: t5.c:6:5 finish: t5.c:6:10> arg:1 arg:0 arg:1 t5.c:6:5 start: t5.c:6:5 finish: t5.c:6:10> t5.c:6:5 start: t5.c:6:5 finish: t5.c:6:10> t5.c:6:5 start: t5.c:6:5 finish: t5.c:6:10> (gdb) p debug_tree((tree)0x7730fb40) constant 92> CASE_HIGH/CASE_LOW have a type of `unsigned char` but index is type of `int`. There is a missing fold_convert in the predicate_bbs on CASE_HIGH/CASE_LOW. Note one more thing: + tree low = build2_loc (loc, GE_EXPR, +boolean_type_node, +index, CASE_LOW (label)); + tree high = build2_loc (loc, LE_EXPR, + boolean_type_node, + index, CASE_HIGH (label)); + case_cond = build2_loc (loc, TRUTH_AND_EXPR, + boolean_type_node, + low, high); Why use TRUTH_AND_EXPR here, just use AND_EXPR . Likewise, use TRUTH_NOT_EXPR. + switch_cond = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node, + unshare_expr (switch_cond)); Also why not do something like what is done in convert_single_case_switch ?
[Bug tree-optimization/116510] [15 Regression] ice in decompose, at wide-int.h:1049
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116510 --- Comment #2 from Andrew Pinski --- There is a type mismatch going on ...
[Bug tree-optimization/116510] [15 Regression] ice in decompose, at wide-int.h:1049
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116510 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-28 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||ak at gcc dot gnu.org --- Comment #1 from Andrew Pinski --- I am 99% sure it was introduced by r15-3167-gc9ccc3961f5b8d .
[Bug tree-optimization/116510] [15 Regression] ice in decompose, at wide-int.h:1049
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116510 Andrew Pinski changed: What|Removed |Added Component|c |tree-optimization Summary|ice in decompose, at|[15 Regression] ice in |wide-int.h:1049 |decompose, at ||wide-int.h:1049 Keywords||ice-on-valid-code Target Milestone|--- |15.0
[Bug target/107533] Inefficient code sequence for fp16 testcase on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533 --- Comment #3 from Andrew Pinski --- (In reply to Ramana Radhakrishnan from comment #2) > yes the by-value parameters are a separate issue that I hope recent patches > on the list (I remember something flying past) should help correct. The patch improved both returns and argument passing: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660968.html
[Bug middle-end/116508] [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=116509 --- Comment #5 from Andrew Pinski --- (In reply to Andrew Pinski from comment #4) > Let me file a bug for that. PR 116509. Once that is fixed the popcount128 will also be fixed.
[Bug target/116509] 128bit integer compares can be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116509 --- Comment #1 from Andrew Pinski --- Created attachment 59018 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59018&action=edit What LLVM produces This is what LLVM produces. GCC should be able to do similarly.
[Bug target/116509] New: 128bit integer compares can be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116509 Bug ID: 116509 Summary: 128bit integer compares can be improved Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` int ltu(unsigned __int128 a, unsigned __int128 b) { return a < b; } int gtu(unsigned __int128 a, unsigned __int128 b) { return a > b; } int geu(unsigned __int128 a, unsigned __int128 b) { return a >= b; } int leu(unsigned __int128 a, unsigned __int128 b) { return a <= b; } int eq(unsigned __int128 a, unsigned __int128 b) { return a == b; } int ne(unsigned __int128 a, unsigned __int128 b) { return a != b; } int lt(__int128 a, __int128 b) { return a < b; } int gt(__int128 a, __int128 b) { return a > b; } int ge(__int128 a, __int128 b) { return a >= b; } int le(__int128 a, __int128 b) { return a <= b; } ``` These can be handled using ccmp/sbcs instead of what we currently produce.
[Bug middle-end/116508] [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 --- Comment #4 from Andrew Pinski --- (In reply to Andrew Pinski from comment #3) > (In reply to Andrew Pinski from comment #2) > > part of the problem here is the use of OPTAB_DIRECT when it should use > > OPTAB_WIDEN instead. > > That fixes short but for char looks like there is still some cost issue > since there is a removal of the addv there. > > Note the related testcase should have been: > ``` > int f128(unsigned __int128 a) > { > return __builtin_popcountg(a) == 1; > } > ``` > > Which is not fixed with the OPTAB_WIDEN change. That is because we don't emit decent code for: ``` int gtu128(unsigned __int128 a, unsigned __int128 b) { return a > b; } ``` Let me file a bug for that.
[Bug middle-end/116508] [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > part of the problem here is the use of OPTAB_DIRECT when it should use > OPTAB_WIDEN instead. That fixes short but for char looks like there is still some cost issue since there is a removal of the addv there. Note the related testcase should have been: ``` int f128(unsigned __int128 a) { return __builtin_popcountg(a) == 1; } ``` Which is not fixed with the OPTAB_WIDEN change.
[Bug middle-end/116508] [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 --- Comment #2 from Andrew Pinski --- part of the problem here is the use of OPTAB_DIRECT when it should use OPTAB_WIDEN instead.
[Bug target/116507] [15 Regression] movhi_aarch64 should use fmov if FP16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116507 --- Comment #1 from Andrew Pinski --- Hmm, the whole `*mov_aarch64` set of patterns are a mess and looks like they need some cleanup too.
[Bug target/114224] popcount RTL cost seems wrong with cssc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114224 Andrew Pinski changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-August/ ||661650.html Keywords||patch --- Comment #7 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661650.html I think this depends on the other patch (the first patch in the series is independent but was useful to help debug this). Note I used what I thought Ampere1B's cost would as the generic cost. It is the only core in GCC that implements CSSC right now. The generic cost can be changed once more cores implement it; e.g. if it becomes cnt is one cycle in most common cores it should be changed but that is future work.
[Bug c++/85282] CWG 727 (full specialization in non-namespace scope)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85282 Andrew Pinski changed: What|Removed |Added See Also||https://bugzilla.mozilla.or ||g/show_bug.cgi?id=1677690 --- Comment #19 from Andrew Pinski --- Looks like the xsimd library code (for riscv at least) uses this feature too.
[Bug target/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute and arm_sve_vector_bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 --- Comment #7 from Andrew Pinski --- (In reply to Andrew Pinski from comment #6) > (In reply to Andrew Pinski from comment #5) > > (In reply to J Lee from comment #4) > > > Is this error also related to the same 'const' issue? > > > > No that is unrelated to this attribute issue. > > Well the original error message is. The rest are due to the first error > message. Except it does have some other error messages: /opt/compiler-explorer/libs/xsimd/trunk/include/xsimd/types/xsimd_rvv_register.hpp:121:213: error: explicit specialization in non-namespace scope 'struct xsimd::types::detail::rvv_type_info' Which is PR 85282. So this code won't be supported until that is implemented which seems not any time soon it looks like.
[Bug target/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute and arm_sve_vector_bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 --- Comment #6 from Andrew Pinski --- (In reply to Andrew Pinski from comment #5) > (In reply to J Lee from comment #4) > > Is this error also related to the same 'const' issue? > > No that is unrelated to this attribute issue. Well the original error message is. The rest are due to the first error message.
[Bug target/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute and arm_sve_vector_bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 --- Comment #5 from Andrew Pinski --- (In reply to J Lee from comment #4) > Is this error also related to the same 'const' issue? No that is unrelated to this attribute issue.
[Bug target/114224] popcount RTL cost seems wrong with cssc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114224 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=116508 --- Comment #6 from Andrew Pinski --- (In reply to Andrew Pinski from comment #4) > > Plus there is a missing optimization. > It should just be: > ``` > fmov b31, w0 > ``` wait there is no fmov with b. > Oh that expansion issue is not just a cost issue. I will work on that > tomorrow. Today is just the cost issue. Filed that as PR 116508 for another day.
[Bug middle-end/116508] [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-27 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Andrew Pinski --- .
[Bug middle-end/116508] New: [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116508 Bug ID: 116508 Summary: [15 Regression] `popcount(short) == 1` or char no longer expands to using `(arg ^ (arg - 1)) > arg - 1` trick after r15-2946-gfcc3af99498804 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: pinskia at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` int fs(unsigned short a) { return __builtin_popcountg(a) == 1; } int fc(unsigned char a) { return __builtin_popcountg(a) == 1; } ``` The `(arg ^ (arg - 1)) > arg - 1` trick is no longer even tried for these 2 functions. Note a related testcase: ``` int fc(unsigned char a) { return __builtin_popcountg(a) == 1; } ``` But the above is NOT a regression though.
[Bug target/116507] [15 Regression] movhi_aarch64 should use fmov if FP16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116507 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2024-08-27 Target Milestone|--- |15.0
[Bug target/116507] New: [15 Regression] movhi_aarch64 should use fmov if FP16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116507 Bug ID: 116507 Summary: [15 Regression] movhi_aarch64 should use fmov if FP16 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: pinskia at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` _Float16 fs(unsigned short a) { return __builtin_bit_cast(_Float16, a); } ``` GCC 14 produced: ``` fmovh0, w0 ret ``` Which is decent. While GCC 15 produces: ``` dup v0.4h, w0 ret ``` This is due to different mode being used in GCC 14 vs 15 for the allocated register. In GCC 14, HF was used while on the trunk, HI is used. Options used: `-O2 -march=armv9-a+fp16` . Note clang produces always: ``` fmovs0, w0 ret ``` But I am not sure if that is correct though. I will be fixing this one ...
[Bug testsuite/116500] gcc.dg/vect/vect-switch-ifcvt-1.c FAILs on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500 --- Comment #8 from Andrew Pinski --- (In reply to andi from comment #7) > Thanks. Updated patch. This one seems obvious so I'll commit soon. > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c > b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c > index f5352ef8ed7a..2e3a9ae3c249 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-ifcvt-1.c > @@ -1,4 +1,4 @@ > -/* { dg-require-effective-target vect_int } */ > +/* { dg-require-effective-target vect_condition } */ > #include "tree-vect.h" > > extern void abort (void); Most likely should be `{ vect_int && vect_condition }` too.
[Bug target/114224] popcount RTL cost seems wrong with cssc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114224 --- Comment #5 from Andrew Pinski --- (In reply to Andrew Pinski from comment #4) > Note after r15-2946-gfcc3af99498804, for: > ``` > int fc(unsigned char a) > { > return __builtin_popcountg(a) == 1; > } > ``` > > Without CSSC, GCC produces: > ``` > and w0, w0, 255 > fmovd31, x0 > cnt v31.8b, v31.8b > smovw0, v31.b[0] > cmp w0, 1 > csetw0, eq > ret > ``` > > Plus there is a missing optimization. > It should just be: > ``` > fmov b31, w0 > ``` Oh that expansion issue is not just a cost issue. I will work on that tomorrow. Today is just the cost issue.
[Bug target/114224] popcount RTL cost seems wrong with cssc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114224 --- Comment #4 from Andrew Pinski --- Note after r15-2946-gfcc3af99498804, for: ``` int fc(unsigned char a) { return __builtin_popcountg(a) == 1; } ``` Without CSSC, GCC produces: ``` and w0, w0, 255 fmovd31, x0 cnt v31.8b, v31.8b smovw0, v31.b[0] cmp w0, 1 csetw0, eq ret ``` Plus there is a missing optimization. It should just be: ``` fmov b31, w0 ``` I have a fix for that I think. But we still don't get the right costing here ...
[Bug testsuite/116500] gcc.dg/vect/vect-switch-ifcvt-1.c FAILs on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500 Andrew Pinski changed: What|Removed |Added Component|tree-optimization |testsuite Last reconfirmed||2024-08-27 Ever confirmed|0 |1 Summary|gcc.dg/vect/vect-switch-ifc |gcc.dg/vect/vect-switch-ifc |vt-1.c FAILs|vt-1.c FAILs on sparc Status|UNCONFIRMED |NEW --- Comment #6 from Andrew Pinski --- .
[Bug tree-optimization/116500] gcc.dg/vect/vect-switch-ifcvt-1.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116500 --- Comment #5 from Andrew Pinski --- (In reply to Andi Kleen from comment #4) > It seems sparc doesn't support comparisons in vectorization? I think you want to check vect_condition for this. (like bb-slp-69.c )
[Bug c++/112456] Diagnostic for [[nodiscard]] on a constructor could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112456 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Severity|normal |enhancement
[Bug target/115612] powerpc: define_insn_and_splits calling gen_reg_rtx unconditionally (-flate-combine disabled by default for powerpc port)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115612 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-08-27 Ever confirmed|0 |1 --- Comment #5 from Andrew Pinski --- .
[Bug target/116505] ICE: in gen_reg_rtx, at emit-rtl.cc:1177 with -O -fprofile-arcs -fprofile-values -flate-combine-instructions on powerpc64le with basic code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116505 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 115612 ***
[Bug target/115612] powerpc: define_insn_and_splits calling gen_reg_rtx unconditionally (-flate-combine disabled by default for powerpc port)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115612 Andrew Pinski changed: What|Removed |Added CC||zsojka at seznam dot cz --- Comment #4 from Andrew Pinski --- *** Bug 116505 has been marked as a duplicate of this bug. ***
[Bug c++/116502] [15 Regression] -Wunused-result warning cannot be suppressed if coroutine awaiter returns a reference after r15-2318-g2664c1bf83855b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116502 Andrew Pinski changed: What|Removed |Added Summary|[15 Regression] |[15 Regression] |-Wunused-result warning |-Wunused-result warning |cannot be suppressed if |cannot be suppressed if |coroutine awaiter returns a |coroutine awaiter returns a |reference |reference after ||r15-2318-g2664c1bf83855b Target Milestone|--- |15.0
[Bug c++/116499] [modules] Replace CMI term with BMI in documentation, commit messages, etc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116499 --- Comment #1 from Andrew Pinski --- Note BMI is used as a x86_64 target instruction set; Bit manipulation instruction set .
[Bug tree-optimization/116501] wrong code with __builtin_sub_overflow_p() and _BitInt() at -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116501 Andrew Pinski changed: What|Removed |Added Target|x86_64-pc-linux-gnu |x86_64-pc-linux-gnu ||aarch64-linux-gnu Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Host|x86_64-pc-linux-gnu | Last reconfirmed||2024-08-27 --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #14 from Andrew Pinski --- (In reply to andi from comment #13) > Or a test case for the intended register allocation benefits? > That's more complicated and won't be small. So what if it won't be small but it will be understanding the overall benefit that is if it is a good idea or not. Note you can fake high register pressure by using inline-asm and clobbers which should make the testcase small really :).
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #5 from Andrew Pinski --- Musttail can never be used for correctness. Also lto deals deals just fine with localizing a function. But again you are making hacks what for? Code which is specific to one application rather than making improvements for gcc for all. I am sorry but that seems like the wrong. Improve optimizations instead of thinking of adding attributes.
[Bug target/116497] static functions ABI should be improved for SSE caller saved registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #3 from Andrew Pinski --- >When writing threaded code interpreters by chaining functions with musttail >the normal ABI behavior of some caller saved registers can cause unnecessary >spills and fills compared to using indirect goto. THIS is why I call all of this attribute usage a hack since it means you will always need to keep on changing the sources of the application rather than ever doing improvements to GCC that would help code that didn't even know about the attributes. The same is true of this whole musttail attribute. It does nothing except provide an error message. There are better ways of implementing that inside GCC really than the attribute that was added. GCC has -fopt-info which should have been used instead. Here is another place where the attribute is just a way to hack around instead of improving GCC for ABI for static functions.
[Bug target/116497] static functions ABI should be improved for SSE caller saved registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug target/116497] Need no_caller_saved_registers with SSE support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116497 --- Comment #2 from Andrew Pinski --- Sounds more like the attribute is not needed but gcc should figure out how to improve the abi for static functions instead, like what is done already for 32bit x86. I think even musttail is also a bogus way of doing this. Attributes should not be used but rather improving gcc in more generic way.
[Bug c/97986] [12/13/14/15 Regression] ICE in force_constant_size when applying va_arg to VLA type since r6-91-gf8e89441bc5518f4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97986 Andrew Pinski changed: What|Removed |Added CC||hnarkaytis at gmail dot com --- Comment #8 from Andrew Pinski --- *** Bug 116495 has been marked as a duplicate of this bug. ***
[Bug c/116495] Crash on va_arg with variable size type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116495 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 97986 ***
[Bug c++/85973] [[nodiscard]] on class shall emit a warning for unused anonymous variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85973 Andrew Pinski changed: What|Removed |Added CC||leonid.satanovsky at gmail dot com --- Comment #5 from Andrew Pinski --- *** Bug 116441 has been marked as a duplicate of this bug. ***
[Bug c++/116441] [[nodiscard]] attribute ignored in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116441 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #6 from Andrew Pinski --- This is a dup of bug 85973. I even make mention of moving the attribute to the constructor makes the warning/error happen: > Note if we move the attribute to the constructor, then GCC will error out. *** This bug has been marked as a duplicate of bug 85973 ***
[Bug target/116493] widening reduction add could be better
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116493 --- Comment #1 from Andrew Pinski --- Forgot to mention this is at -O2 (or -O3).
[Bug target/116493] New: widening reduction add could be better
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116493 Bug ID: 116493 Summary: widening reduction add could be better Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` unsigned int f(unsigned short *a) { unsigned int t1 = 0; for(int i = 0; i < 16; i++) t1 += a[i]; return t1; } ``` GCC generates: ``` ldp q0, q31, [x0] uxtlv30.4s, v0.4h uaddw2 v30.4s, v30.4s, v0.8h uaddw v30.4s, v30.4s, v31.4h uaddw2 v30.4s, v30.4s, v31.8h addvs31, v30.4s fmovw0, s31 ``` This could be improved a few things, first the first two `uxtl/uaddw2` pair could be changed to: ``` uxtlv30.4s, v0.4h uxtl2 v30.4s, v0.8h ``` That is simplify: vect_patt_20.8_2 = vect__4.6_1 w+ { 0, 0, 0, 0 }; into just: vect_patt_20.8_2 = (vector(8) unsigned int)vect__4.6_1; And then I think we could handle the widending add better for the second case.
[Bug c++/116492] inherited constructors with concept in subclass overrides constructor in subclass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116492 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #1) > Slightly reduced: In this example if you comment out: ``` requires true_c ``` GCC does the correct thing.
[Bug c++/116492] inherited constructors with concept in subclass overrides constructor in subclass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116492 --- Comment #2 from Andrew Pinski --- >The same does not happen in clang, and in gcc with similar examples from other >classes I have tried. So it comes down to the concept on the constructor which is why you didn't run into similar examples from other classes. std::expected has requires statements on the constructors in some cases.
[Bug c++/116492] inherited constructors with concept in subclass overrides constructor in subclass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116492 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-26 Summary|inherited constructors in |inherited constructors with |subclass of std::expected |concept in subclass |can not be overridden |overrides constructor in ||subclass Blocks||67491 Status|UNCONFIRMED |NEW Keywords||wrong-code Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Slightly reduced: ``` template concept true_c = true; template struct expected { T a; bool hasvalue; constexpr expected(T a1) : a(a1), hasvalue(true){} constexpr expected() requires true_c : a(), hasvalue(true){} expected(expected&&) = default; T &operator*() { if (!hasvalue) throw 1; return a; } }; class my_expected : public expected { public: using expected::expected; my_expected() : expected::expected(42) {} }; int main() { my_expected obj; return *obj != 42; } ``` Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67491 [Bug 67491] [meta-bug] concepts issues
[Bug c++/116491] GCC defines macro linux if -std is not set, and does not define otherwise
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116491 --- Comment #7 from Andrew Pinski --- Or rather see https://cmake.org/cmake/help/latest/prop_tgt/CXX_EXTENSIONS.html#prop_tgt:CXX_EXTENSIONS . That is -std=c++17 vs -std=gnu++17. GCC defaults to gnu++17 in newer versions of GCC. So the bug is not in cmake but your misunderstanding on the defaults and what is documented (even in cmake documentation :) ).
[Bug c++/116491] GCC defines macro linux if -std is not set, and does not define otherwise
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116491 --- Comment #6 from Andrew Pinski --- (In reply to Sergey Markelov from comment #3) > This is not a duplicate. The macro is defined conditionally, this is not a > correct behavior. Yes and that is by design. -std=gnu++17 enables some non-standard macros which are considered legacy macros while -std=c++17 does not enable those. This is all documented too. as mentioned in both PRs there too. The bug is cmake does not have a way to specify that you want to use C++17 with GNU extensions; only that you want to use the C++17 language. That in itself is not a GCC issue but the way cmake is designed.
[Bug c++/116491] GCC defines macro linux if -std is not set, and does not define otherwise
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116491 --- Comment #5 from Andrew Pinski --- Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84400#c2
[Bug c++/116491] GCC defines macro linux if -std is not set, and does not define otherwise
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116491 --- Comment #4 from Andrew Pinski --- (In reply to Sergey Markelov from comment #3) > This is not a duplicate. The macro is defined conditionally, this is not a > correct behavior. Yes it is. Please read that one and pr84400 .
[Bug target/65128] remove "linux" and "unix" from preprocessor macros from cpp-5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65128 Andrew Pinski changed: What|Removed |Added CC||mihaipop11 at gmail dot com --- Comment #6 from Andrew Pinski --- *** Bug 84400 has been marked as a duplicate of this bug. ***
[Bug c++/84400] “linux” string in path replaced with “1” when using “<>” angle brackets to include a header through a macro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84400 Andrew Pinski changed: What|Removed |Added Resolution|INVALID |DUPLICATE --- Comment #5 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 65128 ***
[Bug target/65128] remove "linux" and "unix" from preprocessor macros from cpp-5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65128 Andrew Pinski changed: What|Removed |Added CC||sergio_nsk at yahoo dot de --- Comment #5 from Andrew Pinski --- *** Bug 116491 has been marked as a duplicate of this bug. ***
[Bug c++/116491] GCC defines macro linux if -std is not set, and does not define otherwise
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116491 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- This is expected behavior see pr 65128 *** This bug has been marked as a duplicate of bug 65128 ***
[Bug c/116489] Conflict between noinit and section __attribute__ makes object files (and static libraries) unnecessarily large
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116489 --- Comment #2 from Andrew Pinski --- >In any case, the behavior seems to be undocumented: Well considering the documentation says: place sections with the .noinit prefix you can assume they conflict :). Also "prefix" is almost always added with `.` which is not exactly documented but that might be documented in the binutils documentation rather than the GCC documentation.
[Bug c/116489] Conflict between noinit and section __attribute__ makes object files (and static libraries) unnecessarily large
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116489 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- The special casing is handled by the assembler and not by GCC. It uses `.noinit` and `.noinit.*` as the special casing. So you could name the section `.noinit.PSRAM` and it will work the way you want it to work.
[Bug target/114224] popcount RTL cost seems wrong with cssc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114224 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > Interesting: > ``` > int h1(unsigned a) > { > return __builtin_popcountg(a) == 1; > } > ``` > works. > > > Anyways I will be adding POPCOUNT's rtl cost here. > > We don't even handle POPCOUNT for vector modes either ... Nor we handle: ``` (set (reg:DI 105) (zero_extend:DI (unspec:QI [ (reg:V8QI 107) ] UNSPEC_ADDV))) ```
[Bug rtl-optimization/116488] [15 Regression] wrong code at -O{s,2,3} with "-fno-forward-propagate" on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116488 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-26 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||law at gcc dot gnu.org, ||pinskia at gcc dot gnu.org --- Comment #1 from Andrew Pinski --- `-fno-ext-dce` fixes it.
[Bug rtl-optimization/116488] [15 Regression] wrong code at -O{s,2,3} with "-fno-forward-propagate" on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116488 Andrew Pinski changed: What|Removed |Added Version|unknown |15.0 Target Milestone|--- |15.0 Summary|wrong code at -O{s,2,3} |[15 Regression] wrong code |with|at -O{s,2,3} with |"-fno-forward-propagate" on |"-fno-forward-propagate" on |x86_64-linux-gnu|x86_64-linux-gnu
[Bug middle-end/116487] Miscompile with different optimization flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116487 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- : In function 'b': :19:18: warning: 'bf' is used uninitialized [-Wuninitialized] 19 | *bh = bg ^= bf || 0; | ~~~^~~~ :16:11: note: 'bf' was declared here 16 | int64_t bf; | ^~ : In function 'main': :34:10: warning: iteration 1 invokes undefined behavior [-Waggressive-loop-optimizations] 34 | c[i] = crc; | ~^ :27:12: note: within this loop 27 | for (; i < 256; i++) { | ~~^ Undefined behavior == different output at different optimization level.
[Bug middle-end/116486] [15 Regression] wrong code with __builtin_stdc_first_leading_one() at -O2 -fno-tree-ccp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116486 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=115337 --- Comment #2 from Andrew Pinski --- Maybe r15-1014-g591d30c5c97e75 .
[Bug middle-end/116486] [15 Regression] wrong code with __builtin_stdc_first_leading_one() at -O2 -fno-tree-ccp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116486 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-08-26 --- Comment #1 from Andrew Pinski --- In GCC 14 (from evrp): ``` Folding statement: _4 = _3 / 0x1; Queued stmt for removal. Folds to: 0 Folding statement: _1 = .CLZ (_4, -1); gimple_simplified to _1 = -1; ``` On the trunk: ``` Folding statement: _4 = _3 / 0x1; Queued stmt for removal. Folds to: 0 Folding statement: _1 = .CLZ (_4, -1); Queued stmt for removal. Folds to: 128 ```
[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 --- Comment #7 from Andrew Pinski --- (In reply to Alexander Monakov from comment #6) > with the caveats that you'd only get that for future gcc releases I think this caveat is fine as if adding the other feature to asm goto you would also have to wait for a future version of GCC too.
[Bug tree-optimization/116485] CFG cleanup should prune unreachable switch cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116485 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-26 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- .
[Bug tree-optimization/116460] [14/15 Regression] LTO ICE with -g during GIMPLE pass: forwprop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116460 --- Comment #22 from Andrew Pinski --- (In reply to Richard Biener from comment #20) > Unfortunately the following doesn't reproduce the issue. > > #include > #include > > void g(); > > void f(int nBands, double maxZErr) { > for (int iBand = 0; iBand < nBands; iBand++) >{ > g(); > std::vector noDataCandVec; > std::vector distCandVec = {0, 1, 10, 100, 5, 6}; > for (signed char dist : distCandVec) > noDataCandVec.push_back(1); > std::sort(noDataCandVec.begin(), noDataCandVec.end(), > std::greater()); > } > } > > I'll add the preprocessed source. Yes it depends on some definitions not to be inlined, even the original code had a check on typeinfo which dependedent on the `operator==` not being inlined but that was able to change to the simple call of `g()` with no definition. I didn't try to figure out which functions needed to be inlined or not when I was reducing the code. This is also why it originally needed LTO to hit the ICE because the usage across different TUs caused slightly different inlining decisions and then as definitions were able to be removed; you could hit it without LTO.
[Bug middle-end/116480] [15 Regression] ICE: in operand_subword_force, at emit-rtl.cc:1824 at -O2 with __builtin_stdc_has_single_bit() and _BitInt(65...128) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116480 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Andrew Pinski --- Fixed.
[Bug target/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute and arm_sve_vector_bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-26 Status|UNCONFIRMED |NEW Summary|Allow constexpr expression |Allow constexpr expression |in riscv_rvv_vector_bits|in riscv_rvv_vector_bits |attribute |attribute and ||arm_sve_vector_bits Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski --- Looks like arm_sve_vector_bits will have the same issue. The way the standard attribute allows constexpr, etc to work is to use default_conversion . e.g.: if (size && TREE_CODE (size) != IDENTIFIER_NODE && TREE_CODE (size) != FUNCTION_DECL) size = default_conversion (size); But default_conversion is only available from the C family frontends. Looks like a lang-hook is needed here ...
[Bug target/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 Andrew Pinski changed: What|Removed |Added See Also||https://github.com/riscv-no ||n-isa/rvv-intrinsic-doc/iss ||ues/176 --- Comment #2 from Andrew Pinski --- The full documentation in the RVV documentation is not done so
[Bug tree-optimization/116460] [14/15 Regression] LTO ICE with -g during GIMPLE pass: forwprop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116460 --- Comment #16 from Andrew Pinski --- (In reply to Richard Biener from comment #14) > (In reply to Andrew Pinski from comment #13) > > (In reply to Andrew Pinski from comment #8) > > > ./a.ltrans6.ltrans.212t.forwprop4 > > > > > > Removing dead stmt noDataCandVec$_M_start_888 = PHI <_1783(176), > > > _577(186)> > > > ... > > > Removing dead stmt:_598 = _888 + 16; > > > > > > So it looks like we remove the statement defining _888 and then removing > > > the > > > use. > > > The removal of _888 happens directly from forwprop while _598 definition > > > removal comes from simple_dce_from_worklist . > > > > > > The ICE happens because the ssa name _888 has already been freed so the > > > type > > > is null (and not in this case a pointer) since this was originally a > > > pointer > > > plus. > > > > > > Trying to reduce this further. > > > > _888 definition is from a BB which is going to be removed so we should not > > need to mark its uses as being needed for dce worklist. But I am not sure > > how to detect that case. > > forwprop shouldn't remove _888 if there's a use left. When adding > simple_dce_from_worklist, did you remove some manual stmt removal > (adding to to_remove)? Having both is a bit ugly (see also > remove_prop_source_from_use), but the sets need to be separate to > avoid interactions like this. Richard, you are the one who added simple_dce_from_worklist to forwprop in the end; I had tried originally by not do the manual one but ran into regressions so I didn't submit it.
[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 --- Comment #3 from Andrew Pinski --- > But for satisfying some tools analyzing the generated machine code Also this sounds like a limitation in the tool analyzing the generated code and outside of gcc; I know helping the tool along is useful but that sounds like a workaround instead of fixing the tool.
[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-08-26 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING --- Comment #2 from Andrew Pinski --- I am trying to understand the use case here. Is it for someone analyzing the code after the fact or for something else. Can you provide a full example of what you want and explain why computed goto cannot be used instead of the inline-asm. It is not obvious from reading your explanation why you need the asm goto here.
[Bug c++/116484] Allow constexpr expression in riscv_rvv_vector_bits attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116484 --- Comment #1 from Andrew Pinski --- This attribute is not documented so ...
[Bug tree-optimization/116481] [12/13/14/15 Regression] `arrays of functions are not meaningful` error message happens with -W -Wall -O2 even though there are no arrays of function types used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116481 --- Comment #5 from Andrew Pinski --- (In reply to Bruno Haible from comment #4) > Why? This code is accessing read-only memory near the address of the 'tramp' > function. Why would it need 'volatile' when doing so? (I don't claim that > this is portable ISO C code; it's doing a low-level thing. But C is meant > to be used for low-level things.) Because it is accessing before the begining of the function pointer. >But C is meant to be used for low-level things. There is still undefined behavior in C and accessing before the start of an array (in this case the function pointer) is undefined. Standard C says function pointers and other pointers don't need to be the same size or otherwise and not convertable (though it is a requirement for POSIX). With -pedantic-errors we get an error message due to this :): ``` : In function 'is_trampoline': :5:25: error: ISO C forbids initialization between function pointer and 'void *' [-Wpedantic] 5 | void* tramp_address = tramp; | ^ ```
[Bug tree-optimization/116460] [14/15 Regression] LTO ICE with -g during GIMPLE pass: forwprop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116460 --- Comment #13 from Andrew Pinski --- (In reply to Andrew Pinski from comment #8) > ./a.ltrans6.ltrans.212t.forwprop4 > > Removing dead stmt noDataCandVec$_M_start_888 = PHI <_1783(176), _577(186)> > ... > Removing dead stmt:_598 = _888 + 16; > > So it looks like we remove the statement defining _888 and then removing the > use. > The removal of _888 happens directly from forwprop while _598 definition > removal comes from simple_dce_from_worklist . > > The ICE happens because the ssa name _888 has already been freed so the type > is null (and not in this case a pointer) since this was originally a pointer > plus. > > Trying to reduce this further. _888 definition is from a BB which is going to be removed so we should not need to mark its uses as being needed for dce worklist. But I am not sure how to detect that case.
[Bug tree-optimization/116481] [12/13/14/15 Regression] Compilation error caused by -Warray-bounds and -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116481 Andrew Pinski changed: What|Removed |Added Known to fail||12.1.0 Last reconfirmed||2024-08-25 Target Milestone|--- |12.5 Ever confirmed|0 |1 Keywords||needs-bisection Status|UNCONFIRMED |NEW Summary|Compilation error caused by |[12/13/14/15 Regression] |-Warray-bounds and -O2 |Compilation error caused by ||-Warray-bounds and -O2 Known to work||11.1.0 --- Comment #3 from Andrew Pinski --- Confirmed. Note I think this code is undefined really unless you use the volatile. Now we should not reject it though. The error message happens on x86_64 also.
[Bug tree-optimization/116481] Compilation error caused by -Warray-bounds and -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116481 Andrew Pinski changed: What|Removed |Added Blocks||56456 Target|hppa-linux-gnu |*-*-* --- Comment #2 from Andrew Pinski --- hppa-linux-gnu uses Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 [Bug 56456] [meta-bug] bogus/missing -Warray-bounds
[Bug middle-end/116480] [15 Regression] ICE: in operand_subword_force, at emit-rtl.cc:1824 at -O2 with __builtin_stdc_has_single_bit() and _BitInt(65...128) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116480 Andrew Pinski changed: What|Removed |Added Keywords||patch URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-August/ ||661414.html --- Comment #6 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661414.html
[Bug tree-optimization/116463] [15 Regression] fast-math-complex-mls-{double,float}.c fail after r15-3087-gb07f8a301158e5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116463 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=105095 --- Comment #10 from Andrew Pinski --- The current failures/xpass for aarch64 is: XPASS: gcc.dg/vect/complex/fast-math-complex-add-half-float.c scan-tree-dump-times vect "stmt.*COMPLEX_ADD_ROT270" 1 XPASS: gcc.dg/vect/complex/fast-math-complex-add-half-float.c scan-tree-dump-times vect "stmt.*COMPLEX_ADD_ROT90" 1 FAIL: gcc.dg/vect/complex/fast-math-complex-mls-double.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-float.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270" FAIL: gcc.dg/vect/complex/fast-math-complex-mls-half-float.c scan-tree-dump vect "Found COMPLEX_ADD_ROT270"
[Bug rtl-optimization/116479] [15 Regression] wrong code with -O -funroll-loops -finline-stringops -fmodulo-sched --param=max-iterations-computation-cost=637924876 on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116479 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=110791 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-08-25 --- Comment #1 from Andrew Pinski --- Confirmed. Looks like fmodulo-sched didn't do anything for the testcase in GCC 14 but does on the trunk. So maybe a latent bug ...
[Bug rtl-optimization/116479] [15 Regression] wrong code with -O -funroll-loops -finline-stringops -fmodulo-sched --param=max-iterations-computation-cost=637924876 on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116479 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |15.0