[Bug tree-optimization/104368] [12 Regression] Failure to vectorise conditional grouped accesses after PR102659
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104368 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-02-04 Status|UNCONFIRMED |NEW CC||amacleod at redhat dot com --- Comment #1 from Richard Biener --- Confirmed. On x86 with AVX2 we don't get this vectorized anymore for the same reason. t.c:5:15: missed: failed: evolution of base is not affine. base_address: offset from base address: constant offset from base address: step: base alignment: 0 base misalignment: 0 offset alignment: 0 step alignment: 0 base_object: *_8 Creating dr for *_12 if-conversion now produces ... _47 = (unsigned long) y_21(D); .. # i_26 = PHI _1 = (long unsigned int) i_26; _2 = _1 * 4; _3 = x_20(D) + _2; _4 = *_3; _45 = (unsigned int) i_26; _46 = _45 * 2; _5 = (int) _46; _6 = (long unsigned int) _5; _7 = _6 * 4; _48 = _47 + _7; _8 = (int *) _48; _49 = _4 > 0; _9 = .MASK_LOAD (_8, 32B, _49); _10 = _6 + 1; _11 = _10 * 4; _51 = _11 + _47; _12 = (int *) _51; _13 = .MASK_LOAD (_12, 32B, _49); _52 = (unsigned int) _9; _53 = (unsigned int) _13; _54 = _52 + _53; _14 = (int) _54; .MASK_STORE (_3, 32B, _49, _14); i_23 = i_26 + 1; if (n_19(D) > i_23) goto ; [89.00%] else goto ; [11.00%] note that if-conversion is correct in rewriting i*2 and i*2 + 1 to unsigned arithmetic since that will now execute unconditionally and can overflow. In the end the issue is that the multiplication by the element size is done in sizetype and so y[i*2] and y[i*2+1] might not be adjacent. What we miss is that iff the stmts were executed then because of undefined overflow they will always be adjacent. IMHO the only good way to recover is to scrap the separate if-conversion step and do vectorization on the original IL. Or integrate the two passes as much as to allow dataref analysis on the not if-converted IL. Another possibility (and long-standing TODO) is to teach SCEV analysis to derive assumptions we can version the loop on - in this case that i*2 + 1 does not overflow. Note in this particular case we probably miss to see that i is in [0,INT_MAX-1] and thus (unsigned)i * 2 + 1 never wraps (unless I miss something). We have [local count: 955630226]: # RANGE [0, 2147483647] NONZERO 2147483647 # i_26 = PHI # RANGE [0, 2147483646] NONZERO 2147483647 _1 = (long unsigned int) i_26; # RANGE [0, 8589934584] NONZERO 8589934588 _2 = _1 * 4; # PT = null { D.2435 } (nonlocal, restrict) _3 = x_20(D) + _2; _4 = MEM[(int *)_3 clique 1 base 1]; _45 = (unsigned int) i_26; _46 = _45 * 2; _5 = (int) _46; _6 = (long unsigned int) _5; _7 = _6 * 4; _48 = _47 + _7; so unfortunately while _1 has that correct range, i_26 does not and the ifcvt generated stmts don't either. It might be possible to throw ranger on the if-converted body. Andrew - if we'd like to do that, in tree-if-conv.cc in tree_if_conversion () after we've produced the final IL (after the call to ifcvt_hoist_invariants), is there a way to invoke ranger on the stmts of the (single-BB) loop and have it adjust the global ranges? In particular - see above, it would need to somehow improve the global range of the i_26 IV. The pass creates blocks and destroys edges, so I'm not sure if we can reasonably use a caching instance over its lifetime so cost per loop would be a limiting factor.
[Bug c/17170] add warning for bitfield declarations where the presence of a signbit (or lack thereof) could lead to confusion [-Wdefault-bitfield-sign]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=17170 Eric Gallager changed: What|Removed |Added Summary|-Wdefault-bitfield-sign |add warning for bitfield ||declarations where the ||presence of a signbit (or ||lack thereof) could lead to ||confusion ||[-Wdefault-bitfield-sign] --- Comment #11 from Eric Gallager --- making the title more descriptive
[Bug c++/12341] Request for additional warning for variable shadowing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12341 --- Comment #5 from Eric Gallager --- is this expected to be a new argument accepted by the `-Wshadow=` flag, or its own separate flag entirely?
[Bug target/104364] [12 Regression] OpenMP/nvptx regressions after "[nvptx] Add some support for .local atomics"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104364 --- Comment #6 from Thomas Schwinge --- Thanks for having confirmed my findings and doubts -- seems I did correctly understand a thing or two. ;-) (In reply to Tom de Vries from comment #5) > (In reply to Thomas Schwinge from comment #0) > > ... but only seen regressing for: [...] > > > > ... and never seen regressing for: [...] > > > > (What is the underlying characteristic here?) > > Good question. > > I've tested this using (recommended) driver 470.94 on boards: [...] > while iterating over dimensions { -mptx=3.1 , -mptx=6.3 } x { > GOMP_NVPTX_JIT=-O0, }. > > So I'm slightly surprised that I didn't see any regressions. If indeed we're now generating some bad 'atom' code, it sure is confusing why execution anyway PASSes for quite a number of configurations? Are we just "lucky", or is there some more fundamental issue that we're not even properly using the concurrency in these configurations (and thus don't notice the 'atom' issues)?
[Bug middle-end/103641] [11/12 regression] Severe compile time regression in SLP vectorize step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103641 Richard Biener changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||kyrylo.tkachov at arm dot com --- Comment #26 from Richard Biener --- I'm testing a patch along comment#25 - CCing Kyrylo who seems to have authored the code. It doesn't address the issue noted by Roger that we use MAX_COST but as said in the comment it improves compile-time quite a bit and it should also produce better sequences since we base the cost on the vector mode that will be used rather than the scalar mode (assuming the vector ops are costed in a non-random way - which is likely where this will fail).
[Bug middle-end/104077] bogus/missing -Wdangling-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104077 Bug 104077 depends on bug 104092, which changed state. Bug 104092 Summary: [12 Regression] Invalid -Wdangling-pointer warning after writes by calls https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104092 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/104092] [12 Regression] Invalid -Wdangling-pointer warning after writes by calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104092 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Richard Biener --- Should be fixed now.
[Bug middle-end/90348] [9/10/11/12 Regression] Partition of char arrays is incorrect in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90348 --- Comment #24 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:551aa75778a4c5165d9533cd447c8fc822f583e1 commit r12-7044-g551aa75778a4c5165d9533cd447c8fc822f583e1 Author: Richard Biener Date: Wed Feb 2 14:24:39 2022 +0100 Add CLOBBER_EOL to mark storage end-of-life clobbers This adds a flag to CONSTRUCTOR nodes indicating that for clobbers this marks the end-of-life of storage as opposed to just ending the lifetime of the object that occupied it. The dangling pointer diagnostics uses CLOBBERs but is confused by those emitted by the C++ frontend for example which emits them for the second purpose at the start of CTORs. The issue is also appearant for aarch64 in PR104092. Distinguishing the two cases is also necessary for the PR90348 fix. Since I'm going to add another flag I added an enum clobber_flags and a defaulted argument to build_clobber plus a convenient way to query the enum from the CTOR tree and specify it for gimple_clobber_p. Since 'CLOBBER' is already taken and I needed a name for the unspecified clobber we have now I used 'CLOBBER_UNDEF'. 2022-02-03 Richard Biener PR middle-end/90348 PR middle-end/104092 gcc/ * tree-core.h (clobber_kind): New enum. (tree_base::u::bits::address_space): Document use in CONSTRUCTORs. * tree.h (CLOBBER_KIND): Add. (build_clobber): Add clobber kind argument, defaulted to CLOBBER_UNDEF. * tree.cc (build_clobber): Likewise. * gimple.h (gimple_clobber_p): New overload with specified kind. * tree-streamer-in.cc (streamer_read_tree_bitfields): Stream CLOBBER_KIND. * tree-streamer-out.cc (streamer_write_tree_bitfields): Likewise. * tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs. * gimplify.cc (gimplify_bind_expr): Build storage end-of-life clobbers with CLOBBER_EOL. (gimplify_target_expr): Likewise. * tree-inline.cc (expand_call_inline): Likewise. * tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise. * gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat CLOBBER_EOL clobbers as ending lifetime of storage. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Compare CLOBBER_KIND. gcc/testsuite/ * gcc.dg/pr87052.c: Adjust.
[Bug middle-end/104092] [12 Regression] Invalid -Wdangling-pointer warning after writes by calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104092 --- Comment #2 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:551aa75778a4c5165d9533cd447c8fc822f583e1 commit r12-7044-g551aa75778a4c5165d9533cd447c8fc822f583e1 Author: Richard Biener Date: Wed Feb 2 14:24:39 2022 +0100 Add CLOBBER_EOL to mark storage end-of-life clobbers This adds a flag to CONSTRUCTOR nodes indicating that for clobbers this marks the end-of-life of storage as opposed to just ending the lifetime of the object that occupied it. The dangling pointer diagnostics uses CLOBBERs but is confused by those emitted by the C++ frontend for example which emits them for the second purpose at the start of CTORs. The issue is also appearant for aarch64 in PR104092. Distinguishing the two cases is also necessary for the PR90348 fix. Since I'm going to add another flag I added an enum clobber_flags and a defaulted argument to build_clobber plus a convenient way to query the enum from the CTOR tree and specify it for gimple_clobber_p. Since 'CLOBBER' is already taken and I needed a name for the unspecified clobber we have now I used 'CLOBBER_UNDEF'. 2022-02-03 Richard Biener PR middle-end/90348 PR middle-end/104092 gcc/ * tree-core.h (clobber_kind): New enum. (tree_base::u::bits::address_space): Document use in CONSTRUCTORs. * tree.h (CLOBBER_KIND): Add. (build_clobber): Add clobber kind argument, defaulted to CLOBBER_UNDEF. * tree.cc (build_clobber): Likewise. * gimple.h (gimple_clobber_p): New overload with specified kind. * tree-streamer-in.cc (streamer_read_tree_bitfields): Stream CLOBBER_KIND. * tree-streamer-out.cc (streamer_write_tree_bitfields): Likewise. * tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs. * gimplify.cc (gimplify_bind_expr): Build storage end-of-life clobbers with CLOBBER_EOL. (gimplify_target_expr): Likewise. * tree-inline.cc (expand_call_inline): Likewise. * tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise. * gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat CLOBBER_EOL clobbers as ending lifetime of storage. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Compare CLOBBER_KIND. gcc/testsuite/ * gcc.dg/pr87052.c: Adjust.
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #41 from Richard Biener --- (In reply to rguent...@suse.de from comment #40) > On Thu, 3 Feb 2022, amacleod at redhat dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 > > > > --- Comment #37 from Andrew Macleod --- > > (In reply to Jakub Jelinek from comment #35) > > > I meant something like: > > > return Z / X; > > > > > and there evrp does with -O2 -gnatp optimize away the division. > > > Though that is likely the X / boolean_range_Y case which you've disabled. > > > In any case, I think you want to hear from Andrew/Aldy where exactly does > > > VRP/ranger assume UB on integer division by zero. > > > > That divide is remove by the simplifier because it determines that X has a > > range of [0,1] and I believe the simplifer chooses to ignore the 0 under > > various circumstances. > > > > As for ranger, range-ops will return UNDEFINED for the range if x is known > > to > > be [0,0]. This can be propagated around, and depending on how it ends up > > being > > used as to what happens with it. > > I think that's OK as outgoing range (on the non-exceptional path - on the > exeptional path the result isn't computed). That just may not be used > to simplify the stmt producing the range itself of course. That said, range-ops, from say [0,1] = [0,2] / y; may _not_ reason that 'y' is not 0 when non-call EH. That is, you need to be careful on the reverse ops but I think not on the forward ops.
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #40 from rguenther at suse dot de --- On Thu, 3 Feb 2022, amacleod at redhat dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 > > --- Comment #37 from Andrew Macleod --- > (In reply to Jakub Jelinek from comment #35) > > I meant something like: > > return Z / X; > > > and there evrp does with -O2 -gnatp optimize away the division. > > Though that is likely the X / boolean_range_Y case which you've disabled. > > In any case, I think you want to hear from Andrew/Aldy where exactly does > > VRP/ranger assume UB on integer division by zero. > > That divide is remove by the simplifier because it determines that X has a > range of [0,1] and I believe the simplifer chooses to ignore the 0 under > various circumstances. > > As for ranger, range-ops will return UNDEFINED for the range if x is known to > be [0,0]. This can be propagated around, and depending on how it ends up > being > used as to what happens with it. I think that's OK as outgoing range (on the non-exceptional path - on the exeptional path the result isn't computed). That just may not be used to simplify the stmt producing the range itself of course.
[Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376 Andrew Pinski changed: What|Removed |Added Depends on||104378 --- Comment #3 from Andrew Pinski --- Filed PR 104378 for the (31 - x) ^ 31 issue. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104378 [Bug 104378] (N - x) ^ N should be optimized to x if x <= N (unsigned) and N is a pow2 - 1
[Bug tree-optimization/104378] New: (N - x) ^ N should be optimized to x if x <= N (unsigned) and N is a pow2 - 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104378 Bug ID: 104378 Summary: (N - x) ^ N should be optimized to x if x <= N (unsigned) and N is a pow2 - 1 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: #define n 8 #define N ((1u< N) __builtin_unreachable(); return (N - x) ^ N; } This should be optimized to just return x; Like it is done by LLVM.
[Bug ipa/104377] New: Unreachable code in create_specialized_node of ipa-prop.c?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104377 Bug ID: 104377 Summary: Unreachable code in create_specialized_node of ipa-prop.c? Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: fxue at os dot amperecomputing.com CC: marxin at gcc dot gnu.org Target Milestone: --- For function create_specialized_node(), the "node" to operated on seems always to be an original cgraph node, never a clone node. From call graph related to the function, we know that ipcp_decision_stage () only passes raw cgraph node downwards to its callees. Then, "node" reaching create_specialized_node() would not be a clone, so the code enclosed by "if (old_adjustments)" might be of no use. But I am not sure sure if there is some thing that I missed. ipcp_driver | '--> ipcp_decision_stage | '--> decide_whether_version_node | |--> decide_about_value | | '-'--> create_specialized_node
[Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376 --- Comment #2 from Andrew Pinski --- The second issue can be seen with: #include uint32_t countLeadingZeros32(uint32_t x) { if (x == 0) return 32; return (__builtin_clz(x)) ; } This gets optimized for aarch64 at the rtl level but not for x86_64 with -mlzcnt.
[Bug tree-optimization/104376] Failure to optimize clz equivalent to clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Last reconfirmed||2022-02-04 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Target||aarch64-*-* x86_64-*-* ||(with -mlzcnt) --- Comment #1 from Andrew Pinski --- There are two issues, both are tree level issues, though the second one works on the RTL level just fine. Right now we have: _1 = __builtin_clz (x_5(D)); _2 = 31 - _1; _3 = _2 ^ 31; But the _3 can be optimized to just _1.
[Bug target/103197] ppc inline expansion of memcpy/memmove should not use lxsibzx/stxsibx for a single byte
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103197 --- Comment #12 from Segher Boessenkool --- (In reply to HaoChen Gui from comment #11) > Segher, > Will you commit your patch in stage4? Several issues are supposed to be > fixed by your patch. Thanks. Yes, of course, but there have been complications. In particular this had to be *thoroughly* tested.
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 Óscar Fuentes changed: What|Removed |Added Status|SUSPENDED |RESOLVED Resolution|--- |FIXED --- Comment #12 from Óscar Fuentes --- (In reply to Jonathan Wakely from comment #9) > (In reply to Óscar Fuentes from comment #6) > > So IIUC you are applying modifications to libstdc++ that deviate from the > > published standard expecting that the committee will accept those changes. > > As a user, this is troublesome, because right now I need to special-case gcc > > version >11.2 and maybe version > not accepted and is reverted. > > Why do you need to special case anything? What problem are these extra const > qualifications causing you? One project here consists on a compiler for certain strict, statically typed language that transparently interacts with C++ code bases. We have a mechanism for inferring the signature of C/C++ functions and automatically create wrappers for them, using a combination of macros and templates. For instance, this is how std::atomic_notify_all is reflected: LP0_FFI_FN_OV("notify-all", void, (std::atomic*), std::atomic_notify_all); The "_OV" means "overloaded", "void" is the type returned, (std::atomic*) is the argument list. If the returned type and argument list does not match an overload of std::atomic_notify_all, the C++ compiler throws an error. For stdlibc++ we could simply use LP0_FFI_FN("notify-all", std::atomic_notify_all>); and let our template machinery deduce the signature of std::atomic_notify_all, but other implementations (libc++) do provide the "volatile" overload, so we are forced to explicitly tell the compiler which overload we want. Thus, if the function's signature differ from one implementation to another, we need to detect the correct signature and use it on each instantiation of std::atomic_notify_all et al we reflect. To make things worse, some distros picked the change and incorporated them to their gcc 11.2 packages. I'm afraid the only solution is a platform check at configure time plus the corresponding macro-sprinkling on our C++ sources. A hairy mess for what otherwise would be something quite simple and clean. There are other technical inconveniences related to our precise use case that would be too long to explain here. Anyway, thanks for explaining the state of affairs. I understand your POV, so I'm closing this issue.
[Bug tree-optimization/104376] New: Failure to optimize clz equivalent to clz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104376 Bug ID: 104376 Summary: Failure to optimize clz equivalent to clz Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- #include uint32_t countLeadingZeros32(uint32_t x) { if (x == 0) return 32; return (31 - __builtin_clz(x)) ^ 31; } On x86, with `-mlzcnt`, GCC outputs this: countLeadingZeros32(unsigned int): mov eax, 32 test edi, edi je .L1 mov eax, 31 lzcnt edi, edi sub eax, edi xor eax, 31 .L1: ret LLVM instead outputs this: countLeadingZeros32(unsigned int): lzcnt eax, edi ret
[Bug rtl-optimization/101885] [10/11/12 Regression] wrong code at -O3 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101885 --- Comment #12 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #8)> The failed match attempt > (parallel [ > (set (reg:QI 82 [ b_lsm_flag.26 ]) > (and:QI (reg:QI 143) > (reg:QI 145))) > (set (reg:CCZ 17 flags) > (compare:CCZ (and:QI (reg:QI 143) > (reg:QI 145)) > (const_int 0 [0]))) > ]) > actually looks almost good, except that it would need to try them in the > other order in the parallel. > I must say I forgot whether the flags first then operation ordering is now > canonical everywhere, or whether some backends want one and others another > one. It is canonical, and has been since literally forever. 81ad201ac5f6 (from 2017) makes this explicit in our documentation. compare-elim.c used to get this wrong, but it was fixed in 4f0473fe89e6 (also 2017). > But I vaguely remember there are various passes that only work with the > ordering x86 has. Only compare-elim did. It was fixed.
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 Thomas Rodgers changed: What|Removed |Added Status|REOPENED|SUSPENDED --- Comment #11 from Thomas Rodgers --- (In reply to Jonathan Wakely from comment #10) > N.B. [member.functions] in the standard says > > "For a non-virtual member function described in the C++ standard library, an > implementation may declare a different set of member function signatures, > provided that any call to the member function that would select an overload > from the set of declarations described in this document behaves as if that > overload were selected." > > In general, being declared with a different signature is permitted. > > Do you have an example where a call to std::atomic::notify_one() that > should be valid according to the standard either fails to compile or > misbehaves, as a result of being const qualified? Pending the outcome of whether there is an LWG issue with the wording, and given this, I am going to mark this issue SUSPENDED.
[Bug middle-end/101926] [meta-bug] struct/complex argument passing and return should be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926 Bug 101926 depends on bug 99712, which changed state. Bug 99712 Summary: Cannot elide aggregate parameter setup https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99712 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug target/50883] [ARM] Suboptimal optimization for small structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50883 Andrew Pinski changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #8 from Andrew Pinski --- *** Bug 99712 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/99712] Cannot elide aggregate parameter setup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99712 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- Dup of bug 50883. *** This bug has been marked as a duplicate of bug 50883 ***
[Bug middle-end/101926] [meta-bug] struct/complex argument passing and return should be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926 Bug 101926 depends on bug 104372, which changed state. Bug 104372 Summary: [ARM] Unnecessary writes to stack when passing aggregate in registers https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104372 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug rtl-optimization/104372] [ARM] Unnecessary writes to stack when passing aggregate in registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104372 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Andrew Pinski --- Dup of bug 50883. *** This bug has been marked as a duplicate of bug 50883 ***
[Bug target/50883] [ARM] Suboptimal optimization for small structures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50883 Andrew Pinski changed: What|Removed |Added CC||palchak at google dot com --- Comment #7 from Andrew Pinski --- *** Bug 104372 has been marked as a duplicate of this bug. ***
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #10 from Jonathan Wakely --- N.B. [member.functions] in the standard says "For a non-virtual member function described in the C++ standard library, an implementation may declare a different set of member function signatures, provided that any call to the member function that would select an overload from the set of declarations described in this document behaves as if that overload were selected." In general, being declared with a different signature is permitted. Do you have an example where a call to std::atomic::notify_one() that should be valid according to the standard either fails to compile or misbehaves, as a result of being const qualified?
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #9 from Jonathan Wakely --- (In reply to Óscar Fuentes from comment #6) > So IIUC you are applying modifications to libstdc++ that deviate from the > published standard expecting that the committee will accept those changes. > As a user, this is troublesome, because right now I need to special-case gcc > version >11.2 and maybe version not accepted and is reverted. Why do you need to special case anything? What problem are these extra const qualifications causing you?
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 --- Comment #6 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #5) > except something very contrived that uses SFINAE of concepts s/of/or/
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #8 from Jonathan Wakely --- *** Bug 103933 has been marked as a duplicate of this bug. ***
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #5 from Jonathan Wakely --- (In reply to Óscar Fuentes from comment #4) > Ok, thanks, but atomic<>::notify_(one|all) exist and show the problem, which > was introduced as a "fix" for > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 Yes, I know. What exactly is the problem? If you call them on a non-const object, it still works. No valid code is rejected as a result of those const qualifications, except something very contrived that uses SFINAE of concepts to fail if they also work on const objects. I'm closing this as a dup of PR 102994, I don't see any advantage to having two bugs about essentially the same thing. *** This bug has been marked as a duplicate of bug 102994 ***
[Bug tree-optimization/104373] [12 regression] bogus -Wmaybe-uninitialized warning with array new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104373 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Component|c++ |tree-optimization Ever confirmed|0 |1 Last reconfirmed||2022-02-04 --- Comment #1 from Andrew Pinski --- Confirmed. Though I am not 100% sure how to get this fixed. The warning is 100% bogus as the basic block where it is being warned about can never be reached. if (cleanup.5_20 != 0) goto ; [INV] else goto ; [INV] : if (_51(D) != 0B) goto ; [INV] else goto ; [INV] Only path which is bb 16 is reached is when cleanup.5_20 is non-zero and cleanup.5_20 is defined as: cleanup.5_20 = 0; So in theory the uninitialize warning pass should have detected that and not warned.
[Bug c++/104373] [12 regression] bogus -Wmaybe-uninitialized warning with array new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104373 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0 Keywords||diagnostic
[Bug c++/104367] Possible improvements for -Wmisleading-indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104367 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Component|c |c++ --- Comment #3 from Andrew Pinski --- In the C case, I had thought we warned about return with a statement for a void return type but looks like I am wrong. This would have showed the issue too. Obviously for C++, it is not that useful due to templates and such. The non-null warning is not going to be useful in the general case really, it might have helped here but does not mean it will help in general.
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #7 from Thomas Rodgers --- (In reply to Óscar Fuentes from comment #6) > (In reply to Jonathan Wakely from comment #5) > > (In reply to Óscar Fuentes from comment #4) > > > The fix is wrong. It changes atomic_notify_one and atomic_notify_all > > > instead > > > of atomic<>::wait. > > > > It changed both. > > > > > So right now atomic<>::wait remains unfixed > > > > Are you sure? > > Sigh. Sorry. It would be nice if the commit message mentioned the change to > atomic_notify_* and its motivation, though. > > > > and atomic_notify_(one|all) arg > > > is wrongly marked as const. > > > > This will be the subject of a library issue, potentially fixing the > > standard. The notify functions should be const too. > > So IIUC you are applying modifications to libstdc++ that deviate from the > published standard expecting that the committee will accept those changes. > As a user, this is troublesome, because right now I need to special-case gcc > version >11.2 and maybe version not accepted and is reverted. There is an ongoing discussion between myself and the SG1,LWG, and LEWG chairs (two of which were authors of p1135 which proposes atomic wait/notify) as to whether there is a wording issue with the standard. None of the three major standard library implementations require (as a matter of implementation detail) notify_one/notify_all to be non-const, and indeed the early wording of p1135 had them marked const. Between r2 and r3 of p1135 this was changed, it'cites the minutes of an LEWG discussion as part of the change rationale, but the minutes of that discussion do not give the motivation for the change. One argument is that you would typically wait in a const context and notify in a non-const context, but by that rationale, the constness of atomic_ref::notify is somewhat weird.
[Bug target/104375] [x86] Failure to recognize bzhi pattern when shr is present
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104375 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug c++/104079] [9/10/11 Regression] internal compiler error: in nothrow_spec_p, at cp/except.c:1192 since r9-4662-g0d699def39bb937e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104079 Patrick Palka changed: What|Removed |Added Summary|[9/10/11/12 Regression] |[9/10/11 Regression] |internal compiler error: in |internal compiler error: in |nothrow_spec_p, at |nothrow_spec_p, at |cp/except.c:1192 since |cp/except.c:1192 since |r9-4662-g0d699def39bb937e |r9-4662-g0d699def39bb937e --- Comment #9 from Patrick Palka --- Fixed for GCC 12 so far.
[Bug c++/104079] [9/10/11/12 Regression] internal compiler error: in nothrow_spec_p, at cp/except.c:1192 since r9-4662-g0d699def39bb937e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104079 --- Comment #8 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:82e31c8973eb1a752c2ffd01005efe291d35cee3 commit r12-7041-g82e31c8973eb1a752c2ffd01005efe291d35cee3 Author: Patrick Palka Date: Thu Feb 3 18:54:23 2022 -0500 c++: dependence of member noexcept-spec [PR104079] Here a stale TYPE_DEPENDENT_P/_P_VALID value for f's function type after replacing the type's DEFERRED_NOEXCEPT with the parsed dependent noexcept-spec causes us to try to instantiate g's noexcept-spec ahead of time (since it in turn appears non-dependent), leading to an ICE. This patch fixes this by clearing TYPE_DEPENDENT_P_VALID in fixup_deferred_exception_variants appropriately (as in build_cp_fntype_variant). That turns out to fix the testcase for C++17 but not for C++11/14, because it's not until C++17 that a noexcept-spec is part of (and therefore affects dependence of) the function type. Since dependence of NOEXCEPT_EXPR is defined in terms of instantiation dependence, the most appropriate fix for earlier dialects seems to be to make instantiation dependence consider dependence of a noexcept-spec. PR c++/104079 gcc/cp/ChangeLog: * pt.cc (value_dependent_noexcept_spec_p): New predicate split out from ... (dependent_type_p_r): ... here. (instantiation_dependent_r): Use value_dependent_noexcept_spec_p to consider dependence of a noexcept-spec before C++17. * tree.cc (fixup_deferred_exception_variants): Clear TYPE_DEPENDENT_P_VALID. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/noexcept74.C: New test. * g++.dg/cpp0x/noexcept74a.C: New test.
[Bug tree-optimization/104368] [12 Regression] Failure to vectorise conditional grouped accesses after PR102659
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104368 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0 Blocks||53947 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362 Uroš Bizjak changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|12.0|11.4 Host|x86_64-linux-gnu|i386-linux-gnu Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak --- Fixed for gcc-11.4+
[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362 --- Comment #6 from CVS Commits --- The releases/gcc-11 branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:731f4bf14fc89a595abb780a969d03e82b807763 commit r11-9537-g731f4bf14fc89a595abb780a969d03e82b807763 Author: Uros Bizjak Date: Fri Feb 4 00:21:11 2022 +0100 i386: Do not use %ecx DRAP for functions that use __builtin_eh_return [PR104362] %ecx can't be used for both DRAP register and eh_return. Adjust find_drap_reg to choose %edi for functions that uses __builtin_eh_return to avoid the assert in ix86_expand_epilogue that enforces this rule. 2022-02-03 Uroš Bizjak gcc/ChangeLog: PR target/104362 * config/i386/i386.c (find_drap_reg): For 32bit targets return DI_REG if function uses __builtin_eh_return. gcc/testsuite/ChangeLog: PR target/104362 * gcc.target/i386/pr104362.c: New test.
[Bug target/104375] New: [x86] Failure to recognize bzhi patter nwhen shr is present
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104375 Bug ID: 104375 Summary: [x86] Failure to recognize bzhi patter nwhen shr is present Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- #include uint64_t bextr_u64(uint64_t w, unsigned off, unsigned int len) { return (w >> off) & ((1U << len) - 1U); } With -mbmi2, this can be optimized to using shrx followed by bzhi. This transformation is done by LLVM, but not by GCC. PS: Even in the case where the shr is removed and thus the bzhi pattern is recognized (e.g. `return w & ((1U << len) - 1U);`), it is still not compiled optimally as it for some reason decides to put the result of the bzhi in an intermediary register before moving it to eax.
[Bug target/104371] [x86] Failure to use optimize pxor+pcmpeqb+pmovmskb+cmp 0xFFFF pattern to ptest
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104371 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug analyzer/104369] False positive from -Wanalyzer-use-of-uninitialized-value with realloc moving buffer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104369 David Malcolm changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from David Malcolm --- Should be fixed by the above commit.
[Bug analyzer/104369] False positive from -Wanalyzer-use-of-uninitialized-value with realloc moving buffer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104369 --- Comment #1 from CVS Commits --- The master branch has been updated by David Malcolm : https://gcc.gnu.org/g:3ef328c293a336df0aead2d72c0c5ed9781a9861 commit r12-7040-g3ef328c293a336df0aead2d72c0c5ed9781a9861 Author: David Malcolm Date: Wed Feb 2 16:39:12 2022 -0500 analyzer: fixes to realloc-handling [PR104369] This patch fixes various issues with how -fanalyzer handles "realloc" seen when debugging PR analyzer/104369. Previously it wasn't correctly copying over the contents of the old buffer for the success-with-move case, leading to false -Wanalyzer-use-of-uninitialized-value diagnostics. I also noticed that -fanalyzer failed to properly handle "realloc" for cases where the ptr's region had unknown dynamic extents, and an ICE for the case where a tainted value is used as a realloc size argument. This patch fixes these issues, including the false uninit diagnostics seen in PR analyzer/104369. gcc/analyzer/ChangeLog: PR analyzer/104369 * engine.cc (exploded_graph::process_node): Use the node for any diagnostics, avoiding ICE if a bifurcation update adds a saved_diagnostic, such as for a tainted realloc size. * region-model-impl-calls.cc (region_model::impl_call_realloc::success_no_move::update_model): Require the old pointer to be non-NULL to be able successfully grow in place. Use model->deref_rvalue rather than maybe_get_region to support the old pointer being symbolic. (region_model::impl_call_realloc::success_with_move::update_model): Likewise. Add a constraint that the new pointer != the old pointer. Use a sized_region when setting the value of the new region. Handle the case where we don't know the dynamic size of the old region by marking the new region as unknown. * sm-taint.cc (tainted_allocation_size::tainted_allocation_size): Update assertion to also allow for MEMSPACE_UNKNOWN. (tainted_allocation_size::emit): Likewise. (region_model::check_dynamic_size_for_taint): Likewise. gcc/testsuite/ChangeLog: PR analyzer/104369 * gcc.dg/analyzer/pr104369-1.c: New test. * gcc.dg/analyzer/pr104369-2.c: New test. * gcc.dg/analyzer/realloc-3.c: New test. * gcc.dg/analyzer/realloc-4.c: New test. * gcc.dg/analyzer/taint-realloc.c: New test. Signed-off-by: David Malcolm
[Bug c++/91082] Reference to function binds to pointer to function when given a template specialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91082 --- Comment #1 from Andrew Pinski --- clang rejects it with: :8:5: error: non-const lvalue reference to type 'void ()' cannot bind to a temporary of type '' static_cast(); ^ ICC rejects it with: (8): error: cannot determine which instance of function template "a" is intended static_cast(); ^ While MSVC accepts it and just produces a warning: (8): warning C4550: expression evaluates to a function which is missing an argument list
[Bug libstdc++/104361] Biased Reference Counting for the standard library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104361 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug libstdc++/104361] Biased Reference Counting for the standard library
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104361 --- Comment #2 from Marc Glisse --- I looked at this paper for a different project a while ago, and it doesn't seem like such a good match for C++ in general. While the basic idea looks simple (use 2 counters, one for the thread that created the object, one for the others), making it work in all cases is actually a lot of work. In particular the paper requires a runtime that periodically checks a queue in each thread.
[Bug lto/104366] [12 Regression] Regression: infinite loop in add_sibling_attributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104366 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0 Keywords||compile-time-hog
[Bug other/104374] New: attributes for signal safety and signal handling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104374 Bug ID: 104374 Summary: attributes for signal safety and signal handling Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: crrodriguez at opensuse dot org Target Milestone: --- It will be nice that a set of function attributes like: __attribute__((signal_handler)) (since "signal" is already used in the AVR port) for sa_handler function and __attribute__(("async-signal-safe")) --> for use with gcc builtins and the C library to annotate functions that either ought to be signal safe according to the standards or user defined. SO -Wall could warn about for example using stdio on a signal handler or similar bugs.
[Bug fortran/66193] ICE for initialisation of some non-zero-sized arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66193 anlauf at gcc dot gnu.org changed: What|Removed |Added CC||anlauf at gcc dot gnu.org --- Comment #18 from anlauf at gcc dot gnu.org --- Created attachment 52346 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52346=edit WIP patch This fixes at least some of the testcase given in this PR and regtests OK. May need more testing and fine-tuning.
[Bug middle-end/104260] [12 Regression] Misplaced waccess3 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104260 Martin Sebor changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from Martin Sebor --- Done.
[Bug middle-end/104260] [12 Regression] Misplaced waccess3 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104260 --- Comment #4 from CVS Commits --- The master branch has been updated by Martin Sebor : https://gcc.gnu.org/g:5a668ec0339c28b0725ded1e80d3276edb76b8b3 commit r12-7038-g5a668ec0339c28b0725ded1e80d3276edb76b8b3 Author: Martin Sebor Date: Thu Feb 3 14:51:46 2022 -0700 Adjust warn_access pass placement [PR104260]. Resolves: PR middle-end/104260 - Misplaced waccess3 pass gcc/ChangeLog: PR middle-end/104260 * passes.def (pass_warn_access): Adjust pass placement.
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #6 from Óscar Fuentes --- (In reply to Jonathan Wakely from comment #5) > (In reply to Óscar Fuentes from comment #4) > > The fix is wrong. It changes atomic_notify_one and atomic_notify_all instead > > of atomic<>::wait. > > It changed both. > > > So right now atomic<>::wait remains unfixed > > Are you sure? Sigh. Sorry. It would be nice if the commit message mentioned the change to atomic_notify_* and its motivation, though. > > and atomic_notify_(one|all) arg > > is wrongly marked as const. > > This will be the subject of a library issue, potentially fixing the > standard. The notify functions should be const too. So IIUC you are applying modifications to libstdc++ that deviate from the published standard expecting that the committee will accept those changes. As a user, this is troublesome, because right now I need to special-case gcc version >11.2 and maybe version
[Bug c/104367] Possible improvements for -Wmisleading-indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104367 Martin Sebor changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #2 from Martin Sebor --- Declaring bar()n with attribute nonnull triggers -Wnonnull but only at -O1: In function ‘foo’, inlined from ‘main’ at pr104367.c:15:5: pr104367.c:11:5: warning: argument 1 null where non-null expected [-Wnonnull] 11 | bar(x); | ^~ pr104367.c: In function ‘main’: pr104367.c:3:32: note: in a call to function ‘bar’ declared ‘nonnull’ 3 | __attribute__ ((nonnull)) void bar(int *x) { |^~~ At -O2 GCC notices the invalid access and emits a trap but doesn't warn: void foo (int * x) { int _3; [local count: 1073741824]: if (x_1(D) == 0B) goto ; [9.81%] else goto ; [90.19%] [local count: 105334072]: _3 ={v} MEM[(int *)0B]; __builtin_trap (); [local count: 1073741824]: return; }
[Bug target/104362] [12 Regression] ICE in ix86_expand_epilogue, at config/i386/i386.c:9362 since r12-3117-g6e5401e87d02919b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104362 --- Comment #5 from CVS Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:599122fa690d55e5e14d74f4d514b2d8b6a98505 commit r12-7037-g599122fa690d55e5e14d74f4d514b2d8b6a98505 Author: Uros Bizjak Date: Thu Feb 3 22:24:21 2022 +0100 i386: Do not use %ecx DRAP for functions that use __builtin_eh_return [PR104362] %ecx can't be used for both DRAP register and eh_return. Adjust find_drap_reg to choose %edi for functions that uses __builtin_eh_return to avoid the assert in ix86_expand_epilogue that enforces this rule. 2022-02-03 Uroš Bizjak gcc/ChangeLog: PR target/104362 * config/i386/i386.cc (find_drap_reg): For 32bit targets return DI_REG if function uses __builtin_eh_return. gcc/testsuite/ChangeLog: PR target/104362 * gcc.target/i386/pr104362.c: New test.
[Bug analyzer/103872] testcase fail in gcc.dg/analyzer/pr103526.c on riscv64-unknown-elf-gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103872 David Malcolm changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from David Malcolm --- Thanks; I can reproduce this, and am working on a fix (it's a bug in region_model::impl_call_memcpy)
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 --- Comment #4 from Óscar Fuentes --- (In reply to Jonathan Wakely from comment #3) > (In reply to Óscar Fuentes from comment #1) > > Also, the template functions atomic_notify_one and atomic_notify_all take a > > const argument, when it should be non-const. > > > > The `volatile' arg overload is missing too. > > Because there is no atomic_flag::notify_one() or atomic_flag::notify_all() > in libstdc++. The volatile overloads are not useful, and are deprecated, and > are not present in libstdc++. Ok, thanks, but atomic<>::notify_(one|all) exist and show the problem, which was introduced as a "fix" for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994
[Bug target/104345] [12 Regression] "nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1" patch made some code generation worse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104345 Roger Sayle changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |roger at nextmovesoftware dot com --- Comment #3 from Roger Sayle --- Additional patch proposed: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589802.html I need to figure out how to (re)produce Thomas' "used N registers" reports. If someone could summarize the effect of this patch (and previous patches) on register usage, that would be much appreciated (and help reviewers).
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 --- Comment #5 from Jonathan Wakely --- (In reply to Óscar Fuentes from comment #4) > The fix is wrong. It changes atomic_notify_one and atomic_notify_all instead > of atomic<>::wait. It changed both. > So right now atomic<>::wait remains unfixed Are you sure? > and atomic_notify_(one|all) arg > is wrongly marked as const. This will be the subject of a library issue, potentially fixing the standard. The notify functions should be const too.
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 --- Comment #3 from Jonathan Wakely --- (In reply to Óscar Fuentes from comment #1) > Also, the template functions atomic_notify_one and atomic_notify_all take a > const argument, when it should be non-const. > > The `volatile' arg overload is missing too. Because there is no atomic_flag::notify_one() or atomic_flag::notify_all() in libstdc++. The volatile overloads are not useful, and are deprecated, and are not present in libstdc++.
[Bug tree-optimization/85741] [meta-bug] bogus/missing -Wformat-overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85741 Bug 85741 depends on bug 104119, which changed state. Bug 104119 Summary: [12 Regression] unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104119 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/104119] [12 Regression] unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104119 Martin Sebor changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Martin Sebor --- The warning has been avoided in this case by using the size of the source array as the upper bound. The heuristic the warning uses is still in place so when the size of the source array isn't known (e.g., when it's a flexible array member) it will still trigger.
[Bug c++/104373] New: [12 regression] bogus -Wmaybe-uninitialized warning with array new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104373 Bug ID: 104373 Summary: [12 regression] bogus -Wmaybe-uninitialized warning with array new Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: s...@li-snyder.org Target Milestone: --- hi - With a recent checkout of gcc12 (20220203), on a x86_64-pc-linux-gnu host, the following source gives bogus -Wmaybe-uninitialized warnings with -Wall: -- void* operator new[](unsigned long, void* __p); struct allocator { ~allocator(); }; void *foo (void *p) { return p ? new(p) allocator[1] : new allocator[1]; } -- $ g++ -Wall -c gccbug.cc gccbug.cc: In function ‘void* foo(void*)’: gccbug.cc:11:51: warning: ‘’ may be used uninitialized [-Wmaybe-uninitialized] 11 | return p ? new(p) allocator[1] : new allocator[1]; | ^ gccbug.cc:11:51: note: ‘’ was declared here 11 | return p ? new(p) allocator[1] : new allocator[1]; | ^ gccbug.cc:11:32: warning: ‘’ may be used uninitialized [-Wmaybe-uninitialized] 11 | return p ? new(p) allocator[1] : new allocator[1]; |^ gccbug.cc:11:32: note: ‘’ was declared here 11 | return p ? new(p) allocator[1] : new allocator[1]; |^ >From git bisect, this appears to have been introduced by this commit: commit beaee0a871b6485d20573fe050b1fd425581e56a (HEAD) Author: Jason Merrill Date: Sat Jan 1 16:00:22 2022 -0500 c++: temporary lifetime with array aggr init [PR94041] The previous patch fixed temporary lifetime for aggregate initialization of classes; this one extends that fix to arrays. This specifically reverses my r74790, the patch for PR12253, which was made wrong when these semantics were specified in DR201. Since the array cleanup region encloses the regions for any temporaries, we don't need to add an additional region for the array object itself in either initialize_local_var or split_nonconstant_init; we do, however, need to tell split_nonconstant_init how to disable the cleanup once an enclosing object is fully constructed, at which point we want to run that destructor instead. FWIW, the warning goes away if the conditional expression in foo() is rewritten as an explicit if statement.
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 --- Comment #2 from Óscar Fuentes --- The breakage mentioned on my previous message was introduced by a wrong fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994
[Bug libstdc++/102994] std::atomic::wait is not marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994 Óscar Fuentes changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #4 from Óscar Fuentes --- The fix is wrong. It changes atomic_notify_one and atomic_notify_all instead of atomic<>::wait. So right now atomic<>::wait remains unfixed and atomic_notify_(one|all) arg is wrongly marked as const.
[Bug tree-optimization/104119] [12 Regression] unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104119 --- Comment #6 from CVS Commits --- The master branch has been updated by Martin Sebor : https://gcc.gnu.org/g:3c9f762ad02f398c27275688c3494332f69237f5 commit r12-7033-g3c9f762ad02f398c27275688c3494332f69237f5 Author: Martin Sebor Date: Thu Feb 3 13:27:16 2022 -0700 Constrain conservative string lengths to array sizes [PR104119]. Resolves: PR tree-optimization/104119 - unexpected -Wformat-overflow after strlen in ILP32 since Ranger integration gcc/ChangeLog: PR tree-optimization/104119 * gimple-ssa-sprintf.cc (struct directive): Change argument type. (format_none): Same. (format_percent): Same. (format_integer): Same. (format_floating): Same. (get_string_length): Same. (format_character): Same. (format_string): Same. (format_plain): Same. (format_directive): Same. (compute_format_length): Same. (handle_printf_call): Same. * tree-ssa-strlen.cc (get_range_strlen_dynamic): Same. Call get_maxbound. (get_range_strlen_phi): Same. (get_maxbound): New function. (strlen_pass::get_len_or_size): Adjust to parameter change. * tree-ssa-strlen.h (get_range_strlen_dynamic): Change argument type. gcc/testsuite/ChangeLog: PR tree-optimization/104119 * gcc.dg/tree-ssa/builtin-snprintf-13.c: New test. * gcc.dg/tree-ssa/builtin-sprintf-warn-29.c: New test.
[Bug libstdc++/103933] atomics: notify_one, notify_all marked const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103933 --- Comment #1 from Óscar Fuentes --- Also, the template functions atomic_notify_one and atomic_notify_all take a const argument, when it should be non-const. The `volatile' arg overload is missing too.
[Bug c++/104358] Assignable template lambda as function parameter is incorrectly reduced to type of "int"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104358 --- Comment #1 from qingzhe huang --- Sorry about the long description and here is the short version to highlight the core issue. Given this template function with a templated lambda as parameter: template using Lambda=decltype(+[](T){}); template auto foo(T&&, Lambda)->T; GCC considers the parameter "Lambda" as type "int" which is wrong: static_assert(is_same_v), int(*)(int&&, /*"Lambda"*/ int)>);
[Bug c++/104319] better error message for parsing error when >= or >> ends a template variable.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104319 --- Comment #10 from qingzhe huang --- Here I have another test case. It involves an anonymous template argument which confuses me for a lot at the time which clang is doing a great job to clarify the reason for me. https://www.godbolt.org/z/YGfMncGeW template ::value, bool>=true //there should be a space in ">=" > struct TestStruct{};
[Bug rtl-optimization/104372] [ARM] Unnecessary writes to stack when passing aggregate in registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104372 --- Comment #1 from David Palchak --- Demo here: https://godbolt.org/z/Tbh5YP61h
[Bug rtl-optimization/104372] New: [ARM] Unnecessary writes to stack when passing aggregate in registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104372 Bug ID: 104372 Summary: [ARM] Unnecessary writes to stack when passing aggregate in registers Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: palchak at google dot com Target Milestone: --- Missed optimization when an aggregate is passed by value entirely in registers: struct Bar {}; struct Foo { Bar *addr; int size; }; Bar* MakeBar1(Bar *addr, int) noexcept { return addr; } Bar* MakeBar2(Foo foo) noexcept { return foo.addr; } When compiled with '-O2' using 'arm-unknown-linux-gnueabihf-g++ (GCC) 12.0.0' generates: MakeBar1(Bar*, int): bx lr MakeBar2(Foo): sub sp, sp, #8 add r3, sp, #8 stmdb r3, {r0, r1} add sp, sp, #8 bx lr The creation of a stack frame in MakeBar2 is completely unnecessary. For comparison, Clang 11.0.1 generates identical code for both functions that matches MakeBar1 shown here.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 --- Comment #5 from Segher Boessenkool --- (In reply to rdapp from comment #4) > originally ifcvt would only pass e.g. > > (unle (reg:SF 129 [ _29 ]) > (reg/v:SF 118 [ highScore ])) > > as condition to rs6000_emit_cmove via emit_conditional_move (). (This is > the example from the ICE). And that works fine, right? > dest = (reg/f:DI 122 [ bestFuzz$__object ]) > > The check > > if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) With compare_mode SFmode, and result_mode, what, SImode? Or VOIDmode? In either case this would be true. > fails and we return false Which does not match my argument assumptions then, so they must be wrong. > i.e. the expander fails and returns NULL_RTX. This > is fine as we will just not do anything with it in ifcvt. But it was fine by accident, then :-( > Now with the patch rs6000_emit_cmove is passed > > (unle (reg:CCFP 153) > (const_int 0 [0])). > > This CC comparison has already been computed before so the backend actually > needs to do less work and could just use it without preparing a comparison. > This helps costing on the ifcvt side. But this is done during expand only, costing is pretty irrelevant there. expand should emit correct code, and later passes optimise that. Many of our oldest problems are expand trying to "optimise" things :-( Or is this function called later as well? > dest is the same as before Which is? > but now we have > > FLOAT_MODE_P (compare_mode) == false That is wrong then. > and > > if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) > > not causing a return false. So you meant that FLOAT_MODE (compare_mode) is true? > We then call rs6000_emit_int_cmove which in turn calls > rs6000_generate_compare. > There we try to > > emit_insn (gen_rtx_SET (compare_result, > > gen_rtx_COMPARE (comp_mode, op0, op1))); > > but ICE since the modes don't match or make sense. Yes, rs6000_generate_compare does not expect the funny RTL for a condition code as input: it wants a real comparison, instead, so not a representation of the result of such a comparison. > From an ifcvt point of view we would just expect a NULL_RTX if something > cannot be handled. We did not pass CC compares to backends before, so I > don't know how reasonable this assumption is :) Hard to say, I don't find documentation for this. But, such a change is not in scope for stage 4, no matter how you look at it :-(
[Bug fortran/104311] [9/10/11/12 Regression] ICE out of memory since r9-6321-g4716603bf875ce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104311 --- Comment #9 from CVS Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:4e4252db0348a7274663a892c3a96d3ed7702aff commit r12-7032-g4e4252db0348a7274663a892c3a96d3ed7702aff Author: Harald Anlauf Date: Tue Feb 1 23:33:24 2022 +0100 Fortran: reject simplifying TRANSFER for MOLD with storage size 0 gcc/fortran/ChangeLog: PR fortran/104311 * check.cc (gfc_calculate_transfer_sizes): Checks for case when storage size of SOURCE is greater than zero while the storage size of MOLD is zero and MOLD is an array shall not depend on SIZE. gcc/testsuite/ChangeLog: PR fortran/104311 * gfortran.dg/transfer_simplify_15.f90: New test.
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #39 from Andrew Macleod --- (In reply to Jakub Jelinek from comment #38) > (In reply to Andrew Macleod from comment #37) > > As for ranger, range-ops will return UNDEFINED for the range if x is known > > to be [0,0]. This can be propagated around, and depending on how it ends up > > being used as to what happens with it. > > Yeah, so e.g. with Eric's patch to disable the X / boolean_range_Y simplifier > in match.pd, won't the ranger perform the same optimization? > I mean with the wi_fold_in_parts, if range of the divisor has 2 values and > one of them is 0, won't it try to union range of X / 1 (range of X) and > range of X / 0 (undefined) and yield range of X? So say won't 7 / > Y_with_bool_range yield > [7,7] ? It will... but ranger wont remove the divide unless the simplifier or folding does it. ie: if (x == 7 || x == 0) return y/x; Produces: x_5(D) int [0, 0][7, 7] : _8 = y_7(D) / x_5(D); // predicted unlikely by early return (on trees) predictor. _8 : int [-306783378, 306783378] Change it to: if (y != 7) return 1; if (x == 7 || x == 0) return y/x; and ranger will provide [7,7] / [0,0][7,7] then go thru folding and remove the statement: Folding statement: _8 = y_5(D) / x_6(D); Queued stmt for removal. Folds to: 1 however ranger will indeed calculate this range as [1,1]. _8 will be registered as [1,1] and some follow on code may get eliminated as a result... Even if the folder didn't remove this, Its possible that if we propagate the value of [1,1] it could make the divide dead code, and it could then be removed by some other passes. If we have states where divide by zero cannot possibly ever be eliminated like this, then we could have the / 0 case return VARYING instead... we just need to decide in rangeops what behaviour you want. In fact, I think it did in GCC11... yeah, here it is. // If we're definitely dividing by zero, there's nothing to do. if (wi_zero_p (type, divisor_min, divisor_max)) { r.set_varying (type); return; }
[Bug target/104371] New: [x86] Failure to use optimize pxor+pcmpeqb+pmovmskb+cmp 0xFFFF pattern to ptest
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104371 Bug ID: 104371 Summary: [x86] Failure to use optimize pxor+pcmpeqb+pmovmskb+cmp 0x pattern to ptest Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- bool is_zero(__m128i x) { return _mm_movemask_epi8(_mm_cmpeq_epi8(x, _mm_setzero_si128())) == 0x; } This can be optimized to `return _mm_testz_si128(x, x);`. This optimization is done by LLVM, but not by GCC.
[Bug target/104363] hppa: __asm__ directive .global and multiple .symver not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104363 --- Comment #7 from dave.anglin at bell dot net --- On 2022-02-03 12:13 p.m., danglin at gcc dot gnu.org wrote: > If I was to guess, I suspect the problem is with asm. Maybe a '\t' > is needed before .symver on hppa. The hppa assembler wants white space > before directives. That is fixed in this commit: https://github.com/smuellerDD/libkcapi/commit/3e9a1494dd2401094675fb54b1013022bd7933b8
[Bug fortran/104329] [12 Regression] ICE in resolve_omp_atomic, at fortran/openmp.cc:7827 (etc.) since r12-5793-g689407ef916503b2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104329 --- Comment #2 from Tobias Burnus --- The problem is related to x = ['123'] generates the AST ASSIGN z1:_F.DA0(FULL) (parens z1:x(FULL)) ASSIGN z1:x(FULL) z1:_F.DA0(FULL) The following should fix it - at least it fixes the three examples by erroring out with: Error: !$OMP ATOMIC statement must set a scalar variable of intrinsic type at (1) --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -7690 +7690 @@ resolve_omp_atomic (gfc_code *code) - gfc_code *stmt = NULL, *capture_stmt = NULL; + gfc_code *stmt = NULL, *capture_stmt = NULL, *tailing_stmt = NULL; @@ -7828 +7828,2 @@ resolve_omp_atomic (gfc_code *code) - gcc_assert (!code->next->next); + /* Shall be NULL but can happen for invalid code. */ + tailing_stmt = code->next->next; @@ -7836 +7837,2 @@ resolve_omp_atomic (gfc_code *code) - gcc_assert (!code->next); + /* Shall be NULL but can happen for invalid code. */ + tailing_stmt = code->next; @@ -7888,0 +7891,3 @@ resolve_omp_atomic (gfc_code *code) + /* Should be diagnosed above already. */ + gcc_assert (tailing_stmt == NULL); +
[Bug target/103686] ICE in rs6000_expand_new_builtin at rs6000-call.c:15946
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103686 Bill Schmidt changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #13 from Bill Schmidt --- Fixed now.
[Bug target/90524] [10/11 Regression] attribute name and argument mixed up in an error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90524 Martin Sebor changed: What|Removed |Added Status|ASSIGNED|NEW Summary|[10/11/12 Regression] |[10/11 Regression] |attribute name and argument |attribute name and argument |mixed up in an error|mixed up in an error |message |message Assignee|msebor at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #5 from Martin Sebor --- Fixed by Martin Liska in r12-7014. I'll leave that to others to backport.
[Bug target/103686] ICE in rs6000_expand_new_builtin at rs6000-call.c:15946
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103686 --- Comment #12 from CVS Commits --- The master branch has been updated by William Schmidt : https://gcc.gnu.org/g:48bd780ee327c9ae6ffc0641e73cc1f4939fb204 commit r12-7030-g48bd780ee327c9ae6ffc0641e73cc1f4939fb204 Author: Bill Schmidt Date: Wed Feb 2 21:30:27 2022 -0600 rs6000: Remove -m[no-]fold-gimple flag [PR103686] The -m[no-]fold-gimple flag was really intended primarily for internal testing while implementing GIMPLE folding for rs6000 vector built-in functions. It ended up leaking into other places, causing problems such as PR103686 identifies. Let's remove it. There are a number of tests in the testsuite that require adjustment. Some specify -mfold-gimple directly, which is the default, so that is handled by removing the option. Others unnecessarily specify -mno-fold-gimple, as the tests work fine without this. Again that is handled by removing the option. There are a couple of extra variants of tests specifically for -mno-fold-gimple; for those, we can just remove the whole test. gcc.target/powerpc/builtins-1.c was more problematic. It was written in such a way as to be extremely fragile. For this one, I rewrote the whole test in a different style, using individual functions to test each built-in function. These same tests are also largely covered by builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to explicitly make this test -mbig for simplicity, and use -O2 for clean code generation. I made some slight modifications to the expected instruction counts as a result, and tested on both 32- and 64-bit. 2022-02-02 Bill Schmidt gcc/ PR target/103686 * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Remove test for !rs6000_fold_gimple. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise. * config/rs6000/rs6000.opt (mfold-gimple): Remove. gcc/testsuite/ PR target/103686 * gcc.target/powerpc/builtins-1-be-folded.c: Remove -mfold-gimple option. * gcc.target/powerpc/builtins-1-le-folded.c: Likewise. * gcc.target/powerpc/builtins-1.c: Rewrite to use small functions and restrict to -O2 -mbig for predictability. Adjust instruction counts. * gcc.target/powerpc/builtins-5.c: Remove -mno-fold-gimple option. * gcc.target/powerpc/p8-vec-xl-xst.c: Likewise. * gcc.target/powerpc/pr83926.c: Likewise. * gcc.target/powerpc/pr86731-nogimplefold-longlong.c: Delete. * gcc.target/powerpc/pr86731-nogimplefold.c: Delete. * gcc.target/powerpc/swaps-p8-17.c: Remove -mno-fold-gimple option.
[Bug target/95082] [11/12] LE implementations of vec_cnttz_lsbb and vec_cntlz_lsbb are wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95082 --- Comment #7 from CVS Commits --- The master branch has been updated by William Schmidt : https://gcc.gnu.org/g:3f30f2d1dbb3228b8468b26239fe60c2974ce2ac commit r12-7029-g3f30f2d1dbb3228b8468b26239fe60c2974ce2ac Author: Bill Schmidt Date: Wed Feb 2 21:24:22 2022 -0600 rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082] These built-ins were misimplemented as always having big-endian semantics. 2022-01-18 Bill Schmidt gcc/ PR target/95082 * config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle endianness for vclzlsbb and vctzlsbb. * config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change default pattern and indicate a different pattern will be used for big endian. (VCLZLSBB_V4SI): Likewise. (VCLZLSBB_V8HI): Likewise. (VCTZLSBB_V16QI): Likewise. (VCTZLSBB_V4SI): Likewise. (VCTZLSBB_V8HI): Likewise. gcc/testsuite/ PR target/95082 * gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig. * gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise. * gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New. * gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New. * gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig. * gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise. * gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New. * gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
[Bug target/104363] hppa: __asm__ directive .global and multiple .symver not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104363 John David Anglin changed: What|Removed |Added CC||danglin at gcc dot gnu.org --- Comment #6 from John David Anglin --- For context, see: https://github.com/smuellerDD/libkcapi/issues/133#issuecomment-1024349323 Note that the following commit fixes the symbol issue on hppa with gcc-11: https://github.com/smuellerDD/libkcapi/commit/71d80bcffca26373149121e026d612146b4695d5 The patch predates the hppa issue and it doesn't have anything to do with hppa. It does mention -flto but this doesn't seem to apply here. As far as I remember, support for symbol versioning is done in generic code. If I was to guess, I suspect the problem is with asm. Maybe a '\t' is needed before .symver on hppa. The hppa assembler wants white space before directives.
[Bug libstdc++/103755] {has,use}_facet() and iostream constructor performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103755 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |13.0
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #38 from Jakub Jelinek --- (In reply to Andrew Macleod from comment #37) > As for ranger, range-ops will return UNDEFINED for the range if x is known > to be [0,0]. This can be propagated around, and depending on how it ends up > being used as to what happens with it. Yeah, so e.g. with Eric's patch to disable the X / boolean_range_Y simplifier in match.pd, won't the ranger perform the same optimization? I mean with the wi_fold_in_parts, if range of the divisor has 2 values and one of them is 0, won't it try to union range of X / 1 (range of X) and range of X / 0 (undefined) and yield range of X? So say won't 7 / Y_with_bool_range yield [7,7] ?
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #37 from Andrew Macleod --- (In reply to Jakub Jelinek from comment #35) > I meant something like: > return Z / X; > and there evrp does with -O2 -gnatp optimize away the division. > Though that is likely the X / boolean_range_Y case which you've disabled. > In any case, I think you want to hear from Andrew/Aldy where exactly does > VRP/ranger assume UB on integer division by zero. That divide is remove by the simplifier because it determines that X has a range of [0,1] and I believe the simplifer chooses to ignore the 0 under various circumstances. As for ranger, range-ops will return UNDEFINED for the range if x is known to be [0,0]. This can be propagated around, and depending on how it ends up being used as to what happens with it. This happens in range-ops.cc in operator_div::wi_fold() // If we're definitely dividing by zero, there's nothing to do. if (wi_zero_p (type, divisor_min, divisor_max)) { r.set_undefined (); return; } likewise the MOD operator does the same: void operator_trunc_mod::wi_fold (irange , tree type, const wide_int _lb, const wide_int _ub, const wide_int _lb, const wide_int _ub) const { wide_int new_lb, new_ub, tmp; signop sign = TYPE_SIGN (type); unsigned prec = TYPE_PRECISION (type); // Mod 0 is undefined. if (wi_zero_p (type, rh_lb, rh_ub)) { r.set_undefined (); return; }
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #36 from Eric Botcazou --- > with System.Unsigned_Types; use System.Unsigned_Types; > > function F (X, Y : Unsigned) return Unsigned is > Z : Unsigned; > begin > if X >=2 then > return 0; > end if; > Z := Y; > if X = 1 then > Z := Y + 4; > end if; > return Z / X; > end; > and there evrp does with -O2 -gnatp optimize away the division. Indeed, I see this with GCC 11 & 12 (patched or not) but not with GCC 9 & 10. > Though that is likely the X / boolean_range_Y case which you've disabled. > In any case, I think you want to hear from Andrew/Aldy where exactly does > VRP/ranger assume UB on integer division by zero. Let's close this PR first, which is a 12 regression, and I'll open a PR for the VRP issue in 11 & 12.
[Bug tree-optimization/104356] [12 Regression] divide by zero trap incorrectly optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #35 from Jakub Jelinek --- I meant something like: with System.Unsigned_Types; use System.Unsigned_Types; function F (X, Y : Unsigned) return Unsigned is Z : Unsigned; begin if X >=2 then return 0; end if; Z := Y; if X = 1 then Z := Y + 4; end if; return Z / X; end; and there evrp does with -O2 -gnatp optimize away the division. Though that is likely the X / boolean_range_Y case which you've disabled. In any case, I think you want to hear from Andrew/Aldy where exactly does VRP/ranger assume UB on integer division by zero.
[Bug c++/104359] GCC Treats bool with value != 1 as falsey when picking branches
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104359 Martin Sebor changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #5 from Martin Sebor --- It is undefined but the issue/question keeps coming up. The store that makes the subsequent read undefined is clearly visible in the IL at all optimization levels so it would be quite easy to issue a helpful warning for the code.
[Bug tree-optimization/104356] [12 Regression] divide by zero trap is being removed now when it should not be in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104356 --- Comment #34 from Eric Botcazou --- > For the X / bool_range_Y is X. case I think just !flag_non_call_exceptions > would be better. If @1 has boolean range and is known to be non-zero, it is > known to be 1, so we should be optimizing it elsewhere, that is the constant > case. OK, adjusted and successfully tested on x86-64/Linux. > path isolation is already guarding is_divmod_with_given_divisor calls with > !cfun->can_throw_non_call_exceptions. Indeed. > Can you try to rewrite the > unsigned > foo (unsigned x, unsigned y) > { > if (x >= 2) > return 0; > if (x == 1) > y += 4; > return y / x; > } > > testcase I've posted into Ada and see if it will optimize away the division > in evrp or vrp? For: with System.Unsigned_Types; use System.Unsigned_Types; function F (X, Y : Unsigned) return Unsigned is begin if X >=2 then return 0; elsif X = 1 then return 2 * Y; else return Y / X; end if; end; I get with -O2 -gnatp: _ada_f: .LFB1: .cfi_startproc cmpl$1, %edi ja .L4 je .L6 ud2 .p2align 4,,10 .p2align 3 .L4: xorl%eax, %eax ret .p2align 4,,10 .p2align 3 .L6: leal(%rsi,%rsi), %eax ret
[Bug c++/104319] better error message for parsing error when >= or >> ends a template variable.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104319 --- Comment #9 from Jakub Jelinek --- Though, the cp_parser_next_token_ends_template_argument_p change can't be right. E.g. struct A{}; A<1>=2> a; is not A<1> =2> a; I bet we can't treat at least >= as terminating template argument, perhaps we could go back to it if tentative parsing with >= didn't work out. In any case, not a GCC 12 material as not a regression.
[Bug go/104290] [12 Regression] trunk 20220126 fails to build libgo on i686-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290 Svante Signell changed: What|Removed |Added CC||svante.signell at gmail dot com --- Comment #1 from Svante Signell --- Created attachment 52345 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52345=edit Adding hurd to unixsock_readmsg_cloexec.go fixes the build of libgo Hello, The attached patch fixes the build of libgo for GNU/Hurd for gcc-12-20220126 when patching the generated libgo/Makefile as follows: change ../libbacktrace/libbacktrace.la to ../../libbacktrace/libbacktrace.la and remove libatomic from the linkage: LIBATOMIC = ../libatomic/libatomic_convenience.la PTHREAD_CFLAGS = -pthread -L../libatomic/.libs since libatomic is not built yet. It should built before libgo but does not. Unknown why, it may be a problem with the Debian stuff. Additionally, continuing, the build of gotools fails: go1: error: '-fsplit-stack' currently only supported on GNU/Linux go1: error: '-fsplit-stack' is not supported by this compiler configuration The reason for this also unknown so far, libgo and gotools built fine with split-stack on gcc-11. Is this problem related to that libatomic not yet has bee built?? Thanks!
[Bug analyzer/104370] New: False positive from -Wanalyzer-mismatching-deallocation with reallocarray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104370 Bug ID: 104370 Summary: False positive from -Wanalyzer-mismatching-deallocation with reallocarray Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: dmalcolm at gcc dot gnu.org Target Milestone: --- Created attachment 52344 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52344=edit Reduced reproducer >From downstream report here: https://bugzilla.redhat.com/show_bug.cgi?id=2047926#c0 Compiling the attachment with -fanalyzer gives: : In function 'main': :21:15: warning: 'foo' should have been deallocated with 'free' but was deallocated with 'reallocarray' [CWE-762] [-Wanalyzer-mismatching-deallocation] 21 | new_foo = reallocarray(foo, 201, 200); | ^~~ 'main': events 1-5 | | 17 | foo = calloc(200, 200); | | ^~~~ | | | | | (1) allocated here (expects deallocation with 'free') | 18 | if (!foo) | |~ | || | |(2) assuming 'foo' is non-NULL | |(3) following 'false' branch (when 'foo' is non-NULL)... |.. | 21 | new_foo = reallocarray(foo, 201, 200); | | ~~~ | | | | | (4) ...to here | | (5) deallocated with 'reallocarray' here; allocation at (1) expects deallocation with 'free' | Compiler Explorer: https://godbolt.org/z/K7xaxrfcs Recent glibc headers declare reallocarray twice, with different attributes: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=c1760eaf3b575ad174fd88b252fd16bd525fa818
[Bug debug/104337] [9/10/11/12 Regression] ICE when compiling with optimize attribute and always_inline at -m32 -g3 -O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104337 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:1d5c7584fd6e72bfdbede86cef5ff04ae35f9744 commit r12-7026-g1d5c7584fd6e72bfdbede86cef5ff04ae35f9744 Author: Richard Biener Date: Thu Feb 3 11:20:59 2022 +0100 debug/104337 - avoid messing with the abstract origin chain in NRV The following avoids NRV from massaging DECL_ABSTRACT_ORIGIN after variable creation since NRV runs _after_ the function was inlined and thus affects the inlined variables copy indirectly. We may adjust the abstract origin of a variable only at the point we create it, not further along the path since otherwise the (new) invariant that the abstract origin is always the ultimate origin cannot be maintained. The intent of what NRV does is OK I guess and it may improve the debug experience. But I also notice we do SET_DECL_VALUE_EXPR (found, result); DECL_HAS_VALUE_EXPR_P (found) = 1; the code is there since the merge from tree-ssa which added tree-nrv.c. Jakub added the DECL_VALUE_EXPR in g:938650d8fddb878f623e315f0b7fd94b217efa96 and Jason added the abstract origin setting conditional in g:7716876bbd3a The follwoing takes the radical approach and remove the attempt to "optimize" the debug info. The gdb testsuites show no regressions. 2022-02-03 Richard Biener PR debug/104337 * tree-nrv.cc (pass_nrv::execute): Remove tieing result and found together via DECL_ABSTRACT_ORIGIN. * gcc.dg/debug/pr104337.c: New testcase.
[Bug analyzer/104369] New: False positive from -Wanalyzer-use-of-uninitialized-value with realloc moving buffer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104369 Bug ID: 104369 Summary: False positive from -Wanalyzer-use-of-uninitialized-value with realloc moving buffer Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: dmalcolm at gcc dot gnu.org Target Milestone: --- Created attachment 52343 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52343=edit Reduced reproducer The attached reproducer emits two false positives from -Wanalyzer-use-of-uninitialized-value, both "when 'realloc' succeeds, moving buffer", the first of which is: : In function 'main': :79:34: warning: use of uninitialized value '*pollfds.fd' [CWE-457] [-Wanalyzer-use-of-uninitialized-value] 79 | pollfds[nsockets - 1].fd = accept(pollfds[0].fd, , ); | ^~~~ 'main': events 1-7 | | 62 | if (!pollfds) { | | ^ | | | | | (1) following 'false' branch (when 'pollfds' is non-NULL)... |.. | 67 | rc = ppoll(pollfds, nsockets, NULL, NULL); | | | | | | | (2) ...to here |.. | 74 | newpollfds = realloc(pollfds, nsockets * sizeof(*pollfds)); | |~ | || | |(3) when 'realloc' succeeds, moving buffer | |(4) region created on heap here | 75 | if (!newpollfds) { | | ~ | | | | | (5) following 'false' branch (when 'newpollfds' is non-NULL)... |.. | 78 | pollfds = newpollfds; | | | | | | | (6) ...to here | 79 | pollfds[nsockets - 1].fd = accept(pollfds[0].fd, , ); | | | | | | | (7) use of uninitialized value '*pollfds.fd' here | On Compiler Explorer: https://godbolt.org/z/EKrnsoaY4 >From downstream report: https://bugzilla.redhat.com/show_bug.cgi?id=2047926#c5
[Bug tree-optimization/104368] New: [12 Regression] Failure to vectorise conditional grouped accesses after PR102659
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104368 Bug ID: 104368 Summary: [12 Regression] Failure to vectorise conditional grouped accesses after PR102659 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- The following test regressed with PR102659, compiled with -O3 -march=armv8.2-a+sve: void f(int *restrict x, int *restrict y, int n) { for (int i = 0; i < n; ++i) if (x[i] > 0) x[i] = y[i * 2] + y[i * 2 + 1]; } Previously we treated the y[] accesses as a linear group and so could use LD2W. Now we treat them as individual gather loads instead: .L3: ld1wz1.s, p0/z, [x0, x3, lsl 2] lsl z0.s, z2.s, #1 cmpgt p0.s, p0/z, z1.s, #0 ld1wz1.s, p0/z, [x1, z0.s, sxtw 2] // Gather ld1wz0.s, p0/z, [x5, z0.s, sxtw 2] // Gather add z0.s, z1.s, z0.s st1wz0.s, p0, [x0, x3, lsl 2] incwz2.s add x3, x3, x4 whilelo p0.s, w3, w2 b.any .L3
[Bug lto/104366] [12 Regression] Regression: infinite loop in add_sibling_attributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104366 --- Comment #3 from Martin Liška --- Created attachment 52342 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52342=edit Reduced test-case
[Bug target/104345] [12 Regression] "nvptx: Transition nvptx backend to STORE_FLAG_VALUE = 1" patch made some code generation worse
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104345 --- Comment #2 from Tom de Vries --- (In reply to Roger Sayle from comment #1) > The other patches in the "nvptx Boolean" series are: > patchq3: nvptx: Expand QI mode operations using SI mode instructions. > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587999.html > > patchq4: nvptx: Fix and use BI mode logic instructions (e.g. and.pred). > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588555.html > > [and purely for reference, my other outstanding nvptx patches are] > patchn: nvptx: Improved support for HFMode including neghf2 and abshf2. > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587949.html > > patchw: nvptx: Add support for 64-bit mul.hi (and other) instructions. > https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588453.html FYI, I'm currently testing these.
[Bug c++/104365] Overload ambiguity not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104365 --- Comment #9 from Andris Pavenis --- >> The warning should be in case when both >> 1) there is preferred standard conversion sequence for parameter of one >> overloaded method > >Standard conversions include T -> const T&, and derived-to-base conversions, > >>and T* -> void*. I don't think anybody would be surprised that those >>conversions beat a user-defined one. Perhaps there should be no warning in these cases. [const] char * -> bool is good example which would deserve warning. I do not have other examples currently. >> 2) there is other user defined conversion sequences for one more more other >> overloaded methods > >And non-member functions? Should be handled in the same way as member functions >> 20220203-1.cpp:19:24: warning: call of overloaded 'Test(const char [4], >> unsigned >> char[4])' is ambiguous > >"is ambiguous" is incorrect though, so it would have to be clear that there is >>no ambiguity in C++ terms, just potential for confusion. Maybe 'suspicious use of overloaded ...' or something similar