[Bug c/110047] RFE: Add a warning for use of bare "unsigned" (possibly under -Wimplicit-int?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110047 --- Comment #1 from Richard Biener --- Maybe just diagnose at the point of conversions that are not just sign conversions but truncations/extensions? Note even then this will have a high rate of false positives (I'm myself always short-cutting 'unsigned int' to 'unsigned' ...) so it's more of a coding-style diagnostic where then warning for all plain 'unsigned' might be appropriate as well. So, maybe split it even. -Wconversion-bare-unsigned and -Wbare-unsigned?
[Bug target/110039] [14 Regression] FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #1 from rsandifo at gcc dot gnu.org --- I guess adding an extra pattern means that we'll have three forms for this (on top of the existing alt1 and alt2 patterns). But that probably can't be helped given that the DI form has presumably not changed.
[Bug c/110048] undefined reference when build with O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110048 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Richard Biener --- This is C99 inline semantics - the 'inline' function is only a declaration, not a definition so you need an additional void foo (void); somewhere to create an out-of-line instance. Or use -fgnu89-inline
[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044 --- Comment #3 from Sergey Fedorov --- (In reply to Eric Gallager from comment #2) > possible dup of either bug 60972 and/or bug 68160? >From those topics it looks that the bug, if identical, has never been addressed since GCC 4.9. Would it be helpful to compare against Apple gcc code, which seems to handle the issue correctly?
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 --- Comment #6 from rguenther at suse dot de --- On Tue, 30 May 2023, pinskia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 > > Andrew Pinski changed: > >What|Removed |Added > >Keywords||missed-optimization > Ever confirmed|0 |1 >Severity|normal |enhancement >Last reconfirmed||2023-05-30 > Status|UNCONFIRMED |NEW > > --- Comment #2 from Andrew Pinski --- > More obvious Reduced testcase: > ``` > struct MyClass > { > unsigned long long arr[128]; > }; > > [[gnu::noipa]] > void sink(void *m){} > void gg(MyClass &a) > { > MyClass c = a; > MyClass *b = new MyClass; > *b = c; > sink(b); > } > ``` > > There might be a dup of this issue too. But we cannot move the load of 'a' across the call to operator new since that can possibly clobber 'a' (you can overwrite 'new' with something having observable side-effects)
[Bug c/110048] New: undefined reference when build with O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110048 Bug ID: 110048 Summary: undefined reference when build with O0 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: yinyuefengyi at gmail dot com Target Milestone: --- The below case failed to link with O0 since gcc 5.1, is it a regression? Though clang always failed to link... The case links success with O1+ or 'inline' removed. https://godbolt.org/z/9PEhWrov8 inline void foo(void) { } int main(void) { foo(); }
[Bug target/59666] IBM long double arithmetic results invalid in non-default rounding modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59666 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org, ||iains at gcc dot gnu.org --- Comment #9 from Eric Gallager --- (In reply to Sergey Fedorov from comment #8) > (In reply to Vincent Lefèvre from comment #1) > > (In reply to Joseph S. Myers from comment #0) > > It seems to be like that "by design" (though this is not satisfactory) and > > part of the ppc64 ABI for instance: > > > > http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#PREC > > > > "The software support is restricted to round-to-nearest mode. Programs that > > use extended precision must ensure that this rounding mode is in effect when > > extended-precision calculations are performed." > > Also true for AIX: > https://www.ibm.com/docs/sr/xcafbg/9.0.0?topic=SS3KZ4_9.0.0/com.ibm.xlf111. > bg.doc/xlfopg/fp-overview.html > > Does anyone know whether this is also true for Darwin on PowerPC though? I don't, but Iain might...
[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #2 from Eric Gallager --- possible dup of either bug 60972 and/or bug 68160?
[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #4 from Eric Gallager --- GCC also has its own -Wmisleading-indentation flag; I wonder why that didn't catch this?
[Bug c/110047] New: RFE: Add a warning for use of bare "unsigned" (possibly under -Wimplicit-int?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110047 Bug ID: 110047 Summary: RFE: Add a warning for use of bare "unsigned" (possibly under -Wimplicit-int?) Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: diagnostic Severity: enhancement Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: egallager at gcc dot gnu.org Blocks: 87403 Target Milestone: --- When I was first learning C, one thing that confused me was how you can just use plain "unsigned" as a type, without specifying the length (long, short, int, etc.). Thus, I thought that casting to unsigned would just change the sign like a call to abs(), without realizing that there was an implicit "int" involved. I made a testcase: $ cat bare_unsigned.c #include unsigned var; /* debatable */ unsigned long foo(void) { long variable = LONG_MAX; unsigned long uvariable = (unsigned)variable; /* warn here */ return uvariable; } $ The one where I added the "debatable" comment is debatable because I actually see a lot of declarations in that form pretty often, and it's probably not very harmful in that case, but the case with the cast, where it says "warn here", is probably more deserving of a warning, as there's a change of size involved. It might make sense to include this under -Wimplicit-int, or maybe create a new warning -Wbare-unsigned for it? Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87403 [Bug 87403] [Meta-bug] Issues that suggest a new warning
[Bug c/29970] mixing ({...}) with VLA leads to massive breakage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970 --- Comment #18 from Martin Uecker --- What is not fixed is returning structs with VLA members as in the first three test cases, e.g. the second one still ICEs.
[Bug middle-end/70802] IRA memory cost calculation incorrect for immediates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70802 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=71768 Resolution|--- |FIXED Target Milestone|--- |8.0 Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- This is also fixed for GCC 8 by r8-6056-g5cce817119cd31d18fbfc1c8245519d86b5e9 .
[Bug rtl-optimization/71768] Missed trivial rematerialiation oppurtunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71768 Andrew Pinski changed: What|Removed |Added Known to fail||7.5.0 Known to work||14.0, 8.1.0, 9.1.0 Target Milestone|--- |8.0 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #4 from Andrew Pinski --- Fixed for GCC 8 by r8-6056-g5cce817119cd31d18fbfc1c8245519d86b5e9 .
[Bug target/27663] missed-optimization transforming a byte array to unsigned long
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=27663 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2007-07-25 17:29:16 |2023-5-30 Component|middle-end |target Known to fail||5.1.0 --- Comment #9 from Andrew Pinski --- Starting around GCC 5 or so, a call to __bswapsi2 is done here. my bet is if avr target adds a bswapsi2 pattern (which either expands or splits into the best moves), this will be optimized correctly.
[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 --- Comment #10 from Kewen Lin --- (In reply to JuzheZhong from comment #9) > (In reply to Kewen Lin from comment #8) > > I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit > > --param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline > > r14-1241 vs. new r14-1242, the results showed that it can offer some > > speedups for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%, > > 549.fotonik3d_r 1.25%, but it degraded 510.parest_r by 5.01%. > > > > I just tested Juzhe's new proposed fix which makes the loop closing iv > > SCEV-ed, it can fix the degradation of 510.parest_r, also the miss > > optimization on cunroll (in #c5), the test failures are gone as well. One > > SPEC2017 re-evaluation with that fix is ongoing, I'd expect it won't degrade > > anything. > > Thanks so much. You mean you are trying this patch: > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html ? Yes, it means that Richi's concern (niter analysis but all analyses relying on SCEV are pessimized) does affect the exposed degradation and failures. Thanks for looking into it. > > I believe it can improve even more for IBM's target. Hope so, I'll post the new SPEC2017 results once the run finishes. btw, the SPEC2017 run with --param=vect-partial-vector-usage=2 here is mainly to verify the expectation on the decrement IV change, the normal SPEC2017 runs still use --param=vect-partial-vector-usage=1 which isn't affected by this change and it beats the former in general as the cost for length setting up.
[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 --- Comment #9 from JuzheZhong --- (In reply to Kewen Lin from comment #8) > I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit > --param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline > r14-1241 vs. new r14-1242, the results showed that it can offer some > speedups for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%, > 549.fotonik3d_r 1.25%, but it degraded 510.parest_r by 5.01%. > > I just tested Juzhe's new proposed fix which makes the loop closing iv > SCEV-ed, it can fix the degradation of 510.parest_r, also the miss > optimization on cunroll (in #c5), the test failures are gone as well. One > SPEC2017 re-evaluation with that fix is ongoing, I'd expect it won't degrade > anything. Thanks so much. You mean you are trying this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620086.html ? I believe it can improve even more for IBM's target.
[Bug target/109971] [14 regression] Several powerpc64 vector test cases fail after r14-1242-gf574e2dfae7905
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971 Kewen Lin changed: What|Removed |Added Keywords|testsuite-fail |missed-optimization Assignee|linkw at gcc dot gnu.org |juzhe.zhong at rivai dot ai --- Comment #8 from Kewen Lin --- I did SPEC2017 int/fp evaluation on Power10 at Ofast and an extra explicit --param=vect-partial-vector-usage=2 (the default is 1 on Power), baseline r14-1241 vs. new r14-1242, the results showed that it can offer some speedups for 500.perlbench_r 1.12%, 525.x264_r 1.96%, 544.nab_r 1.91%, 549.fotonik3d_r 1.25%, but it degraded 510.parest_r by 5.01%. I just tested Juzhe's new proposed fix which makes the loop closing iv SCEV-ed, it can fix the degradation of 510.parest_r, also the miss optimization on cunroll (in #c5), the test failures are gone as well. One SPEC2017 re-evaluation with that fix is ongoing, I'd expect it won't degrade anything.
[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Probably best to limit the values to reassoc-width by adding the > appropriate IntegerRange attribute in params.opt > > IntegerRange(0, 256) > > maybe? "rewrite_expr_tree_parallel" got a wrong width from "get_reassociation_width" The number of ops is 4, width is 2147483647. get_reassociation_width: ... width_min = 1; while (width > width_min) { int width_mid = (width + width_min) / 2; --> (width + 1) out of bounds ... So Richard suggested that limiting tree-reassoc-width to IntegerRange(0, 256) would solve the ICE, I also added a width constraint in rewrite_expr_tree_parallel, here is the patch. https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620154.html 1. Limit the value of tree-reassoc-width to IntegerRange(0, 256). 2. Add width limit in rewrite_expr_tree_parallel.
[Bug fortran/105847] namelist-object-name can be a renamed host associated entity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105847 --- Comment #5 from Jerry DeLisle --- Hi Steve,I will see if I can get all this tested and committed this coming weekend.
[Bug c/50486] Missed -Wsign-conversion with signed -> unsigned casting and enums
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50486 Eric Gallager changed: What|Removed |Added Summary|No warning at signed -> |Missed -Wsign-conversion |unsigned casting|with signed -> unsigned ||casting and enums --- Comment #4 from Eric Gallager --- updating the title a bit
[Bug c/29970] mixing ({...}) with VLA leads to massive breakage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29970 --- Comment #17 from Eric Gallager --- (In reply to CVS Commits from comment #16) > The master branch has been updated by Martin Uecker : > > https://gcc.gnu.org/g:4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524 > > commit r12-5338-g4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524 > Author: Martin Uecker > Date: Wed Nov 17 14:20:59 2021 +0100 > > Fix ICE when mixing VLAs and statement expressions [PR91038] > > When returning VM-types from statement expressions, this can > lead to an ICE when declarations from the statement expression > are referred to later. Most of these issues can be addressed by > gimplifying the base expression earlier in gimplify_compound_lval. > Another issue is fixed by wrapping the pointer expression in > pointer_int_sum. This fixes PR91038 and some of the test cases > from PR29970 (structs with VLA members need further work). > > gcc/ > PR c/91038 > PR c/29970 > * gimplify.c (gimplify_var_or_parm_decl): Update comment. > (gimplify_compound_lval): Gimplify base expression first. > (gimplify_target_expr): Add comment. > > gcc/c-family/ > PR c/91038 > PR c/29970 > * c-common.c (pointer_int_sum): Make sure pointer expressions > are evaluated first when the size expression depends on for > variably-modified types. > > gcc/testsuite/ > PR c/91038 > PR c/29970 > * gcc.dg/vla-stexp-3.c: New test. > * gcc.dg/vla-stexp-4.c: New test. > * gcc.dg/vla-stexp-5.c: New test. > * gcc.dg/vla-stexp-6.c: New test. > * gcc.dg/vla-stexp-7.c: New test. > * gcc.dg/vla-stexp-8.c: New test. > * gcc.dg/vla-stexp-9.c: New test. Is this fixed now, or is it staying open for backports?
[Bug c++/55077] implement and enable by default -Wliteral-conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55077 --- Comment #10 from Eric Gallager --- (In reply to David Binderman from comment #9) > -Wfloat-conversion does the deed: any chance of getting it someplace useful > like -Wall or -Wextra anytime soon ? > > I will put it into my local compiler. I think the point here is that the proposed -Wliteral-conversion warns for a smaller number of cases than -Wfloat-conversion does, and thus would be safer to enable more widely than -Wfloat-conversion is.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #7 from Frank J. T. Wojcik --- After playing with this a bit more, I was able to find a Generator which produces a slightly wider bound, but still nowhere near 0x1.fep+127. This also includes the requested change to SimpleGen. #include #include #include #define MAXOFFSET 3 // Return 25-bit values centered around 0x100 (the halfway-point of all // possible 25-bit outputs). Widening the Generator does not alter the // program's output. class SimpleGen { public: using result_type = uint32_t; result_type val, offset, ctr = 0; static constexpr result_type min() { return 0; } static constexpr result_type max() { return 0x1ff; } result_type operator()() { offset = (ctr & 1) ? ((ctr / 2) / (2 * MAXOFFSET)) : ((ctr / 2) % (2 * MAXOFFSET)); val = 0x100 + offset - MAXOFFSET; printf("\tG 0x%07x\n", val); ++ctr; return val; } }; int main(void) { SimpleGen gen; std::normal_distribution norm {0, 1}; printf("min: %+f %a\nmax: %+f %a\n\n", norm.min(), norm.min(), norm.max(), norm.max()); for (int i = 0; i < ((2 * MAXOFFSET) * (2 * MAXOFFSET - 1) * 2) ; i++) { float r = norm(gen); printf("%d %f %a\n", i, r, r); } } Build output: $ g++-12.2 -Wall -Wextra -o normdist2 normdist2.cpp -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined $ echo $? 0 Actual outputs (excerpts only!): min: -340282346638528859811704183484516925440.00 -0x1.fep+127 max: +340282346638528859811704183484516925440.00 0x1.fep+127 G 0x100 G 0x0ff 30 -8.157336 -0x1.0508e6p+3 31 0.00 0x0p+0 G 0x101 G 0x0ff 32 -8.157336 -0x1.0508e6p+3 33 0.00 0x0p+0 G 0x0ff G 0x100 40 0.00 0x0p+0 41 -8.157336 -0x1.0508e6p+3 G 0x102 G 0x100 42 0.00 0x0p+0 43 7.985583 0x1.ff13ccp+2 G 0x0ff G 0x101 48 0.00 0x0p+0 49 -8.157336 -0x1.0508e6p+3 G 0x102 G 0x101 50 0.00 0x0p+0 51 7.985583 0x1.ff13ccp+2 G 0x100 G 0x102 58 7.985583 0x1.ff13ccp+2 59 0.00 0x0p+0 Expected outputs (excerpt only!): min: -8.157336 -0x1.0508e6p+3 max: +7.985583 0x1.ff13ccp+2
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #6 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #5) > 2 things, I think your result_type in SimpleGen needs to be public. Sure, I'll change that. > Second is LLVM's libc++ outputs: > -inf -inf > inf inf I only have clang 6.0 on my system and its outputs are identical to g++'s. And I do not have access to MSVC, but I believe it also produces similar values. These data points are why I bring up the possibility that there is some detail in the specification that I'm missing. Still, the behavior I'm looking for is useful, and it does seem to be what the spec is requiring, so I would like a fix if appropriate.
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 --- Comment #5 from Pontakorn Prasertsuk --- (In reply to Andrew Pinski from comment #3) > We don't even optimize: > ``` > struct MyClass > { > unsigned long long arr[128]; > }; > > [[gnu::noipa]] > void sink(void *m); > void gg(MyClass &a, MyClass *b) > { > MyClass c = a; > *b = c; > sink(b); > } > ``` > > As I mentioned there are dups of the above testcase. Would you mind pointing me to the original issue?
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #5 from Andrew Pinski --- 2 things, I think your result_type in SimpleGen needs to be public. Second is LLVM's libc++ outputs: -inf -inf inf inf
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 --- Comment #4 from Pontakorn Prasertsuk --- (In reply to Richard Biener from comment #1) > Ick - convoluted C++. We end up with > > void ff (struct MyClass & obj) > { > vector(2) long unsigned int vect_SR.16; > vector(2) long unsigned int vect_SR.15; > vector(2) long unsigned int vect_SR.14; > void * _6; > >[local count: 1073741824]: > vect_SR.14_5 = MEM [(struct MyClass > &)obj_2(D)]; > vect_SR.15_28 = MEM [(struct MyClass > &)obj_2(D) + 16]; > vect_SR.16_30 = MEM [(struct MyClass > &)obj_2(D) + 32]; > _6 = operator new (48); > MEM [(struct MyClass2 *)_6] = vect_SR.14_5; > MEM [(struct MyClass2 *)_6 + 16B] = > vect_SR.15_28; > MEM [(struct MyClass2 *)_6 + 32B] = > vect_SR.16_30; > HandleMyClass2 (_6); [tail call] > > and the issue is that 'operator new (48)' can alter what 'obj' points to, > so we cannot move the loads across the call and we get spilling. > > There is no inter-procedural analysis in GCC that would tell us that > 'obj_2(D)' (the MyClass & obj argument of ff) does not point to an > object that did not escape. In fact 'ff' has global visibility > and it might have other callers. > > If you add -fwhole-program then you get the function inlined to main and > > main: > .LFB652: > .cfi_startproc > subq$8, %rsp > .cfi_def_cfa_offset 16 > movl$48, %edi > call_Znwm > movq$0, (%rax) > movq%rax, %rdi > movq$0, 8(%rax) > movq$0, 16(%rax) > movq$0, 24(%rax) > movq$0, 32(%rax) > movq$0, 40(%rax) > call_Z14HandleMyClass2Pv > xorl%eax, %eax > addq$8, %rsp > .cfi_def_cfa_offset 8 > ret > > (not using vectors because 'main' is considered cold). Do you cite an > inline copy of ff() for clang? Hi Richard, The clang snippet I provided is not inlined into 'main' function.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #4 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #3) > With 24bit precision, maybe it is ~8 standard deviations away from the mean. > But the generator argument can change for each call though so that does not > mean the next call to operator() could produce one with more bits ... > > Also the standard says: "as determined by the current values of d's > parameters" > > The parameters is only mean and standard deviations and not the generator. I would agree with all of this also, I think. :) But can you or someone demonstrate *any* generator which produces (e.g.) the current value of max() for std::normal_distribution {0, 1}? I can find no generator implementation which does that, and by my reading of the implementation there cannot be one.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #3 from Andrew Pinski --- With 24bit precision, maybe it is ~8 standard deviations away from the mean. But the generator argument can change for each call though so that does not mean the next call to operator() could produce one with more bits ... Also the standard says: "as determined by the current values of d's parameters" The parameters is only mean and standard deviations and not the generator.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #2 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #1) > The heavy weight goes to potentially. The way I understand it is not the max > of what operator() has produced currently but will potentially return in the > future. And I would agree with that interpretation. My inquiry is based on the fact that I can find *no* Generator outputs which produce the currently-given max() value. The goal of the demo program was to show how I arrived at my "7.985583" value for what seems to be the actual maximum value, and that I didn't make up some arbitrary value. If it helps, imagine the min/max printf() to appear before any generation has been done, or even without any generation. I would expect the same result, modulo order of printouts.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #1 from Andrew Pinski --- The heavy weight goes to potentially. The way I understand it is not the max of what operator() has produced currently but will potentially return in the future.
[Bug libstdc++/110045] New: std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 Bug ID: 110045 Summary: std::normal_distribution (and likely others) give wrong min() and max() values Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gccbugs at elkpod dot com Target Milestone: --- >From my (non-expert) reading of the C++ spec (I'm using https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf for the current 2023-05-10 draft), the min() and max() methods for std::normal_distribution are returning incorrect values. "28.5.3.6 Random number distribution requirements" says that "A class D meets the requirements of a random number distribution if the expressions shown in Table 97 are valid and have the indicated semantics", and that in Table 97 "x" is a "(possibly const) value[s] of D", and that "x.min()" "Returns glb", and that "x.max()" "Returns lub", and that "glb and lub are values of T respectively corresponding to the greatest lower bound and the least upper bound on the values potentially returned by d's operator(), as determined by the current values of d's parameters". By my reading, this means that if I have "std::normal_distribution norm {0, 1};", then "norm.max()" should return the least upper bound on the values potentially returned by norm's operator(). It does not seem to do that. For example, the highest value I was able to get norm() to generate was 0x1.ff13ccp+2 (== 7.985583). norm.max() reports 0x1.fep+127 (== 340282346638528859811704183484516925440.00). While the values returned are technically valid bounds, they do not seem to be the least upper or greatest lower. I could find no Generator outputs which produced a value of 0x1.fep+127 from std::normal_distribution. It is possible that my interpretation of the spec is wrong, and that somehow the min/max values should encompass all possible instantiations of std::normal_distribution, or some other loophole may exist. I could not find a better forum to find the answer to that; sorry. I have looked at the implementation of std::normal_distribution, and written the following code to generate what I believe to be the actual extreme values that are "potentially returned by norm's operator()". Reproduction code: #include #include #include int32_t offsets[8] = { -1, +0, +0, -1, +0, +1, +1, +0 }; class SimpleGen { using result_type = uint32_t; public: result_type val, ctr = 0; static constexpr result_type min() { return 0; } static constexpr result_type max() { return 0xff; } result_type operator()() { val = 0x80 + offsets[(ctr++)%8]; printf("\tG 0x%06x\n", val); return val; } }; int main(void) { SimpleGen gen; std::normal_distribution norm {0, 1}; for (int i = 0; i < 8; i++) { float r = norm(gen); printf("%d %f %a\n", i, r, r); } printf("\n%f %a\n%f %a\n", norm.min(), norm.min(), norm.max(), norm.max()); } Build output: $ g++-12.2 -Wall -Wextra -o normdist2 normdist2.c -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined $ echo $? 0 Output: G 0x7f G 0x80 0 0.00 0x0p+0 1 -7.985583 -0x1.ff13ccp+2 G 0x80 G 0x7f 2 -7.985583 -0x1.ff13ccp+2 3 0.00 0x0p+0 G 0x80 G 0x81 4 7.985583 0x1.ff13ccp+2 5 0.00 0x0p+0 G 0x81 G 0x80 6 0.00 0x0p+0 7 7.985583 0x1.ff13ccp+2 -340282346638528859811704183484516925440.00 -0x1.fep+127 340282346638528859811704183484516925440.00 0x1.fep+127 Expected output: G 0x7f G 0x80 0 0.00 0x0p+0 1 -7.985583 -0x1.ff13ccp+2 G 0x80 G 0x7f 2 -7.985583 -0x1.ff13ccp+2 3 0.00 0x0p+0 G 0x80 G 0x81 4 7.985583 0x1.ff13ccp+2 5 0.00 0x0p+0 G 0x81 G 0x80 6 0.00 0x0p+0 7 7.985583 0x1.ff13ccp+2 -7.985583 -0x1.ff13ccp+2 7.985583 0x1.ff13ccp+2 $ g++-12.2 -v Using built-in specs. COLLECT_GCC=g++-12.2 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-slackware-linux/12.2.0/lto-wrapper Target: x86_64-slackware-linux Configured with: ../gcc-12.2.0/configure --prefix=/usr/local --program-suffix=-12.2 -enable-languages=c,c++,lto --enable-lto --disable-multilib --with-gnu-ld --enable-threads --verbose --target=x86_64-slackware-linux --build=x86_64-slackware-linux --host=x86_64-slackware-linux --enable-tls --with-fpmath=avx --enable-__cxa_atexit --enable-gnu-indirect-function --enable-bootstrap --enable-libssp Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.2.0 (GCC)
[Bug target/108938] Missing bswap detection
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108938 Hongtao.liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #17 from Hongtao.liu --- Fixed for GCC14.
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #6 from Hongtao.liu --- Fixed for GCC14.
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #5 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:3279b6223066d36d2e6880a137f80a46d3c82c8f commit r14-1421-g3279b6223066d36d2e6880a137f80a46d3c82c8f Author: liuhongt Date: Wed Feb 22 17:54:46 2023 +0800 Enhance NARROW FLOAT_EXPR vectorization by truncating integer to lower precision. Similar like WIDEN FLOAT_EXPR, when direct_optab is not existed, try intermediate integer type whenever gimple ranger can tell it's safe. .i.e. When there's no direct optab for vector long long -> vector float, but the value range of integer can be represented as int, try vector int -> vector float if availble. gcc/ChangeLog: PR tree-optimization/108804 * tree-vect-patterns.cc (vect_get_range_info): Remove static. * tree-vect-stmts.cc (vect_create_vectorized_demotion_stmts): Add new parameter narrow_src_p. (vectorizable_conversion): Enhance NARROW FLOAT_EXPR vectorization by truncating to lower precision. * tree-vectorizer.h (vect_get_range_info): New declare. gcc/testsuite/ChangeLog: * gcc.target/i386/pr108804.c: New test.
[Bug tree-optimization/110043] [14 Regression] ice in size_remaining, at pointer-query.cc:875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-30 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Note you can make this C++ valid code too: ``` __int128 g_116_1; extern char g_521[][8]; void func_24() { for (; g_116_1 >= 0;) g_521[g_116_1][g_116_1] &= 0; } ``` Confirmed.
[Bug tree-optimization/110043] [14 Regression] ice in size_remaining, at pointer-query.cc:875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043 Andrew Pinski changed: What|Removed |Added Summary|ice in size_remaining, at |[14 Regression] ice in |pointer-query.cc:875|size_remaining, at ||pointer-query.cc:875 Component|c |tree-optimization Target Milestone|--- |14.0 Keywords||ice-on-valid-code
[Bug target/110044] #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044 Andrew Pinski changed: What|Removed |Added Keywords||ABI, wrong-code --- Comment #1 from Andrew Pinski --- I suspect the issue is inside darwin_rs6000_special_round_type_align . But I can't seem to figure out just by looking at the code.
[Bug c/109836] -Wpointer-sign should be enabled by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109836 Eric Gallager changed: What|Removed |Added Keywords||patch --- Comment #5 from Eric Gallager --- (In reply to Eric Gallager from comment #4) > How about: > > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 0d0ad0a6374..f046d91d03b 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1178,7 +1178,7 @@ C ObjC C++ ObjC++ Var(warn_pointer_arith) Warning > LangEnabledBy(C ObjC C++ ObjC+ > Warn about function pointer arithmetic. > > Wpointer-sign > -C ObjC Var(warn_pointer_sign) Warning LangEnabledBy(C ObjC,Wall || > Wpedantic) > +C ObjC Var(warn_pointer_sign) Warning LangEnabledBy(C ObjC,Wall || > Wpedantic || Wextra) > Warn when a pointer differs in signedness in an assignment. > > Wpointer-compare I sent this to the gcc-patches mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620137.html
[Bug target/110044] New: #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110044 Bug ID: 110044 Summary: #pragma pack(push, 1) may not force packing, while __attribute__((packed, aligned(1))) works Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vital.had at gmail dot com CC: iains at gcc dot gnu.org Target Milestone: --- Target: powerpc-apple-darwin Problem: #pragma pack(push, 1) may not work correctly on ppc (32-bit); seems to be present across GCC versions, confirmed to affect gcc7, gcc11 and gcc12. Old Apple GCC 4.2 is not affected, at the same time. Test code: #include #include #pragma pack(push, 1) /* struct from OpenEXR; should be packed with the pragma directive */ typedef struct { uint32_t x_size; uint32_t y_size; uint8_t level_and_round; } exr_attr_tiledesc_t; /* same struct but reordered */ typedef struct { uint8_t level_and_round; uint32_t x_size; uint32_t y_size; } new1_exr_attr_tiledesc_t; /* same as first struct but with packed forced */ typedef struct { uint32_t x_size; uint32_t y_size; uint8_t level_and_round; } __attribute__((packed, aligned(1))) new2_exr_attr_tiledesc_t; #pragma pack(pop) int main() { std::cout << sizeof(exr_attr_tiledesc_t) << " " << sizeof(new1_exr_attr_tiledesc_t) << " " << sizeof(new2_exr_attr_tiledesc_t) << "\n"; return 0; } On Mac OS X Leopart (10.5 PowerPC): `g++-mp-7 main.cxx && ./a.out` gives: 12 9 9 `g++ main.cxx && ./a.out` gives: 9 9 9 `g++* -arch ppc64 && ./a.out` gives: 9 9 9 On Mac OS X Snow Leopard (10A190 PowerPC): `g++-mp-11 main.cxx && ./a.out` gives: 12 9 9 `g++-mp-12 main.cxx && ./a.out` gives: 12 9 9 `g++ main.cxx && ./a.out` gives: 9 9 9 where g++ stands for Xcode gcc-4.2. Discussion in: https://github.com/macports/macports-ports/pull/18872 Also see: https://trac.macports.org/ticket/63490
[Bug c/109826] Incompatible pointer types in ?: not covered by -Wincompatible-pointer-types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109826 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-30 Ever confirmed|0 |1 --- Comment #4 from Eric Gallager --- (anyways, confirmed)
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 --- Comment #6 from Andrew Pinski --- Created attachment 55219 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55219&action=edit untested patch I am going to test this on both x86_64 and aarch64 tonight.
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-05-30 Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #5 from Andrew Pinski --- I have a patch which adds support for paradoxical subregs. Since paradoxical subregs as the dest always assign the full register still, there is no reason to reject it.
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 --- Comment #4 from Andrew Pinski --- Because of the subreg.
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 --- Comment #3 from Andrew Pinski --- bb_valid_for_noce_process_p returns false for the zero_extract case ...
[Bug c/110043] New: ice in size_remaining, at pointer-query.cc:875
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110043 Bug ID: 110043 Summary: ice in size_remaining, at pointer-query.cc:875 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- For this C source code: __int128 g_116_1; char g_521[][8]; func_24() { for (; g_116_1 >= 0;) g_521[g_116_1][g_116_1] &= 0; } compiled by recent gcc, does this: $ ~/gcc/results/bin/gcc -c -w -O1 bug927.c during GIMPLE pass: waccess bug927.c: In function ‘func_24’: bug927.c:3:1: internal compiler error: in size_remaining, at pointer-query.cc:875 3 | func_24() { | ^~~ 0xd1823d access_ref::size_remaining(generic_wide_int >*) const ../../trunk.year/gcc/pointer-query.cc:875 The bug seems to exist since sometime before 20220515.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #10 from Jeffrey A. Law --- Created attachment 55218 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55218&action=edit (Incomplete) Patch
[Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041 --- Comment #4 from Jeffrey A. Law --- Patch was for a different problem. Sorry.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #9 from Jeffrey A. Law --- Weird, I don't see the attachment either. I'll extract & upload it again. WRT costing. fwprop and combine will both query the target rtx costs and will reject when the target costing model indicates the change isn't actually profitable. As you'd noted before, combine will internally transform a sign/zero extension into a pair of shifts. The whole point of that internal canonicalization is to expose cases where the shifts can combine with other nearby operations. So there's no significant risk to detecting and creating the extension form earlier.
[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822 Matthias Kretz (Vir) changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Matthias Kretz (Vir) --- Resolved on all branches.
[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822 --- Comment #5 from CVS Commits --- The releases/gcc-11 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:39a60f2d5f7bf6806a4c4d7d1f52f139e157e01a commit r11-10835-g39a60f2d5f7bf6806a4c4d7d1f52f139e157e01a Author: Matthias Kretz Date: Fri May 26 12:23:44 2023 +0200 libstdc++: Correct NTTP and simd_mask ctor call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109822 * include/experimental/bits/simd.h (to_native): Use int NTTP as specified in PTS2. (to_compatible): Likewise. Add missing tag to call mask generator ctor. * testsuite/experimental/simd/pr109822_cast_functions.cc: New test. (cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)
[Bug testsuite/52641] Test cases fail for 16-bit int targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52641 --- Comment #21 from CVS Commits --- The master branch has been updated by Georg-Johann Lay : https://gcc.gnu.org/g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8 commit r14-1418-ge4c986fde56a6248f8fbe6cf0704e1da34b055d8 Author: Georg-Johann Lay Date: Tue May 30 22:04:57 2023 +0200 testsuite/52641: Fix more of implicit int=32 assumption fallout. gcc/testsuite/ PR testsuite/52641 * gcc.dg/torture/pr107451.c: Require int32plus. * gcc.dg/torture/pr108574-3.c: Use __INT32_TYPE__ instead of int. * gcc.dg/torture/pr109940.c: Use __INTPTR_TYPE__ instead of long. * gcc.dg/torture/pr95248.c: Require size24plus. * gcc.dg/torture/pr95295-3.c: Use var_* with at least 32 bits int. * gcc.dg/torture/pr98640.c: Cast to __INT32_TYPE__ instead of int. * gcc.dg/tree-ssa/pr103771.c: Use int with at least 32 bits.
[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822 --- Comment #4 from CVS Commits --- The releases/gcc-12 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:467887d5750d03d438ab704437b2c5e5da78497e commit r12-9666-g467887d5750d03d438ab704437b2c5e5da78497e Author: Matthias Kretz Date: Fri May 26 12:23:44 2023 +0200 libstdc++: Correct NTTP and simd_mask ctor call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109822 * include/experimental/bits/simd.h (to_native): Use int NTTP as specified in PTS2. (to_compatible): Likewise. Add missing tag to call mask generator ctor. * testsuite/experimental/simd/pr109822_cast_functions.cc: New test. (cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #50 from Oleg Endo --- Actually, let's take any further discussion of shift patterns to PR 54089.
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #49 from Oleg Endo --- (In reply to Alexander Klepikov from comment #48) > I made tests (including *.c files from GCC testsuite) and everything looks > fine for now. But I'm still afraid that pattern for 'ashrsi3_libcall_expand' > is too wide. It is possible to narrow it down as much as possible by adding > distinct attribute and set when emitting 'ashrsi3_libcall_collapsed' and > then check it and fail if not set: > For this kind of change, the whole GCC test suite needs to be ran for at least big/little -m2,-m4 variants. +(define_insn_and_split "ashrsi3_libcall_expand" + [(parallel [(set (match_operand:SI 0 "arith_reg_dest") + (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand") + (match_operand:SI 2 "const_int_operand")) + )(clobber (reg:SI T_REG)) + (clobber (reg:SI PR_REG)) + ])] The 'parallel' construct looks strange.
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 --- Comment #2 from Andrew Pinski --- Here is a bitfield testcase which shows this was a latent issue: ``` struct f { unsigned t:3; unsigned t1:4; }; unsigned f2(struct f); unsigned f1(int t, struct f y) { int tt = 0; if(t) tt = y.t1; return tt; } ``` We should produce: ``` ubfxw8, w1, #3, #4 cmp w0, #0 cselw0, wzr, w8, eq ret `` But currently produces: ``` cbz w0, .L3 ubfxx0, x1, 3, 4 ret .L3: mov w0, 0 ret ``` The IR is similar too: ``` (insn 11 10 12 3 (set (subreg:DI (reg:QI 96) 0) (zero_extract:DI (subreg:DI (reg/v:SI 95 [ y ]) 0) (const_int 4 [0x4]) (const_int 3 [0x3]))) "/app/example.cpp":12:11 832 {*extzvdi} (expr_list:REG_DEAD (reg/v:SI 95 [ y ]) (nil))) (insn 12 11 24 3 (set (reg:SI 93 [ ]) (zero_extend:SI (reg:QI 96))) "/app/example.cpp":13:10 146 {*zero_extendqisi2_aarch64} (expr_list:REG_DEAD (reg:QI 96) (nil))) ```
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
[Bug rtl-optimization/110042] [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 --- Comment #1 from Andrew Pinski --- I am still looking into this. This is definitely a latent bug and maybe even can be reproduced some bitfield extractions too.
[Bug rtl-optimization/110042] New: [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110042 Bug ID: 110042 Summary: [14 Regression] missed cmov optimization after r14-1014-gc5df248509b489364c573e8 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64-linux-gnu Take: ``` unsigned f1(int t, int t1) { int tt = 0; if(t) tt = (t1&0x8)!=0; return tt; } ``` On aarch64 we should produce: ``` cmp w0, 0 ubfxx0, x1, 3, 1 cselw0, w0, wzr, ne ret ``` But on the trunk we get: ``` cbz w0, .L3 ubfxx0, x1, 3, 1 ret .p2align 2,,3 .L3: mov w0, 0 ret ``` The difference in the IR is: old: ``` (insn 11 10 12 3 (set (reg:SI 97) (lshiftrt:SI (reg/v:SI 96 [ t1 ]) (const_int 3 [0x3]))) "/app/example.cpp":7:18 782 {*aarch64_lshr_sisd_or_int_si3} (expr_list:REG_DEAD (reg/v:SI 96 [ t1 ]) (nil))) (insn 12 11 25 3 (set (reg:SI 94 [ ]) (and:SI (reg:SI 97) (const_int 1 [0x1]))) "/app/example.cpp":7:18 533 {andsi3} (expr_list:REG_DEAD (reg:SI 97) (nil))) ``` new: ``` (insn 11 10 12 3 (set (subreg:DI (reg:SI 97) 0) (zero_extract:DI (subreg:DI (reg/v:SI 96 [ t1 ]) 0) (const_int 1 [0x1]) (const_int 3 [0x3]))) "/app/example.cpp":7:18 832 {*extzvdi} (expr_list:REG_DEAD (reg/v:SI 96 [ t1 ]) (nil))) (insn 12 11 24 3 (set (reg:SI 94 [ ]) (reg:SI 97)) "/app/example.cpp":8:10 64 {*movsi_aarch64} (expr_list:REG_DEAD (reg:SI 97) (nil))) ``` noce_try_cmove_arith handles the old one but not the new one for some reason: ``` if-conversion succeeded through noce_try_cmove_arith ``
[Bug tree-optimization/50286] Missed optimization, fails to propagate bool
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50286 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED Target Milestone|--- |13.0 --- Comment #3 from Andrew Pinski --- Fixed in GCC 13. Checking profitability of path (backwards): bb:3 (8 insns) bb:5 (latch) Control statement insns: 2 Overall: 6 insns Registering killing_def (path_oracle) i_9 Registering killing_def (path_oracle) _10 Registering killing_def (path_oracle) _11 Checking profitability of path (backwards): [1] Registering jump thread: (5, 3) incoming edge; (3, 4) nocopy; path: 5->3->4 SUCCESS Jump threading proved probability of edge 3->4 too small (it is 11.0% (guessed) should be always (guessed))
[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 --- Comment #11 from Georg-Johann Lay --- Created attachment 55217 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55217&action=edit Proposed patch for postreload.cc to record clobbers of next insn + test case. This patch solves the problem for avr and tests with no additional regressions for avr. -- rtl-optimization/101188: Don't bypass clobbers of some insns that are optimized or are optimization candidates. gcc/ PR rtl-optimization/101188 * postreload.cc (reload_cse_move2add): Record clobbers of next insn using move2add_note_store. gcc/testsuite/ PR rtl-optimization/101188 * gcc.c-torture/execute/pr101188.c: New test.
[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Version|unknown |14.0 Resolution|--- |FIXED Target Milestone|--- |14.0 Target||x86 --- Comment #3 from Uroš Bizjak --- Fixed.
[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041 --- Comment #2 from CVS Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:2720bbd597f56742a17119dfe80edc2ba86af255 commit r14-1416-g2720bbd597f56742a17119dfe80edc2ba86af255 Author: Uros Bizjak Date: Tue May 30 20:38:20 2023 +0200 i386: Fix misleading identation in i386-expand.cc [PR110041] gcc/ChangeLog: PR target/110041 * config/i386/i386-expand.cc (ix86_expand_vecop_qihi2): Fix misleading identation.
[Bug tree-optimization/58483] missing optimization opportunity for const std::vector compared to std::array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58483 --- Comment #17 from Andrew Pinski --- Here is related reduced testcase: ``` int f(void) { int tt = 100; int t[3] = {10,20,30}; int *t1 = new int[3]; __builtin_memcpy(t1, t, sizeof(t)); for(int *i = t1; i != &t1[3]; i++) tt += *i; delete[] t1; return tt; } ``` Note in the above testcase we can remove the memcpy but not the operator new/delete. This is unlike the original testcase where memcpy is not removed either.
[Bug target/110041] gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041 David Binderman changed: What|Removed |Added CC||uros at gcc dot gnu.org --- Comment #1 from David Binderman --- Adding author of code.
[Bug target/110041] New: gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110041 Bug ID: 110041 Summary: gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- I just tried a build of gcc trunk with clang. It said: gcc/config/i386/i386-expand.cc:23394:5: warning: misleading indentation; statement is not part of the previous 'else' [-Wmisleading-indentation] git blame says: 52ff3f7b86 (Uros Bizjak 2023-05-25 19:40:26 +0200 23394) if (code != MULT && op2vec) It might be worth tidying this up.
[Bug debug/63572] [10/11/12/13/14 Regression] ICF breaks user debugging experience
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63572 --- Comment #31 from Jakub Jelinek --- In theory, what we could do (expensive though) is keep the IL of the functions that were ICF merged with the picked up candidate, just mark them in cgraph specially so that e.g. IPA doesn't consider references to functions/vars from the other copies as distinct references, compile those functions right after compiling their chosen ICF winner (or right before it), but don't emit into assembly, instead compare with how the ICF winner and emit just debug info for the other copies after building some mapping between the debug related labels in the different functions. If we compiled it into different code, something bad happened (e.g. some debug counter or similar) and we'd just not emit the debug info for the other copies (like we don't emit it currently for those).
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 --- Comment #3 from Andrew Pinski --- We don't even optimize: ``` struct MyClass { unsigned long long arr[128]; }; [[gnu::noipa]] void sink(void *m); void gg(MyClass &a, MyClass *b) { MyClass c = a; *b = c; sink(b); } ``` As I mentioned there are dups of the above testcase.
[Bug middle-end/106776] Unexpected use-after-free warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106776 --- Comment #6 from Leandro Nini --- Can't reproduce anymore with gcc 13.1.0 Still there in gcc 12.3.0
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Ever confirmed|0 |1 Severity|normal |enhancement Last reconfirmed||2023-05-30 Status|UNCONFIRMED |NEW --- Comment #2 from Andrew Pinski --- In the case of x86_64, it is just moving the loads across the operator new, I think: vect_SR.14_5 = MEM [(struct MyClass &)obj_2(D)]; vect_SR.15_28 = MEM [(struct MyClass &)obj_2(D) + 16]; vect_SR.16_30 = MEM [(struct MyClass &)obj_2(D) + 32]; _6 = operator new (48); MEM [(struct MyClass2 *)_6] = vect_SR.14_5; MEM [(struct MyClass2 *)_6 + 16B] = vect_SR.15_28; MEM [(struct MyClass2 *)_6 + 32B] = vect_SR.16_30; HandleMyClass2 (_6); [tail call] Other targets is moving across the operator new too: D.14580.__obj = *obj_2(D); _6 = operator new (48); MEM[(struct MyClass2 *)_6].f = D.14580; More obvious Reduced testcase: ``` struct MyClass { unsigned long long arr[128]; }; [[gnu::noipa]] void sink(void *m){} void gg(MyClass &a) { MyClass c = a; MyClass *b = new MyClass; *b = c; sink(b); } ``` There might be a dup of this issue too.
[Bug tree-optimization/101024] Missed min expression at phiopt1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101024 --- Comment #10 from Andrew Pinski --- The only thing left to do to remove minmax_replacement, is the improvement mentioned in PR 95699 (or rather r11-1504-g2e0f4a18bc978c7362 ).
[Bug target/87913] max(n, 1) code generation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87913 --- Comment #7 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:45466eecf5ef669164c0922e5be8fd288b144886 commit r14-1412-g45466eecf5ef669164c0922e5be8fd288b144886 Author: Andrew Pinski Date: Tue May 16 14:26:41 2023 -0700 Add a != MIN/MAX_VALUE_CST ? CST-+1 : a to minmax_from_comparison This patch adds the support for match that was implemented for PR 87913 in phiopt. It implements it by adding support to minmax_from_comparison for the check. It uses the range information if available which allows to produce MIN/MAX expression when comparing against the lower/upper bound of the range instead of lower/upper of the type. minmax-20.c is the new testcase which tests the ranges part. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * fold-const.cc (minmax_from_comparison): Add support for NE_EXPR. * match.pd ((cond (cmp (convert1? x) c1) (convert2? x) c2) pattern): Add ne as a possible cmp. ((a CMP b) ? minmax : minmax pattern): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/minmax-22.c: New test.
[Bug target/106907] gcc/config/rs6000/rs6000.cc:23155: strange expression ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106907 Jeevitha changed: What|Removed |Added CC||jeevitha at gcc dot gnu.org --- Comment #4 from Jeevitha --- (In reply to Andreas Schwab from comment #3) > Should probably be written as swapped != !BYTES_BIG_ENDIAN. I bootstrapped and regtest there is no regression with this change.
[Bug target/110040] rs6000 port emits dead mfvsrd instruction for simple test case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110040 Jeevitha changed: What|Removed |Added Target Milestone|--- |14.0 CC||bergner at gcc dot gnu.org, ||meissner at gcc dot gnu.org, ||segher at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jeevitha at gcc dot gnu.org Keywords||missed-optimization Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2023-05-30 Ever confirmed|0 |1 Target||powerpc64le-linux --- Comment #1 from Jeevitha --- I am working on this.
[Bug target/110040] New: rs6000 port emits dead mfvsrd instruction for simple test case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110040 Bug ID: 110040 Summary: rs6000 port emits dead mfvsrd instruction for simple test case Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jeevitha at gcc dot gnu.org Target Milestone: --- GCC Trunk generates a dead mfvsrd for the following test case. [jeevitha@ltcden2-lp1 ~]$ cat bug.c #include void foo (signed long *dst, vector signed __int128 src) { *dst = (signed long) src[0]; } [jeevitha@ltcden2-lp1 ~]$ gcc bug.c -O2 -mcpu=power9 -S -o bug.s [jeevitha@ltcden2-lp1 ~]$ cat bug.s .file "bug.c" .machine power9 .abiversion 2 .section".text" .align 2 .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc mfvsrd 11,34 #dead instruction mfvsrld 10,34 std 10,0(3) blr
[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822 --- Comment #3 from CVS Commits --- The releases/gcc-13 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:717a14e727bce409ac7e7f10c413530e704f4ca7 commit r13-7393-g717a14e727bce409ac7e7f10c413530e704f4ca7 Author: Matthias Kretz Date: Fri May 26 12:23:44 2023 +0200 libstdc++: Correct NTTP and simd_mask ctor call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109822 * include/experimental/bits/simd.h (to_native): Use int NTTP as specified in PTS2. (to_compatible): Likewise. Add missing tag to call mask generator ctor. * testsuite/experimental/simd/pr109822_cast_functions.cc: New test. (cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)
[Bug target/110027] Misaligned vector store on detect_stack_use_after_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110027 Jack O'Connor changed: What|Removed |Added CC||oconnor663 at gmail dot com --- Comment #3 from Jack O'Connor --- Thanks to Andrew Pinski's comment about -fstack-protector-strong, I can now reproduce this issue on Godbolt: https://godbolt.org/z/47a695sWY. So the minimal set of flags to reproduce on most distros (other than Arch Linux) is: -mavx512f -fsanitize=address -fstack-protector-strong
[Bug target/110026] [Bug] 5% performance drop on important benchmark after r260951.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026 Andrew Pinski changed: What|Removed |Added Keywords||ra --- Comment #3 from Andrew Pinski --- (In reply to d_vampile from comment #2) > O0 does miss a lot of optimizations. However, for the problem I mentioned, > the GPRs used before and the FP registers after modification are used. When > vectorization is not applicable, the X0 register is faster than the D0 > register. Is it appropriate to modify here? Well the generic_tunings has: { 4, /* load_int. */ 4, /* store_int. */ 4, /* load_fp. */ 4, /* store_fp. */ 4, /* load_pred. */ 4 /* store_pred. */ }, /* memmov_cost. */ Which says the load/store of fp has the same cost as ints (gprs) (this is the same as a53's tuning). If anything that should be changed Of you should use -mcpu=* where appliable.
[Bug target/110023] 10% performance drop on important benchmark after r247544.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110023 --- Comment #2 from d_vampile --- (In reply to Andrew Pinski from comment #1) > This is almost definitely an aarch64 cost model issue ... Do you mean that the vectorized cost_model of the underlying hardware causes the policy of not peeling the loop after r247544 to be chosen? ? So why does loop peeling result in performance improvements? For the following code, I understand that this is a very standard vectorized effective loop. for (j=0; j
[Bug target/110026] [Bug] 5% performance drop on important benchmark after r260951.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110026 --- Comment #2 from d_vampile --- (In reply to Jakub Jelinek from comment #1) > Note, any benchmarking for speed with -O rather than -O2/-O3 is > intentionally missing various optimizations which can greatly improve > performance. O0 does miss a lot of optimizations. However, for the problem I mentioned, the GPRs used before and the FP registers after modification are used. When vectorization is not applicable, the X0 register is faster than the D0 register. Is it appropriate to modify here?
[Bug libstdc++/109822] Converting std::experimental::simd masks yields an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109822 --- Comment #2 from CVS Commits --- The master branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:668d43502f465d48adbc1fe2956b979f36657e5f commit r14-1409-g668d43502f465d48adbc1fe2956b979f36657e5f Author: Matthias Kretz Date: Fri May 26 12:23:44 2023 +0200 libstdc++: Correct NTTP and simd_mask ctor call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109822 * include/experimental/bits/simd.h (to_native): Use int NTTP as specified in PTS2. (to_compatible): Likewise. Add missing tag to call mask generator ctor. * testsuite/experimental/simd/pr109822_cast_functions.cc: New test.
[Bug target/107172] [13 Regression] wrong code with "-O1 -ftree-vrp" on x86_64-linux-gnu since r13-1268-g8c99e307b20c502e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107172 --- Comment #51 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:69185294f322dd53d4e1592115014c5488302e2e commit r14-1405-g69185294f322dd53d4e1592115014c5488302e2e Author: Roger Sayle Date: Tue May 30 14:40:50 2023 +0100 PR target/107172: Avoid "unusual" MODE_CC comparisons in simplify-rtx.cc I believe that a better (or supplementary) fix to PR target/107172 is to avoid producing incorrect (but valid) RTL in simplify_const_relational_operation when presented with questionable (obviously invalid) expressions, such as those produced during combine. Just as with the "first do no harm" clause with the Hippocratic Oath, simplify-rtx (probably) shouldn't unintentionally transform invalid RTL expressions, into incorrect (non-equivalent) but valid RTL that may be inappropriately recognized by recog. In this specific case, many GCC backends represent their flags register via MODE_CC, whose representation is intentionally "opaque" to the middle-end. The only use of MODE_CC comprehensible to the middle-end's RTL optimizers is relational comparisons between the result of a COMPARE rtx (op0) and zero (op1). Any other uses of MODE_CC should be left alone, and some might argue indicate representational issues in the backend. In practice, CPUs occasionally have numerous instructions that affect the flags register(s) other than comparisons [AVR's setc, powerpc's mtcrf, x86's clc, stc and cmc and x86_64's ptest that sets C and Z flags in non-obvious ways, c.f. PR target/109973]. Currently care has to be taken, wrapping these in UNSPEC, to avoid combine inappropriately merging flags setters with flags consumers (such as conditional jumps). It's safer to teach simplify_const_relational_operation not to modify expressions that it doesn't understand/recognize. 2023-05-30 Roger Sayle gcc/ChangeLog PR target/107172 * simplify-rtx.cc (simplify_const_relational_operation): Return early if we have a MODE_CC comparison that isn't a COMPARE against const0_rtx.
[Bug target/109973] [13/14 Regression] Wrong code for AVX2 since 13.1 by combining VPAND and VPTEST since r13-2006-ga56c1641e9d25e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109973 --- Comment #6 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:69185294f322dd53d4e1592115014c5488302e2e commit r14-1405-g69185294f322dd53d4e1592115014c5488302e2e Author: Roger Sayle Date: Tue May 30 14:40:50 2023 +0100 PR target/107172: Avoid "unusual" MODE_CC comparisons in simplify-rtx.cc I believe that a better (or supplementary) fix to PR target/107172 is to avoid producing incorrect (but valid) RTL in simplify_const_relational_operation when presented with questionable (obviously invalid) expressions, such as those produced during combine. Just as with the "first do no harm" clause with the Hippocratic Oath, simplify-rtx (probably) shouldn't unintentionally transform invalid RTL expressions, into incorrect (non-equivalent) but valid RTL that may be inappropriately recognized by recog. In this specific case, many GCC backends represent their flags register via MODE_CC, whose representation is intentionally "opaque" to the middle-end. The only use of MODE_CC comprehensible to the middle-end's RTL optimizers is relational comparisons between the result of a COMPARE rtx (op0) and zero (op1). Any other uses of MODE_CC should be left alone, and some might argue indicate representational issues in the backend. In practice, CPUs occasionally have numerous instructions that affect the flags register(s) other than comparisons [AVR's setc, powerpc's mtcrf, x86's clc, stc and cmc and x86_64's ptest that sets C and Z flags in non-obvious ways, c.f. PR target/109973]. Currently care has to be taken, wrapping these in UNSPEC, to avoid combine inappropriately merging flags setters with flags consumers (such as conditional jumps). It's safer to teach simplify_const_relational_operation not to modify expressions that it doesn't understand/recognize. 2023-05-30 Roger Sayle gcc/ChangeLog PR target/107172 * simplify-rtx.cc (simplify_const_relational_operation): Return early if we have a MODE_CC comparison that isn't a COMPARE against const0_rtx.
[Bug target/106887] ICE in extract_insn, at recog.cc:2791 since r13-2111-g6910cad55ffc330d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106887 Martin Jambor changed: What|Removed |Added Resolution|--- |FIXED CC||jamborm at gcc dot gnu.org Status|NEW |RESOLVED --- Comment #5 from Martin Jambor --- This issue has been fixed (I cannot reproduce it with GCC 13 nor master).
[Bug target/110039] [14 Regression] FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039 ktkachov at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |14.0
[Bug target/110039] New: FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110039 Bug ID: 110039 Summary: FAIL: gcc.target/aarch64/rev16_2.c scan-assembler-times rev16\\tw[0-9]+ 2 Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64 I think after g:d8545fb2c71683f407bfd96706103297d4d6e27b the test regresses on aarch64. We now generate: __rev16_32_alt: rev w0, w0 ror w0, w0, 16 ret __rev16_32: rev w0, w0 ror w0, w0, 16 ret whereas before it was: __rev16_32_alt: rev16 w0, w0 ret __rev16_32: rev16 w0, w0 ret I think the GIMPLE at expand time is better and the RTL that it tries to match is simpler: Failed to match this instruction: (set (reg:SI 95) (rotate:SI (bswap:SI (reg:SI 96)) (const_int 16 [0x10]))) So maybe it's simply a matter of adding that pattern to aarch64.md. Anyway, filing this here to track the regression
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #48 from Alexander Klepikov --- I made tests (including *.c files from GCC testsuite) and everything looks fine for now. But I'm still afraid that pattern for 'ashrsi3_libcall_expand' is too wide. It is possible to narrow it down as much as possible by adding distinct attribute and set when emitting 'ashrsi3_libcall_collapsed' and then check it and fail if not set: (define_attr "libcall_collapsed" "ashrsi3,nil" (const_string "nil")) (define_insn "ashrsi3_libcall_collapsed" [(set (match_operand:SI 0 "arith_reg_dest" "=r") (ashiftrt:SI (match_operand:SI 1 "arith_reg_operand" "0") (match_operand:SI 2 "const_int_operand"))) (clobber (reg:SI T_REG)) (clobber (reg:SI PR_REG))] "TARGET_SH1" "OOPS" [(set_attr "type" "dyn_shift") (set_attr "libcall_collapsed" "ashrsi3") (set_attr "needs_delay_slot" "yes")]) if (get_attr_libcall_collapsed(insn) != LIBCALL_COLLAPSED_ASHRSI3) return false; It will be super safe then but ugly a little bit.
[Bug c++/110037] GCC accepts private member access of enclosing class through friend function of inner class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110037 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Patrick Palka --- dup *** This bug has been marked as a duplicate of bug 106756 ***
[Bug c++/106756] [CWG1699] Overbroad friendship for nested classes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106756 Patrick Palka changed: What|Removed |Added CC||jlame646 at gmail dot com --- Comment #5 from Patrick Palka --- *** Bug 110037 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/110035] Missed optimization for dependent assignment statements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035 --- Comment #1 from Richard Biener --- Ick - convoluted C++. We end up with void ff (struct MyClass & obj) { vector(2) long unsigned int vect_SR.16; vector(2) long unsigned int vect_SR.15; vector(2) long unsigned int vect_SR.14; void * _6; [local count: 1073741824]: vect_SR.14_5 = MEM [(struct MyClass &)obj_2(D)]; vect_SR.15_28 = MEM [(struct MyClass &)obj_2(D) + 16]; vect_SR.16_30 = MEM [(struct MyClass &)obj_2(D) + 32]; _6 = operator new (48); MEM [(struct MyClass2 *)_6] = vect_SR.14_5; MEM [(struct MyClass2 *)_6 + 16B] = vect_SR.15_28; MEM [(struct MyClass2 *)_6 + 32B] = vect_SR.16_30; HandleMyClass2 (_6); [tail call] and the issue is that 'operator new (48)' can alter what 'obj' points to, so we cannot move the loads across the call and we get spilling. There is no inter-procedural analysis in GCC that would tell us that 'obj_2(D)' (the MyClass & obj argument of ff) does not point to an object that did not escape. In fact 'ff' has global visibility and it might have other callers. If you add -fwhole-program then you get the function inlined to main and main: .LFB652: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 movl$48, %edi call_Znwm movq$0, (%rax) movq%rax, %rdi movq$0, 8(%rax) movq$0, 16(%rax) movq$0, 24(%rax) movq$0, 32(%rax) movq$0, 40(%rax) call_Z14HandleMyClass2Pv xorl%eax, %eax addq$8, %rsp .cfi_def_cfa_offset 8 ret (not using vectors because 'main' is considered cold). Do you cite an inline copy of ff() for clang?
[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 Richard Biener changed: What|Removed |Added Last reconfirmed||2023-05-30 Target Milestone|--- |14.0 Ever confirmed|0 |1 Priority|P3 |P1 Status|UNCONFIRMED |NEW CC||lili.cui at intel dot com --- Comment #1 from Richard Biener --- Probably best to limit the values to reassoc-width by adding the appropriate IntegerRange attribute in params.opt IntegerRange(0, 256) maybe?
[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036 Andreas Schwab changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #4 from Andreas Schwab --- Fixed on all branches.
[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036 --- Comment #3 from CVS Commits --- The releases/gcc-12 branch has been updated by Andreas Schwab : https://gcc.gnu.org/g:2910660f00c74d12d17e3114870e287804a3332c commit r12-9661-g2910660f00c74d12d17e3114870e287804a3332c Author: Andreas Schwab Date: Sun May 28 12:08:22 2023 +0200 riscv: update riscv_asan_shadow_offset gcc/ PR target/110036 * config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to match libsanitizer.
[Bug libstdc++/86880] Incorrect mersenne_twister_engine equality comparison between rotated states
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86880 --- Comment #3 from Jonathan Wakely --- We have the same problem with std::subtract_with_carry_engine. Its equality operator doesn't work for rotated states.
[Bug libstdc++/60441] Incorrect textual representation for std::mersenne_twister_engine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60441 --- Comment #3 from Jonathan Wakely --- We have the same problem with std::subtract_with_carry_engine.
[Bug libstdc++/86880] Incorrect mersenne_twister_engine equality comparison between rotated states
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86880 --- Comment #2 from Jonathan Wakely --- This would fix the equality operator to correctly compare rotated states: --- a/libstdc++-v3/include/bits/random.h +++ b/libstdc++-v3/include/bits/random.h @@ -601,8 +601,37 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION friend bool operator==(const mersenne_twister_engine& __lhs, const mersenne_twister_engine& __rhs) - { return (std::equal(__lhs._M_x, __lhs._M_x + state_size, __rhs._M_x) - && __lhs._M_p == __rhs._M_p); } + { + const _UIntType* const __lx = __lhs._M_x; + const _UIntType* const __rx = __rhs._M_x; + size_t __lp = __lhs._M_p % state_size; + size_t __rp = __rhs._M_p % state_size; + size_t __n1, __n2; + if (__lp > __rp) + { + __n1 = state_size - __lp; + __n2 = __lp - __rp; + } + else + { + __n1 = state_size - __rp; + __n2 = __rp - __lp; + } + if (!std::equal(__lx + __lp, __lx + __lp + __n1, __rx + __rp)) + return false; + if (__n1 == state_size) // i.e. __lhs._M_p == 0 && __rhs._M_p == 0 + return true; + __lp = (__lp + __n1) % state_size; + __rp = (__rp + __n1) % state_size; + if (!std::equal(__lx + __lp, __lx + __lp + __n2, __rx + __rp)) + return false; + __lp = (__lp + __n2) % state_size; + __rp = (__rp + __n2) % state_size; + size_t __n3 = state_size - __n1 - __n2; + if (!std::equal(__lx + __lp, __lx + __lp + __n3, __rx + __rp)) + return false; + return true; + } /** * @brief Inserts the current state of a % mersenne_twister_engine But the testcase above still fails, because mteB and mteA have different content in the array, even though they produce the same sequence of numbers.
[Bug target/110036] [12/13 regression] riscv_asan_shadow_offset mismatch with libsanitizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110036 --- Comment #2 from CVS Commits --- The releases/gcc-13 branch has been updated by Andreas Schwab : https://gcc.gnu.org/g:acf4fac6c5d14b30dca6cbde75f8b7db89850e04 commit r13-7389-gacf4fac6c5d14b30dca6cbde75f8b7db89850e04 Author: Andreas Schwab Date: Sun May 28 12:08:22 2023 +0200 riscv: update riscv_asan_shadow_offset gcc/ PR target/110036 * config/riscv/riscv.cc (riscv_asan_shadow_offset): Update to match libsanitizer.
[Bug c/109999] [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10 Tobias Burnus changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #4 from Tobias Burnus --- FIXED for mainline/GCC 14.
[Bug c/109999] [OpenMP] Bogus error message: talks about '"#pragma omp" clause' instead of '"target" clause
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10 --- Comment #3 from CVS Commits --- The master branch has been updated by Tobias Burnus : https://gcc.gnu.org/g:a899401404186843f38462c8fc9de733f19ce864 commit r14-1404-ga899401404186843f38462c8fc9de733f19ce864 Author: Tobias Burnus Date: Tue May 30 12:49:09 2023 +0200 OpenMP: Improve C/C++ parsing error message [PR10] Replace error: expected '#pragma omp' clause before ... by the the more readable/clearer error: expected an OpenMP clause before ... (And likewise for '#pragma acc' and OpenACC.) PR c/10 gcc/c/ChangeLog: * c-parser.cc (c_parser_oacc_all_clauses, c_parser_omp_all_clauses): Improve error wording. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_all_clauses, cp_parser_omp_all_clauses): Improve error wording. gcc/testsuite/ChangeLog: * c-c++-common/goacc/asyncwait-1.c: Update dg-error. * c-c++-common/goacc/clauses-fail.c: Likewise. * c-c++-common/goacc/data-2.c: Likewise. * c-c++-common/gomp/declare-target-2.c: Likewise. * c-c++-common/gomp/directive-1.c: Likewise. * g++.dg/goacc/data-1.C: Likewise.
[Bug c/109827] Pointer/integer mismatch in ?: not covered by -Wint-conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109827 Eric Gallager changed: What|Removed |Added Last reconfirmed||2023-05-30 Status|UNCONFIRMED |NEW CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 Blocks||44209 --- Comment #1 from Eric Gallager --- Confirmed. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44209 [Bug 44209] [meta-bug] Some warnings are not linked to diagnostics options