[Bug fortran/103434] New: Pointer subobject does not show to correct memory location
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103434 Bug ID: 103434 Summary: Pointer subobject does not show to correct memory location Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: baradi09 at gmail dot com Target Milestone: --- Based on the discussion on FD (https://fortran-lang.discourse.group/t/is-the-section-of-a-pointer-to-an-array-a-valid-pointer/2331), I'd assume, that the following code is standard conforming. However, the result with gfortran seems to be incorrect. *** Code: module test implicit none type :: pointer_wrapper real, pointer :: ptr(:) => null() end type pointer_wrapper contains subroutine store_pointer(wrapper, ptr) type(pointer_wrapper), intent(out) :: wrapper real, pointer, intent(in) :: ptr(:) wrapper%ptr => ptr end subroutine store_pointer subroutine use_pointer(wrapper) type(pointer_wrapper), intent(inout) :: wrapper wrapper%ptr(:) = wrapper%ptr + 1.0 end subroutine use_pointer end module test program testprog use test implicit none real, allocatable, target :: data(:,:) real, pointer :: ptr(:,:) type(pointer_wrapper) :: wrapper integer :: ii allocate(data(4, 2)) ptr => data(:,:) data(:,:) = 0.0 do ii = 1, size(data, dim=2) print *, "#", ii print *, "BEFORE ", ii, maxval(ptr(:,ii)) call store_pointer(wrapper, ptr(:,ii)) print *, "BETWEEN", ii, maxval(ptr(:,ii)) call use_pointer(wrapper) print *, "AFTER ", ii, maxval(ptr(:,ii)) end do end program testprog *** Output: # 1 BEFORE1 0. BETWEEN 1 0. AFTER 1 1. # 2 BEFORE2 1. BETWEEN 2 1. AFTER 2 1. *** Expected output: # 1 BEFORE1 0. BETWEEN 1 0. AFTER 1 1. # 2 BEFORE2 0. BETWEEN 2 0. AFTER 2 1. It seems, as if store_pointer would point to a memory location larger as it should be, so that also data outside of the actual stride is modified. Intel and NAG deliver the expected output.
[Bug ipa/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug middle-end/103431] [12 Regression] wrong code with -O -fno-tree-bit-ccp -fno-tree-dominator-opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103431 Richard Biener changed: What|Removed |Added Priority|P3 |P1 Keywords||needs-bisection Target Milestone|--- |12.0
[Bug target/103271] ICE in assign_stack_temp_for_type with -ftrivial-auto-var-init=pattern and VLAs and -mno-strict-align on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103271 --- Comment #7 from rguenther at suse dot de --- On Fri, 26 Nov 2021, wilson at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103271 > > Jim Wilson changed: > >What|Removed |Added > > CC||wilson at gcc dot gnu.org > > --- Comment #5 from Jim Wilson --- > SiFive doesn't support -mno-strict-align so I've never tested it. I doubt > that > it works correctly, i.e. I doubt that it optimizes as intended. I've > mentioned > this to other RVI members, but there hasn't been anyone other than SiFive > actively working on upstream gcc so I don't think anyone ever looked at it. > It > shouldn't give an ICE though. > > Looking at this, it appears to be another "if only we had a movti pattern" > issue. > > In expand_DEFERRED_INIT in internal-fn.c, in the reg_lhs == TRUE case, there > is > a test > && have_insn_for (SET, var_mode)) > which fails because var_mode is TImode and we don't have a movti pattern. The > code calls build_zero_cst which returns a constructor with an array type. We > then call expand_assignment which gets confused as it doesn't know the size of > the array it is copying. That seems to be the bug - in this path we shouldn't ever create an entity with VLA size since we do know the actual size. But it all is a bit awkward.
[Bug target/102768] [feature request] Add compiler support for aarch64 shadow call stack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102768 --- Comment #6 from ashimida --- RFC,v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585496.html
[Bug target/103271] ICE in assign_stack_temp_for_type with -ftrivial-auto-var-init=pattern and VLAs and -mno-strict-align on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103271 --- Comment #6 from Jim Wilson --- See also bug 103302 which can also be fixed by adding a movti pattern.
[Bug target/103302] wrong code with -fharden-compares
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103302 --- Comment #4 from Jim Wilson --- See also bug 103271 which can also be fixed by adding a movti pattern.
[Bug target/103271] ICE in assign_stack_temp_for_type with -ftrivial-auto-var-init=pattern and VLAs and -mno-strict-align on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103271 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #5 from Jim Wilson --- SiFive doesn't support -mno-strict-align so I've never tested it. I doubt that it works correctly, i.e. I doubt that it optimizes as intended. I've mentioned this to other RVI members, but there hasn't been anyone other than SiFive actively working on upstream gcc so I don't think anyone ever looked at it. It shouldn't give an ICE though. Looking at this, it appears to be another "if only we had a movti pattern" issue. In expand_DEFERRED_INIT in internal-fn.c, in the reg_lhs == TRUE case, there is a test && have_insn_for (SET, var_mode)) which fails because var_mode is TImode and we don't have a movti pattern. The code calls build_zero_cst which returns a constructor with an array type. We then call expand_assignment which gets confused as it doesn't know the size of the array it is copying. However, if we had a movti pattern, then the code computes the size of the array, and creates a VIEW_CONVERT_EXPR to document the array size before calling expand_assignment. So it looks like it would work if we had a movti pattern. I verified that adding a dummy movti pattern makes the ICE go away.
[Bug target/103302] wrong code with -fharden-compares
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103302 --- Comment #3 from Jim Wilson --- Maybe the register allocator should remove clobbers of pseudos, instead of turning them into clobbers of hard register pairs. That would eliminate the ambiguity after register allocation. It is also true that we don't needs hard reg clobbers. The clobbers are only there for tracking pseudo reg subregs.
[Bug target/103433] ICE in convert_move, at expr.c:219
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103433 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||ice-on-valid-code Host|x86_64-linux| Last reconfirmed||2021-11-26 Ever confirmed|0 |1 CC||pinskia at gcc dot gnu.org Component|c |target Target|aarch64-none-elf|aarch64*-*-* --- Comment #1 from Andrew Pinski --- Confirmed on the trunk.
[Bug c/103433] New: ICE in convert_move, at expr.c:219
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103433 Bug ID: 103433 Summary: ICE in convert_move, at expr.c:219 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: ilyply2006 at hotmail dot com Target Milestone: --- $ cat test.c #include "arm_sve.h" __attribute__((noinline)) void test_ldst_1 (svfloat32_t op0, svfloat32x2_t *op1) { *op1 = *(svfloat32x2_t*)&op0; } $ ./aarch64-none-elf-gcc -v -save-temps -march=armv8.2-a+sve test.c -O3 -S Using built-in specs. COLLECT_GCC=./aarch64-none-elf-gcc Target: aarch64-none-elf Configured with: /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/snapshots/gcc/configure SHELL=/bin/sh --with-mpc=/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu --with-mpfr=/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu --with-gmp=/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu --with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto --enable-shared --without-included-gettext --enable-nls --with-system-zlib --disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id --disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu --enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no --with-isl=no --enable-multilib --enable-fix-cortex-a53-835769 --enable-fix-cortex-a53-843419 --with-arch=armv8-a --enable-threads=no --disable-multiarch --with-newlib --with-build-sysroot= --with-sysroot=/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/aarch64-none-elf/libc --enable-checking=release --disable-bootstrap --enable-languages=c,c++,lto --prefix=/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=aarch64-none-elf Thread model: single Supported LTO compression algorithms: zlib gcc version 10.2.1 20201103 (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-march=armv8.2-a+sve' '-O3' '-S' '-mlittle-endian' '-mabi=lp64' /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/libexec/gcc/aarch64-none-elf/10.2.1/cc1 -E -quiet -v test.c -march=armv8.2-a+sve -mlittle-endian -mabi=lp64 -O3 -fpch-preprocess -o test.i ignoring nonexistent directory "/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/aarch64-none-elf/libc/usr/local/include" ignoring nonexistent directory "/work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/aarch64-none-elf/libc/usr/include" #include "..." search starts here: #include <...> search starts here: /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/lib/gcc/aarch64-none-elf/10.2.1/include /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/lib/gcc/aarch64-none-elf/10.2.1/include-fixed /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/lib/gcc/aarch64-none-elf/10.2.1/../../../../aarch64-none-elf/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-march=armv8.2-a+sve' '-O3' '-S' '-mlittle-endian' '-mabi=lp64' /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/builds/destdir/x86_64-pc-linux-gnu/libexec/gcc/aarch64-none-elf/10.2.1/cc1 -fpreprocessed test.i -quiet -dumpbase test.c -march=armv8.2-a+sve -mlittle-endian -mabi=lp64 -auxbase test -O3 -version -o test.s GNU C17 (GCC) version 10.2.1 20201103 (aarch64-none-elf) compiled by GNU C version 7.5.0, GMP version 4.3.2, MPFR version 3.1.6, MPC version 1.0.3, isl version none GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C17 (GCC) version 10.2.1 20201103 (aarch64-none-elf) compiled by GNU C version 7.5.0, GMP version 4.3.2, MPFR version 3.1.6, MPC version 1.0.3, isl version none GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 2cefa28229609aee36b21907b2deb066 during RTL pass: expand test.c: In function ‘test_ldst_1’: test.c:5:10: internal compiler error: in convert_move, at expr.c:219 5 | *op1 = *(svfloat32x2_t*)&op0; | ~^~~ 0x8606f3 convert_move(rtx_def*, rtx_def*, int) /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/snapshots/gcc/gcc/expr.c:219 0x86773d store_expr(tree_node*, rtx_def*, int, bool, bool) /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/snapshots/gcc/gcc/expr.c:5832 0x867c55 expand_assignment(tree_node*, tree_node*, bool) /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/snapshots/gcc/gcc/expr.c:5516 0x75aed8 expand_gimple_stmt_1 /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/snapshots/gcc/gcc/cfgexpand.c:3753 0x75aed8 expand_gimple_stmt /work/home/xjin/gcc/arm-gnu-toolchain/abe_build/
[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 --- Comment #25 from Kewen Lin --- Status update: > > The fusion related flags have been considered in the posted patch: > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html. > It's still being ping-ed for review since it's posted on Sep. 01. > One RFC/Patch > https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578555.html is also > posted to see if we can avoid to change implicit option behavior for > Power8/9. The patch v3 (https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html) was approved with some additional required adjustments. But the cases were written/tested on top of the above fusion related patch, so I hold to commit it.
[Bug target/102347] "fatal error: target specific builtin not available" with MMA and LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102347 Kewen Lin changed: What|Removed |Added CC||segher at gcc dot gnu.org, ||wschmidt at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2021-11-26 --- Comment #11 from Kewen Lin --- Status update: one proposed fix was posted to gcc-patches@ on Sep 28 (https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580357.html), there were some discussion following that, we agreed the proposed fix is safe eventually. There are no further new versions for it, so keep the original one being ping-ed for review.
[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Target Milestone|--- |12.0 Resolution|--- |FIXED --- Comment #5 from Andrew Pinski --- .
[Bug c++/98360] sizeof in template difference between g++/icc and clang++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98360 --- Comment #3 from Andrew Pinski --- GCC, ICC and MSVC all agree that this is valid code and all produce 4. clang is the only one which rejects it. Here is an even more reduced testcase: template struct uintset { T values[1]; struct traits { }; struct hash : traits { int foo () { return sizeof (uintset::values); } }; hash h; }; uintset s; int x = s.h.foo (); If you remove the base class or change it not to dependent type, the code is accepted. The defect reports in this area: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#613 http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#198 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2253.html is the paper which resolves 613. I suspect GCC is correct if I go by this paper.
[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #4 from Hongtao.liu --- Fixed in GCC12.
[Bug target/102811] vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102811 --- Comment #3 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:90cb088ece8d8cc1019d25629d1585e5b0234179 commit r12-5536-g90cb088ece8d8cc1019d25629d1585e5b0234179 Author: konglin1 Date: Wed Nov 10 09:37:32 2021 +0800 i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811] Add define_insn extendhfsf2 and truncsfhf2 for target_f16c. gcc/ChangeLog: PR target/102811 * config/i386/i386.c (ix86_can_change_mode_class): Allow 16 bit data in XMM register for TARGET_SSE2. * config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for TARGET_F16C. (extendhfdf2): Restrict extendhfdf for TARGET_AVX512FP16 only. (*extendhf2): Rename from extendhf2. (truncsfhf2): Likewise. (truncdfhf2): Likewise. (*trunc2): Likewise. gcc/testsuite/ChangeLog: PR target/102811 * gcc.target/i386/pr90773-21.c: Allow pextrw instead of movw. * gcc.target/i386/pr90773-23.c: Ditto. * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: New test.
[Bug middle-end/103419] FAIL: gcc.target/i386/pr102566-10b.c with -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |12.0 --- Comment #6 from Andrew Pinski --- .
[Bug middle-end/103419] FAIL: gcc.target/i386/pr102566-10b.c with -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419 --- Comment #5 from Hongtao.liu --- Fixed in GCC12.
[Bug middle-end/103419] FAIL: gcc.target/i386/pr102566-10b.c with -mx32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103419 --- Comment #4 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:379be00f45f65e0e8de72a50553dd9d2bab6cc08 commit r12-5535-g379be00f45f65e0e8de72a50553dd9d2bab6cc08 Author: liuhongt Date: Thu Nov 25 13:51:57 2021 +0800 Fix typo in r12-5486. gcc/ChangeLog: PR middle-end/103419 * match.pd: Fix typo, use the type of second parameter, not first one.
[Bug testsuite/103335] [12 Regression] new test case gcc.dg/tree-ssa/modref-dse-4.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103335 Bug 103335 depends on bug 103282, which changed state. Bug 103282 Summary: New test case gcc.dg/tree-ssa/modref-dse-5.c in r12-5292 fails https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103282 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED
[Bug testsuite/103282] New test case gcc.dg/tree-ssa/modref-dse-5.c in r12-5292 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103282 Jan Hubicka changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #11 from Jan Hubicka --- Fixed.
[Bug ipa/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug ipa/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Andrew Pinski changed: What|Removed |Added Keywords||needs-reduction, wrong-code Last reconfirmed|2021-11-26 00:00:00 | Target Milestone|--- |12.0 Status|ASSIGNED|NEW Assignee|hubicka at gcc dot gnu.org |unassigned at gcc dot gnu.org Known to work||11.2.0 Component|tree-optimization |ipa --- Comment #2 from Andrew Pinski --- Confirmed, I have not reduced it but here is what is happening. outD.25694 = {}; ... MEM[(struct DCTToD.21174 *)&D.25700 clique 3 base 1].data_D.21196 = &outD.25694; ... _ZN12_GLOBAL__N_121GenericTransposeBlockILm1ELm4ENS_7DCTFromENS_5DCTToEEEvRKT1_RKT2_.constprop.0D.25466 (&D.25767, &D.25766); ... _ZN12_GLOBAL__N_113IDCT1DWrapperILm4ELm1ENS_7DCTFromENS_5DCTToEEEvRKT1_RKT2_.constprop.0D.25467 (&D.25768, &D.25700); ... _3 = outD.25694[2]; FRE thinks _ZN12_GLOBAL__N_113IDCT1DWrapperILm4ELm1ENS_7DCTFromENS_5DCTToEEEvRKT1_RKT2_.constprop.0 does not touch out even though D.25700 is passed to it
[Bug tree-optimization/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Jan Hubicka changed: What|Removed |Added Ever confirmed|0 |1 Target Milestone|12.0|--- Keywords|wrong-code | Known to work|11.2.0 | Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2021-11-26 Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org CC||hubicka at gcc dot gnu.org Component|ipa |tree-optimization --- Comment #1 from Jan Hubicka --- It fails with ./xgcc -B ./ -O2 d.ii -fdbg-cnt=ipa_mod_ref_pta:189 -fdump-tree-all-details -fdump-ipa-all-details and works ./xgcc -B ./ -O2 d.ii -fdbg-cnt=ipa_mod_ref_pta:188 -fdump-tree-all-details -fdump-ipa-all-details The difference in optimized dump is: int main () { struct DCTFrom D.11418; @@ -2805,12 +2810,7 @@ float x[4]; struct DCTTo D.11356; struct DCTFrom D.11355; - float _3; - float _4; - double _6; struct FILE * stderr.3_8; - double _9; - struct FILE * stderr.4_11; float _12; float _13; double _15; @@ -2996,30 +2996,10 @@ {anonymous}::IDCT1DWrapper.constprop<4, 1, {anonymous}::DCTFrom, {anonymous}::DCTTo> (&D.11400, &D.11356); D.11400 ={v} {CLOBBER}; D.11356 ={v} {CLOBBER}; - _3 = out[2]; - _4 = _3 - 1.0e+0; - actual_accuracy_5 = ABS_EXPR <_4>; - if (actual_accuracy_5 > 9.99974752427078783512115478515625e-7) -goto ; [0.04%] - else -goto ; [99.96%] - - [local count: 429325]: - _6 = (double) actual_accuracy_5; stderr.3_8 = stderr; - fprintf (stderr.3_8, "ERROR: Too low accuracy: exp=%f act=%f\n", 9.99974752427078783512115478515625e-7, _6); + fprintf (stderr.3_8, "ERROR: Too low accuracy: exp=%f act=%f\n", 9.99974752427078783512115478515625e-7, 1.0e+0); exit (1); - [local count: 1072883004]: - _9 = (double) actual_accuracy_5; - stderr.4_11 = stderr; - fprintf (stderr.4_11, "OK: Good accuracy: exp=%f act=%f\n", 9.99974752427078783512115478515625e-7, _9); - x ={v} {CLOBBER}; - out ={v} {CLOBBER}; - coeffs ={v} {CLOBBER}; - scratch_space ={v} {CLOBBER}; - return 0; - } And I suppose we are not expected to optimize out the "Good accuracy" message :) So it looks out is modified by {anonymous}::IDCT1DWrapper.constprop<4, 1, {anonymous}::DCTFrom, {anonymous}::DCTTo> (&D.11400, &D.11356); but for some reason ipa propagation gets no_indirect_clobber for param1. This seems wrong since to->data is written to, so it is an indirect clobber. I may be able to look more into this only tomorrow - it is bit late.
[Bug target/103302] wrong code with -fharden-compares
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103302 Jim Wilson changed: What|Removed |Added CC||wilson at gcc dot gnu.org --- Comment #2 from Jim Wilson --- It is the second reversed comparison that is wrong. This is the u32_0 <= (...) on the first line of foo0. In the assembly file, this ends up as mv a0,a6 mv a1,a7 xor a6,a0,a6 xor a7,a1,a7 or a6,a6,a7 seqza6,a6 and note that it is comparing a value against itself when it should be comparing two different values. The harden compare pass is generating RTL insn 156 152 155 6 (set (reg:TI 201) (asm_operands:TI ("") ("=g") 0 [ (reg:TI 77 [ _8 ]) ] [ (asm_input:TI ("0")) ] [])) -1 (nil)) (insn 155 156 153 6 (clobber (reg:TI 77 [ _8 ])) -1 (nil)) (insn 153 155 154 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 0) (subreg:DI (reg:TI 201) 0)) -1 (nil)) (insn 154 153 160 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 8) (subreg:DI (reg:TI 201) 8)) -1 (nil)) Then the asmcons pass is changing this to (insn 851 152 849 5 (clobber (reg:TI 201)) -1 (nil)) (insn 849 851 850 5 (set (subreg:DI (reg:TI 201) 0) (subreg:DI (reg:TI 77 [ _8 ]) 0)) -1 (nil)) (insn 850 849 156 5 (set (subreg:DI (reg:TI 201) 8) (subreg:DI (reg:TI 77 [ _8 ]) 8)) -1 (nil)) (insn 156 850 155 5 (set (reg:TI 201) (asm_operands:TI ("") ("=g") 0 [ (reg:TI 201) ] [ (asm_input:TI ("0")) ] [])) -1 (expr_list:REG_DEAD (reg:TI 77 [ _8 ]) (nil))) (insn 155 156 153 5 (clobber (reg:TI 77 [ _8 ])) -1 (nil)) (insn 153 155 154 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 0) (subreg:DI (reg:TI 201) 0)) 135 {*movdi_64bit} (nil)) (insn 154 153 854 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 8) (subreg:DI (reg:TI 201) 8)) 135 {*movdi_64bit} (expr_list:REG_DEAD (reg:TI 201) (nil))) Then the register allocator puts both 77 and 201 in the same register, which means we are now clobbering values we need. In the reload dump I see (insn 851 152 849 5 (clobber (reg:TI 16 a6 [201])) -1 (nil)) (insn 849 851 850 5 (set (reg:DI 16 a6 [201]) (reg:DI 10 a0 [orig:77 _8 ] [77])) 135 {*movdi_64bit} (nil)) (insn 850 849 907 5 (set (reg:DI 17 a7 [+8 ]) (reg:DI 11 a1 [ _8+8 ])) 135 {*movdi_64bit} (nil)) (insn 907 850 1014 5 (clobber (reg:TI 16 a6 [201])) -1 (nil)) so the insns 849 and 850 get optimized away, but we need them. Also, we have (insn 854 154 852 5 (clobber (reg:TI 16 a6 [202])) -1 (nil)) (insn 852 854 853 5 (set (reg:DI 16 a6 [202]) (reg:DI 6 t1 [orig:86 _39 ] [86])) 135 {*movdi_64bit} (nil)) (insn 853 852 913 5 (set (reg:DI 17 a7 [+8 ]) (reg:DI 7 t2 [ _39+8 ])) 135 {*movdi_64bit} (nil)) (insn 913 853 1010 5 (clobber (reg:TI 16 a6 [202])) -1 (nil)) and the insns 852 and 853 get optimized away, but we need them. The comparison is supposed to be a0/a1 versus t1/t2, but we end up with comparing a6/a7 against itself. asmcons is calling emit_move_insn to copy the asm source to the asm dest so it can simplify the asm. Since this is a multiword mode, and the riscv backend doesn't have a movti pattern, this ends up calling emit_move_multi_word which emits the extra clobber that causes the problem. I suppose we could fix this by adding a movti pattern to the riscv backend to avoid the clobbers but we shouldn't have to. Though it would be interesting to see if this maybe results in better code optimization. It isn't clear exactly where the problem is. Maybe asmcons shouldn't try to fix an asm when the mode is larger than the word mode? This could be left to the register allocator to fix. Or maybe harden compare shouldn't generate RTL like this? This could be a harden compare issue, or maybe an issue with the RTL expander to emit the rtl differently. Looks like the same issue with the RTL expander calling emit_move_multi_word which generates the clobber. Or maybe a movti pattern is actually required now? I did verify that disabling asmcons fixes the problem for this testcase. I had to hack the code in function.c to do that as there is no option to disable it.
[Bug ipa/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Andrew Pinski changed: What|Removed |Added Keywords||wrong-code Target Milestone|--- |12.0 CC||marxin at gcc dot gnu.org Component|tree-optimization |ipa Known to work||11.2.0
[Bug tree-optimization/103432] New: [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 Bug ID: 103432 Summary: [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: slyfox at gcc dot gnu.org Target Milestone: --- Created attachment 51875 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51875&action=edit dct_test.cc Originally noticed a problem as failed tests on libjxl-0.5. I extracted ~10KB self-contained single-file example. It still could be reduced, but it's quite tangled. Could you see what is obviously wrong with it? Attached the reproducer as dct_test.cc: $ g++-12.0.0 -std=c++11 -O2 -fno-tree-vectorize dct_test.cc -o dct_test $ g++-12.0.0 -std=c++11 -O2 -fno-tree-vectorize -fno-ipa-modref dct_test.cc -o dct_test1 # good: $ ./dct_test1 OK: Good accuracy: exp=0.01 act=0.00 OK: Good accuracy: exp=0.01 act=0.00 # bad: $ ./dct_test OK: Good accuracy: exp=0.01 act=0.00 ERROR: Too low accuracy: exp=0.01 act=1.00 $ g++-12.0.0 -v Using built-in specs. COLLECT_GCC=/nix/store/2lxwqh3k88x4jwyfwlsfnwrp78yq2ah2-gcc-12.0.0/bin/g++ COLLECT_LTO_WRAPPER=/nix/store/2lxwqh3k88x4jwyfwlsfnwrp78yq2ah2-gcc-12.0.0/libexec/gcc/x86_64-unknown-linux-gnu/12.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.0.0 20211121 (experimental) (GCC)
[Bug c++/92385] extremely long and memory intensive compilation for brace construction of array member
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92385 Andrew Pinski changed: What|Removed |Added CC||beyondstandard at gmail dot com --- Comment #8 from Andrew Pinski --- *** Bug 71165 has been marked as a duplicate of this bug. ***
[Bug c++/71165] std::array with aggregate initialization generates huge code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71165 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #4 from Andrew Pinski --- Dup of bug 92385. *** This bug has been marked as a duplicate of bug 92385 ***
[Bug c++/92385] extremely long and memory intensive compilation for brace construction of array member
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92385 Andrew Pinski changed: What|Removed |Added CC||hehaochen at hotmail dot com --- Comment #7 from Andrew Pinski --- *** Bug 94957 has been marked as a duplicate of this bug. ***
[Bug c++/94957] Compilation slowww for simple code with big array of structs with constructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94957 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |DUPLICATE --- Comment #6 from Andrew Pinski --- Dup of bug 92385. *** This bug has been marked as a duplicate of bug 92385 ***
[Bug c++/94957] Compilation slowww for simple code with -O1/2/3 and -g in GCC 8 and 9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94957 Andrew Pinski changed: What|Removed |Added CC||ilord.tiran at yandex dot ru --- Comment #5 from Andrew Pinski --- *** Bug 98547 has been marked as a duplicate of this bug. ***
[Bug c++/98547] GCC spends many minutes instead of seconds building a file with array initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98547 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- Yes this is a dup of bug 94957. *** This bug has been marked as a duplicate of bug 94957 ***
[Bug fortran/103418] random_number() does not accept pointer, intent(in) array argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 --- Comment #8 from Steve Kargl --- On Thu, Nov 25, 2021 at 02:18:46PM -0800, Steve Kargl wrote: > On Thu, Nov 25, 2021 at 10:10:32PM +, anlauf at gcc dot gnu.org wrote: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 > > > > --- Comment #6 from anlauf at gcc dot gnu.org --- > > Unfortunately the patch in comment#5 does not work for me. :-( > > > > Interestingly, the Intel compiler fails on the testcase, too. > > > > Hmmm. I did have a number of other patches in my tree. I > wonder if one of those helped. Unfortunately, I updated > my git repository, where I cleared out all patch, and it > takes a long time to rebuild gcc on my laptop. > For the record, module test implicit none contains subroutine change_pointer_target(ptr) real, pointer, intent(in) :: ptr(:) call random_number(ptr) ptr(:) = ptr + 1.0 end subroutine change_pointer_target end module test program foo use test implicit none real, pointer :: a(:), b allocate(a(4), b) call random_number(b) call random_number(a) print '(5F8.5)', b, a end program foo % gfcx -o z a.f90 && ./z 0.65287 0.82614 0.77541 0.61923 0.52961
[Bug c/98487] ICE: tree check: expected identifier_node, have tree_list in is_attribute_p, at attribs.h:155 [C2X attribute syntax, gnu::format and -Wsuggest-attribute=format]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98487 Andrew Pinski changed: What|Removed |Added Keywords||ice-checking Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2021-11-25 --- Comment #2 from Andrew Pinski --- Confirmed. Simplier testcase: #include [[gnu::__format__(__printf__, 1, 2)]] void do_printf(const char * const a0, ...) { va_list ap; va_start(ap, a0); __builtin_vprintf(a0, ap); va_end(ap); } [[gnu::__format__(__scanf__, 1, 2)]] void do_scanf(const char * const a0, ...) { va_list ap; va_start(ap, a0); __builtin_vscanf(a0, ap); va_end(ap); } [[gnu::__format__(__strftime__, 1, 0)]] void do_strftime(const char * const a0, struct tm * a1) { char buff[256]; __builtin_strftime(buff, sizeof(buff), a0, a1); puts(buff); }
[Bug libstdc++/96416] [DR 3545] to_address() is broken by static_assert in pointer_traits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96416 --- Comment #21 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:b8018e5c5ec0e9b6948182f13fba47c67b758d8a commit r12-5532-gb8018e5c5ec0e9b6948182f13fba47c67b758d8a Author: Jonathan Wakely Date: Thu Nov 25 16:49:45 2021 + libstdc++: Make std::pointer_traits SFINAE-friendly [PR96416] This implements the resolution I'm proposing for LWG 3545, to avoid hard errors when using std::to_address for types that make pointer_traits ill-formed. Consistent with std::iterator_traits, instantiating std::pointer_traits for a non-pointer type will be well-formed, but give an empty type with no member types. This avoids the problematic cases for std::to_address. Additionally, the pointer_to member is now only declared when the element type is not cv void (and for C++20, when the function body would be well-formed). The rebind member was already SFINAE-friendly in our implementation. libstdc++-v3/ChangeLog: PR libstdc++/96416 * include/bits/ptr_traits.h (pointer_traits): Reimplement to be SFINAE-friendly (LWG 3545). * testsuite/20_util/pointer_traits/lwg3545.cc: New test. * testsuite/20_util/to_address/1_neg.cc: Adjust dg-error line. * testsuite/20_util/to_address/lwg3545.cc: New test.
[Bug libstdc++/101608] ranges::fill/fill_n missing std::is_constant_evaluated() condition for __builtin_memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101608 --- Comment #3 from CVS Commits --- The releases/gcc-11 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:7ae6e4e3831429d20eea1be285dbc6a4a005930f commit r11-9314-g7ae6e4e3831429d20eea1be285dbc6a4a005930f Author: Jonathan Wakely Date: Wed Nov 24 13:17:54 2021 + libstdc++: Do not use memset in constexpr calls to ranges::fill_n [PR101608] libstdc++-v3/ChangeLog: PR libstdc++/101608 * include/bits/ranges_algobase.h (__fill_n_fn): Check for constant evaluation before using memset. * testsuite/25_algorithms/fill_n/constrained.cc: Check byte-sized values as well. (cherry picked from commit 82c3657dd74896b39937bb0a2aaeba9b8ca105fd)
[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #14 from H.J. Lu --- (In reply to Richard Earnshaw from comment #13) > Also, note that the comment in gimple-fold.c prior to this change read: > > /* If we can perform the copy efficiently with first doing all loads > and then all stores inline it that way. Currently efficiently > means that we can load all the memory into a single integer > register which is what MOVE_MAX gives us. */ > > Which would imply that the AArch64 definition of MOVE_MAX is the correct one. The GCC manual has - Macro: MOVE_MAX The maximum number of bytes that a single instruction can move quickly between memory and registers or between two memory locations.
[Bug tree-optimization/98304] Failure to optimize bitwise arithmetic pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98304 --- Comment #2 from Andrew Pinski --- > @1 == (@2)-1 Should have been: @1 == -(@2-1) maybe check that @1 is a mask.
[Bug tree-optimization/98304] Failure to optimize bitwise arithmetic pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98304 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-11-25 Severity|normal |enhancement Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- _1 = MAX_EXPR ; _2 = _1 & -64; _4 = n_3(D) - _2; Something like: (simplify (minus @0 (bit_and (max @0 INTEGER_CST@1) INTEGER_CST@2)) (if (@1 == (@2)-1) (if (TYPE_SIGN (type) == UNSIGNED) (bit_and @0 @1) (cond (le @0 @1) @0 (bit_and @0 @1)) ) ) ) Note LLVM handles the unsigned case already. Also note also even though GCC can handle the loop case for signed, it only handles it on the RTL level, for gimple GCC produces: _3 = n_2(D) + -64; _8 = (unsigned int) n_2(D); _9 = _8 + 4294967232; // _9 = _3 - 64 _10 = _9 >> 6; // _10 = _9/64 _11 = (int) _10; _12 = _11 * -64; n_1 = _3 + _12;
[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto since r12-3903-g0288527f47cec669
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > Started with r12-3903-g0288527f47cec669. This is September change (for which we have PR102943) however the regression range was g:1ae8edf5f73ca5c3 (or g:264f061997c0a534 on second plot) and g:3e09331f6aeaf595 which is the latest regression visible on the graphs appearing betwen Nov 12 and Nov 15. The September regression is there too, but it is tracket as PR102943
[Bug rtl-optimization/79048] Unnecessary reload for flags setting insn when operands die
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79048 Roger Sayle changed: What|Removed |Added Target Milestone|--- |12.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED CC||roger at nextmovesoftware dot com --- Comment #2 from Roger Sayle --- This issue appears to be fixed on mainline. The test case now generates: f1: orb %dil, %sil jne .L4 ret
[Bug fortran/103418] random_number() does not accept pointer, intent(in) array argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 --- Comment #7 from Steve Kargl --- On Thu, Nov 25, 2021 at 10:10:32PM +, anlauf at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 > > --- Comment #6 from anlauf at gcc dot gnu.org --- > Unfortunately the patch in comment#5 does not work for me. :-( > > Interestingly, the Intel compiler fails on the testcase, too. > Hmmm. I did have a number of other patches in my tree. I wonder if one of those helped. Unfortunately, I updated my git repository, where I cleared out all patch, and it takes a long time to rebuild gcc on my laptop.
[Bug tree-optimization/103423] [12 Regression] 19% cpu2006 wrf compile time regression with -flto since r12-3903-g0288527f47cec669
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- Martin, My original report here was on regression at July 17 2021 (range g:0b7a11874d4eb428 and g:704e8a825c78b9a8) which seems unrelated to g:r12-3903-g0288527f47cec669 which is in Sep 21 2021 I think we are mixing up the cpu2006 and cpu2017 wrf's that seems to regress on different times. > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423 > > Martin Liška changed: > >What|Removed |Added > >See Also||https://gcc.gnu.org/bugzill >||a/show_bug.cgi?id=103409 > > -- > You are receiving this mail because: > You reported the bug.
[Bug c++/98030] error message for enum definition without ';' could be improved to include a fixit note
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98030 Andrew Pinski changed: What|Removed |Added Summary|error message for enum |error message for enum |definition without ';' |definition without ';' |could be improved |could be improved to ||include a fixit note Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-11-25 Severity|normal |enhancement --- Comment #3 from Andrew Pinski --- Confirmed.
[Bug fortran/103418] random_number() does not accept pointer, intent(in) array argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 --- Comment #6 from anlauf at gcc dot gnu.org --- Unfortunately the patch in comment#5 does not work for me. :-( Interestingly, the Intel compiler fails on the testcase, too.
[Bug middle-end/103431] [12 Regression] wrong code with -O -fno-tree-bit-ccp -fno-tree-dominator-opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103431 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Component|rtl-optimization|middle-end Last reconfirmed||2021-11-25 --- Comment #1 from Andrew Pinski --- Confirmed. reduced testcase (removing the globals): typedef unsigned __int128 B; __attribute__((noipa)) void f(unsigned short a) { B b = 5; int size = (sizeof(b)*8)-1; a /= 0xfffd; B b1 = (b << (a & size) | b >> (-(a & size) & size)); if (b1 != 5) __builtin_abort (); } int main (void) { f(0); } - CUT --- The gimple level does not change. In GCC 11 and the trunk, we have: _1 = (unsigned intD.9) a_8(D); _2 = _1 / 4294967293; a_9 = (short unsigned intD.18) _2; _13 = a_9 & 127; _3 = (intD.6) _13; b1_10 = 5 r<< _3; if (b1_10 != 5) It looks like the expansion from gimple to RTL of the rotate is different between the two versions.
[Bug rtl-optimization/103431] New: [12 Regression] wrong code with -O -fno-tree-bit-ccp -fno-tree-dominator-opts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103431 Bug ID: 103431 Summary: [12 Regression] wrong code with -O -fno-tree-bit-ccp -fno-tree-dominator-opts Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Created attachment 51874 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51874&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O -fno-tree-bit-ccp -fno-tree-dominator-opts testcase.c $ ./a.out Aborted $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64//bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-5528-20211125184355-g9488d242066-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-5528-20211125184355-g9488d242066-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20211125 (experimental) (GCC)
[Bug fortran/103418] random_number() does not accept pointer, intent(in) array argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 --- Comment #5 from Steve Kargl --- On Thu, Nov 25, 2021 at 09:02:34PM +, anlauf at gcc dot gnu.org wrote: > (In reply to kargl from comment #3) > > (In reply to anlauf from comment #2) > > > The nearly obvious fix: > > > > > > diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c > > > index 837eb0912c0..3859e18c6c3 100644 > > > --- a/gcc/fortran/check.c > > > +++ b/gcc/fortran/check.c > > > @@ -1031,7 +1031,7 @@ variable_check (gfc_expr *e, int n, bool allow_proc) > > > break; > > > } > > > > > > - if (!ref) > > > + if (!ref && !pointer) > > > { > > > gfc_error ("%qs argument of %qs intrinsic at %L cannot be " > > > "INTENT(IN)", gfc_current_intrinsic_arg[n]->name, > > > > > > regresses for gfortran.dg/move_alloc_8.f90, thus needs additional > > > investigation. > > > > Did you try the patch posted in Fortran Discourse? > > No. > > I'm afraid I also missed it on the usual channels where patches for gcc > are posted. > As explained on FD, I don't report problems found be other people who post them in FD, stackoverflow, or c.l.f. I encourage those people to report the problems themselves. That said, you found the right location to patch. The code looks convoluted to deal with CLASS, which messes up an array with the pointer attribute. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index 6ea6e136d4f..e96bcdb1b44 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -1031,7 +1031,7 @@ variable_check (gfc_expr *e, int n, bool allow_proc) break; } - if (!ref) + if (!ref && !(pointer && e->ref && e->ref->type == REF_ARRAY)) { gfc_error ("%qs argument of %qs intrinsic at %L cannot be " "INTENT(IN)", gfc_current_intrinsic_arg[n]->name, @@ -1062,7 +1062,8 @@ variable_check (gfc_expr *e, int n, bool allow_proc) return true; gfc_error ("%qs argument of %qs intrinsic at %L must be a variable", -gfc_current_intrinsic_arg[n]->name, gfc_current_intrinsic, &e->where); +gfc_current_intrinsic_arg[n]->name, gfc_current_intrinsic, +&e->where); return false; }
[Bug fortran/103418] random_number() does not accept pointer, intent(in) array argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103418 --- Comment #4 from anlauf at gcc dot gnu.org --- (In reply to kargl from comment #3) > (In reply to anlauf from comment #2) > > The nearly obvious fix: > > > > diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c > > index 837eb0912c0..3859e18c6c3 100644 > > --- a/gcc/fortran/check.c > > +++ b/gcc/fortran/check.c > > @@ -1031,7 +1031,7 @@ variable_check (gfc_expr *e, int n, bool allow_proc) > > break; > > } > > > > - if (!ref) > > + if (!ref && !pointer) > > { > > gfc_error ("%qs argument of %qs intrinsic at %L cannot be " > > "INTENT(IN)", gfc_current_intrinsic_arg[n]->name, > > > > regresses for gfortran.dg/move_alloc_8.f90, thus needs additional > > investigation. > > Did you try the patch posted in Fortran Discourse? No. I'm afraid I also missed it on the usual channels where patches for gcc are posted.
[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #13 from Richard Earnshaw --- Also, note that the comment in gimple-fold.c prior to this change read: /* If we can perform the copy efficiently with first doing all loads and then all stores inline it that way. Currently efficiently means that we can load all the memory into a single integer register which is what MOVE_MAX gives us. */ Which would imply that the AArch64 definition of MOVE_MAX is the correct one.
[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #12 from Richard Earnshaw --- (In reply to Jakub Jelinek from comment #10) > Alternatively, couldn't we check next to that new > && have_insn_for (SET, mode) > also that > && known_le (GET_MODE_SIZE (mode), MOVE_MAX) > ? No, that would limit us to MOVE_MAX again, so what would be the point in having a more relaxed test earlier. I do wonder if MOVE_MAX * MOVE_RATIO should be replaced with the MOVE_BY_PIECES infrastructure, I just haven't had time to cook up a patch to try that, though.
[Bug middle-end/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406 --- Comment #14 from joseph at codesourcery dot com --- There is no reasonable definition of how operands of binary + map to particular operands of a particular instruction and so no -f or -m option could sensibly be defined for that. When the result is a NaN, there is no requirement at all on what (quiet) NaN it is (beyond a preference for preservation of the payload of a NaN operand if there is at least one NaN operand).
[Bug tree-optimization/99520] Failure to detect bswap pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520 Roger Sayle changed: What|Removed |Added Target Milestone|--- |12.0 CC||roger at nextmovesoftware dot com Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Roger Sayle --- This PR is now fixed on mainline. Thanks to Jakub (my apologies if I'd seen comment #2 I wouldn't of accidentally broken things; aka PR tree-optimization/103376, fortunately Jakub was able to quickly correct my oversight).
[Bug tree-optimization/98953] Failure to optimize two reads from adjacent addresses into one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98953 Roger Sayle changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|roger at nextmovesoftware dot com |unassigned at gcc dot gnu.org --- Comment #4 from Roger Sayle --- The MULT_EXPR and PLUS_EXPR aspects of this PR are now resolved (i.e. the case in comment #1), but unfortunately the abs-based indexing used in the original report still causes problems. The bswap pass doesn't yet handle memory accesses of the form read[abs]/read[abs+1] (but does handle read[0]/read[1]).
[Bug libstdc++/101608] ranges::fill/fill_n missing std::is_constant_evaluated() condition for __builtin_memset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101608 --- Comment #2 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:82c3657dd74896b39937bb0a2aaeba9b8ca105fd commit r12-5530-g82c3657dd74896b39937bb0a2aaeba9b8ca105fd Author: Jonathan Wakely Date: Wed Nov 24 13:17:54 2021 + libstdc++: Do not use memset in constexpr calls to ranges::fill_n [PR101608] libstdc++-v3/ChangeLog: PR libstdc++/101608 * include/bits/ranges_algobase.h (__fill_n_fn): Check for constant evaluation before using memset. * testsuite/25_algorithms/fill_n/constrained.cc: Check byte-sized values as well.
[Bug tree-optimization/103345] missed optimization: add/xor individual bytes to form a word
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103345 Roger Sayle changed: What|Removed |Added Target Milestone|--- |12.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Roger Sayle --- This PR should now be fixed (missed optimization implemented) on mainline.
[Bug middle-end/103406] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406 Roger Sayle changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|roger at nextmovesoftware dot com |unassigned at gcc dot gnu.org Summary|[12 Regression] gcc -O0 |gcc -O0 behaves differently |behaves differently on |on "DBL_MAX related |"DBL_MAX related|operations" than gcc -O1 |operations" than gcc -O1|and above |and above | Target||x86_64 --- Comment #13 from Roger Sayle --- The Inf - Inf => 0.0 regression should now be fixed on mainline. Hmm. As hinted by Richard Beiner's investigation, the underlying problem is even more pervasive. It turns out that on x86/IA64 chips, floating point addition is not commutative, i.e. x+y is not the same as y+x, as demonstrated by the test program below: #include const double pn = __builtin_nan(""); const double mn = -__builtin_nan(""); __attribute__ ((noinline, noclone)) double plus(double x, double y) { return x + y; } int main() { printf("%lf\n",plus(pn,mn)); printf("%lf\n",plus(mn,pn)); return 0; } Output: nan -nan Unfortunately, GCC assumes almost everywhere the FP addition is commutative and (as per comments #8 and #9) associative with negation/minus. This appears to be target property, c.f. libgcc's _FP_CHOOSENAN, but could in theory be resolved by a -fstrict-math mode (that implies -ftrapping-math) that disables commutativity (swapping of operands) throughout the compiler, including reload/fold-const etc., on affected Intel-like targets. Perhaps this PR is a duplicate now that the regression has been fixed?
[Bug c++/56119] Allows static member definition of template class in namespace not enclosing this class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56119 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=103426 CC||fchelnokov at gmail dot com --- Comment #3 from Andrew Pinski --- *** Bug 103426 has been marked as a duplicate of this bug. ***
[Bug c++/103426] Acceptance of invalid template specialization in a namespace not enclosing the specialized template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103426 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=56119 Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- This is a dup of bug 56119. *** This bug has been marked as a duplicate of bug 56119 ***
[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-11-25 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Severity|normal |enhancement --- Comment #11 from Andrew Pinski --- Confirmed. I had thought there was another bug about this but I can't find it.
[Bug tree-optimization/103332] Spurious -Wstringop-overflow warnings in libstdc++ tests
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103332 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-11-25 Status|UNCONFIRMED |NEW --- Comment #4 from Andrew Pinski --- .
[Bug target/102117] s390: Inefficient code for 64x64=128 signed multiply for <= z13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102117 Roger Sayle changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |12.0 --- Comment #4 from Roger Sayle --- This should now be fixed on mainline.
[Bug tree-optimization/102958] std::u8string suboptimal compared to std::string, triggers warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102958 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-11-25 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski --- Confirmed, interesting we don't detect this as strlen: [local count: 8687547547]: # __i_155 = PHI <__i_46(3), 0(2)> __i_46 = __i_155 + 1; _48 = MEM[(const char_type &)"123456789" + __i_46 * 1]; if (_48 != 0) goto ; [89.00%] else goto ; [11.00%] I thought there was code to do that dection now?
[Bug middle-end/103406] [12 Regression] gcc -O0 behaves differently on "DBL_MAX related operations" than gcc -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103406 --- Comment #12 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:6ea5fb3cc7f3cc9b731d72183c66c23543876f5a commit r12-5529-g6ea5fb3cc7f3cc9b731d72183c66c23543876f5a Author: Roger Sayle Date: Thu Nov 25 19:02:06 2021 + PR middle-end/103406: Check for Inf before simplifying x-x. This is a simple one line fix to the regression PR middle-end/103406, where x - x is being folded to 0.0 even when x is +Inf or -Inf. In GCC 11 and previously, we'd check whether the type honored NaNs (which implicitly covered the case where the type honors infinities), but my patch to test whether the operand could potentially be NaN failed to also check whether the operand could potentially be Inf. 2021-11-25 Roger Sayle gcc/ChangeLog PR middle-end/103406 * match.pd (minus @0 @0): Check tree_expr_maybe_infinite_p. gcc/testsuite/ChangeLog PR middle-end/103406 * gcc.dg/pr103406.c: New test case.
[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug c++/102454] coroutines: ICE in gimplify_var_or_parm_decl, at gimplify.c:2958
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102454 --- Comment #7 from Iain Sandoe --- I was leaving it to check if we needed to back port to 10.x as well.
[Bug c++/102213] Incorrect executable produced from valid input code with virtual consteval
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102213 --- Comment #2 from Andrew Pinski --- Note GCC 10 did a sorry message: sorry, unimplemented: 'virtual' 'consteval'
[Bug c++/102213] Incorrect executable produced from valid input code with virtual consteval
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102213 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Known to fail||11.1.0, 11.2.0 Status|UNCONFIRMED |NEW Last reconfirmed||2021-11-25 --- Comment #1 from Andrew Pinski --- Confirmed, at -O1 it is constant evulated in the front-end and it works.
[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 --- Comment #4 from Martin Liška --- So it's very funny what's happening here. iftoswitch pass is called for all e.g. f_dispatch_always_inline<10>, f_dispatch_always_inline<9> and so on until f_dispatch_always_inline<5> which is converted to switch. And then all early passes are called for f_dispatch_always_inline<4> which include einline and we end up with: __attribute__((always_inline)) void f_dispatch_always_inline<4> (int i) { : if (i_2(D) == 4) goto ; [INV] else goto ; [INV] : f<4> (); goto ; [INV] : switch (i_2(D)) [16.67%], case 5: [16.67%], case 6: [16.67%], case 7: [16.67%], case 8: [16.67%], case 9: [16.67%]> : : f<5> (); goto ; [100.00%] : : f<6> (); goto ; [100.00%] : : f<7> (); goto ; [100.00%] : : f<8> (); goto ; [100.00%] : : f<9> (); : : return; } which is a mixture of if and switch statements. So what we basically need is if-to-switch hybrid support for if-else chain combined with switches.
[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto since r12-3903-g0288527f47cec669
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 Martin Liška changed: What|Removed |Added Summary|[12 Regression] 18% |[12 Regression] 18% |SPEC2017 WRF compile-time |SPEC2017 WRF compile-time |regression with -O2 -flto |regression with -O2 -flto |between g:264f061997c0a534 |since |and g:3e09331f6aeaf595 |r12-3903-g0288527f47cec669 Keywords|needs-bisection | CC||aldyh at gcc dot gnu.org, ||amacleod at redhat dot com --- Comment #5 from Martin Liška --- Started with r12-3903-g0288527f47cec669.
[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 --- Comment #11 from Jakub Jelinek --- Actually no, GET_MODE_SIZE in that case is the size of the whole operation. To me the previous change looks extremely ARM specific with load lines in mind which no other target has. If we want to support more than one SET covering it, there should be a loop to find out how large each load should be and we should decide that based on MOVE_MAX.
[Bug middle-end/103393] [12 Regression] Generating 256bit register usage with -mprefer-avx128 -mprefer-vector-width=128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #10 from Jakub Jelinek --- Alternatively, couldn't we check next to that new && have_insn_for (SET, mode) also that && known_le (GET_MODE_SIZE (mode), MOVE_MAX) ?
[Bug c++/102454] coroutines: ICE in gimplify_var_or_parm_decl, at gimplify.c:2958
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102454 --- Comment #6 from Arseny Solokha --- Should this PR be closed now?
[Bug c++/103430] New: ICE in gimplify_var_or_parm_decl, at gimplify.c:2975
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103430 Bug ID: 103430 Summary: ICE in gimplify_var_or_parm_decl, at gimplify.c:2975 Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com CC: muecker at gwdg dot de Target Milestone: --- g:4e6bf0b9dd5585df1a1472d6a93b9fff72fe2524 fixes the long-standing issue w/ VLAs and statement expressions for the C front-end, but not for the C++ one. g++ still ICEs when compiling the following testcase, extracted from gcc/testsuite/gcc.dg/vla-stexp-9.c: void foo(void) { if (2 * sizeof(int) != sizeof((*({ int N = 2; int (*x)[9][N] = 0; x; })[1]))) __builtin_abort(); } % g++-12.0.0 -c mrzd2yqy.c mrzd2yqy.c: In function 'void foo()': mrzd2yqy.c:3:29: internal compiler error: in gimplify_var_or_parm_decl, at gimplify.c:2975 3 | if (2 * sizeof(int) != sizeof((*({ int N = 2; int (*x)[9][N] = 0; x; })[1]))) | ^~~~ 0x7a339b gimplify_var_or_parm_decl /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:2975 0xea3e1f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15127 0xea98e0 internal_get_tmp_var /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:624 0xeac95e get_initialized_tmp_var(tree_node*, gimple**, gimple**, bool) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:679 0xeac95e gimplify_save_expr /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:6267 0xea4298 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14967 0xea3e7d gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14971 0xea3a51 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15432 0xea3a51 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15432 0xebbd7c gimplify_cond_expr /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:4329 0xea446c gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14623 0xea79e6 gimplify_stmt(tree_node**, gimple**) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024 0xea82af gimplify_bind_expr /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:1426 0xea444f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14867 0xea79e6 gimplify_stmt(tree_node**, gimple**) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024 0xea82af gimplify_bind_expr /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:1426 0xea444f gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:14867 0xebeb96 gimplify_stmt(tree_node**, gimple**) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:7024 0xebeb96 gimplify_body(tree_node*, bool) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:15912 0xebf02d gimplify_function_tree(tree_node*) /var/tmp/portage/sys-devel/gcc-12.0.0_alpha20211121/work/gcc-12-20211121/gcc/gimplify.c:16066
[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453 --- Comment #9 from Segher Boessenkool --- Yeah that looks better already, thanks. Please get rid of the debug stuff still in here, and send to gcc-patches@?
[Bug c++/103428] [11/12 Regression] Parameter packs not expanded with local struct in lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103428 Jakub Jelinek changed: What|Removed |Added Summary|Parameter packs not |[11/12 Regression] |expanded with local struct |Parameter packs not |in lambda |expanded with local struct ||in lambda Last reconfirmed||2021-11-25 Target Milestone|--- |11.3 Ever confirmed|0 |1 CC||jakub at gcc dot gnu.org, ||jason at gcc dot gnu.org, ||ppalka at gcc dot gnu.org Priority|P3 |P2 Status|UNCONFIRMED |NEW --- Comment #1 from Jakub Jelinek --- Started to be rejected again with r12-392-g2a6fc19e655e696bf0df9b7aaedf9848b23f07f3 11.1 accepts it since r11-8103-ge89055f90cff9fb6f565b9374e1ab74f805682fb
[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 --- Comment #13 from CVS Commits --- The master branch has been updated by Martin Jambor : https://gcc.gnu.org/g:5bc4cb04127a4805b6228b0a6cbfebdbd61314d2 commit r12-5527-g5bc4cb04127a4805b6228b0a6cbfebdbd61314d2 Author: Martin Jambor Date: Thu Nov 25 17:58:12 2021 +0100 ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227) PR 103227 exposed an issue with ordering of transformations of IPA passes. IPA-CP can create clones for constants passed by reference and at the same time IPA-SRA can also decide that the parameter does not need to be a pointer (or an aggregate) and plan to convert it into (a) simple scalar(s). Because no intermediate clone is created just for the purpose of ordering the transformations and because IPA-SRA transformation is implemented as part of clone materialization, the IPA-CP transformation happens only afterwards, reversing the order of the transformations compared to the ordering of analyses. IPA-CP transformation looks at planned substitutions for values passed by reference or in aggregates but finds that all the relevant parameters no longer exist. Currently it subsequently simply gives up, leading to clones created for no good purpose (and huge regression of 548.exchange_r. This patch teaches it recognize the situation, look up the new scalarized parameter and perform value substitution on it. On my desktop this has recovered the lost exchange2 run-time (and some more). I have disabled IPA-SRA in a Fortran testcase so that the dumping from the transformation phase can still be matched in order to verify that IPA-CP understands the IL after verifying that it does the right thing also with IPA-SRA. gcc/ChangeLog: 2021-11-23 Martin Jambor PR ipa/103227 * ipa-prop.h (ipa_get_param): New overload. Move bits of the existing one to the new one. * ipa-param-manipulation.h (ipa_param_adjustments): New member function get_updated_index_or_split. * ipa-param-manipulation.c (ipa_param_adjustments::get_updated_index_or_split): New function. * ipa-prop.c (adjust_agg_replacement_values): Reimplement, add capability to identify scalarized parameters and perform substitution on them. (ipcp_transform_function): Create descriptors earlier, handle new return values of adjust_agg_replacement_values. gcc/testsuite/ChangeLog: 2021-11-23 Martin Jambor PR ipa/103227 * gcc.dg/ipa/pr103227-1.c: New test. * gcc.dg/ipa/pr103227-3.c: Likewise. * gcc.dg/ipa/pr103227-2.c: Likewise. * gfortran.dg/pr53787.f90: Disable IPA-SRA.
[Bug preprocessor/103415] [12 Regression] ICE in cpp_interpret_string_1, at libcpp/charset.c:1739
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103415 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #5 from Jakub Jelinek --- Created attachment 51873 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51873&action=edit gcc12-pr103415.patch Untested fix.
[Bug tree-optimization/103425] [12 Regression] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425 --- Comment #4 from Jan Hubicka --- In meanwhile other testers picked the revision and it seems that indeed only benzen machine reports this (it is AMD EPYC 7702). So it looks microarchitecture specific issue.
[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 Martin Liška changed: What|Removed |Added Last reconfirmed||2021-11-25 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #3 from Martin Liška --- Looking into it.
[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396 Andrew Stubbs changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Andrew Stubbs --- This problem should be fixed now.
[Bug target/103396] [12 Regression][GCN][BUILD] ICE RTL check: access of elt 4 of vector with last elt 3 in move_callee_saved_registers, at config/gcn/gcn.c:2821
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103396 --- Comment #5 from CVS Commits --- The master branch has been updated by Andrew Stubbs : https://gcc.gnu.org/g:58d50a5dd6344179eebaeb6fd2f895e59463cf74 commit r12-5525-g58d50a5dd6344179eebaeb6fd2f895e59463cf74 Author: Andrew Stubbs Date: Thu Nov 25 15:59:20 2021 + amdgcn: Fix ICE generating CFI [PR103396] gcc/ChangeLog: PR target/103396 * config/gcn/gcn.c (move_callee_saved_registers): Ensure that the number of spilled registers is counted correctly.
[Bug fortran/103412] [10/11/12 Regression] ICE: Invalid expression in gfc_element_size since r10-2083-g8dc63166e0b85954
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103412 --- Comment #3 from kargl at gcc dot gnu.org --- (In reply to Martin Liška from comment #2) > Started with r10-2083-g8dc63166e0b85954. No, it did not start with this commit. It was exposed by this commit.
[Bug fortran/103414] [10/11/12 Regression] [PDT] ICE in gfc_free_actual_arglist, at fortran/expr.c:547 since r10-2083-g8dc63166e0b85954
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103414 --- Comment #6 from kargl at gcc dot gnu.org --- (In reply to Martin Liška from comment #5) > Started with r10-2083-g8dc63166e0b85954. Well, no, it did not start with the above commit. At best, it was exposed by this commit.
[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 --- Comment #2 from Edward Rosten --- It is doing if-to-switch, but only really with N=5, and only if force-inline is set. I think this are two problems, one is that you need to force-inline in order to trigger if-to-switch. The other problem is that the if-to-switch conversion triggered only works well for exactly 5 conditions, otherwise it uses a mix of converted and unconverted. It doesn't appear to be doing binary tree generation, more like linear search Here's the asm for N=7: run(int): ret run_inline(int): testedi, edi je .L13 cmp edi, 1 je .L14 cmp edi, 6 ja .L3 mov edi, edi jmp [QWORD PTR .L8[0+rdi*8]] .L8: .quad .L3 .quad .L3 .quad .L12 .quad .L11 .quad .L10 .quad .L9 .quad .L7 .L13: jmp void f<0>() .L7: jmp void f<6>() .L12: jmp void f<2>() .L11: jmp void f<3>() .L10: jmp void f<4>() .L9: jmp void f<5>() .L14: jmp void f<1>() .L3: ret Note, it's essentially doing: if(i==0) f<0>(); else if(i==1) f<1>(); else if(i > 6) return; else switch(i){ case 0: case 1: return; case 2: f<2>(); return; case 3: f<3>(); return; case 4: f<4>(); return; case 5: f<5>(); return; case 6: f<6>(); return; } It's not doing binary searches. For, e.g. N%5 == 1, the structure is more like: if(i==0) f<0>(); else if(i > 5){ if(i-5 > 4){ if(i-11>4){ if(i-16 > 4){ // and so on, linearly } else switch(i-16){ //... } } else switch(i-11){ //... } } else switch(i-6){ //... } } else switch(i){ case 0: return; case 1: f<1>(); return; case 2: f<2>(); return; case 3: f<3>(); return; case 4: f<4>(); return; case 5: f<5>(); return; }
[Bug preprocessor/103415] [12 Regression] ICE in cpp_interpret_string_1, at libcpp/charset.c:1739
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103415 --- Comment #4 from Jakub Jelinek --- __VA_OPT__ has been supported for a few more years, my change just added support for stringification of __VA_OPT__...
[Bug c++/47256] "--sysroot" option is not passed to COLLECT_GCC_OPTIONS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47256 --- Comment #7 from Richard Purdie --- Thanks for the tip, we'll look into dropping it!
[Bug tree-optimization/103417] [12 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r12-5489
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103417 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/103221] evrp removes |SIGN but does not propagate the ssa name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103221 --- Comment #3 from Andrew Macleod --- And BTW, we do this optimization, just not completely in evrp. EVRP removes the extraneous | -128 since that is a range related action. Constant propagation handles the propagation of the copy into the PHI, I'm not sure we also need to do it in a VRP pass.
[Bug tree-optimization/103429] Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 Richard Biener changed: What|Removed |Added Component|c++ |tree-optimization CC||marxin at gcc dot gnu.org --- Comment #1 from Richard Biener --- Sounds like if-to-switch / switch conversion could help. Note that GCC converts large switches into binary if trees in some cases as well.
[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 --- Comment #4 from Martin Liška --- I'm going to bisect that.
[Bug c++/103429] New: Optimization of Auto-generated condition chain is not giving good lookup tables.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103429 Bug ID: 103429 Summary: Optimization of Auto-generated condition chain is not giving good lookup tables. Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ed at edwardrosten dot com Target Milestone: --- I've got come generated condition chains (using recursive templates) and am getting some odd/suboptimal optimization results. Code is provided below and with a godbolt link. In the first case (without a force inline), the compiler inlines the functions but does not perform condition chain optimization. In the second case (identical code but with force inline), it will optimize condition chains but only with exactly 5 elements. Otherwise it will end up with an if-else structure indexing optimized 5 element condition chains, and an if-else chain for anything spare. It only attempts the optimization from gcc 11 onwards, I checked on trunk too. Example: https://godbolt.org/z/c9xbPqq7r Here's the code: template void f(); constexpr int N=5; template static inline void f_dispatch(int i){ if constexpr (I == N) return; else if(i == I) f(); else f_dispatch(i); } template __attribute__((always_inline)) static inline void f_dispatch_always_inline(int i){ if constexpr (I == N) return; else if(i == I) f(); else f_dispatch_always_inline(i); } void run(int i){ f_dispatch<>(i); } void run_inline(int i){ f_dispatch_always_inline<>(i); }
[Bug c++/100465] Overloading operator+= and including filesystem causes conflicting overload compilation error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100465 Jonathan Wakely changed: What|Removed |Added Last reconfirmed|2021-05-07 00:00:00 |2021-11-25 --- Comment #6 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #2) > Maybe another case of PR 51577 but I haven't looked into it yet. The testcase in comment 4 was fixed by the patch for that bug, r12-702. The original testcase using still fails though. Patrick, do you think this is just a dup of PR 51577? Do I need to reduce this again to something that still fails, or do we have a matching testcase already?
[Bug tree-optimization/103427] Alignment of C++ references and 'this' pointer not used by optimizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103427 --- Comment #10 from Jonathan Wakely --- int*& is a reference to a pointer, and is perfectly valid. You can't have a pointer to a reference (a reference isn't required to have any storage, so taking the address of a reference doesn't make sense). int&& is an rvalue reference, which is just a different type of reference. You can't have int & & though. Binding another reference to a reference actually binds to the underlying object, not the reference. So there are no pointers or references to references.
[Bug tree-optimization/102648] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) since r12-2381-g704e8a825c78b9a8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102648 Andrew Macleod changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Andrew Macleod --- Should be fixed.
[Bug tree-optimization/102648] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0) since r12-2381-g704e8a825c78b9a8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102648 --- Comment #3 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:1598bd47b2a4a5f12b5a987d16d82634644db4b6 commit r12-5524-g1598bd47b2a4a5f12b5a987d16d82634644db4b6 Author: Andrew MacLeod Date: Thu Nov 25 08:58:19 2021 -0500 Add the testcase for this PR to the testsuite. Various ranger-enabled patches like threading and VRP2 can do this now, so add the testcase for posterity. gcc/testsuite/ PR tree-optimization/102648 * gcc.dg/pr102648.c: New.
[Bug target/103421] -march=bogus12323123423452345 -march=skylake-avx512 is accepted as a command line option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103421 --- Comment #2 from Richard Biener --- I think that makes sense in some way, not sure we want -march-for-check=bogus12323123423452345. Also consider -march=xyz -moption-not-valid-for-xyz -march=but-for-this
[Bug tree-optimization/103425] [12 Regression] 48% tramp3d regression between g:df1a0d526e2e4c75 and g:9e026da720091704 with -Ofast -march=native at Zen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103425 Richard Biener changed: What|Removed |Added Target||x86_64-*-* Summary|48% tramp3d regression |[12 Regression] 48% tramp3d |between g:df1a0d526e2e4c75 |regression between |and g:9e026da720091704 with |g:df1a0d526e2e4c75 and | -Ofast -march=native at|g:9e026da720091704 with |Zen |-Ofast -march=native at Zen Keywords||needs-bisection Target Milestone|--- |12.0 --- Comment #3 from Richard Biener --- Not visible on Haswell (for now).