[Bug fortran/100855] pow run time gfortran vs ifort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100855 --- Comment #10 from Nadav Halahmi --- (In reply to Dominique d'Humieres from comment #9) > I don't know if the test is coming from a real world problem. The modified > test > > program power > implicit none > > real :: sum, sum1, n, q > integer :: i, j > integer :: limit > real :: start, finish > > sum = 0d0 > sum1 = 0d0 > limit = 1 > n = 2.0 > q = 0.5 > call CPU_TIME(start) > do i=1, limit > n = n*q > sum1 = sum1 + (i ** (0.05 + n)) > end do > do i=1, limit > sum = sum + (i ** 0.05) > end do > sum = sum1 + (limit-1)*sum > call CPU_TIME(finish) > print *, sum, n, sum1 > print '("Time = ",f6.3," seconds.")',finish-start > end program power > > yields > >150945680. 0. 15095.7852 > Time = 0.000 seconds. What did you try to show here?
[Bug c++/100929] New: gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 Bug ID: 100929 Summary: gcc fails to optimize less to min for SIMD code Product: gcc Version: og10 (devel/omp/gcc-10) Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: denis.yaroshevskij at gmail dot com Target Milestone: --- Stand alone float - x86 example: https://godbolt.org/z/vr3cjvY5G Using a library x86 float, int, aarch64: https://godbolt.org/z/zPP48vzrq less + blend or greater + blend should become min/max.
[Bug c/100920] bogus warnings with -Wscalar-storage-order
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100920 --- Comment #4 from CVS Commits --- The master branch has been updated by Eric Botcazou : https://gcc.gnu.org/g:a589877a0036fc2f66b7a957859940c53efdc7c9 commit r12-1242-ga589877a0036fc2f66b7a957859940c53efdc7c9 Author: Eric Botcazou Date: Sun Jun 6 11:37:45 2021 +0200 Fix thinko in new warning on type punning for storage order purposes In C, unlike in Ada, the storage order of arrays is that of their component type, so you need to look at it when deciding to warn. And the PR complains about a bogus warning on the assignment of a pointer returned by alloca or malloc, so this also fixes that. gcc/c PR c/100920 * c-decl.c (finish_struct): Fix thinko in previous change. * c-typeck.c (convert_for_assignment): Do not warn on pointer assignment and initialization for storage order purposes if the RHS is a call to a DECL_IS_MALLOC function. gcc/testsuite/ * gcc.dg/sso-14.c: New test.
[Bug c/100920] bogus warnings with -Wscalar-storage-order
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100920 Eric Botcazou changed: What|Removed |Added Target Milestone|--- |12.0 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from Eric Botcazou --- Thanks for reporting the problem.
[Bug libfortran/98301] random_init() is broken
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98301 --- Comment #13 from CVS Commits --- The releases/gcc-11 branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:002745ca3668fc5e87c22acc81caaeaaadf9c47a commit r11-8515-g002745ca3668fc5e87c22acc81caaeaaadf9c47a Author: Andre Vehreschild Date: Sun Jun 6 12:06:31 2021 +0200 PR fortran/98301 - random_init() is broken Correct implementation of random_init() when -fcoarray=lib is given. Backport from mainline. 2021-06-06 Andre Vehreschild Steve Kargl gcc/fortran/ChangeLog: PR fortran/98301 * trans-decl.c (gfc_build_builtin_function_decls): Move decl. * trans-intrinsic.c (conv_intrinsic_random_init): Use bool for lib-call of caf_random_init instead of logical (4-byte). * trans.h: Add tree var for random_init. libgfortran/ChangeLog: PR fortran/98301 * caf/libcaf.h (_gfortran_caf_random_init): New function. * caf/single.c (_gfortran_caf_random_init): New function. * gfortran.map: Added fndecl. * intrinsics/random_init.f90: Implement random_init.
[Bug target/100930] New: PPC: Missing builtins for P9 vextsb2w, vextsb2w, vextsb2d, vextsh2d, vextsw2d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100930 Bug ID: 100930 Summary: PPC: Missing builtins for P9 vextsb2w, vextsb2w, vextsb2d, vextsh2d, vextsw2d Product: gcc Version: 8.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jens.seifert at de dot ibm.com Target Milestone: --- Using the same names like xlC appreciated: vec_extsbd, vec_extsbw, vec_extshd, vec_extshw, vec_extswd
[Bug rtl-optimization/40772] generating rendundant moves from second byte of 32b/64b register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40772 Roger Sayle changed: What|Removed |Added Resolution|--- |FIXED CC||roger at nextmovesoftware dot com Target Milestone|--- |7.0 Status|UNCONFIRMED |RESOLVED --- Comment #5 from Roger Sayle --- This issue has been fixed since gcc 7; the compiler now stores the high-byte register ah/bh/dh etc directly to memory. The original tst2b.c testcase when compiled with -O3 -march=k8 -fno-tree-vectorize looks like: test: .LFB0: .cfi_startproc leal1(%rdi), %edx movl%edi, %eax movb%ah, data(%rip) addl$15, %eax movb%dh, data+1(%rip) leal2(%rdi), %edx movb%ah, data+15(%rip) movb%dh, data+2(%rip) leal3(%rdi), %edx movb%dh, data+3(%rip) leal4(%rdi), %edx movb%dh, data+4(%rip) leal5(%rdi), %edx movb%dh, data+5(%rip) leal6(%rdi), %edx movb%dh, data+6(%rip) leal7(%rdi), %edx movb%dh, data+7(%rip) leal8(%rdi), %edx movb%dh, data+8(%rip) leal9(%rdi), %edx movb%dh, data+9(%rip) leal10(%rdi), %edx movb%dh, data+10(%rip) leal11(%rdi), %edx movb%dh, data+11(%rip) leal12(%rdi), %edx movb%dh, data+12(%rip) leal13(%rdi), %edx movb%dh, data+13(%rip) leal14(%rdi), %edx movb%dh, data+14(%rip) ret
[Bug bootstrap/29482] libcpp/configure - no usable dependency style found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29482 Nicolas Boulenguez changed: What|Removed |Added CC||nicolas at debian dot org --- Comment #9 from Nicolas Boulenguez --- Hello. I had the failure with GCC-10.2.1, only when running `autoreconf -f -i . fixincludes gcc subdirs...` before `./configure`. For each subdir in turn, autoreconf checks if the subdirectory uses libtool or automake. If so, it installs depcomp in . (../ from the subdir), else removes ./depcomp (breaking the build of other subdirectories). Changing the order of autoreconf arguments so that the last one depends on automake fixed the problem for me. I am not sure if this is a bug, or where to report it, but documenting the work-around here may be useful to other GCC users.
[Bug fortran/100907] Bind(c): failure handling wide character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100907 --- Comment #5 from Dominique d'Humieres --- > It seems that Mac OS doesn't have the full set of C11 standard headers... :-( Shouldn't the C11 standard headers be provide by GCC12? Nevertheless the test compiles with the new version of the new C companion. The same is true for 100910 and 100914.
[Bug driver/69471] "-march=native" unintentionally breaks further -march/-mtune flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69471 H.J. Lu changed: What|Removed |Added Target Milestone|9.5 |9.3 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #20 from H.J. Lu --- Fixed in GCC 9.3 and above. GCC 8 branch is closed.
[Bug target/100931] New: [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931 Bug ID: 100931 Summary: [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- void g(int *p) { *p = 2; p[1] = 2; } void h(long long *p) { *p = 0x20002; } g compiles to this on GCC on plenty of architectures: g(int*): mov rax, QWORD PTR .LC0[rip] mov QWORD PTR [rdi], rax ret .LC0: .long 2 .long 2 h is equivalent to g (non-withstanding aliasing) and instead compiles to this: h(long long*): movabs rax, 8589934594 mov QWORD PTR [rdi], rax ret g has been compiled differently from h since GCC 10. I'm somewhat doubtful about filing this bug actually, I personally think that h will be faster and that g is simply a regression from GCC 9, but I can't really be sure there isn't some architecture-specific reasoning to use a separate constant, especially since this transformation seems to only occur on specific architectures (generic, core2, nehalem, westmere, sandybridge, ivybridge, haswell, broadwell, znver1, znver2 and znver3)
[Bug c++/100929] gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 --- Comment #1 from Marc Glisse --- Please attach your testcases to the bug report. godbolt links are nice complements, but not considered sufficient here. We don't lower the comparison or the blend in GIMPLE (yet). I think Hongtao Liu is doing blends right now. I don't know if there would be issues for comparisons (with -ftrapping-math for instance?). If you write (x
[Bug target/100929] gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 Marc Glisse changed: What|Removed |Added Version|og10 (devel/omp/gcc-10) |11.1.0 Keywords||missed-optimization Component|c++ |target Severity|normal |enhancement Target||x86_64-*-*
[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 H.J. Lu changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #12 from H.J. Lu --- We should handle it in the whole Linux software stack: https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8 not just in compiler.
[Bug rtl-optimization/95405] Unnecessary stores with std::optional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405 Gabriel Ravier changed: What|Removed |Added CC||gabravier at gmail dot com --- Comment #2 from Gabriel Ravier --- Welp, I've tried to convert this to a simplified form, but I can't seem to get the same output regardless of how close I get in terms of GIMPLE output. With this code: struct opbeb {}; union opbs { opbeb empty_byte; long value; }; struct opb { opbs payload; bool engaged; }; struct op : public opb { }; struct ob { op payload; }; struct o { ob base; }; o foo(); long bar() { struct o r = foo(); if (__builtin_expect_with_probability((*(const ob *)&r).payload.engaged != 0, 1, .66)) return (long &)*(long *)&r; else return 0; } I get this final GIMPLE (i.e. -fdump-tree-optimized): ;; Function bar (_Z3barv, funcdef_no=9255, decl_uid=109154, cgraph_uid=6606, symbol_order=6814) Removing basic block 5 long int bar () { struct o r; bool _1; long int _4; long int _7; [local count: 1073741824]: r = foo (); _1 = MEM[(const struct ob *)&r].payload.D.109140.engaged; if (_1 != 0) goto ; [66.00%] else goto ; [34.00%] [local count: 708669601]: _7 = MEM[(long int &)&r]; [local count: 1073741824]: # _4 = PHI <_7(3), 0(2)> r ={v} {CLOBBER}; return _4; } Which seems to be almost exactly identical to the one I get from the real std::optional: ;; Function bar (_Z3barv, funcdef_no=6084, decl_uid=49565, cgraph_uid=5869, symbol_order=5916) Removing basic block 5 long int bar () { struct optional r; long int _1; bool _4; long int _5; [local count: 1073741824]: r = foo (); _4 = MEM[(const struct _Optional_base *)&r]._M_payload.D.50442._M_engaged; if (_4 != 0) goto ; [66.00%] else goto ; [34.00%] [local count: 708669601]: _5 = MEM[(long int &)&r]; [local count: 1073741824]: # _1 = PHI <_5(3), 0(2)> r ={v} {CLOBBER}; return _1; } Literally the only differences I can see is that variables are declared in a different order, and that some variable names are different. Yet the assembly output for my version optimizes the store to memory away just fine, and the std::optional output still fails to optimize the store to memory. Is the (very minor) difference here this significant or is there something I can't see in the outputted GIMPLE that results in the differences ? I tried to delve into the RTL, though I failed to really understand what was going on (though I could see significant differences between what I wrote and the original example there). I've also checked the assembly, and as far as I can see, there is no functional difference between what I wrote and the original one, LLVM even produces the exact same assembly for both. I've also tried to rule out the difference in variable declaration placement and naming by rewriting what I wrote into GIMPLE and modifying it to correspond to the original example as well as possible, with this being my best effort: long int __GIMPLE (ssa,guessed_local(1073741824)) bar () { struct o r; long int _1; bool _4; long int _7; __BB(2,guessed_local(1073741824)): r = foo (); _4 = __MEM ((const struct ob *)&r).payload.base.engaged; if (_4 != _Literal (bool) 0) goto __BB3(guessed(88583700)); else goto __BB4(guessed(45634028)); __BB(3,guessed_local(708669601)): _7 = __MEM (&r); goto __BB4(precise(134217728)); __BB(4,guessed_local(1073741824)): _1 = __PHI (__BB3: _7, __BB2: 0l); r = _Literal (struct o) {}; return _1; } But it still gets optimized well, as expected, unlike the original, which is rather mind boggling to me, unless there really is a bunch of GIMPLE information that isn't part of the outputted form. PS: LLVM optimizes the original example and what I wrote perfectly fine to the same assembly code.
[Bug fortran/100907] Bind(c): failure handling wide character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100907 --- Comment #6 from José Rui Faustino de Sousa --- > Shouldn't the C11 standard headers be provide by GCC12? > AFAIK gcc uses the system's libc. In Linux the default will be GNU libc "glibc" in Mas OS the default libc will be BSD libc which is missing some of the headers... Or so it says in GNU portability library "gnulib" documentation... Thank you very much. Best regards, José Rui
[Bug target/100909] [12 Regression] powerpc64le: Regression causing unexpected error with IBM long double
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100909 Martin Liška changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #2 from Martin Liška --- Mine, I've got a patch for it.
[Bug other/100932] New: autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932 Bug ID: 100932 Summary: autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: nicolas at debian dot org Target Milestone: --- Hello. When I attempt to autoreconf(2.69) the gcc/ subdirectory of 10.2.1 or 11.1.0, I get: configure.ac:886: error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT configure.ac:1499: error: possibly undefined macro: GCC_AC_FUNC_MMAP_BLACKLIST There is a slight possibility that the error is caused by local patches (Debian experimental), but this trivial change fixes the issue: --- a/src/gcc/configure.ac +++ b/src/gcc/configure.ac @@ -25,6 +25,7 @@ AC_INIT AC_CONFIG_SRCDIR(tree.c) +AC_CONFIG_MACRO_DIRS(../config) AC_CONFIG_HEADER(auto-host.h:config.in) gcc_version=`cat $srcdir/BASE-VER` The documentation seems to recommend AC_CONFIG_MACRO_DIRS anyway.
[Bug c++/67829] Bogus "ambiguous template instantiation" error with partial specializations involving a template template parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67829 Patrick Palka changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org CC||ppalka at gcc dot gnu.org
[Bug other/100933] New: install cannot stat include-fixed/limits.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100933 Bug ID: 100933 Summary: install cannot stat include-fixed/limits.h Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: nicolas at debian dot org Target Milestone: --- Hello. I have been bitten by the exact bug described at: https://gcc.gnu.org/legacy-ml/gcc/2013-04/msg00171.html The work-around described there worked for me : run 'make && make install' directly instead of via wrappers (dh_auto_build and dh_auto_install) that parse 'make -n' before normal operation. The issue seems difficult, but please at least provide a hint in the error message at install time. Without this post, I would probably never have found a work-around. Thanks.
[Bug middle-end/100593] [ELF] -fno-pic: Use GOT to take address of an external default visibility function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 --- Comment #13 from Fangrui Song --- (In reply to H.J. Lu from comment #12) > We should handle it in the whole Linux software stack: > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8 > > not just in compiler. It is great that you have the desire to fix these fundamental issues :) I think a GNU_PROPERTY marker is over-engineering. See https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/8 for details. Many things (including this and PR98112) can be changed today. When -fno-direct-access-external-data/-fno-direct-access-external-function as -fno-pic default becomes prevailing, make ld warning by default for R_*_COPY/canonical PLT entries. After a while (say one or two years), let glibc ld.so warn for R_*_COPY/canonical PLT entries.
[Bug c/100902] pointer attachment issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100902 --- Comment #2 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:7fa4db39b6bcd207bd2b52023ff6b155bd15 commit r12-1246-g7fa4db39b6bcd207bd2b52023ff6b155bd15 Author: Jakub Jelinek Date: Sun Jun 6 19:37:06 2021 +0200 openmp: Call c_omp_adjust_map_clauses even for combined target [PR100902] When looking at in_reduction support for target, I've noticed that c_omp_adjust_map_clauses is not called for the combined target case. The following patch fixes it. Unfortunately, there are other issues. One is (also mentioned in the PR) that currently the pointer attachment stuff seems to be clause ordering dependent (the standard says that clause ordering on the same construct does not matter), the baz and qux cases in the PR are rejected while when swapped it is accepted. Note, the order of clauses in GCC really is treated as insignificant initially and only later on the compiler can adjust the ordering (e.g. when we sort map clauses based on what they refer to etc.) and in particular, clauses from parsing is reverse of the order in user code, while c_omp_split_clauses performed for combined/composite constructs typically reverses that ordering, i.e. makes it follow the user code ordering. And another one is I'm slightly afraid c_omp_adjust_map_clauses might misbehave in templates, though haven't tried to verify it with testcases. When processing_template_decl, the non-dependent clauses will be handled usually the same as when not in a template, but dependent clauses aren't processed or only limited processing is done there, and rest is deferred till later. From quick skimming of c_omp_adjust_map_clauses, it seems it might not be very happy about non-processed map clauses that might still have the TREE_LIST representation of array sections, or might not have finalized decls or base decls etc. So, for this I wonder if cp_parser_omp_target (and other cp/parser.c callers of c_omp_adjust_map_clauses) shouldn't call it only if (!processing_template_decl) - perhaps you could add cp_omp_adjust_map_clauses wrapper that would be if (!processing_template_decl) c_omp_adjust_map_clauses (...); - and call c_omp_adjust_map_clauses from within pt.c after the clauses are tsubsted and finish_omp_clauses is called again. 2021-06-06 Jakub Jelinek PR c/100902 * c-parser.c (c_parser_omp_target): Call c_omp_adjust_map_clauses even when target is combined with other constructs. * parser.c (cp_parser_omp_target): Call c_omp_adjust_map_clauses even when target is combined with other constructs. * c-c++-common/gomp/pr100902-1.c: New test.
[Bug rtl-optimization/95405] Unnecessary stores with std::optional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405 --- Comment #3 from Marc Glisse --- For a self-contained version, see below. Notice how the extra constructor in _Optional_payload_base changes the generated code, or storing directly a _Optional_payload_base instead of _Optional_payload in optional struct _Optional_payload_base { long _M_value; bool _M_engaged = false; _Optional_payload_base() = default; ~_Optional_payload_base() = default; _Optional_payload_base(const _Optional_payload_base&) = default; _Optional_payload_base(_Optional_payload_base&&) = default; _Optional_payload_base(double,float); }; struct _Optional_payload : _Optional_payload_base { }; struct optional { _Optional_payload _M_payload; }; optional foo(); long bar() { auto r = foo(); if (r._M_payload._M_engaged) return r._M_payload._M_value; else return 0L; }
[Bug rtl-optimization/95405] Unnecessary stores with std::optional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405 --- Comment #4 from Gabriel Ravier --- Ah, I see. Didn't think there was a constructor involved and/or that GIMPLE would keep it implicit like this...
[Bug rtl-optimization/95405] Unnecessary stores with std::optional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95405 --- Comment #5 from Marc Glisse --- GIMPLE doesn't know about calling conventions, that's something that only "appears" during expansion to RTL. Still, I don't claim to understand what is going on here.
[Bug target/100929] gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 --- Comment #2 from Andrew Pinski --- Original x86_64 testcase: #include __m256 if_else(__m256 x, __m256 y) { __m256 mask = _mm256_cmp_ps(y, x, _CMP_LT_OQ); return _mm256_blendv_ps(x, y, mask); } __m256 min(__m256 x, __m256 y) { return _mm256_min_ps(x, y); } CUT - Note the other testcase is using eve which I have no idea what it is coming from.
[Bug target/100931] [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931 --- Comment #1 from Andrew Pinski --- SLP is happening. This is just a cost model issue as -mtune=intel works.
[Bug bootstrap/100932] autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932 Andrew Pinski changed: What|Removed |Added Component|other |bootstrap Keywords||build --- Comment #1 from Andrew Pinski --- I suspect most people already and normally do: autoconf -I../config
[Bug bootstrap/100933] install cannot stat include-fixed/limits.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100933 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX Component|other |bootstrap --- Comment #1 from Andrew Pinski --- I doubt we are going to fix "make -n" as it is just a debugging tool of makefiles rather than actually something which should be used.
[Bug tree-optimization/100934] New: wrong code at -O3 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100934 Bug ID: 100934 Summary: wrong code at -O3 on x86_64-linux-gnu Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhendong.su at inf dot ethz.ch Target Milestone: --- It seems to affect all versions since GCC 8.4 (but not GCC 8.3). [583] % gcctk -v Using built-in specs. COLLECT_GCC=gcctk COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --disable-bootstrap --prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib --with-system-zlib Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.0.0 20210606 (experimental) [master revision 28c62475050:a6bc26893ec:a589877a0036fc2f66b7a957859940c53efdc7c9] (GCC) [584] % [584] % gcctk -O2 small.c; ./a.out [585] % [585] % gcctk -O3 small.c [586] % ./a.out Segmentation fault [587] % [587] % cat small.c int a, b, c, d, e; int main() { int f = 0, g = 0; for (; f < 2; f++) { int h, i; for (h = 0; h < 2; h++) { b = e = g ? a % g : 0; c = d; for (i = 0; i < 1; i++) g = 0; for (; g < 2; g++) ; } } return 0; }
[Bug tree-optimization/100923] wrong code at -O2 and above on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923 Andrew Pinski changed: What|Removed |Added Summary|wrong code at -Os and above |wrong code at -O2 and above |on x86_64-linux-gnu |on x86_64-linux-gnu Ever confirmed|0 |1 Last reconfirmed||2021-06-06 Status|UNCONFIRMED |NEW Keywords||alias, wrong-code --- Comment #2 from Andrew Pinski --- - working (-O2 -fno-strict-aliasing) + not working (-O2 -fstrict-aliasing) - l.1_3 = l; - e.2_5 = e; - f.3_6 = f; - *e.2_5 = f.3_6; - _7 = *l.1_3; - if (_7 != 0) + l.1_4 = l; + _5 = *l.1_4; + e.2_6 = e; + f.3_7 = f; + *e.2_6 = f.3_7; + if (_5 != 0) So we swapped around the store to *e and the load from *l.
[Bug tree-optimization/100923] [9/10/11/12 Regression] wrong code at -O2 and above on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923 Andrew Pinski changed: What|Removed |Added Summary|wrong code at -O2 and above |[9/10/11/12 Regression] |on x86_64-linux-gnu |wrong code at -O2 and above ||on x86_64-linux-gnu Target Milestone|--- |9.5
[Bug target/100930] PPC: Missing builtins for P9 vextsb2w, vextsb2w, vextsb2d, vextsh2d, vextsw2d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100930 Bill Schmidt changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2021-06-06 Ever confirmed|0 |1 CC||bergner at gcc dot gnu.org, ||segher at gcc dot gnu.org --- Comment #1 from Bill Schmidt --- Hi Jens, The old xlC names are nonstandard. The agreed-upon names between GCC and OpenXL are vec_signexti, vec_signextll, and vec_signextq, having result type vector signed int, vector signed long long, and vector signed __int128, respectively. vec_signextq is available only for P10. Unfortunately these aren't yet implemented (their absence was discovered not too long ago), so we still have work to do here. :( Confirmed.
[Bug tree-optimization/100923] [9/10/11/12 Regression] wrong code at -O2 and above on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100923 --- Comment #3 from Andrew Pinski --- So FRE thinks: *e.2_6 = f.3_7; Does not modify: _9 = *l.1_4;
[Bug tree-optimization/100934] [9/10/11/12 Regression] wrong code at -O3 during unrolling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100934 Andrew Pinski changed: What|Removed |Added Target||x86_64-linux-gnu Status|UNCONFIRMED |NEW Summary|wrong code at -O3 on|[9/10/11/12 Regression] |x86_64-linux-gnu|wrong code at -O3 during ||unrolling Keywords||wrong-code Last reconfirmed||2021-06-06 Known to fail||12.0 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- The complete unroller is adding a __builtin_unreachable and then that becomes the only thing.
[Bug d/100935] New: d: T.alignof ignores explicit align(N) type alignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100935 Bug ID: 100935 Summary: d: T.alignof ignores explicit align(N) type alignment Product: gcc Version: 9.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: d Assignee: ibuclaw at gdcproject dot org Reporter: ibuclaw at gdcproject dot org Target Milestone: --- T.alignof currently always returns the natural alignment of a type: align(8) struct Aligned { int a; } static assert(Aligned.alignof == 8); // fails, 4 align(1) struct Packed { int a; } static assert(Packed.alignof == 1); // fails, 4
[Bug target/100936] New: %p and %P modifiers should not emit segment overrides
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936 Bug ID: 100936 Summary: %p and %P modifiers should not emit segment overrides Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- __seg_gs int var = 123; static int *foo (void) { int *addr; asm ("lea %p1, %0" : "=r"(addr) : "m"(var)); return addr; } static int bar (int *addr) { int val; asm ("mov %%gs:%1, %0" : "=r"(val) : "m"(*addr)); return val; } int baz (void) { int *addr = foo(); int val = bar (addr); return val; } --cut here-- emits assembly warning when compiled on x86 target: gcc -O2 -c lea.c lea.c: Assembler messages: lea.c:8: Warning: segment override on `lea' is ineffectual $ objdump -d lea.o lea.o: file format elf64-x86-64 Disassembly of section .text: : 0: 65 48 8d 04 25 00 00lea%gs:0x0,%rax 7: 00 00 9: 65 8b 00mov%gs:(%rax),%eax c: c3 retq The problem is with %p operand modifier, which should emit raw symbol name: P -- if PIC, print an @PLT suffix. For -fno-plt, load function address from GOT. p -- print raw symbol name. but it also emits its segment override. As shown in the above example, it is not possible to use LEA to load its address into a register. Similar problem is with %P modifier, trying to CALL or JMP to overriden symbol,e.g: call %gs:zzz jmp %gs:zzz call.s:1: Warning: skipping prefixes on `call' call.s:2: Warning: skipping prefixes on `jmp'
[Bug target/100936] %p and %P modifiers should not emit segment overrides
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100936 --- Comment #1 from Uroš Bizjak --- Proposed patch: --cut here-- diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 04649b42122..0773a4a9ba8 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13531,7 +13531,7 @@ ix86_print_operand_punct_valid_p (unsigned char code) static void ix86_print_operand_address_as (FILE *file, rtx addr, - addr_space_t as, bool no_rip) + addr_space_t as, bool raw) { struct ix86_address parts; rtx base, index, disp; @@ -13570,7 +13570,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, else gcc_assert (ADDR_SPACE_GENERIC_P (parts.seg)); - if (!ADDR_SPACE_GENERIC_P (as)) + if (!ADDR_SPACE_GENERIC_P (as) && !raw) { if (ASSEMBLER_DIALECT == ASM_ATT) putc ('%', file); @@ -13589,7 +13589,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, } /* Use one byte shorter RIP relative addressing for 64bit mode. */ - if (TARGET_64BIT && !base && !index && !no_rip) + if (TARGET_64BIT && !base && !index && !raw) { rtx symbol = disp; --cut here--
[Bug target/100929] gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 --- Comment #3 from Denis Yaroshevskiy --- > Please attach your testcases to the bug report. Is what @Andrew Pinski copied enough? I can attach the same code as file. > I don't know if there would be issues for comparisons (with -ftrapping-math > for instance?). -ftrapping-math causes clang to stop doing this optimisation. I can see that clang does it, so I assume `nans` are OK without this flag. For ints this is for sure OK. > Note the other testcase is using eve which I have no idea what it is coming > from. Using eve just was much easier then writing this with intrinsics: The point was: vpcmpgtdymm2, ymm0, ymm1 vpblendvb ymm0, ymm0, ymm1, ymm2 should become vpminsd ymm0, ymm1, ymm0 And on arm: cmgtv2.4s, v0.4s, v1.4s bit v0.16b, v1.16b, v2.16b should become sminv0.4s, v1.4s, v0.4s And fcmgt v2.4s, v0.4s, v1.4s bit v0.16b, v1.16b, v2.16b should become fminv0.4s, v1.4s, v0.4s I don't really know how it is done in `gcc` - but all these examples look like the same issue. If it is very helpful to write all of them as intrinsics, I can.
[Bug driver/100937] New: configure: Add --enable-default-semantic-interposition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 Bug ID: 100937 Summary: configure: Add --enable-default-semantic-interposition Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: i at maskray dot me Target Milestone: --- Add a configure option --enable-default-semantic-interposition to customize -f(no-)semantic-interposition default. The suppression of interprocedural optimizations and inlining for such default visibility non-vague-linkage function definitions is the biggest difference between -fPIE/-fPIC. Distributions may want to enable default -fno-semantic-interposition to reclaim the lost performance from -fPIC (e.g. CPython is said to be 27% faster; Clang is 3% faster).
[Bug driver/100937] configure: Add --enable-default-semantic-interposition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 Andrew Pinski changed: What|Removed |Added Resolution|--- |WONTFIX Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- NO. This is wrong for many reasons. First it makes portability a pain.
[Bug driver/100937] configure: Add --enable-default-semantic-interposition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 Fangrui Song changed: What|Removed |Added Resolution|WONTFIX |--- Status|RESOLVED|UNCONFIRMED --- Comment #2 from Fangrui Song --- How is it a portability problem? clang -fpic has always been allowing interprocedural optimizations for non-vague-linkage function definitions. FreeBSD uses clang and software works with no problem. For a vague-linkage function definition, a call site in the same translation unit may inline the callee. Whether -fno-semantic-interposition is enabled/disabled has no effect. For a non-vague-linkage function definition, by default (-fsemantic-interposition) the -fpic mode does not allow a call site in the same translation unit to inline the callee or perform other interprocedural optimizations. -fno-semantic-interposition re-enables interprocedural optimizations. If a caller inlines a callee, using LD_PRELOAD to interpose the callee will not affect the caller. But many other LD_PRELOAD usage still work. We consider the small LD_PRELOAD limitation a good trade off for the speedup.
[Bug driver/100937] configure: Add --enable-default-semantic-interposition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 Andrew Pinski changed: What|Removed |Added Resolution|--- |WONTFIX Status|UNCONFIRMED |RESOLVED --- Comment #3 from Andrew Pinski --- >clang -fpic has always been allowing interprocedural optimizations for >non-vague-linkage function definitions. FreeBSD uses clang and software works >with no problem. That does not mean clang is correct here. clang breaks ELF assumptions and that is all I am going to say. If you want to break ELF fine, FreeBSD can break those. But there is still a portability issue between distros using different options like this.
[Bug driver/100937] configure: Add --enable-default-semantic-interposition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 --- Comment #4 from Andrew Pinski --- Also your patch did not change the documentation of the option. Plus the documentation is clear that changing the default is most likely not wanted at all: https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Optimize-Options.html#index-fsemantic-interposition
[Bug libstdc++/100475] semiregular-box's constructor uses wrong list-initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100475 --- Comment #7 from 康桓瑋 --- (In reply to CVS Commits from comment #6) > The master branch has been updated by Patrick Palka : > > https://gcc.gnu.org/g:fe993b469c528230d9a01e1ae2208610f960dd9f > > commit r12-856-gfe993b469c528230d9a01e1ae2208610f960dd9f > Author: Patrick Palka > Date: Tue May 18 00:28:44 2021 -0400 > > libstdc++: Fix up semiregular-box partial specialization [PR100475] > > This makes the in-place constructor of our partial specialization of > __box for already-semiregular types perform > direct-non-list-initialization > (in accordance with the specification of the primary template), and > additionally makes the member function data() use std::__addressof. > > libstdc++-v3/ChangeLog: > > PR libstdc++/100475 > * include/std/ranges (__box::__box): Use non-list-initialization > in member initializer list of in-place constructor of the > partial specialization for semiregular types. > (__box::operator->): Use std::__addressof. > * testsuite/std/ranges/adaptors/detail/semiregular_box.cc > (test02): New test. > * testsuite/std/ranges/single_view.cc (test04): New test. I think that even list-initialization with a single parameter should be changed to direct-non-list-initialization to avoid bugs in some uncommon situations. #include struct S { S() = default; S(std::initializer_list) = delete; S(const S&) {} }; S obj; auto l = std::initializer_list{{}, {}}; auto x = std::views::single(obj); auto y = std::views::single(std::move(l)); https://godbolt.org/z/7nePj6Y57
[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #17 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:16465ceb06cc1f65cfca3c0eb2c1ee27ab03bdfd commit r12-1252-g16465ceb06cc1f65cfca3c0eb2c1ee27ab03bdfd Author: liuhongt Date: Tue Jun 1 09:00:57 2021 +0800 CALL_INSN may not be a real function call. Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a fake call, it won't have its own function stack. gcc/ChangeLog PR target/82735 * df-scan.c (df_get_call_refs): When call_insn is a fake call, it won't use stack pointer reg. * final.c (leaf_function_p): When call_insn is a fake call, it won't affect caller as a leaf function. * reg-stack.c (callee_clobbers_any_stack_reg): New. (subst_stack_regs): When call_insn doesn't clobber any stack reg, don't clear the arguments. * rtl.c (shallow_copy_rtx): Don't clear flag used when orig is a insn. * shrink-wrap.c (requires_stack_frame_p): No need for stack frame for a fake call. * rtl.h (FAKE_CALL_P): New macro.
[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #18 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:9a90b311f22956addaf4f5f9bdb3592afd45083f commit r12-1253-g9a90b311f22956addaf4f5f9bdb3592afd45083f Author: liuhongt Date: Tue Jun 1 09:09:44 2021 +0800 Fix _mm256_zeroupper by representing the instructions as call_insns in which the call has a special vzeroupper ABI. When __builtin_ia32_vzeroupper is called explicitly, the corresponding vzeroupper pattern does not carry any CLOBBERS or SETs before LRA, which leads to incorrect optimization in pass_reload. In order to solve this problem, this patch refine instructions as call_insns in which the call has a special vzeroupper ABI. gcc/ChangeLog: PR target/82735 * config/i386/i386-expand.c (ix86_expand_builtin): Remove assignment of cfun->machine->has_explicit_vzeroupper. * config/i386/i386-features.c (ix86_add_reg_usage_to_vzerouppers): Delete. (ix86_add_reg_usage_to_vzeroupper): Ditto. (rest_of_handle_insert_vzeroupper): Remove ix86_add_reg_usage_to_vzerouppers, add df_analyze at the end of the function. (gate): Remove cfun->machine->has_explicit_vzeroupper. * config/i386/i386-protos.h (ix86_expand_avx_vzeroupper): Declared. * config/i386/i386.c (ix86_insn_callee_abi): New function. (ix86_initialize_callee_abi): Ditto. (ix86_expand_avx_vzeroupper): Ditto. (ix86_hard_regno_call_part_clobbered): Adjust for vzeroupper ABI. (TARGET_INSN_CALLEE_ABI): Define as ix86_insn_callee_abi. (ix86_emit_mode_set): Call ix86_expand_avx_vzeroupper directly. * config/i386/i386.h (struct GTY(()) machine_function): Delete has_explicit_vzeroupper. * config/i386/i386.md (enum unspec): New member UNSPEC_CALLEE_ABI. (ABI_DEFAULT,ABI_VZEROUPPER,ABI_UNKNOWN): New define_constants for insn callee abi index. * config/i386/predicates.md (vzeroupper_pattern): Adjust. * config/i386/sse.md (UNSPECV_VZEROUPPER): Deleted. (avx_vzeroupper): Call ix86_expand_avx_vzeroupper. (*avx_vzeroupper): Rename to .. (avx_vzeroupper_callee_abi): .. this, and adjust pattern as call_insn which has a special vzeroupper ABI. (*avx_vzeroupper_1): Deleted. gcc/testsuite/ChangeLog: PR target/82735 * gcc.target/i386/pr82735-1.c: New test. * gcc.target/i386/pr82735-2.c: New test. * gcc.target/i386/pr82735-3.c: New test. * gcc.target/i386/pr82735-4.c: New test. * gcc.target/i386/pr82735-5.c: New test.
[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #19 from Hongtao.liu --- Fixed in GCC12.
[Bug target/69199] Incorrect prototypes for AVX512 unaligned load/store builtin functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69199 --- Comment #2 from Hongtao.liu --- I can confirm it has already been fixed by r7-104
[Bug gcov-profile/100938] New: [GCOV] Coverage changes when a statement is divided in multiple lines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100938 Bug ID: 100938 Summary: [GCOV] Coverage changes when a statement is divided in multiple lines Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: njuwy at smail dot nju.edu.cn CC: marxin at gcc dot gnu.org Target Milestone: --- $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure -enable-checking=release -enable-languages=c,c++ -disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.2.0 (GCC) $ cat test.c int fn5(int x){ return -x; } int fn2(int x,int y){ return x+3-y; } int fn6(int x,int y){ return x+y; } int fn7(int x){ return 2*x; } int main() { int t1,t2,t3,t4,t5=1; int b=1,y; y = fn5(b && fn2(t1=t2,fn6(fn7(t3) < t4,t5))); y = fn5(b && fn2(t1=t2, fn6(fn7(t3) < t4,t5))); } $ gcc -O0 --coverage test.c;./a.out;gcov test;cat test.c.gcov File 'test.c' Lines executed:100.00% of 14 Creating 'test.c.gcov' -:0:Source:test.c -:0:Graph:test.gcno -:0:Data:test.gcda -:0:Runs:1 2:1:int fn5(int x){ 2:2:return -x; -:3:} 2:4:int fn2(int x,int y){ 2:5:return x+3-y; -:6:} 2:7:int fn6(int x,int y){ 2:8:return x+y; -:9:} 2: 10:int fn7(int x){ 2: 11:return 2*x; -: 12:} -: 13: 1: 14:int main() -: 15:{ 1: 16:int t1,t2,t3,t4,t5=1; 1: 17:int b=1,y; 1*: 18: y = fn5(b && fn2(t1=t2,fn6(fn7(t3) < t4,t5))); 2*: 19: y = fn5(b && fn2(t1=t2, 1: 20:fn6(fn7(t3) < t4,t5))); -: 21:} Line 18 and 19 should be executed the same number of times
[Bug target/100931] [x86-64] Failure to optimize 2 32-bit stores converted to a 64-bit store into using movabs instead of loading from a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100931 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #2 from Hongtao.liu --- What's the option do you use? with -O2 -march=x86-64, gcc generate same asm for g and h https://godbolt.org/z/Wx5eG39aG
[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885 --- Comment #6 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:be5efe9c12cb852c788f74f8555e6ab8d755479b commit r12-1254-gbe5efe9c12cb852c788f74f8555e6ab8d755479b Author: liuhongt Date: Thu Jun 3 16:38:32 2021 +0800 Fix ICE of insn does not satisfy its constraints. evex encoding vpmovzxbx needs both AVX512BW and AVX512VL which means constraint "Yw" should be used instead of constraint "v". gcc/ChangeLog: PR target/100885 * config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): Refine constraints. (v4siv4di2): Delete constraints for define_expand. gcc/testsuite/ChangeLog: PR target/100885 * g++.target/i386/pr100885.C: New test.
[Bug c/100939] New: Missing warning with misplaced attribute declaration in struct, enum, or union definition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100939 Bug ID: 100939 Summary: Missing warning with misplaced attribute declaration in struct, enum, or union definition Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: johnfbennett at protonmail dot com Target Milestone: --- $ cat misplacedattribute.c struct samplestruct { int member1; int member2; }; int main(void) { struct __attribute__((__unused__)) samplestruct samplestruct; return 0; } $ gcc -Wall misplacedattribute.c misplacedattribute.c: In function ‘main’: misplacedattribute.c:7:50: warning: unused variable ‘samplestruct’ [-Wunused-variable] 7 | struct __attribute__((__unused__)) samplestruct samplestruct; | ^~~~ $
[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885 --- Comment #7 from CVS Commits --- The releases/gcc-11 branch has been updated by hongtao Liu : https://gcc.gnu.org/g:c064e787b10069e3de56bd3d0d1a34a1a09086ea commit r11-8517-gc064e787b10069e3de56bd3d0d1a34a1a09086ea Author: liuhongt Date: Thu Jun 3 16:38:32 2021 +0800 Fix ICE of insn does not satisfy its constraints. evex encoding vpmovzxbx needs both AVX512BW and AVX512VL which means constraint "Yw" should be used instead of constraint "v". gcc/ChangeLog: PR target/100885 * config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): Refine constraints. (v4siv4di2): Delete constraints for define_expand. gcc/testsuite/ChangeLog: PR target/100885 * g++.target/i386/pr100885.C: New test.
[Bug libstdc++/100940] New: views::take and views::drop should not define _S_has_simple_extra_args
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100940 Bug ID: 100940 Summary: views::take and views::drop should not define _S_has_simple_extra_args Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- For view::take and views::drop, we need to perfectly forward its incoming arg in some uncommon situations: #include struct Five { operator int() && { return 5; } } five; extern int x[10]; auto r = x | std::views::take(five); https://godbolt.org/z/MEsssWGEh
[Bug target/100885] [12 Regression] ICE: in extract_constrain_insn, at recog.c:2671: insn does not satisfy its constraints: {sse4_1_zero_extendv8qiv8hi2}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100885 --- Comment #8 from Hongtao.liu --- Fixed in trunk.
[Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770 --- Comment #3 from Hongtao.liu --- Shouldn't pass_store_merging be better place to handle such optimization? currently store-merging only merges .a and .b, fails to merge .c and .d 202t.store-merging void caller () { struct guu D.4030; struct guu D.4029; [local count: 1073741824]: MEM [(int *)&D.4029] = 21474836483; D.4029.c = 7.0e+0; D.4029.d = 9; test (D.4029); MEM [(int *)&D.4030] = 21474836483; D.4030.c = 7.0e+0; D.4030.d = 9; test (D.4030); D.4029 ={v} {CLOBBER}; D.4030 ={v} {CLOBBER}; return;
[Bug c++/54835] [C++11][DR 1518] Explicit default constructors not respected during copy-list-initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54835 TC changed: What|Removed |Added CC||rs2740 at gmail dot com --- Comment #21 from TC --- (In reply to David Friberg from comment #19) > > P0398R0 [1] describes the final resolution to CWG 1518, after which the > following example is arguably well-formed: > It's not. Explicitness of a constructor is not considered when forming implicit conversion sequences from a braced-init-list, and therefore the assignment is ambiguous because {} can convert to either S or tag_t, even though the latter is ill-formed if actually used.
[Bug target/100941] New: wrong code with __builtin_shufflevector() with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100941 Bug ID: 100941 Summary: wrong code with __builtin_shufflevector() with -mavx512f Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 50957 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50957&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc testcase.c -Wno-psabi $ ./a.out $ x86_64-pc-linux-gnu-gcc testcase.c -mavx512f $ ./a.out Aborted $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-1254-20210607112745-gbe5efe9c12c-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-1254-20210607112745-gbe5efe9c12c-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20210607 (experimental) (GCC)
[Bug target/100929] gcc fails to optimize less to min for SIMD code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100929 --- Comment #4 from Marc Glisse --- (In reply to Denis Yaroshevskiy from comment #3) > Is what @Andrew Pinski copied enough? I think so (it is missing the command line), although one example with an integer type could also help in case floats turn out to have a different issue. > -ftrapping-math causes clang to stop doing this optimisation. Note that -ftrapping-math is on by default with gcc (PR 54192), but -fno-trapping-math wouldn't solve your problem, we are missing other things.
[Bug target/69199] Incorrect prototypes for AVX512 unaligned load/store builtin functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69199 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |7.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #3 from Andrew Pinski --- Fixed so closing.