[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 --- Comment #10 from Ivo Raisr --- (In reply to Bill Schmidt from comment #9) I confirm this fixes the problem also in the original full-blown source.
[Bug c++/66601] RFE: improve diagnostics for failure to deduce template parameter pack that is not in the last position in the parameter list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66601 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-28 CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Eric Gallager --- Confirmed.
[Bug tree-optimization/65461] -Warray-bounds warnings in the linux kernel (free_area_init_nodes)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65461 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-28 CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Eric Gallager --- Confirmed that I get the -Warray-bounds warning too.
[Bug middle-end/65041] Improve -Wclobbered
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65041 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-28 CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Eric Gallager --- Confirmed that gcc warns for a2 instead of fd.
[Bug target/71727] -O3 -mstrict-align produces code which assumes unaligned vector accesses work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727 --- Comment #4 from Christophe Lyon --- Author: clyon Date: Wed Sep 27 23:52:58 2017 New Revision: 253242 URL: https://gcc.gnu.org/viewcvs?rev=253242=gcc=rev Log: [AArch64] PR71727 fix -mstrict-align 2017-09-27 Christophe LyonPR target/71727 gcc/ * config/aarch64/aarch64.c (aarch64_builtin_support_vector_misalignment): Always return false when misalignment is unknown. gcc/testsuite/ * gcc.target/aarch64/pr71727-2.c: New test Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.c trunk/gcc/testsuite/ChangeLog
[Bug c++/82347] New: Class Name Injection and Constructor Typenames
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82347 Bug ID: 82347 Summary: Class Name Injection and Constructor Typenames Product: gcc Version: 7.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ahuszagh at gmail dot com Target Milestone: --- G++ (including 7.01 and 6.3.0) allowing the following code to compile without any warning, allowing a type alias of the constructor of a class. Source File --- a.cpp ``` #include #include #include int main() { using a = typename std::vector::vector; std::cout << typeid(a).name() << std::endl; return 0; } ``` G++ Information --- Using built-in specs. COLLECT_GCC=g++-7 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7-20170407-0ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.0.1 20170407 (experimental) [trunk revision 246759] (Ubuntu 7-20170407-0ubuntu2) Complete Command g++-7 a.cpp Compiler Output --- N/A Preprocessed File - (Not applicable?) Description --- ISO C++ in [class.qual]/2 states that a nested name specifier for the class specifies the constructor and not the class, and therefore using `typename x::x` is therefore not standards compliant. More information can be found on the StackOverflow post I made on the topic: https://stackoverflow.com/questions/46412754/class-name-injection-and-constructors
[Bug c++/82343] internal compiler error: Segmentation fault - template recurrency, SFINAE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82343 Mark changed: What|Removed |Added Attachment #42244|0 |1 is obsolete|| --- Comment #1 from Mark --- Created attachment 42251 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42251=edit (simplified) Preprocessed source code, generated by adding -save-temps I managed to narrow down the problem (see attachment). I was wrong about the function template recurrency. Problem is associated with the parameter pack of size 0 or 1 in SFINAE test. Confirmed with gcc: 8.0.0, 7.2.0, 7.1.0, 6.3.0, 6.2.0, 6.1.0, 5.4.0, 5.3.0, 5.2.0, 5.1.0, 4.9.3, 4.9.2, 4.9.1, 4.9.0, 4.8.5, 4.8.4, 4.8.3, 4.8.2, 4.8.1. Below version 4.8.1 (4.7.4 and below) everything is okay.
[Bug lto/82172] Destruction of basic_string in basic_stringbuf::overflow with _GLIBCXX_USE_CXX11_ABI=0, -flto, and C++17 mode results in invalid delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172 --- Comment #21 from Gubbins --- (In reply to Martin Liška from comment #20) > Your failure happens even w/o LTO, am I right? > But yes, the problem looks very similar to what happens for ld.bfd. You are right. Does anyone know how I would raise this with someone who can fix it on the Darwin side? Or could it be worked around by gcc?
[Bug fortran/81509] Wrong compilation error: iand/ieor/ior + boz + -std=f2008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81509 --- Comment #6 from Steve Kargl --- On Wed, Sep 27, 2017 at 10:59:56PM +, dominiq at lps dot ens.fr wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81509 > > --- Comment #5 from Dominique d'Humieres --- > pr45513 and pr54072 could be duplicates. > I don't recall either of those PR's, and have no idea why I would have missed them. :-\ pr45513 should be covered by my patch. A portion of pr54072 is also covered, but pr54072 indicates a BOZ can be used with other intrinsic subprogramis, e.g., TRANSFER. This isn't surprizing as a boz is marked as a BT_INTEGER. That is, for gfc_expr x; a boz has x->ts.type = BT_INTEGER x->is_boz = 1 In hindsight, we probably should have introduced BT_BOZ and treat it has some opaque entity with helper functions. For example, gfc_boz2int(x,kind) would convert the BOZ in x to an INTEGER with kind type parameter 'kind'.
[Bug target/80210] ICE in in extract_insn, at recog.c:2311 on ppc64 for with __builtin_pow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80210 --- Comment #16 from Peter Bergner --- While investigating the new failure in Comment 15, I modified the test case slightly to move the #pragma to the beginning of the test case. I found I get another similar looking ICE, but which isn't the same as the bug reported in Comment 15: bergner@bns:~/gcc/BUGS/PR80210> /home/bergner/gcc/build/gcc-fsf-mainline-pr80210-64-base/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-pr80210-64-base/gcc -O2 -S no-sqrt.i no-sqrt.i: In function ‘foo’: no-sqrt.i:6:1: error: unrecognizable insn: } ^ (insn 6 3 7 2 (set (reg:DF 121 [ ]) (sqrt:DF (reg/v:DF 122 [ a ]))) "no-sqrt.i":5 -1 (nil)) during RTL pass: vregs no-sqrt.i:6:1: internal compiler error: in extract_insn, at recog.c:2304 0x101330c7 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/rtl-error.c:108 0x1013310b _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/rtl-error.c:116 0x1085747b extract_insn(rtx_insn*) /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/recog.c:2304 0x10555bff instantiate_virtual_regs_in_insn /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/function.c:1591 0x10555bff instantiate_virtual_regs /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/function.c:1959 0x10555bff execute /home/bergner/gcc/gcc-fsf-mainline-pr80210-base/gcc/function.c:2008 After debugging this, I have found that this is a problem saving and restoring the optab values, so basically the opposite problem than we had before. I have a patch that I am testing that fixes both new problems.
[Bug libstdc++/82346] [5.5 Regression] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 --- Comment #7 from Jonathan Wakely --- The condition for std::to_string being declared in gcc-5 is: #if __cplusplus >= 201103L && defined(_GLIBCXX_USE_C99) So presumably _GLIBCXX_USE_C99 is false. If you're using glibc 2.26 you might have hit https://sourceware.org/bugzilla/show_bug.cgi?id=22146 and so need a glibc fix so that libstdc++ correctly detects C99 support.
[Bug target/80210] ICE in in extract_insn, at recog.c:2311 on ppc64 for with __builtin_pow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80210 Peter Bergner changed: What|Removed |Added Status|CLOSED |ASSIGNED CC||schwab at gcc dot gnu.org Resolution|FIXED |--- --- Comment #15 from Peter Bergner --- David and Andreas have reported that they are seeing ICEs on powerpc-aix and powerpc-linux (32-bit) with the test case added as part of the fix for this bug. I didn't see this on my BE builds, because my build scripts were using the --with-cpu=... configure command and using any -mcpu=... option will work around the bug. The bug isn't due to my earlier, since the ICE existed before my patch, we just didn't have the test case to notice before. The problem they are seeing is due to a mismatch between TARGET_DEFAULT, which contains MASK_PPC_GPOPT and the ISA flags for the default "powerpc64" cpu, which does not contain MASK_PPC_GPOPT and how rs6000_option_override_internal() decides which one to use. The failure scenario is: Early on, we call init_all_optabs() which setups up a table which describes which patterns that generate some HW insns are "valid". Before we call init_all_optabs(), rs6000_option_override_internal() gets called with global_init_p arg set to "true" and we basically set rs6000_isa_flags to TARGET_DEFAULT. This is because we do not have a -mcpu= value nor do we have an "implicit_cpu", which forces us to use TARGET_DEFAULT. With this, init_all_optabs() thinks we can generate a HW sqrt, so it enables generating its pattern. Later, after we've scanned the entire file, we go to expand our function into RTL and we reset our compiler options and we end up calling rs6000_option_override_internal() again, but with global_init_p arg now false and we encounter this code: struct cl_target_option *main_target_opt = ((global_init_p || target_option_default_node == NULL) ? NULL : TREE_TARGET_OPTION (target_option_default_node)); This ends up setting main_target_opt to a non-NULL value, then: ... else if (main_target_opt != NULL && main_target_opt->x_rs6000_cpu_index >= 0) { rs6000_cpu_index = cpu_index = main_target_opt->x_rs6000_cpu_index; have_cpu = true; } So now we act as if the user explicitly passed in a -mcpu= option, then: ... /* If we have a cpu, either through an explicit -mcpu= or if the compiler was configured with --with-cpu=, replace all of the ISA bits with those from the cpu, except for options that were explicitly set. If we don't have a cpu, do not override the target bits set in TARGET_DEFAULT. */ if (have_cpu) { rs6000_isa_flags &= ~set_masks; rs6000_isa_flags |= (processor_target_table[cpu_index].target_enable & set_masks); } else { /* If no -mcpu=, inherit any default options that were cleared via POWERPC_MASKS. Originally, TARGET_DEFAULT was used to initialize target_flags via the TARGET_DEFAULT_TARGET_FLAGS hook. When we switched to using rs6000_isa_flags, we need to do the initialization here. If there is a TARGET_DEFAULT, use that. Otherwise fall back to using -mcpu=powerpc, -mcpu=powerpc64, or -mcpu=powerpc64le defaults. */ HOST_WIDE_INT flags = ((TARGET_DEFAULT) ? TARGET_DEFAULT : processor_target_table[cpu_index].target_enable); rs6000_isa_flags |= (flags & ~rs6000_isa_flags_explicit); } So the first time through here with global_init_p == true, have_cpu is set to false and we get TARGET_DEFAULT. The next time we come here, global_init_p == false and we set have_cpu to true because main_target_opt is non-NULL and the cpu_index value is set to "powerpc64" (for -m64 compiles) or "powerpc" (for -m32 compiles). This causes us to now grab the ISA flags from: processor_target_table[cpu_index].target_enable ...instead of from TARGET_DEFAULT and neither "powerpc64" nor "powerpc" contain the MASK_PPC_GPOPT flag, which leads us to ICE because the optabs allows us to generate the HW sqrt pattern, but our ISA flags don't allow it. This doesn't affect LE builds, because it has a TARGET_DEFAULT value that matches the "powerpc64le" default masks. We also enforce passing a -mcpu=power8 option when the user doesn't explicitly use one, so again, not a problem. This also doesn't affect --target=powerpc-linux builds or --target=powerpc64-linux builds that default to 32-bit binaries, because we use a value of TARGET_DEFAULT == 0 (for both -m32 and -m64), so the first time through rs6000_option_override_internal(), we end up using processor_target_table[cpu_index].target_enable right from the beginning.
[Bug fortran/81509] Wrong compilation error: iand/ieor/ior + boz + -std=f2008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81509 --- Comment #5 from Dominique d'Humieres --- pr45513 and pr54072 could be duplicates.
[Bug libstdc++/82346] [5.5 Regression] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WORKSFORME --- Comment #6 from Jonathan Wakely --- This must be an Ubuntu bug. It works fine here.
[Bug rtl-optimization/82338] valgrind error in inherit_in_ebb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82338 --- Comment #1 from David Binderman --- 12 hours reducing leads to this C++ code: extern "C" { void a(); void *memset(void *, int, unsigned long); } struct b { int c; int d; } e[5000], *f; int g; int h; int i; int j, k; void l(int); int m; int o; void load() { int n; memset(e, 0, sizeof(e)); for (; k;) ; a(); for (; m < o; m++) for (; i; n++) { e[j].c = g; if (f[h].d && n == i) l(-1); } }
[Bug lto/82302] LTO producing bad code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82302 --- Comment #9 from krzysio.kurek at wp dot pl --- I think I located the issue, it works fine on my machine, but using I found an error using glslangValidator. Please try pulling and compiling again.
[Bug target/68924] No intrinsic for x86 `MOVQ m64, %xmm` in 32bit mode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68924 --- Comment #3 from Peter Cordes --- (In reply to Marc Glisse from comment #2) > Does anything bad happen if you remove the #ifdef/#endif for > _mm_cvtsi64_si128? (2 files in the testsuite would need updating for a > proper patch) It's just a wrapper for _mm_cvtsi64_si128 (long long __A) { return _mm_set_epi64x (0, __A); } and _mm_set_epi64x is already available in 32-bit mode. I tried using _mm_set_epi64x(0, i) (https://godbolt.org/g/24AYPk), and got the expected results (same as with _mm_loadl_epi64()); __m128i movq_test(uint64_t *p) { return _mm_set_epi64x( 0, *p ); } movl4(%esp), %eax vmovq (%eax), %xmm0 ret For the test where we shift before movq, it still uses 32-bit integer double-precision shifts, stores to the stack, then vmovq (instead of optimizing to vmovq / vpsllq) For the reverse, we get: long long extract(__m128i v) { return ((__v2di)v)[0]; } subl$28, %esp vmovq %xmm0, 8(%esp) movl8(%esp), %eax movl12(%esp), %edx addl$28, %esp ret MOVD / PEXTRD might be better, but gcc does handle it. It's all using syntax that's available in 32-bit mode, not a special built-in. I don't think it's helpful to disable the 64-bit integer intrinsics for 32-bit mode, even though they are no longer always single instructions. I guess it could be worse if someone used it without thinking, assuming it would be the same cost as MOVD, and didn't really need the full 64 bits. In that case, a compile-time error would prompt them to port more optimally to 32-bit. But it's not usually gcc's job to refuse to compile code that might be sub-optimal!
[Bug libstdc++/82346] [5.5 Regression] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 --- Comment #5 from krzysio.kurek at wp dot pl --- $ g++-5 -std=c++11 main.cpp -o string -v Using built-in specs. COLLECT_GCC=g++-5 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.1-12ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.4.1 20170906 (Ubuntu 5.4.1-12ubuntu4) COLLECT_GCC_OPTIONS='-std=c++11' '-o' 'string' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /usr/lib/gcc/x86_64-linux-gnu/5/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE main.cpp -quiet -dumpbase main.cpp -mtune=generic -march=x86-64 -auxbase main -std=c++11 -version -fstack-protector-strong -Wformat -Wformat-security -o /tmp/ccUSBFkt.s GNU C++11 (Ubuntu 5.4.1-12ubuntu4) version 5.4.1 20170906 (x86_64-linux-gnu) compiled by GNU C version 5.4.1 20170906, GMP version 6.1.2, MPFR version 3.1.6-rc1, MPC version 1.0.3 warning: MPFR header version 3.1.6-rc1 differs from library version 3.1.6. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/5" ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu" ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/5/../../../../x86_64-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/include/c++/5 /usr/include/x86_64-linux-gnu/c++/5 /usr/include/c++/5/backward /usr/lib/gcc/x86_64-linux-gnu/5/include /usr/local/include /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed /usr/include/x86_64-linux-gnu /usr/include End of search list. GNU C++11 (Ubuntu 5.4.1-12ubuntu4) version 5.4.1 20170906 (x86_64-linux-gnu) compiled by GNU C version 5.4.1 20170906, GMP version 6.1.2, MPFR version 3.1.6-rc1, MPC version 1.0.3 warning: MPFR header version 3.1.6-rc1 differs from library version 3.1.6. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 4fa6e505be5bb1fff9039a541d8268ee main.cpp: In function ‘int main()’: main.cpp:6:25: error: ‘to_string’ is not a member of ‘std’ std::string perfect = std::to_string(1+2+4+7+14) + " is a perfect number"; ^
[Bug libstdc++/82346] [5.5 Regression] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 Andrew Pinski changed: What|Removed |Added Component|c++ |libstdc++ Target Milestone|--- |5.5 Summary|String is not detected as a |[5.5 Regression] String is |part of std |not detected as a part of ||std --- Comment #3 from Andrew Pinski --- Can you provide the exact output of g++ then?
[Bug libstdc++/82346] [5.5 Regression] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 --- Comment #4 from Andrew Pinski --- (In reply to Andrew Pinski from comment #3) > Can you provide the exact output of g++ then? Can you provide the exact output of g++ -v then? Sorry for the typo.
[Bug c++/82346] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 krzysio.kurek at wp dot pl changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID |--- --- Comment #2 from krzysio.kurek at wp dot pl --- Oh yeah I could have mentioned, my bad. -std=c++11/-std=c++14 do not fix the issue. It only fails on 5.4.1, and works on 5.4.0.
[Bug c++/82346] String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- This compiles for me with GCC 7 without any -std=* options. GCC 6 and above default to C++14 while GCC 5 defaults to C++03 so you might need -std=c++11 or -std=gnu++11 to make std::to_string work correctly.
[Bug c++/82346] New: String is not detected as a part of std
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 Bug ID: 82346 Summary: String is not detected as a part of std Product: gcc Version: 5.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: krzysio.kurek at wp dot pl Target Milestone: --- Most basic code fails to compile. #include #include int main () { std::string perfect = std::to_string(1+2+4+7+14) + " is a perfect number"; std::cout << perfect << '\n'; return 0; }
[Bug target/82339] Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 --- Comment #5 from Peter Cordes --- (In reply to Richard Biener from comment #2) > I always wondered if it is more efficient to have constant pools per function > in .text so we can do %rip relative loads with short displacement? There's no rel8 encoding for RIP-relative; it's always RIP+rel32, so this doesn't save code-size. (AMD64 hacked it in by repurposing one of the two redundant ways to encode a 32-bit absolute address with no base or index register; the ModRM machine-code encoding is otherwise the same between x86-32 and x86-64.) > I suppose the assembler could even optimize things if there's the desired > constant somewhere near in the code itself... (in case data loads from icache > do not occur too much of a penalty). There's no penalty for loads AFAIK, only stores to addresses near RIP are snooped and cause self-modifying-code machine clears. Code will often be hot in L2 cache as well as L1I, so an L1D miss could hit there. But L1dTLB is separate from L1iTLB, so you could TLB miss even when loading from the instruction you're running. (The L2TLB is usually a victim cache, IIRC, so a TLB miss that loaded the translation into the L1iTLB doesn't also put it into L2TLB.) > The assembler could also replace > .palign space before function start with (small) constant(s). This could be a win in some cases, if L1D pressure is low or there wasn't any locality with other constants anyway. If there could have been locality, you're just wasting space in L1D by having your data spread out across more cache lines. But in general on x86, it's probably not a good strategy. BTW, gcc could do a lot better with vector constants. e.g. set1_ps(1.0f) could compile to a vbroadcastss load (which is the same cost as a normal vmovaps). But instead it actually repeats the 1.0f in memory 8 times. That's useful if you want to use it as a memory operand, because before AVX512 you can't have broadcast memory operands to ALU instructions. But if it's only ever loaded ahead of a loop, a broadcast load or a PMOVZX load can save a lot of space. In a function with multiple vector constants, this is the difference between one vs. multiple cache lines for all its data. (vpbroadcastd/q, ss/sd, and 128-bit is handled in the load ports on Intel and AMD, but vector PMOVZX/SX with a memory operand is still a micro-fused load+ALU. Still, could easily be worth it for e.g. _mm256_set_epi32(1,2,3,4,5,6,7,8), storing that as .byte 1,2,3,4,5,6,7,8. The downside is lost opportunities for different functions to share the same constant like with string-literal deduplication. If one function wants the full constant in memory for use as a memory operand, it's probably better for all functions to use that copy. Except that putting all the constants for a given function into a couple cache lines is good for locality when it runs. If the full copy somewhere else isn't generally hot when a function that could use a broadcast or pmovzx/pmovsx load runs, it might be better for it to use a separate copy stored with the constants it does touch.
[Bug libfortran/66756] libgfortran: ThreadSanitizer: lock-order-inversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66756 --- Comment #12 from Thomas Koenig --- Correction... the patch does not work with a simple example such as program main !$OMP PARALLEL NUM_THREADS(4) print *,"Hello, world" !$OMP END PARALLEL end program main Some more digging to do...
[Bug fortran/81509] Wrong compilation error: iand/ieor/ior + boz + -std=f2008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81509 --- Comment #4 from kargl at gcc dot gnu.org --- A patch has been submitted. See https://gcc.gnu.org/ml/fortran/2017-09/msg00124.html
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 --- Comment #9 from Bill Schmidt --- Revised and tested patch posted here: https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01836.html
[Bug testsuite/82324] Problem in new trunk test case gfortran.dg/promotion_4.f90
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82324 janus at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |8.0 --- Comment #7 from janus at gcc dot gnu.org --- I hope all failures should be fixed with r253214. If not, please reopen.
[Bug libfortran/66756] libgfortran: ThreadSanitizer: lock-order-inversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66756 --- Comment #11 from Thomas Koenig --- Created attachment 42250 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42250=edit Proposed patch This patch is an attempt at getting rid of the lock-order inversion. It seems to do the right thing, and survives both regression-testing and the thread sanitizer. It is not yet complete (comments are not adjusted). I would be grateful if somebody had a way to stress-test it.
[Bug target/82339] Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 --- Comment #4 from Peter Cordes --- (In reply to Jakub Jelinek from comment #0) > At least on i7-5960X in the following testcase: > > baz is fastest as well as shortest. > So I think we should consider using movl $cst, %edx; shlq $shift, %rdx > instead of movabsq $(cst << shift), %rdx. > > Unfortunately I can't find in Agner Fog MOVABS and for MOV r64,i64 there is > too little information, so it is unclear on which CPUs it is beneficial. Agner uses Intel syntax, where imm64 doesn't have a special mnemonic. It's part of the mov r,i entry in the tables. But those tables are throughput for a flat sequence of the instruction repeated many times, not mixed with others where front-end effects can be different. Agner probably didn't actually test mov r64,imm64, because its throughput is different when tested in a long sequence (not in a small loop). According to http://users.atw.hu/instlatx64/GenuineIntel00506E3_Skylake2_InstLatX64.txt, a regular desktop Skylake has 0.64c throughput for mov r64, imm64, vs. 0.25 for mov r32, imm32. (They don't test mov r/m64, imm32, the 7-byte encoding for something like mov rax,-1). Skylake with up-to-date microcode (including all SKX CPUs) disables the loop buffer (LSD), and has to read uops from the uop cache every time even in short loops. Uop-cache effects could be a problem for instructions with a 64-bit immediate. Agner only did detailed testing for Sandybridge; it's likely that Skylake still mostly works the same (although the uop cache read bandwidth is higher). mov r64, imm64 takes 2 entries in the uop cache (because of the 64-bit immediate that's outside the signed 32-bit range), and takes 2 cycles to read from the uop cache, according to Agner's Table 9.1 in his microarch pdf. It can borrow space from another entry in the same uop cache line, but still takes extra cycles to read. See https://stackoverflow.com/questions/46433208/which-is-faster-imm64-or-m64-for-x86-64 for an SO question the other day about loading constants from memory vs. imm64. (Although I didn't have anything very wise to say there, just that it depends on surrounding code as always!) > Peter, any information on what the MOV r64,i64 latency/throughput on various > CPUs vs. MOV r32,i32; SHL r64,i8 is? When not bottlenecked on the front-end, mov r64,i64 is a single ALU uop with 1c latency. I think it's pretty much universal that it's the best choice when you bottleneck on anything else. Some loops *do* bottleneck on the front-end, though, especially without unrolling. But then it comes down to whether we have a uop-cache read bottleneck, or a decode bottleneck, or an issue bottleneck (4 fused-domain uops per clock renamed/issued). For issue/retire bandwidth mov/shl is 2 uops instead of 1. But for code that bottlenecks on reading the uop-cache, it's really hard to say if one is better in general. I think if the imm64 can borrow space in other uops in the cache line, it's better for uop-cache density than mov/shl. Unless the extra code-size means one fewer instruction fits into a uop cache line that wasn't nearly full (6 uops). Front-end stuff is *very* context-sensitive. :/ Calling a very short non-inline function from a tiny loop is probably making the uop-cache issues worse, and is probably favouring the mov/shift over the mov r64,imm64 approach more than you'd see as part of a larger contiguous block. I *think* mov r64,imm64 should still generally be preferred in most cases. Usually the issue queue (IDQ) between the uop cache and the issue/rename stage can absorb uop-cache read bubbles. A constant pool might be worth considering if code-size is getting huge (average instruction length much greater than 4). Normally of course you'd really want to hoist an imm64 out of a loop, if you have a spare register. When optimizing small loops, you can usually avoid front-end bottlenecks. It's a lot harder for medium-sized loops involving separate functions. I'm not confident this noinline case is very representative of real code. --- Note that in this special case, you can save another byte of code by using ror rax (implicit by-one encoding). Also worth considering for tune=sandybridge or later: xor eax,eax / bts rax, 63. 2B + 5B = 7B. BTS has 0.5c throughput, and xor-zeroing doesn't need an ALU on SnB-family (so it has zero latency; the BTS can execute right away even if it issues in the same cycle as xor-zeroing). BTS runs on the same ports as shifts (p0/p6 in HSW+, or p0/p5 in SnB/IvB). On older Intel, it has 1 per clock throughput for the reg,imm form. On AMD, it's 2 uops, with 1c throughput (0.5c on Ryzen), so its not bad if used on AMD CPUs, but it doesn't look good for tune=generic. At -Os, you could consider or eax, -1; shl rax,63. (Also 7 bytes, and works for constants with multiple consecutive high-bits set). The false dependency on the old RAX value is often not a bottleneck, and gcc
[Bug target/69493] Poor code generation for return of struct containing vectors on PPC64LE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69493 Peter Bergner changed: What|Removed |Added CC||bergner at gcc dot gnu.org --- Comment #6 from Peter Bergner --- A simpler test case that shows the same problem when compiling for POWER8. When compiling for POWER9, we get the code we want/expect: bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.c typedef struct { __vector double vx0; __vector double vx1; } vec_t; vec_t foo (__vector double a, __vector double b) { vec_t result; result.vx0 = a; result.vx1 = b; return result; } bergner@pike:~/gcc/BUGS/PR70053$ /home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc -S -O2 -mcpu=power8 pr69493-2.c bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.s ... foo: addi 8,1,-96 li 10,32 xxpermdi 34,34,34,2 xxpermdi 35,35,35,2 li 9,48 stxvd2x 34,8,10 stxvd2x 35,8,9 lxvd2x 34,8,10 lxvd2x 35,8,9 xxpermdi 34,34,34,2 xxpermdi 35,35,35,2 blr bergner@pike:~/gcc/BUGS/PR70053$ /home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-pr70053-debug/gcc -S -O2 -mcpu=power9 pr69493-2.c bergner@pike:~/gcc/BUGS/PR70053$ cat pr69493-2.s ... foo: blr
[Bug fortran/82258] [8 regression] allocate_zerosize_3.f fails since r251949
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82258 --- Comment #9 from Christophe Lyon --- I get: 1 2 1 0 -2 -3 -4 3 4 5 0 7 8 9
[Bug c++/82345] low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 Jonathan Wakely changed: What|Removed |Added Status|WAITING |NEW --- Comment #5 from Jonathan Wakely --- (In reply to Eugene from comment #3) > Created attachment 42249 [details] > source code Thanks, GCC does indeed perform worse for the version using boost::string_view.
[Bug c++/82345] low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 --- Comment #4 from Jonathan Wakely --- When I compare the performance of this similar program on a text file of 4 million lines I see gcc performs slightly better: #include #include #include int main(int , char**argv) { std::ifstream in(argv[1]); std::string line; while (std::getline(in, line)) { auto pos = std::experimental::string_view(line).find("http"); } }
[Bug c++/82345] low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 --- Comment #3 from Eugene --- Created attachment 42249 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42249=edit source code
[Bug c++/82345] low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2017-09-27 Ever confirmed|0 |1 --- Comment #2 from Jonathan Wakely --- Please attach the source here, don't link to somewhere else. Compress it if needed, or better still, reduce it: https://gcc.gnu.org/wiki/A_guide_to_testcase_reduction https://gcc.gnu.org/bugs/
[Bug c++/82345] low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 --- Comment #1 from Eugene --- Source file https://yadi.sk/d/FqXH-4Y63NGeSw
[Bug c++/82345] New: low performance (comparing to clang)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82345 Bug ID: 82345 Summary: low performance (comparing to clang) Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: claprix at yandex dot ru Target Milestone: --- String search is slow for this code. $ g++ -O2 -DNDEBUG -std=c++14 gcc_perf_buf.cc && time ./a.out shodan/huge01.txt real0m0.470s user0m0.367s sys 0m0.104s $ clang++ -O2 -DNDEBUG -std=c++14 gcc_perf_buf.cc && time ./a.out shodan/huge01.txt real0m0.248s user0m0.179s sys 0m0.069s $ gcc --version gcc (Ubuntu 7.2.0-7ubuntu1) 7.2.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ lsb_release -d Description:Ubuntu Artful Aardvark (development branch)
[Bug c++/63392] poor error recovery with missing typename
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63392 Eric Gallager changed: What|Removed |Added Keywords||error-recovery Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-27 CC||egallager at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Eric Gallager --- Confirmed.
[Bug target/82341] [8 regression] i386/pr80732.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82341 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||jakub at gcc dot gnu.org Resolution|--- |FIXED --- Comment #2 from Jakub Jelinek --- Can't reproduce this. I believe this test used to FAIL since r252976 till r253100, but shouldn't be broken anymore.
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 --- Comment #8 from Bill Schmidt --- Created attachment 42248 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42248=edit Proposed patch Here's what I'm testing -- looks like it fixes this particular case.
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 --- Comment #7 from Bill Schmidt --- I think we can do something simpler by just keeping these abnormal SSA names out of the basis chains in the table. Working on a patch.
[Bug target/82342] [8 regression] i386/pr82260-2.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82342 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2017-09-27 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Jakub Jelinek --- Created attachment 42247 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42247=edit gcc8-pr82342.patch Untested fix. BMI2 sarx is not something the test intends to test, so disable it.
[Bug rtl-optimization/82344] New: [8 Regression] SPEC CPU2006 435.gromacs ~10% performance regression with trunk@250855
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82344 Bug ID: 82344 Summary: [8 Regression] SPEC CPU2006 435.gromacs ~10% performance regression with trunk@250855 Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: alexander.nesterovskiy at intel dot com Target Milestone: --- Created attachment 42246 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42246=edit r250854 vs r250855 generated code comparison Compilation options that affects regression: "-Ofast -march=core-avx2 -mfpmath=sse" Regression happened after r250855 though it looks like this commit is not of guilty by itself but reveals something in other stages. Changes in 123t.reassoc1 stage leads to a bit different code generation during stages that follow it. Place of interest is in "inl1130" subroutine (file "innerf.f") - it's a part of a big loop with 9 similar expressions with 4-byte float variables: --- y1 = 1.0/sqrt(x1) y2 = 1.0/sqrt(x2) y3 = 1.0/sqrt(x3) y4 = 1.0/sqrt(x4) y5 = 1.0/sqrt(x5) y6 = 1.0/sqrt(x6) y7 = 1.0/sqrt(x7) y8 = 1.0/sqrt(x8) y9 = 1.0/sqrt(x9) --- When compiled with "-ffast-math" 1/sqrt is calculated with "vrsqrtss" instruction followed by Newton-Raphson step with four "vmulss", one "vadss" and two constants used. Like here (part of r250854 code): --- vrsqrtss xmm12, xmm12, xmm7 vmulss xmm7, xmm12, xmm7 vmulss xmm0, xmm12, DWORD PTR .LC2[rip] vmulss xmm8, xmm7, xmm12 vaddss xmm5, xmm8, DWORD PTR .LC1[rip] vmulss xmm1, xmm5, xmm0 --- Input values (x1-x9) are in xmm registers mostly (x2 and x7 in memory), output values (y1-y9) are in xmm registers. After r250855 .LC2 constant goes into xmm7 and x7 is also goes to xmm register. This leads to lack of temporary registers and worse instructions interleaving as a result. See attached picture with part of assembly listings where corresponding y=1/sqrt parts are painted the same color. Finally these 9 lines of code are executed about twice slower which leads to ~10% performance regression for whole test. I've made two independent attempts to change code in order to verify the above. 1. To be sure that we loose performance exactly in this part of a loop I just pasted ~60 assembly instructions from previous revision to a new one (after proper renaming of course). This helped to restore performance. 2. To be sure that the problem is due to a lack of temporary registers I moved calculation of 1/sqrt for one last line into function call. Like here: --- ... in other module to disable inlining: function myrsqrt(x) implicit none real*4 x real*4 myrsqrt myrsqrt = 1.0/sqrt(x); return end function myrsqrt ... y1 = 1.0/sqrt(x1) y2 = 1.0/sqrt(x2) y3 = 1.0/sqrt(x3) y4 = 1.0/sqrt(x4) y5 = 1.0/sqrt(x5) y6 = 1.0/sqrt(x6) y7 = 1.0/sqrt(x7) y8 = 1.0/sqrt(x8) y9 = myrsqrt(x9) --- Even with call/ret overhead this also helped to restore performance since it freed some registers.
[Bug target/82012] [8 Regression] libitm build fails for s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82012 Andreas Krebbel changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #12 from Andreas Krebbel --- Commit in Comment #9
[Bug target/68924] No intrinsic for x86 `MOVQ m64, %xmm` in 32bit mode.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68924 --- Comment #2 from Marc Glisse --- Does anything bad happen if you remove the #ifdef/#endif for _mm_cvtsi64_si128? (2 files in the testsuite would need updating for a proper patch)
[Bug c++/82159] [6/7/8 Regression] ICE: in assign_temp, at function.c:961
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82159 --- Comment #4 from Jakub Jelinek --- Author: jakub Date: Wed Sep 27 14:19:57 2017 New Revision: 253230 URL: https://gcc.gnu.org/viewcvs?rev=253230=gcc=rev Log: PR c++/82159 * gimplify.c (gimplify_modify_expr): Don't optimize away zero sized lhs from calls if the lhs has addressable type. * g++.dg/opt/pr82159.C: New test. Added: trunk/gcc/testsuite/g++.dg/opt/pr82159.C Modified: trunk/gcc/ChangeLog trunk/gcc/gimplify.c trunk/gcc/testsuite/ChangeLog
[Bug c/82340] volatile ignored in compound literal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82340 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2017-09-27 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Jakub Jelinek --- Created attachment 42245 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42245=edit gcc8-pr82340.patch Untested fix.
[Bug target/82339] Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #3 from Alexander Monakov --- In addition to Agner Fog's manuals, the instlatx64 resource provide plenty of latency/throughput data: http://users.atw.hu/instlatx64/ The benchmark in comment 0 measures throughput (including call/return overhead which seems a bit strange), latency-wise movabs should be preferable. So I think this indicates that a "real fix" should try to evaluate if a 64-bit immediate move starts a critical-ish dependency chain, if yes, then we should be trying to reduce latency and should prefer movabs, if not, we may prefer the mov+shl combo that trades latency for overall throughput (i.e. assuming the additional latency can be hidden by compiler scheduling and CPU reordering).
[Bug c++/81398] Complaining about 'partial specialization of '...' after instantiation' in c++1z
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81398 d25fe0be@ changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from d25fe0be@ --- Let's close it then. Sorry for the noise.
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 Bill Schmidt changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wschmidt at gcc dot gnu.org --- Comment #6 from Bill Schmidt --- You bet. I'll try to have a look today.
[Bug c++/82343] New: internal compiler error: Segmentation fault - template recurrency, SFINAE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82343 Bug ID: 82343 Summary: internal compiler error: Segmentation fault - template recurrency, SFINAE Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: p1006680 at mvrht dot net Target Milestone: --- Created attachment 42244 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42244=edit Preprocessed source code, generated by adding -save-temps Segmentation fault while compiling code (see attachment). Bug is probably associated with function template recurrency. Tested on Wandbox (https://wandbox.org). Confirmed with gcc: 8.0, 7.2, 7.1. Exact version of GCC (gcc -v output): Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.2.0-1ubuntu1~16.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.2.0 (Ubuntu 7.2.0-1ubuntu1~16.04) System type (lsb_release -a output): No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 16.04.3 LTS Release:16.04 Codename: xenial Complete command line that triggers the bug: g++ prog.cpp -std=gnu++1z Compiler output (exit code: 4): g++: internal compiler error: Segmentation fault signal terminated program cc1plus Please submit a full bug report, with preprocessed source if appropriate. See for instructions. Attachment: prog.ii - preprocessed source code, generated by adding -save-temps
[Bug target/82341] [8 regression] i386/pr80732.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82341 Richard Biener changed: What|Removed |Added Target Milestone|--- |8.0 --- Comment #1 from Richard Biener --- Confirmed.
[Bug target/82342] [8 regression] i386/pr82260-2.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82342 Richard Biener changed: What|Removed |Added Target Milestone|--- |8.0 --- Comment #1 from Richard Biener --- Confirmed.
[Bug middle-end/82095] [8 Regression] ICE in tree_nop_conversion at tree.c:11793 on ppc64le
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82095 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Jakub Jelinek --- Fixed.
[Bug c++/82115] [8 Regression] ICE on (valid) C++11 code: Segmentation fault signal terminated program cc1plus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82115 Jakub Jelinek changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||nathan at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Anyway, if value_dependent_expression_p needs to recurse on DECL_INITIAL that can contain arbitrary stuff, including the VAR_DECL with that DECL_INITIAL in it, we need to avoid the recursion. "it is a constant with literal type and is initialized with an expression that is value-dependent." is what applies here. So do we need some hash-map that will track VAR_DECLs on whose DECL_INITIAL we've already recursed? The problem is that value_dependent_expression_p calls type_dependent_expression_p and vice versa, so it is unclear when that hash-map should be saved/restored. Even if it is not possible to reach similar infinite recursion through both of those functions (i.e. when we could save/restore the hash-map in "toplevel" value_dependent_expression_p call and use value_dependent_expression_p_1 recursing to itself from it, another question is how to do that efficiently; how common are VAR_DECLs on which we recurse on DECL_INITIAL? How many there are on average during one top-level value_dependent_expression_p call? E.g. if the common cases are 0 or 1 times, perhaps we could have next to the hash-map a single tree which we'd compare and only create hash-map if seeing another one.
[Bug target/82138] [8 Regression] Assembler messages: Error: can't resolve `.got2' {.got2 section} - `.LCF0' {.text.unlikely section}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82138 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P4 CC||jakub at gcc dot gnu.org
[Bug c++/82148] [7/8 Regression] ICE in assign_temp, at function.c:968
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82148 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- The gimplifier is handed call to foo with argument of type Derived, the conversion has been omitted. As the call expects a type (in this case empty class) passed by value, but the argument is one that should be passed by invisible reference, this obviously crashes during expansion.
[Bug c/82340] volatile ignored in compound literal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82340 --- Comment #2 from Pascal Cuoq --- Richard: > I don't see how a volatile compound literal could make any sense or how > you'd observe the side-effect of multiple stores to it Well, I have the same question about volatile variables the address of which is not taken. But this is off-topic for this bug report, in which the volatile's object's address is taken. > (IIRC compound literals are constant!?). The C11 standard invites the programmer to use the const qualifier if they want a constant compound literal, and gives an explicit example of a “modifiable” non-const compound literal: https://port70.net/~nsz/c/c11/n1570.html#6.5.2.5p12
[Bug target/82342] New: [8 regression] i386/pr82260-2.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82342 Bug ID: 82342 Summary: [8 regression] i386/pr82260-2.c fail Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: andrey.y.guskov at intel dot com Target Milestone: --- r253050 triggered these fails: FAIL: gcc.target/i386/pr82260-2.c scan-assembler \\mmovl\\t%esi, %ecx FAIL: gcc.target/i386/pr82260-2.c scan-assembler \\mmovb\\t%dl, %cl FAIL: gcc.target/i386/pr82260-2.c scan-assembler \\mmovb\\t%r8b, %cl Option set: -with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-shared --enable-host-shared --enable-clocale=gnu --enable-cloog-backend=isl --enable-languages=c,c++,fortran,jit,lto --with-arch=haswell --with-cpu=haswell
[Bug target/82341] New: [8 regression] i386/pr80732.c fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82341 Bug ID: 82341 Summary: [8 regression] i386/pr80732.c fail Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: andrey.y.guskov at intel dot com Target Milestone: --- r253037 triggered the following fail: FAIL: gcc.target/i386/pr80732.c (test for excess errors) Excess errors: gcc.target/i386/pr80732.c:46:8: warning: 'f2' 'ifunc' resolver should return a function pointer [-Wattributes] Option set: -with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-shared --enable-host-shared --enable-clocale=gnu --enable-cloog-backend=isl --enable-languages=c,c++,fortran,jit,lto --with-arch=haswell --with-cpu=haswell
[Bug other/82327] [7 Regression] ICE in equal_mem_array_ref_p, at tree-ssa-scopedtables.c:429 (i686-linux-gnu)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82327 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Can't reproduce, neither with current gcc-7-branch, nor with r253107 (both x86_64-linux with -m32 -march=i686 -mtune=generic -mno-sse), nor Sep 15th build of i686-linux compiler. Whether the compiler defaults to PIE or not should not matter given the explicit -fPIC. Can you reproduce it with vanilla gcc-7-branch?
[Bug middle-end/81657] [8 Regression] FAIL: gcc.dg/20050503-1.c scan-assembler-not call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81657 Andrey Guskov changed: What|Removed |Added CC||andrey.y.guskov at intel dot com --- Comment #2 from Andrey Guskov --- Also seeing this.
[Bug target/82339] Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 Richard Biener changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* --- Comment #2 from Richard Biener --- I always wondered if it is more efficient to have constant pools per function in .text so we can do %rip relative loads with short displacement? I suppose the assembler could even optimize things if there's the desired constant somewhere near in the code itself... (in case data loads from icache do not occur too much of a penalty). The assembler could also replace .palign space before function start with (small) constant(s). Nothing we can really do given x86 has no idea of instruction sizes.
[Bug libfortran/66756] libgfortran: ThreadSanitizer: lock-order-inversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66756 --- Comment #10 from Dominique d'Humieres --- > Could this still be fixed / filtered out in the ThreadSanitizer somehow? Should it be moved to the sanitizer component?
[Bug c/82340] volatile ignored in compound literal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82340 Richard Biener changed: What|Removed |Added Component|middle-end |c --- Comment #1 from Richard Biener --- Well. I don't see how a volatile compound literal could make any sense or how you'd observe the side-effect of multiple stores to it (IIRC compound literals are constant!?). So you need to come up with sth more clever to be convincing ;) Implementation-wise we end up with volatile char * p; volatile char D.1801[1]; [0.00%]: D.1801[0] = 1; p_6 = i_7 = 1; goto ; [0.00%] [0.00%]: *p_6 ={v} 4; i_10 = i_2 + 1; [0.00%]: # i_2 = PHIif (i_2 <= 9) goto ; [0.00%] else goto ; [0.00%] [0.00%]: _1 ={v} *p_6; _8 = (int) _1; which is mostly fine but the initialization of D.1801[0] is not a volatile access. And after optimization volatile char D.1801[1]; char _1; unsigned int ivtmp_2; int _6; unsigned int ivtmp_10; [10.00%]: [90.00%]: # ivtmp_2 = PHI MEM[(volatile char *)] ={v} 4; ivtmp_10 = ivtmp_2 + 4294967295; if (ivtmp_10 != 0) goto ; [88.89%] else goto ; [11.11%] [10.00%]: _1 ={v} MEM[(volatile char *)]; _6 = (int) _1; so GIMPLE does what you want but somewhere on RTL we are too clever in the end. D.1801 ends up being expanded as a register. Probably the variable itself is _not_ marked volatile but only the array member type.
[Bug libfortran/66756] libgfortran: ThreadSanitizer: lock-order-inversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66756 Thomas Koenig changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|INVALID |--- Severity|normal |enhancement --- Comment #9 from Thomas Koenig --- Well, if it is indeed a serious problem, we can re-open as an enhancement request. Unfortunately, I don't know if it is possible to shut up the thread sanitizer somehow. A possiblity would be to lock the unit only after the global lock has been released, and possibly keep around the global lock for longer. If we still are in the process of opening the file in the original thread, then there should be no problem (at least I hope so...)
[Bug lto/82172] Destruction of basic_string in basic_stringbuf::overflow with _GLIBCXX_USE_CXX11_ABI=0, -flto, and C++17 mode results in invalid delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172 --- Comment #20 from Martin Liška --- (In reply to Gubbins from comment #19) > (In reply to Martin Liška from comment #18) > > Issue solved, ld.bfd is responsible. > > Unfortunately, the same test program also crashes when built and linked on > OSX. > > I tested with Sierra (OSX 10.12.5), gcc 7.2.0, compiling the original sample > here with: > > g++-7 -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 -O1 test.cpp > > Result: > > > ./a.out > a.out(608,0x7fffbda8a3c0) malloc: *** error for object 0x10c68d080: pointer > being freed was not allocated > *** set a breakpoint in malloc_error_break to debug > Abort trap: 6 > > > I think this means the Darwin linker has a similar problem. Your failure happens even w/o LTO, am I right? But yes, the problem looks very similar to what happens for ld.bfd.
[Bug target/82339] Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 --- Comment #1 from Jakub Jelinek --- Created attachment 42243 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42243=edit gcc8-pr82339.patch Patch for -Os where movl $cst, %eXX; shlq $shift, %rXX is 1 byte shorter than movabsq $(cst << shift), %rXX. For speed, there is yet another option of movq $cst, %rXX; shlq $shift, %rXX for constants like 0x12345670 which have a sequence of 1 bit, followed by at most 31 arbitrary bits and then the rest is all 0s, movabsq $12345670, %r8 is equivalent to movq $-249346713, %r8; shlq $20, %r8 which is longer, but perhaps faster (on which CPUs).
[Bug c/82340] New: volatile ignored in compound literal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82340 Bug ID: 82340 Summary: volatile ignored in compound literal Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pascal_cuoq at hotmail dot com Target Milestone: --- Consider the function f below: int f(void) { volatile char *p = (volatile char[1]){1}; for (int i=1; i<10; i++) *p=4; return *p; } Volatile access is a visible side-effect, so one may expect the generated code for the function f to do “something” nine times, for some definition of “something”. In GCC 7.2 and in gcc.godbolt.org's current snapshot of “gcc (trunk)”, the function f is compiled to: f: movl$4, %eax ret Command: gcc -O3 -std=c11 -xc -pedantic -S t.c Link: https://godbolt.org/g/4Ua1Ud I would expect function f to be compiled to something that ressembles the code produced for function g, or the code produced by Clang for f: int g(void) { volatile char t[1] = {1}; volatile char *p = t; for (int i=1; i<10; i++) *p=4; return *p; } g: movb$1, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movb$4, -1(%rsp) movsbl -1(%rsp), %eax ret
[Bug lto/82172] Destruction of basic_string in basic_stringbuf::overflow with _GLIBCXX_USE_CXX11_ABI=0, -flto, and C++17 mode results in invalid delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172 Gubbins changed: What|Removed |Added CC||dave.gittins at gmail dot com --- Comment #19 from Gubbins --- (In reply to Martin Liška from comment #18) > Issue solved, ld.bfd is responsible. Unfortunately, the same test program also crashes when built and linked on OSX. I tested with Sierra (OSX 10.12.5), gcc 7.2.0, compiling the original sample here with: g++-7 -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 -O1 test.cpp Result: > ./a.out a.out(608,0x7fffbda8a3c0) malloc: *** error for object 0x10c68d080: pointer being freed was not allocated *** set a breakpoint in malloc_error_break to debug Abort trap: 6 I think this means the Darwin linker has a similar problem.
[Bug target/82339] New: Inefficient movabs instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 Bug ID: 82339 Summary: Inefficient movabs instruction Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- At least on i7-5960X in the following testcase: __attribute__((noinline, noclone)) unsigned long long int foo (int x) { asm volatile ("" : : : "memory"); return 1ULL << (63 - x); } __attribute__((noinline, noclone)) unsigned long long int bar (int x) { asm volatile ("" : : : "memory"); return (1ULL << 63) >> x; } __attribute__((noinline, noclone)) unsigned long long int baz (int x) { unsigned long long int y = 1; asm volatile ("" : "+r" (y) : : "memory"); return (y << 63) >> x; } int main (int argc, const char **argv) { int i; if (argc == 1) for (i = 0; i < 10; i++) asm volatile ("" : : "r" (foo (13))); else if (argc == 2) for (i = 0; i < 10; i++) asm volatile ("" : : "r" (bar (13))); else if (argc == 3) for (i = 0; i < 10; i++) asm volatile ("" : : "r" (baz (13))); return 0; } baz is fastest as well as shortest. So I think we should consider using movl $cst, %edx; shlq $shift, %rdx instead of movabsq $(cst << shift), %rdx. Unfortunately I can't find in Agner Fog MOVABS and for MOV r64,i64 there is too little information, so it is unclear on which CPUs it is beneficial. For -Os, if the destination is a %rax to %rsp register, it is one byte shorter (5+4 vs 10), for %r8 to %r15 it is the same size. For speed optimization, the disadvantage is obviously that the shift clobbers flags register. Peter, any information on what the MOV r64,i64 latency/throughput on various CPUs vs. MOV r32,i32; SHL r64,i8 is?
[Bug c/65892] gcc fails to implement N685 aliasing of union members
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 --- Comment #27 from rguenther at suse dot de --- On Wed, 27 Sep 2017, david at westcontrol dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 > > David Brown changed: > >What|Removed |Added > > CC||david at westcontrol dot com > > --- Comment #26 from David Brown --- > (In reply to rguent...@suse.de from comment #24) > > On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 > > > > > > --- Comment #22 from Tim Rentsch --- > > > [responding to comments from rguent...@suse.de in Comment 20] > > > > > > > GCC already implements this if you specify -fno-strict-aliasing. > > > > > > The main point of my comments is that the ISO C standard requires > > > the behavior in this case (and similar cases) be defined and not > > > subject to any reordering. In other words the result must be the > > > same as an unoptimized version. If a -fstrict-aliasing gcc /does/ > > > transform the code so that the behavior is not the same as an > > > unoptimized version, then gcc is not a conforming implementation. > > > > GCC has various optimization options that make it a not strictly > > conforming implementation (-ffast-math for example), various > > GNU extensions to the language, etc. > > > > > Or is it your position that gcc is conforming only when operated > > > in the -fno-strict-aliasing mode? That position seems contrary to > > > the documented description of the -fstrict-aliasing option. > > > > Well, N685 is still disputed in this bug. I was just pointing out > > that GCC has a switch to make it conforming to your interpretation > > of the standard (and this switch is the default at -O0 and -O1). > > A key difference with non-conformance options like -ffast-math is that these > are not default options. A user must actively choose to use them. A user > should not need particular options in order to get correct object code from > their correct source code - or at least the user should get obvious error > messages when using default options but where their source code hits an oddity > in gcc (as they would get if they happened to use a gcc extension keyword like > "asm" as an identifier in conforming C code). What should not happen is for > the compiler to silently break good code unless the user has given specific > flags. > > I am not sure whether this particular case really is a bug or not. However, I > wonder if there has been too much emphasis on trying to understand exactly > what > the standards say. If the gcc developers here, who are amongst the most > knowledgeable C and C++ experts around, have trouble with the details - then > consider the position of the average C developer. Maybe it is better to try > see it from their viewpoint - would a programmer expect these accesses to > alias > or not? If it is likely that programmers would expect aliasing here, and see > that behaviour in other compilers, then the /useful/ default behaviour for gcc > would be to treat code in the way programmers expect - even with -O3. Then > have a "-fI-know-what-I-am-doing" flag for those that want to squeeze out the > last bit of performance. Unfortunately it's not the "last bit of performance", otherwise it would be indeed a no-brainer. People expect fast code from a compiler and do not want to enable dozens of -fIm-writing-reasonable-code. Some benchmarks even have rules as to how many options you are allowed to enable... Richard.
[Bug c/65892] gcc fails to implement N685 aliasing of union members
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 David Brown changed: What|Removed |Added CC||david at westcontrol dot com --- Comment #26 from David Brown --- (In reply to rguent...@suse.de from comment #24) > On Wed, 2 Nov 2016, txr at alumni dot caltech.edu wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 > > > > --- Comment #22 from Tim Rentsch --- > > [responding to comments from rguent...@suse.de in Comment 20] > > > > > GCC already implements this if you specify -fno-strict-aliasing. > > > > The main point of my comments is that the ISO C standard requires > > the behavior in this case (and similar cases) be defined and not > > subject to any reordering. In other words the result must be the > > same as an unoptimized version. If a -fstrict-aliasing gcc /does/ > > transform the code so that the behavior is not the same as an > > unoptimized version, then gcc is not a conforming implementation. > > GCC has various optimization options that make it a not strictly > conforming implementation (-ffast-math for example), various > GNU extensions to the language, etc. > > > Or is it your position that gcc is conforming only when operated > > in the -fno-strict-aliasing mode? That position seems contrary to > > the documented description of the -fstrict-aliasing option. > > Well, N685 is still disputed in this bug. I was just pointing out > that GCC has a switch to make it conforming to your interpretation > of the standard (and this switch is the default at -O0 and -O1). A key difference with non-conformance options like -ffast-math is that these are not default options. A user must actively choose to use them. A user should not need particular options in order to get correct object code from their correct source code - or at least the user should get obvious error messages when using default options but where their source code hits an oddity in gcc (as they would get if they happened to use a gcc extension keyword like "asm" as an identifier in conforming C code). What should not happen is for the compiler to silently break good code unless the user has given specific flags. I am not sure whether this particular case really is a bug or not. However, I wonder if there has been too much emphasis on trying to understand exactly what the standards say. If the gcc developers here, who are amongst the most knowledgeable C and C++ experts around, have trouble with the details - then consider the position of the average C developer. Maybe it is better to try see it from their viewpoint - would a programmer expect these accesses to alias or not? If it is likely that programmers would expect aliasing here, and see that behaviour in other compilers, then the /useful/ default behaviour for gcc would be to treat code in the way programmers expect - even with -O3. Then have a "-fI-know-what-I-am-doing" flag for those that want to squeeze out the last bit of performance.
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 Richard Biener changed: What|Removed |Added CC||wschmidt at gcc dot gnu.org --- Comment #5 from Richard Biener --- It's SLSR, so a workaround is -fno-tree-slsr. It's a bit iffy to fix given one of the suitable points to fence off is alloc_cand_and_find_basis where we should "reject" SSA_NAME_OCCURS_IN_ABNORMAL_PHI (base) but this function isn't supposed to "FAIL". I'm also not sure it covers the cases fully. Basically when doing replacements SLSR may _never_ end up with a SSA_NAME_OCCURS_IN_ABNORMAL_PHI SSA name in the replacement expression. Costing doesn't seem to apply to unconditional candidates so fending it off there doesn't seem viable. The easiest thing is to try fend off during the stmt walk like with the following big hammer. Not sure if that's enough or we walk SSA def stmts from other places. Bill, can you take over with the hint below? Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 253203) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -802,6 +802,8 @@ slsr_process_phi (gphi *phi, bool speed) definitions must be in the same position in the loop hierarchy as PHI. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) +return; for (i = 0; i < gimple_phi_num_args (phi); i++) { slsr_cand_t arg_cand; @@ -810,7 +812,8 @@ slsr_process_phi (gphi *phi, bool speed) gimple *arg_stmt = NULL; basic_block arg_bb = NULL; - if (TREE_CODE (arg) != SSA_NAME) + if (TREE_CODE (arg) != SSA_NAME + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg)) return; arg_cand = base_cand_from_table (arg); @@ -1738,6 +1741,18 @@ find_candidates_dom_walker::before_dom_c { gimple *gs = gsi_stmt (gsi); + tree op; + ssa_op_iter iter; + bool abnormal_found = false; + FOR_EACH_SSA_TREE_OPERAND (op, gs, iter, SSA_OP_USE|SSA_OP_DEF) + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) + { + abnormal_found = true; + break; + } + if (abnormal_found) + continue; + if (gimple_vuse (gs) && gimple_assign_single_p (gs)) slsr_process_ref (gs);
[Bug c++/82338] New: valgrind error in inherit_in_ebb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82338 Bug ID: 82338 Summary: valgrind error in inherit_in_ebb Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- Created attachment 42242 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42242=edit C++ source code I built a recent trunk version of gcc with valgrind. I compiled the attached C++ code with flags -O3 -fPIC -fno-strict-aliasing -c and got the following: $ ~/gcc/results.253187.valgrind/bin/gcc -O3 -fPIC -fno-strict-aliasing -c bug386.cc ==17152== Conditional jump or move depends on uninitialised value(s) ==17152==at 0x9DCE01: inherit_in_ebb (lra-constraints.c:6224) ==17152==by 0x9DCE01: lra_inheritance() (lra-constraints.c:6474) ==17152==by 0x9C8518: lra(_IO_FILE*) (lra.c:2430) ==17152==by 0x986161: do_reload (ira.c:5440) The bug seems to have been introduced sometime before revision 249539. svn blame claims that lra-constraints.c:6224 is as follows: 192719 vmakarov && usage_insns[regno].calls_num == calls_num - 1 I'll have a go at reducing the code.
[Bug c++/82336] [5/6/7/8 Regression] GCC requires but does not emit defaulted constructors in certain cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82336 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Target Milestone|--- |5.5
[Bug target/82333] [8 Regression] powerpc64le _Float128 ICE in as_a, at machmode.h:345
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82333 Richard Biener changed: What|Removed |Added Version|7.0 |8.0 Target Milestone|--- |8.0
[Bug c++/82331] [7/8 Regression] ICE specializing template for function pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82331 Richard Biener changed: What|Removed |Added Keywords||ice-on-valid-code Priority|P3 |P2 Known to work||7.1.0 Target Milestone|--- |7.3 Summary|ICE specializing|[7/8 Regression] ICE |template for function |specializing template |pointer |for function pointer Known to fail||7.2.0
[Bug middle-end/82329] #pragma GCC target/optimize incurs high compilation time cost
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82329 Richard Biener changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-27 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed.
[Bug lto/82172] Destruction of basic_string in basic_stringbuf::overflow with _GLIBCXX_USE_CXX11_ABI=0, -flto, and C++17 mode results in invalid delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82172 Martin Liška changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #18 from Martin Liška --- Issue solved, ld.bfd is responsible. Gold properly marks the symbols as: 1322 aef281150a4b024f PREVAILING_DEF_IRONLY_EXP _ZNSs4_Rep20_S_empty_rep_storageE I'm going to create binutils issue for that.
[Bug other/82327] [7 Regression] ICE in equal_mem_array_ref_p, at tree-ssa-scopedtables.c:429 (i686-linux-gnu)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82327 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug c++/82326] static_cast for vector extension not working?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82326 --- Comment #2 from Richard Biener --- Agreed. One could allow changes in signedness as extension.
[Bug tree-optimization/82321] [8 Regression] ICE in check_loop_closed_ssa_use, at tree-ssa-loop-manip.c:707
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82321 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Richard Biener --- Fixed.
[Bug c/82323] circular ifunc attribute on a function definition silently accepted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82323 Martin Liška changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org --- Comment #4 from Martin Liška --- Good, I'm taking that for stage4.
[Bug tree-optimization/82337] [5/6/7/8 Regression] ICE: SSA corruption at tree-ssa-coalesce.c:1010
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82337 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2017-09-27 CC||marxin at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Summary|-O2: ICE: SSA corruption at |[5/6/7/8 Regression] ICE: |tree-ssa-coalesce.c:1010|SSA corruption at ||tree-ssa-coalesce.c:1010 Ever confirmed|0 |1 --- Comment #4 from Martin Liška --- Started with r214941.