[Bug target/70117] ppc long double isinf() is wrong?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117 --- Comment #7 from Ulrich Weigand --- Ah, OK. I did't realize this value didn't fit into a 106-bit mantissa. I agree that it probably doesn't make sense to change the internal representation to allow larger mantissas. First of all, there's nothing really special about 107 bits; there can be IBM long double values that would require a much larger mantissa in the internal representation, since we can have many implicit zero bits. But more problematical, if we change the internal representation to a mantissa larger than 106 bits, there will be values in that internal format that cannot be represented directly in the target IBM long double format. In any case, I certainly agree that the is* routines for IBM long double should simply operate on the high double of the pair. I still think that it would be better for gnulib to use the same LDBL_MAX as GCC, which means gnulib should probably be changed to use the 106-bit value.
[Bug sanitizer/70135] New: -fsanitize=undefined causes static_assert to fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70135 Bug ID: 70135 Summary: -fsanitize=undefined causes static_assert to fail Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: trippels at gcc dot gnu.org CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org Target Milestone: --- Created attachment 37894 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37894=edit unreduced testcase markus@x4 build % g++ -O2 comparing.ii markus@x4 build % clang++ -std=c++14 -O2 comparing.ii markus@x4 build % clang++ -fsanitize=undefined -std=c++14 -O2 comparing.ii markus@x4 build % g++ -fsanitize=undefined -O2 comparing.ii ../example/comparing.cpp: In function ‘int main()’: ../example/comparing.cpp:26:5: error: static assertion failed static_assert(grouped == hana::make_tuple( ^
[Bug c++/70112] [lto] Segmentation fault in Libreoffice's program gengal.bin when build with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70112 Markus Trippelsdorf changed: What|Removed |Added CC||hubicka at ucw dot cz --- Comment #4 from Markus Trippelsdorf --- (In reply to Marek Behun from comment #2) Do you get these "invalid vptr" errors also when you compile without -flto? If yes, then you need to fix the bug they point to in Libreoffice. You can set a breakpoint at __ubsan::Diag::~Diag in gdb to get a backtrace. CCing Honza, who, I believe, builds Libreoffice regularly with -flto.
[Bug lto/69953] [5/6 Regression] Using lto causes gtkmm/gparted and gtkmm/inkscape compile to fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69953 --- Comment #16 from john.frankish at outlook dot com --- Any news on a possible patch?
[Bug c/7652] -Wswitch-break : Warn if a switch case falls through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7652 --- Comment #38 from Jonathan Wakely --- (In reply to Matthew Woehlke from comment #37) > [[fallthrough]] was approved for C++17. While the standard does not > normatively *require* a diagnostic, it's certainly expected that one be > issued. It's a shame that gcc is behind the curve here. It was approved less than a week ago. It's not 2017 yet. It will get implemented.
[Bug c++/53637] NRVO not applied where there are two different variables involved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53637 TC changed: What|Removed |Added CC||rs2740 at gmail dot com --- Comment #6 from TC --- (In reply to Thomas Braun from comment #4) > (I'm no gcc dev at all) > > In general gcc is much better in doing NRVO/URVO than other compilers > according to my analysis [1]. So maybe the competitors need to get better > first ;) > > [1]: http://www.byte-physics.de/cpp-copy-elision The three cases (L, P, R) where GCC is "better" is actually non-conforming.
[Bug rtl-optimization/70134] New: combine misses jump optimization on powerpc64le
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70134 Bug ID: 70134 Summary: combine misses jump optimization on powerpc64le Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amodra at gmail dot com Target Milestone: --- I'm not sure this is really a combine bug, but I noticed that for x86_64 and gfortran.dg/pr46755.f, combine is able to convert (insn 62 34 35 9 (set (reg/v:SI 94 [ qdw ]) (const_int 0 [0])) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 -1 (nil)) (insn 35 62 36 9 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:SI 94 [ qdw ]) (const_int 0 [0]))) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 3 {*cmpsi_ccno_1} (nil)) (jump_insn 36 35 37 9 (set (pc) (if_then_else (eq (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 30) (pc))) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 635 {*jcc_1} (expr_list:REG_DEAD (reg:CCZ 17 flags) (int_list:REG_BR_PROB 5000 (nil))) -> 30) into an unconditional jump which then allows pruning of the fall-through block and a whole lot more optimization. For powerpc64le, the corresponding sequence is split over two blocks (thus not really a combine bug I guess). (insn 76 34 38 2 (set (reg/v:DI 162 [ qdw ]) (const_int 0 [0])) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 -1 (nil)) (insn 38 76 11 2 (set (reg:CC 178) (compare:CC (reg/v:DI 162 [ qdw ]) (const_int 0 [0]))) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 720 {*cmpdi_signed} (expr_list:REG_DEAD (reg/v:DI 162 [ qdw ]) (nil))) (jump_insn 39 36 40 9 (set (pc) (if_then_else (eq (reg:CC 178) (const_int 0 [0])) (label_ref 32) (pc))) /src/gcc.git/gcc/testsuite/gfortran.dg/pr46755.f:16 755 {*rs6000.md:11588} (int_list:REG_BR_PROB 5000 (nil)) -> 32) No pass seems to recognize that reg:CC 178 has a known value (from comparing zero with zero). If combine had this information it could convert insn 39 to an unconditional jump, the fallthrough block would be deleted and insn 38 and 76 seen to be dead.
[Bug c/70133] New: AArch64 -mtune=native generates improperly formatted -march parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70133 Bug ID: 70133 Summary: AArch64 -mtune=native generates improperly formatted -march parameters Product: gcc Version: 5.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: davidwillmore at gmail dot com Target Milestone: --- Created attachment 37893 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37893=edit compiler build info and execution host info On an ODROID C2 board (Amlogic S905 processor) which has Cortex-A53 cores with the following CPU Features (found in /proc/cpuinfo) "fp asimd crc32", -mtune=native expands to -march=armv8-a+fp+simd+nocrypto+crc+nolse which causes gcc to emit the error: Assembler messages: Error: must specify extensions to add before specifying those to remove Error: unrecognized option -march=armv8-a+fp+simd+nocrypto+crc+nolse If the same build is attempted with "-mtune=native" replaced by "-march=armv8-a+fp+simd+crc+nocrypto+nolse" the compile succeeds as expected.
[Bug target/70117] ppc long double isinf() is wrong?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117 --- Comment #6 from Alan Modra --- > Well, what I don't quite understand is that the gnulib value, which is > > 0x1.f7cp+1023 Sorry, I didn't look properly at the bug before commenting last night. For some reason I thought the gnulib value didn't have the needed zero bit in the mantissa.. > likewise should round to the same double value, shouldn't it? Yes, it should, but gcc's LDBL_MAX is the largest 106 bit precision IBM extended double. The gnulib value has 107 bits of precision, 53 bits from each of the component doubles plus an implicit zero bit. As Joseph says, the problem is that gcc evaluates IBM extended double expressions to 106 bit precision. Widening to 107 bits of precision would almost certainly cause other problems.
[Bug target/70117] ppc long double isinf() is wrong?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117 --- Comment #5 from joseph at codesourcery dot com --- The issue is that GCC internally handles IBM long double as having a 106-bit mantissa. There is one value that is larger than can be represented with a 106-bit mantissa, while still being finite and satisfying the rule that the two halves, added in round-to-nearest, produce the top half. In my view, isinf, isnan and isfinite for IBM long double should all be special-cased in GCC to call the corresponding operation on the top half of the number (for both hard and soft float). This is correct in all cases and more efficient than anything that needs to look at both halves. It should also be a lot easier than fixing LDBL_MAX by allowing numbers with mantissas wider than 106 bits.
[Bug c/70114] Incompatible implicit function declaration error when parameters actually match a prototype in another scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70114 --- Comment #1 from joseph at codesourcery dot com --- On Mon, 7 Mar 2016, tanzx940228 at hotmail dot com wrote: > 3.gcc.error.c > - > int main() { > { > int foo(float arg0, float arg1); > foo(3.0f, 4.0f); > } > { > foo(3.0f, 4.0f); // clang passes, gcc gets an error??? > } > } > > This is really weird! Clang gives me a warning saying that I'm using an > out-of-scope prototype. BUT gcc gives me an error, saying "incompatible > implicit declaration of function ‘foo’"! What happened? An unprototyped function type is never compatible with a function prototype with argument types subject to the default argument promotions, such as float. See C90 6.5.4.3 (reference to C90 since you're using implicit declarations).
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 --- Comment #12 from Wilco --- (In reply to Jiong Wang from comment #11) > (In reply to Richard Henderson from comment #10) > > Created attachment 37890 [details] > > second patch > > > > Still going through full testing, but I wanted to post this > > before the end of the day. > > > > This update includes a virt_or_elim_regno_p, as discussed in #c7/#c8. > > > > It also updates aarch64_legitimize_address to treat R0+R1+C as a special > > case of R0+(R1*S)+C. All of the arguments wrt scaling apply to unscaled > > indices as well. > > > > As a minor point, doing some of the expansion in a slightly different > > order results in less garbage rtl being generated in the process. > > Richard, > > I just recalled the reassociation of constant offset with vritual frame > pointer will increase register pressure, thus cause bad code generation > under some situations. For example, the testcase given at > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173#c8 > > void bar(int i) > { > char A[10]; > char B[10]; > char C[10]; > g(A); > g(B); > g(C); > f(A[i]); > f(B[i]); > f(C[i]); > return; > } > > Before your patch we are generating (-O2) > === > bar: > stp x29, x30, [sp, -80]! > add x29, sp, 0 > add x1, x29, 80 > str x19, [sp, 16] > mov w19, w0 > add x0, x29, 32 > add x19, x1, x19, sxtw > bl g > add x0, x29, 48 > bl g > add x0, x29, 64 > bl g > ldrbw0, [x19, -48] > bl f > ldrbw0, [x19, -32] > bl f > ldrbw0, [x19, -16] > bl f > ldr x19, [sp, 16] > ldp x29, x30, [sp], 80 > ret > > After your patch, we are generating: > === > bar: > stp x29, x30, [sp, -96]! > add x29, sp, 0 > stp x21, x22, [sp, 32] > add x22, x29, 48 > stp x19, x20, [sp, 16] > mov w19, w0 > mov x0, x22 > add x21, x29, 64 > add x20, x29, 80 > bl g > mov x0, x21 > bl g > mov x0, x20 > bl g > ldrbw0, [x22, w19, sxtw] > bl f > ldrbw0, [x21, w19, sxtw] > bl f > ldrbw0, [x20, w19, sxtw] > bl f > ldp x19, x20, [sp, 16] > ldp x21, x22, [sp, 32] > ldp x29, x30, [sp], 96 > ret > > We are using more callee saved registers, thus extra stp/ldp generated. > > But we do will benefit from reassociation constant offset with virtual > frame pointer if it's inside loop, because: > >* vfp + const_offset is loop invariant >* the virtual reg elimination on vfp will eventually generate one > extra instruction if it was not used with const_offset but another reg. > > Thus after this reassociation, rtl IVOPT can hoist it out of loop, and we > will save two instructions in the loop. > > A fix was proposed for loop-invariant.c to only do such reshuffling for > loop, see https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01253.html. That > patch finally stopped because the issue PR62173 was fixed on tree level, and > the pointer re-shuffling was considered to have hidding overflow risk though > will be very rare. I don't believe this is really worse - if we had say the same example with 3 pointers or 3 global arrays we should get the exact same code (and in fact generating the same canonicalized form for different bases and scales is essential). Once you've done that you can try optimizing accesses which differ by a *small* constant offset.
[Bug fortran/70058] Segmentation fault when open file with existing file and status = "UNKNOWN"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70058 Jerry DeLisle changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #9 from Jerry DeLisle --- (In reply to Paul from comment #8) > I downloaded the latest version I could find from > https://sourceforge.net/projects/mingw-w64/?source=typ_redirect. It is > version 5.3.0. The problem does not exist with version 5.3.0. My > apologies for not finding the latest gfortran version earlier. Glad to hear you have resolved.
[Bug c++/70112] [lto] Segmentation fault in Libreoffice's program gengal.bin when build with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70112 --- Comment #3 from Marek Behun --- Created attachment 37892 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37892=edit sanitizer-output.txt
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 --- Comment #11 from Jiong Wang --- (In reply to Richard Henderson from comment #10) > Created attachment 37890 [details] > second patch > > Still going through full testing, but I wanted to post this > before the end of the day. > > This update includes a virt_or_elim_regno_p, as discussed in #c7/#c8. > > It also updates aarch64_legitimize_address to treat R0+R1+C as a special > case of R0+(R1*S)+C. All of the arguments wrt scaling apply to unscaled > indices as well. > > As a minor point, doing some of the expansion in a slightly different > order results in less garbage rtl being generated in the process. Richard, I just recalled the reassociation of constant offset with vritual frame pointer will increase register pressure, thus cause bad code generation under some situations. For example, the testcase given at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173#c8 void bar(int i) { char A[10]; char B[10]; char C[10]; g(A); g(B); g(C); f(A[i]); f(B[i]); f(C[i]); return; } Before your patch we are generating (-O2) === bar: stp x29, x30, [sp, -80]! add x29, sp, 0 add x1, x29, 80 str x19, [sp, 16] mov w19, w0 add x0, x29, 32 add x19, x1, x19, sxtw bl g add x0, x29, 48 bl g add x0, x29, 64 bl g ldrbw0, [x19, -48] bl f ldrbw0, [x19, -32] bl f ldrbw0, [x19, -16] bl f ldr x19, [sp, 16] ldp x29, x30, [sp], 80 ret After your patch, we are generating: === bar: stp x29, x30, [sp, -96]! add x29, sp, 0 stp x21, x22, [sp, 32] add x22, x29, 48 stp x19, x20, [sp, 16] mov w19, w0 mov x0, x22 add x21, x29, 64 add x20, x29, 80 bl g mov x0, x21 bl g mov x0, x20 bl g ldrbw0, [x22, w19, sxtw] bl f ldrbw0, [x21, w19, sxtw] bl f ldrbw0, [x20, w19, sxtw] bl f ldp x19, x20, [sp, 16] ldp x21, x22, [sp, 32] ldp x29, x30, [sp], 96 ret We are using more callee saved registers, thus extra stp/ldp generated. But we do will benefit from reassociation constant offset with virtual frame pointer if it's inside loop, because: * vfp + const_offset is loop invariant * the virtual reg elimination on vfp will eventually generate one extra instruction if it was not used with const_offset but another reg. Thus after this reassociation, rtl IVOPT can hoist it out of loop, and we will save two instructions in the loop. A fix was proposed for loop-invariant.c to only do such reshuffling for loop, see https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01253.html. That patch finally stopped because the issue PR62173 was fixed on tree level, and the pointer re-shuffling was considered to have hidding overflow risk though will be very rare.
[Bug c++/70112] [lto] Segmentation fault in Libreoffice's program gengal.bin when build with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70112 --- Comment #2 from Marek Behun --- (In reply to Markus Trippelsdorf from comment #1) Does not happen with -O0, does with -O1, -O2, -O3. When gengal.bin is compiled with -O0 -fsanitize=undefined, then in runs successfully, but the sanitizer prints some errors, attaching them in sanitizer-output.txt. With higher optimizations and -fsanitize=undefined the sanitizer prints the same, but then the program crashes. I do therefore not know if the segfault is caused by the same thing that sanitizer complains about. How can I discover if those things are related? The issue does not go away when compiled with "-fno-lifetime-dse -fno-delete-null-pointer-checks". Any ideas how to GDB it?
[Bug driver/70132] New: ARM -mcpu=native can cause a double free abort.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70132 Bug ID: 70132 Summary: ARM -mcpu=native can cause a double free abort. Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: driver Assignee: unassigned at gcc dot gnu.org Reporter: dariushardy1 at gmail dot com Target Milestone: --- When attempting to use cpu autodetection (either -mcpu=native or -march=native) on ARM where the cpu isn't known by the detection routine e.g. Cortex-A53 in aarch32 mode, the detection function attempts to fclose() a file twice. gcc/config/arm/driver-arm.c 131 fclose (f); 132 133 if (val == NULL) 134 goto not_found; 135 136 return concat ("-m", argv[0], "=", val, NULL); 137 138 not_found: 139 { 140 unsigned int i; 141 unsigned int opt; 142 const char *search[] = {NULL, "arch"}; 143 144 if (f) 145 fclose (f); When the cpu identifier isn't know val=NULL when it enters this part and the file f ("/proc/cpuinfo") will be closed on line 131, and then again at 145 causing an abort. Setting f = NULL after the first fclose() should prevent it but it isn't done. rpi3 is a Cortex-A53 running in aarch32 mode. Whilst the A53 is known and gcc will compile for it, the autodetect code doesn't have it listed. pi@rpi3:~ $ gcc -mcpu=native *** Error in `gcc': double free or corruption (top): 0x00f5abd0 *** Aborted Noticed in 4.9.2, but the code for 5.3.0 appears to still have this. pi@rpi3:~ $ gcc --version gcc (Raspbian 4.9.2-10) 4.9.2
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #8 from Jakub Jelinek --- Well, it is a question if it is -fno-strict-aliasing or -std=kernel thing. Anyway, IMHO what matters is the points-to stuff: Points-to sets ANYTHING = { ANYTHING } ESCAPED = { ESCAPED NONLOCAL __aeabi_uidiv __aeabi_idiv } NONLOCAL = { ESCAPED NONLOCAL } STOREDANYTHING = { } INTEGER = { ANYTHING } __aeabi_uidiv.0_1 = { __aeabi_uidiv } __aeabi_uidiv = { ESCAPED NONLOCAL } fn_addr_2 = { __aeabi_uidiv } fn_addr.1_3 = { __aeabi_uidiv } same as fn_addr_2 derefaddrtmp(12) = { NONLOCAL } _6 = { NONLOCAL __aeabi_uidiv } _7 = { NONLOCAL __aeabi_uidiv } same as _6 ... Flow-insensitive points-to information fn_addr.1_3, points-to vars: { } (escaped) _7, points-to non-local, points-to vars: { } (escaped) And the non-local is what makes the difference in between the two during DSE, where ref_maybe_used_by_call_p_1 is called with the v7_coherent_kern_range call, and pt_solutions_intersect then differs between the two stores. The only thing that really surprises me is that the in the Red Hat bugzilla bugreport I've referenced in the URL there is a claim this works with gcc 5.x, that is a mystery to me. At least if I bisect this testcase on x86_64 with -Os -fno-strict-aliasing, the two stores started to be removed with r165641, so quite a while ago. IMHO the kernel people have to change their code even if there is agreement we should make -fno-strict-aliasing more forgiving in this case, because 4.[6-9]*/5 ought to be affected too.
[Bug target/59054] Powerpc -O0 -mcpu=power7 generates sub-optimal code to load 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59054 Michael Meissner changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Michael Meissner --- This was fixed in 2013, but never closed.
[Bug target/70131] PowerPC ISA 2.07 is inefficient at doint (float)(int)x.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70131 Michael Meissner changed: What|Removed |Added CC||dje at gcc dot gnu.org, ||wschmidt at gcc dot gnu.org Severity|normal |enhancement
[Bug target/70131] New: PowerPC ISA 2.07 is inefficient at doint (float)(int)x.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70131 Bug ID: 70131 Summary: PowerPC ISA 2.07 is inefficient at doint (float)(int)x. Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: meissner at gcc dot gnu.org Target Milestone: --- The current GCC compiler does not use the vupkhsw instruction that was introduced in ISA 2.07 (power8) to sign extend the 32-bit integer to 64-bit integer. Instead it does a store and then a load to sign extend the value. The code: double foo (double x) { return (double)(int)x; } generates the following code for power8: fctiwz 1,1 addi 9,1,-16 stfiwx 1,0,9 ori 2,2,0 lfiwax 1,0,9 fcfid 1,1 It does this because It should generate something like: xscvdpsxws 33,1 vupkhsw 1,1 xxpermdi 33,33,33,2 xscvsxddp 1,33
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #7 from Richard Henderson --- (In reply to Andrew Pinski from comment #5) > I still say this is undefined even with -fno-strict-aliasing because > patching a function is undefined. Oh please. I think that's short-sighted. I don't see how this differs materially from e.g. const int x; void f(void) { *(int *) = 1; } and we don't delete that store. We need a mode in which it's possible to do things that aren't valid in "normal" C. We've more-or-less settled on -fn-s-a as an escape whereby treating all memory as a collection of bytes is valid.
[Bug tree-optimization/70130] [6 Regression] h264ref fails with verification error starting with r231674
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70130 Andrew Pinski changed: What|Removed |Added Keywords||wrong-code Target Milestone|--- |6.0 Summary|h264ref fails with |[6 Regression] h264ref |verification error starting |fails with verification |with r231674|error starting with r231674
[Bug tree-optimization/70130] New: h264ref fails with verification error starting with r231674
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70130 Bug ID: 70130 Summary: h264ref fails with verification error starting with r231674 Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pthaugen at gcc dot gnu.org CC: dje at gcc dot gnu.org, rguenth at gcc dot gnu.org, wschmidt at gcc dot gnu.org Target Milestone: --- Host: powerpc64*-unknown-linux-gnu Target: powerpc64*-unknown-linux-gnu Build: powerpc64*-unknown-linux-gnu Created attachment 37891 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37891=edit .i containing loop CPU2006 benchmark 464.h264ref started failing with a verification error with r231674. I have been having a hard time tracking down the exact problem, but have isolated it down to a specific loop. The attached file is the .i from that specific loop from benchmark file block.c:intrapred_luma_16x16(). Compile cmd: gcc -c -m64 -O3 -mcpu=power7 -fsigned-charLoop_err.c Prior to r231674 the loop was not vectorized when specifying -mcpu=power7 and the benchmark ran fine. Note that the benchmark passes both before/after if built with -mcpu=power8. In this case the loop was already being vectorized prior to r231674, but with r231674 it looks like the loop is eliminated (peeled?). Also note that when I run the benchmark on one of our Power7 boxes, it passes if I link the .o files with the distro ld (2.20), but fails when using a newer version (2.22 or 2.25).
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #6 from Andrew Pinski --- Note since 4.5 is not out yet there looks to be another week before it comes out. The fix to the kernel can make it in.
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #5 from Andrew Pinski --- I still say this is undefined even with -fno-strict-aliasing because patching a function is undefined.
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #4 from Jeffrey A. Law --- Shouldn't the call to v7_coherent_kern_range prevent DSE from removing the store to fn_addr[0] and fn_addr[1]? It feels like the undefined nature of this code is leading to a lack of proper dataflow for the virtual operands. I haven't thrown it under a debugger, but that's how it seems as first glance to me.
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 Richard Henderson changed: What|Removed |Added Attachment #37886|0 |1 is obsolete|| --- Comment #10 from Richard Henderson --- Created attachment 37890 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37890=edit second patch Still going through full testing, but I wanted to post this before the end of the day. This update includes a virt_or_elim_regno_p, as discussed in #c7/#c8. It also updates aarch64_legitimize_address to treat R0+R1+C as a special case of R0+(R1*S)+C. All of the arguments wrt scaling apply to unscaled indices as well. As a minor point, doing some of the expansion in a slightly different order results in less garbage rtl being generated in the process.
[Bug lto/69650] [6 Regression] ICE in linemap_line_start, at libcpp/line-map.c:803
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69650 --- Comment #9 from Bernd Schmidt --- Hmm, seems to break Ada of all things...
[Bug libstdc++/70129] [6 Regression] stdlib.h: No such file or directory when using -isystem /usr/include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70129 Markus Trippelsdorf changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #2 from Markus Trippelsdorf --- OK. Thanks.
[Bug libstdc++/70129] [6 Regression] stdlib.h: No such file or directory when using -isystem /usr/include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70129 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Yeah, this is known, but I'm afraid there is nothing that can be done easily about it. Just don't do it. Well, in theory, we could slow everything down by adding yet another default include directory that would come after /usr/include in the default search scope, and would contain some fallback stdlib.h and math.h for these cases, but that would be too ugly. So, IMHO just the packages that use this should either know what they are doing and put the C++ STL system headers first, or don't use STL, or don't mess with -isystem for the default directories. The last one preferred.
[Bug middle-end/70127] [6 Regression] wrong code on x86_64-linux-gnu at -O3 in 32-bit and 64-bit modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70127 Markus Trippelsdorf changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-07 CC||hubicka at gcc dot gnu.org, ||trippels at gcc dot gnu.org Component|web |middle-end Summary|wrong code on |[6 Regression] wrong code |x86_64-linux-gnu at -O3 in |on x86_64-linux-gnu at -O3 |32-bit and 64-bit modes |in 32-bit and 64-bit modes Ever confirmed|0 |1 --- Comment #1 from Markus Trippelsdorf --- Started with r229494: commit 93182daeb14e34045324299014949ebfd7c160cb Author: hubickaDate: Wed Oct 28 16:35:15 2015 + * fold-const.c (operand_equal
[Bug libstdc++/70129] New: [6 Regression] stdlib.h: No such file or directory when using -isystem /usr/include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70129 Bug ID: 70129 Summary: [6 Regression] stdlib.h: No such file or directory when using -isystem /usr/include Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: trippels at gcc dot gnu.org Target Milestone: --- markus@x4 tmp % echo "#include" | g++ -x c++ -isystem /usr/include - In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.0.0/include/g++-v6/bits/stl_algo.h:59:0, from /usr/lib/gcc/x86_64-pc-linux-gnu/6.0.0/include/g++-v6/algorithm:62, from :1: /usr/lib/gcc/x86_64-pc-linux-gnu/6.0.0/include/g++-v6/cstdlib:75:25: fatal error: stdlib.h: No such file or directory #include_next ^ compilation terminated. gcc-5 and older is fine.
[Bug target/70064] Wrong code with custom flags and quite big testcase @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70064 --- Comment #12 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #5) > So, shall we silently disable -mred-zone for -fpic/-fPIE in 32-bit code? > Or error out in that combination? > Or disable it only if we need PIC pointer? > What about other cases where one can have calls in leaf functions (say the > various -p/-mfentry cases, or are those considered non-leaf)? > Or is this simply a user error? FTR, the patch disables redzone when call to pc thunk is emitted in the current function. mcount and __fentry__ are called before redzone is accessed, so they are safe.
[Bug target/70064] Wrong code with custom flags and quite big testcase @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70064 Uroš Bizjak changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #11 from Uroš Bizjak --- Fixed.
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #3 from Jakub Jelinek --- *.mergephi1 has (just the first half, the second one is analogous): __aeabi_uidiv.0_1 = (long unsigned int) __aeabi_uidiv; fn_addr_2 = __aeabi_uidiv.0_1 & 4294967294; fn_addr.1_3 = (unsigned int *) fn_addr_2; *fn_addr.1_3 = 3878744336; _6 = fn_addr_2 + 4; _7 = (unsigned int *) _6; *_7 = 3778019102; _9 = fn_addr_2 + 8; v7_coherent_kern_range (fn_addr_2, _9); Now, dse1 details say: Deleted dead store '*fn_addr.1_3 = 3878744336; and indeed, that is what happened: __aeabi_uidiv.0_1 = (long unsigned int) __aeabi_uidiv; fn_addr_2 = __aeabi_uidiv.0_1 & 4294967294; fn_addr.1_3 = (unsigned int *) fn_addr_2; _6 = fn_addr_2 + 4; _7 = (unsigned int *) _6; *_7 = 3778019102; _9 = fn_addr_2 + 8; v7_coherent_kern_range (fn_addr_2, _9);
[Bug target/70064] Wrong code with custom flags and quite big testcase @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70064 --- Comment #10 from uros at gcc dot gnu.org --- Author: uros Date: Mon Mar 7 19:54:02 2016 New Revision: 234050 URL: https://gcc.gnu.org/viewcvs?rev=234050=gcc=rev Log: PR target/70064 * config/i386/i386.h (machine_function): Add pc_thunk_call_expanded flag. (ix86_pc_thunk_call_expanded): New define. * config/i386/i386.md (set_got, set_got_labelled): New expanders. (*set_got): Rename insn pattern from set_got. (*set_got_labelled): Rename inst pattern from set_got_labelled. * config/i386/i386.c (ix86_compute_frame_layout): Use ix86_pc_thunk_call_expanded to prevent red-zone. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.h trunk/gcc/config/i386/i386.md
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #2 from Jeffrey A. Law --- Just so I'm clear on what's happening here. Precisely which stores are getting removed?
[Bug tree-optimization/70128] Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 --- Comment #1 from Andrew Pinski --- I think anyone patching functions need to be considered special and undefined. In most cases they did not mean to do it. Just the kernel is special. > just add an optimization barrier on fn_addr (like asm ("" : "+g" (fn_addr))). I think that is the correct fix no matter what. As patching functions is not something which you can do normally because the .text section is read only.
[Bug tree-optimization/70128] New: Linux kernel div patching optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70128 Bug ID: 70128 Summary: Linux kernel div patching optimized away Product: gcc Version: 6.0 URL: https://bugzilla.redhat.com/show_bug.cgi?id=1303147 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org CC: rth at gcc dot gnu.org Target Milestone: --- Target: arm*-linux-gnueabi Linux kernel in arch/arm/kernel/setup.c contains questionable code, reduced into: extern void v7_coherent_kern_range(unsigned long, unsigned long); void patch_aeabi_idiv(void) { extern void __aeabi_uidiv(void); extern void __aeabi_idiv(void); unsigned long fn_addr; fn_addr = ((unsigned long)&__aeabi_uidiv) & ~1; ((unsigned int *)fn_addr)[0] = 0xe730f110; ((unsigned int *)fn_addr)[1] = 0xe12fff1e; v7_coherent_kern_range(fn_addr,fn_addr + 8); fn_addr = ((unsigned long)&__aeabi_idiv) & ~1; ((unsigned int *)fn_addr)[0] = 0xe710f110; ((unsigned int *)fn_addr)[1] = 0xe12fff1e; v7_coherent_kern_range(fn_addr,fn_addr + 8); } where even when this is compiled with -fno-strict-aliasing -Os (and lots of other options), the ((unsigned int *)fn_addr)[0] are removed by tree DSE (supposedly points-to analysis figures out that fn_addr points to a FUNCTION_DECL and doesn't set pi->nonlocal, while for ((unsigned int *)fn_addr)[1] it is already set. The question is, is -fno-strict-aliasing meant to also disable some points-to optimizations, or is the above considered invalid even with -fno-strict-aliasing? Of course, the fix for the kernel is easy, just add an optimization barrier on fn_addr (like asm ("" : "+g" (fn_addr))).
[Bug c++/70126] VLA accepted in sizeof and typedef, allowing integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70126 --- Comment #2 from Martin Sebor --- Yes, there are some significant differences between C99 VLAs and N3639. I don't know how common using sizeof with VLA types is in C++. I suspect not very. VLAs never did make it into C++ 14 (though I'm not sure if they will stay out of C++ 17 or whichever next version finally adopts a more recent version of the C standard), but they will very likely continue to be supported by G++ for C (and GCC) compatibility. The problem is that the current G++ implementation is a hybrid of N3639 and C11, with the most treacherous elements included from each (G++ allows initialization which disallowed by C, and as noted in bug 70075, for example, gets it wrong, and it allows applying sizeof to VLA typedefs which is disallowed by N3639, and as noted in this bug, gets that wrong as well). I think a good way to resolve this bug would be in the same spirit as in my proposed patch for bug 69517: by continuing to accept VLA typedefs and VLA types in [runtime] sizeof expressions (perhaps with a warning noting that they are evaluated at runtime) for compatibility with C, and by having the sizeof expression throw an exception on overflow.
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 --- Comment #9 from Richard Henderson --- While I fully believe in CSE'ing "base + reg*scale" when talking about non-stack-based pointers, when it comes to stack-based data access I'm less certain about the proper approach. All things work out "best" when there's no (or little) offset applied during register elimination. When this can be true, all of the rtl optimizations see the final address and can do the right thing. This isn't easy to do for AArch64, however. So we need to accept that some amount of concession need be made so that it's not too difficult turn reg + scale + c1 + c2 into a final address without extra steps. We already special case the eliminable frame registers in aarch64_classify_address to allow arbitrary offset, and we're prepared to split to a proper offset during RA. It wouldn't be out of the question to allow "reg + scale + c" as well. We can probably come up with some good heuristics for splitting into a number of cases based on the generalized "((reg + hi_c) + scale) + lo_c". But the patch we take for stage4 must be less than the full solution.
[Bug tree-optimization/70094] Missed optimization when passing a constant struct argument by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70094 --- Comment #2 from Robert Obryk --- Note that this optimization can not only be applied when the parameter is a compile-time constant. The following function can also be compiled so as not to touch the stack: --snip-- struct foo { int a; int b; int c; }; void bar(foo); void baz(int x) { foo f; f.a = x; f.b = 5; f.c = 1; bar(f); } --snip-- According to godbolt, clang compiles this to: --snip-- baz(int):# @baz(int) movl%edi, %eax movabsq $21474836480, %rdi # imm = 0x5 orq %rax, %rdi movl$1, %esi jmp bar(foo) # TAILCALL --snip--
[Bug web/70127] New: wrong code on x86_64-linux-gnu at -O3 in 32-bit and 64-bit modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70127 Bug ID: 70127 Summary: wrong code on x86_64-linux-gnu at -O3 in 32-bit and 64-bit modes Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: web Assignee: unassigned at gcc dot gnu.org Reporter: chengniansun at gmail dot com Target Milestone: --- The following code is miscompiled by the trunk, gcc-4.7 and gcc-4.8 at -O3 in both 32-bit and 64-bit modes on x86_64-linux-gnu. $: gcc-trunk -v Using built-in specs. COLLECT_GCC=gcc-trunk COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto --prefix=/usr/local/gcc-trunk --disable-bootstrap Thread model: posix gcc version 6.0.0 20160307 (experimental) [trunk revision 234022] (GCC) $: $: gcc-trunk -O3 small.c ; ./a.out 0 $: gcc-4.7 -O3 small.c ; ./a.out 0 $: gcc-4.8 -O3 small.c ; ./a.out 0 $: gcc-trunk -O0 small.c ; ./a.out 1 $: cat small.c int printf(const char *, ...); struct S0 { int f0; signed f1 : 2; } a[1], c = {5, 1}, d; short b; int main() { for (; b <= 0; b++) { struct S0 e = {1, 1}; d = e = a[0] = c; } printf("%d\n", a[0].f1); return 0; } $:
[Bug target/24998] [4.9/5/6 Regression] Build failure: undefined symbol __floatunsitf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24998 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED CC||law at redhat dot com Resolution|--- |FIXED --- Comment #51 from Jeffrey A. Law --- MEP was the last target that needed updating and it has been deprecated. Thus, I'm closing this BZ as resolved.
[Bug bootstrap/44756] [meta-bug] --enable-werror-always issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44756 Bug 44756 depends on bug 49401, which changed state. Bug 49401 Summary: Warning regression for 'uninitialized' variable on non-existant code path (in mep-pragma.c) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49401 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WONTFIX
[Bug other/49401] Warning regression for 'uninitialized' variable on non-existant code path (in mep-pragma.c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49401 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED CC||law at redhat dot com Resolution|--- |WONTFIX --- Comment #3 from Jeffrey A. Law --- MEP has been deprecated.
[Bug target/64402] mep-elf ICE in pre_and_rev_post_order_compute, at cfganal.c:1022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64402 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||law at redhat dot com Resolution|--- |WONTFIX --- Comment #5 from Jeffrey A. Law --- MEP has been deprecated.
[Bug c++/70126] VLA accepted in sizeof and typedef, allowing integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70126 --- Comment #1 from Florian Weimer --- There seems to be a fundamental incompatibility here with supporting the GNU CC VLA extension and this new (and apparently dead) C++ VLA specification. I wonder how much existing G++ code applies sizeof to a VLA type (as opposed to a VLA object, which seems to be allowed under the C++ proposal).
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 --- Comment #8 from amker at gcc dot gnu.org --- (In reply to Richard Henderson from comment #6) > Created attachment 37886 [details] > proposed patch > > I agree -- at minimum virtual and eliminable frame registers ought to be > special-cased. If we separate the constants too far, we'll never be able > to fold the constant plus the adjustment back together. > > If the statement in #c4 is taken at face value -- that r233136 was applied > to simplify frame-based array accesses... Well, I simply don't believe > that. > > I can see how the patch would aid reduction of access to members of a > structure that are in an array which in turn is *not* on the stack. But > for the average stack-based access I can't see except that it would hurt. Hi Richard, my comment was about when to legitimize address expression in the form of "base + reg << scaling + offset", the gimple passes could be improved to catch CSE opportunities of "reg << scaling" so that base + offset can be moved out of memory reference. it's beneficial since "base_offset" is likely invariant in loop context, especially when the base is sp/fp related. IMHO, this is the same transform as your propose patch does. Or maybe I mis-understood something important? Thanks.
[Bug testsuite/70009] test case libgomp.oacc-c-c++-common/vprop.c fails starting with its introduction in r233607
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70009 James Greenhalgh changed: What|Removed |Added Target|powerpc*-*-*, aarch64-*-* |powerpc*-*-*, aarch64-*-*, ||arm*-*-* Last reconfirmed|2016-02-29 00:00:00 |2016-3-7 CC||jgreenhalgh at gcc dot gnu.org --- Comment #5 from James Greenhalgh --- Also failing on arm/aarch64 (so good further evidence of signed vs. unsigned char). Forcing the macro to use signed types clears the error for me on arm-none-linux-gnueabihf (though I don't know if this is correct).
[Bug c++/70105] [6 regression] giant warning when building gcc-5 with gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70105 David Malcolm changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |dmalcolm at gcc dot gnu.org --- Comment #2 from David Malcolm --- Thanks; I can reproduce this locally (e.g. building tree-complex.o in r233492 on gcc-5-branch using gcc 6 20160212); am investigating.
[Bug c++/70126] New: VLA accepted in sizeof and typedef, allowing integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70126 Bug ID: 70126 Summary: VLA accepted in sizeof and typedef, allowing integer overflow Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gcc dot gnu.org Target Milestone: --- G++ 4.9.3 added support for variable-length arrays specified in WG21 document N3639 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3639.html). That document specifies, among other restrictions, that: * The sizeof operator shall not be applied to [...] an array of runtime bound... * A typedef-name shall not name an array of runtime bound. G++ 4.9 fails to enforce these restrictions, making it possible for the definition of a VLA type to cause an integer overflow. The following test case (which is invalid, according to N3639) shows the problems. $ cat v.c && /home/msebor/build/gcc-4.9.3/gcc/xg++ -B/home/msebor/build/gcc-4.9.3/gcc -Wall -Wextra -L /home/msebor/build/gcc-4.9.3/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs -std=c++11 -xc++ v.c && ./a.out typedef __SIZE_TYPE__ size_t; void __attribute__ ((noclone, noinline)) bar (size_t m) { typedef int A [m]; typedef A A2 [4]; __builtin_printf ("sizeof (A) = %zu\nsizeof (A2) = %zu\n", sizeof (A), sizeof (A2)); if (sizeof (A2) < sizeof (A)) __builtin_abort (); } int main () { try { bar (__SIZE_MAX__ / sizeof (int)); __builtin_trap (); } catch (...) { __builtin_printf ("exception caught\n"); } } sizeof (A) = 18446744073709551612 sizeof (A2) = 18446744073709551600 Aborted (core dumped)
[Bug target/70123] [6 Regression] Miscompilation of cfitsio testcase on s390x-linux starting with r222144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70123 --- Comment #2 from Jakub Jelinek --- Ah, the spilled value is pseudo 240, which holds Dunno why instead of spilling it it couldn't be rematerialized. Anyway, the value in r6 after the f9 call is needed, so all that seems to be wrong is the order of the instructions after the f9 call: stg %r6,192(%r15) ! The spill of lgf %r4,252(%r15) larl%r3,.LC6 lghi%r2,1 lgr %r6,%r10! %r10 holds g value, copy it into argument reg lghi%r11,0 brasl %r14,bar@PLT! bar shouldn't clobber %r6 stg %r8,168(%r15) stg %r9,160(%r15) lgr %r4,%r10 lghi%r5,1 lghi%r3,3 lg %r2,280(%r15) brasl %r14,f9@PLT ! nor f9; and f9 takes it as g as its 3rd ! and 5th arg; so in %r4 and %r6, %r4 is call clobbered, %r6 and %r10 are ! call-saved lg %r6,192(%r15) ! this fills the value lgr %r10,%r6! but, here we want the g value instead; ! That value is live in both %r6 before the above fill, or in %r10 as well. ! So either removing the lgr %r10,%r6 instruction, or moving it before ! lg %r6,192(%r15) instruction fixes the testcase.
[Bug target/70123] [6 Regression] Miscompilation of cfitsio testcase on s390x-linux starting with r222144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70123 --- Comment #1 from Jakub Jelinek --- The __builtin_strcat in the testcase should be __builtin_strcpy, it reproduces even with that and without it it has UB. Anyway, what I see is that the f9 call loads the g variable into %r6 register (for argument passing, but on s390* the register is call-saved), we have: %r6 = (reg:DI 180 [g]) before the f9 call and (reg:DI 250 [g]) = (reg:DI 180 [g]) after the f9 call. Pseudo 250 is then used in the problematic loop as doloop counter. sched1 then moves these two insns before the bar call shortly before f9, next to each other. Then IRA performs: New iteration of spill/restore move Changing RTL for loop 2 (header bb20) 10 vs parent 10: Creating newreg=256 from oldreg=180 13 vs parent 11: Creating newreg=257 from oldreg=240 9 vs parent 13: Creating newreg=258 from oldreg=246 8 vs parent 8: Creating newreg=259 from oldreg=247 7 vs parent 7: Creating newreg=260 from oldreg=248 What we get out of LRA looks already wrong. There is an added spill of %r6 (which contains something unrelated to g), then after a few instructions correct %r6 = %r10 (%r10 holds the correct value of g at that point at least in the first iteration that I could verify) before the bar call, then the bar call itself (which shouldn't change %r6, because it is call-saved register), then the call to f9, and right after it LRA adds a fill from the above mentioned spill slot (so unrelated value) into %r6 and copy of that %r6 into %r10.
[Bug c++/16994] [meta-bug] VLA and C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16994 Bug 16994 depends on bug 21113, which changed state. Bug 21113 Summary: Jumps into VLA or VM scope not rejected for C++ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21113 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/21113] Jumps into VLA or VM scope not rejected for C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21113 Martin Sebor changed: What|Removed |Added Status|NEW |RESOLVED Last reconfirmed|2012-01-05 00:00:00 |2016-3-7 CC||msebor at gcc dot gnu.org Known to work||4.9.3, 5.3.0, 6.0 Resolution|--- |FIXED Known to fail||4.1.2, 4.5.3 --- Comment #4 from Martin Sebor --- Fixed in r209124 a couple of years ago. Today's trunk rejects the test case with the following output. $ cat x.c && /build/gcc-trunk-bootstrap/gcc/xg++ -B /build/gcc-trunk-bootstrap/gcc -S -Wall -Wextra -Wpedantic -o/dev/null -xc++ x.c void f(int l) { goto label; int a[l]; label:; } void g (int l) { switch (l) { case 1:; int a[l]; default:; } } x.c: In function ‘void f(int)’: x.c:3:10: warning: ISO C++ forbids variable length array ‘a’ [-Wvla] int a[l]; ^ x.c:4:2: error: jump to label ‘label’ [-fpermissive] label:; ^ x.c:2:8: note: from here goto label; ^ x.c:3:10: note: crosses initialization of ‘sizetype ’ int a[l]; ^ x.c:3:7: note: crosses initialization of ‘int a [l]’ int a[l]; ^ x.c:3:7: warning: unused variable ‘a’ [-Wunused-variable] x.c: In function ‘void g(int)’: x.c:10:12: warning: ISO C++ forbids variable length array ‘a’ [-Wvla] int a[l]; ^ x.c:11:3: error: jump to case label [-fpermissive] default:; ^~~ x.c:10:12: note: crosses initialization of ‘sizetype ’ int a[l]; ^ x.c:10:9: note: crosses initialization of ‘int a [l]’ int a[l]; ^ x.c:10:9: warning: unused variable ‘a’ [-Wunused-variable]
[Bug c++/70076] no exception for excess initializer elements in a multidimensional VLA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70076 Martin Sebor changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-03-07 Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Martin Sebor --- Patch for bug 69517 posted for review (below) includes a fix for this bug as well: https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00441.html
[Bug c++/70019] VLA size overflow not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70019 --- Comment #3 from Martin Sebor --- Patch for this bug and for bug 69517 posted for review: https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00441.html
[Bug c++/69517] [5/6 regression] SEGV on a VLA with excess initializer elements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69517 --- Comment #8 from Martin Sebor --- Patch posted for review: https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00441.html
[Bug rtl-optimization/19705] -fno-branch-count-reg doesn't prevent decrement and branch instructions on a count register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19705 Martin Sebor changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org --- Comment #8 from Martin Sebor --- Documentation clarified in r234039.
[Bug c++/66786] [5/6 Regression] ICE: Segmentation fault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66786 --- Comment #8 from Patrick Palka --- Author: ppalka Date: Mon Mar 7 17:09:53 2016 New Revision: 234038 URL: https://gcc.gnu.org/viewcvs?rev=234038=gcc=rev Log: Adjust fix for PR c++/66786 gcc/cp/ChangeLog: PR c++/66786 * pt.c (get_template_info): Handle PARM_DECL. (template_class_depth): Check DECL_P instead of VAR_OR_FUNCTION_DECL_P. Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/pt.c
[Bug rtl-optimization/19705] -fno-branch-count-reg doesn't prevent decrement and branch instructions on a count register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19705 --- Comment #7 from Martin Sebor --- Author: msebor Date: Mon Mar 7 17:10:12 2016 New Revision: 234039 URL: https://gcc.gnu.org/viewcvs?rev=234039=gcc=rev Log: PR rtl-optimization/19705 - -fno-branch-count-reg doesn't prevent decrement and branch instructions on a count register gcc/ChangeLog: 2016-03-07 Martin SeborPR rtl-optimization/19705 * doc/invoke.texi (Options That Control Optimization): Clarify -fno-branch-count-reg. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/invoke.texi
[Bug tree-optimization/69740] [5/6 Regression] gcc ICE at -O2 and above on valid code on x86_64-linux-gnu in "verify_loop_structure"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69740 --- Comment #12 from Jeffrey A. Law --- Author: law Date: Mon Mar 7 17:01:54 2016 New Revision: 234036 URL: https://gcc.gnu.org/viewcvs?rev=234036=gcc=rev Log: PR tree-optimization/69740 * cfghooks.c (remove_edge): Request loop fixups if we delete an edge that might turn an irreducible loop into a natural loop. * cfgloop.h (check_verify_loop_structure): Clear LOOPS_NEED_FIXUP. Move after definition of loops_state_clear. PR tree-optimization/69740 * gcc.c-torture/compile/pr69740-1.c: New test. * gcc.c-torture/compile/pr69740-2.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/compile/pr69740-1.c trunk/gcc/testsuite/gcc.c-torture/compile/pr69740-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfghooks.c trunk/gcc/cfgloop.h trunk/gcc/testsuite/ChangeLog
[Bug target/70120] [6 Regression][aarch64] -g causes Assembler messages: Error: unaligned opcodes detected in executable segment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70120 Zdenek Sojka changed: What|Removed |Added Summary|[6 Regression][aarch64] |[6 Regression][aarch64] -g |-freorder-functions -g |causes Assembler messages: |causes Assembler messages: |Error: unaligned opcodes |Error: unaligned opcodes|detected in executable |detected in executable |segment |segment | --- Comment #4 from Zdenek Sojka --- I have another testcase that fails with -Og -fno-dce -fno-forward-propagate -ffunction-sections -fno-tree-dce -g1 -mcmodel=tiny... so probably not caused by just -freorder-functions.
[Bug rtl-optimization/69052] [6 Regression] Performance regression after r229402.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69052 --- Comment #17 from amker at gcc dot gnu.org --- Author: amker Date: Mon Mar 7 16:39:27 2016 New Revision: 234034 URL: https://gcc.gnu.org/viewcvs?rev=234034=gcc=rev Log: PR rtl-optimization/69052 * rtlanal.c (commutative_operand_precedence): Set higher precedence to CONST_WIDE_INT. Modified: trunk/gcc/ChangeLog trunk/gcc/rtlanal.c
[Bug c++/70124] alignas error in constexpr function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70124 --- Comment #1 from Martin Sebor --- Bug 70125 tracks the problem with the missing context of the diagnostic.
[Bug c++/70125] New: attributes diagnostics missing essential context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70125 Bug ID: 70125 Summary: attributes diagnostics missing essential context Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gcc dot gnu.org Target Milestone: --- Diagnostics for attributes are missing information that can be essential in determining the root cause of the problem. For example, the following test case causes a warning pointing out that the attribute vector_size is ignored. But there is no indication as to which of the two invocations of the function in which the attribute is used causes the diagnostic. To help debug these problems the diagnostic should include both the value of the vector_size attribute and, for constexpr functions, the call stack. $ cat x.c && /build/gcc-trunk-bootstrap/gcc/xg++ -B /build/gcc-trunk-bootstrap/gcc -S -Wall -Wextra -Wpedantic -o/dev/null -xc++ x.c constexpr int foo (unsigned N) { typedef int V __attribute__ ((vector_size (N))); V v = 0; return v; } int i = foo (1); int j = foo (16); x.c: In function ‘constexpr int foo(unsigned int)’: x.c:3:49: warning: ‘vector_size’ attribute ignored [-Wattributes] typedef int V __attribute__ ((vector_size (N))); ^
[Bug lto/69650] [6 Regression] ICE in linemap_line_start, at libcpp/line-map.c:803
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69650 Bernd Schmidt changed: What|Removed |Added CC||bernds at gcc dot gnu.org --- Comment #8 from Bernd Schmidt --- Created attachment 37889 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37889=edit Candidate patch Testing this.
[Bug c++/70124] New: alignas error in constexpr function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70124 Bug ID: 70124 Summary: alignas error in constexpr function Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gcc dot gnu.org Target Milestone: --- Both attribute aligned and the alignas specifier are rejected with an error in a constexpr function complaining that the alignment isn't a constant expression. Attribute vector_size, on the other hand, is accepted in the same context. Note also that the diagnostic doesn't mention the value of the (constant) argument or include the call site, making it difficult to determine which invocation of the function caused the error when there is more than one. $ cat x.c && /build/gcc-trunk-bootstrap/gcc/xg++ -B /build/gcc-trunk-bootstrap/gcc -S -Wall -Wextra -Wpedantic -o/dev/null -xc++ x.c constexpr int foo (unsigned N) { typedef __attribute__ ((aligned (1 << N))) int I; I i = 0; return i; } int i = foo (__alignof__ (int)); x.c: In function ‘constexpr int foo(unsigned int)’: x.c:3:50: error: requested alignment is not an integer constant typedef __attribute__ ((aligned (1 << N))) int I; ^ $ cat x.c && /build/gcc-trunk-bootstrap/gcc/xg++ -B /build/gcc-trunk-bootstrap/gcc -S -Wall -Wextra -Wpedantic -o/dev/null -xc++ x.c constexpr int foo (unsigned N) { alignas (1 << N) int i = 0; return i; } int i = foo (__alignof__ (int)); x.c: In function ‘constexpr int foo(unsigned int)’: x.c:3:17: error: ‘N’ is not a constant expression alignas (1 << N) int i = 0; ^ x.c:3:17: error: ‘N’ is not a constant expression
[Bug rtl-optimization/69633] [6 Regression] Redundant move is generated after r228097
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69633 Bernd Schmidt changed: What|Removed |Added CC||bernds at gcc dot gnu.org --- Comment #2 from Bernd Schmidt --- Doesn't seem to happen over here. Can you still reproduce this with trunk? Please post exact arguments to cc1 if it does.
[Bug target/70123] [6 Regression] Miscompilation of cfitsio testcase on s390x-linux starting with r222144
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70123 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-07 CC||krebbel at gcc dot gnu.org, ||vmakarov at gcc dot gnu.org Target Milestone|--- |6.0 Summary|[6 Regression] |[6 Regression] |Miscompilation of |Miscompilation of cfitsio ||testcase on s390x-linux ||starting with r222144 Ever confirmed|0 |1
[Bug target/70117] ppc long double isinf() is wrong?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117 --- Comment #4 from Ulrich Weigand --- (In reply to Alan Modra from comment #3) > > while with GCC, we get: > > > > high double: 7FEF > > low double: 7C8F FFFE > > Right. This is 0x1.f78p+1023 > > gnulib isn't correct here. As the comment says the high double must be the > value of the long double correctly rounded to double (to nearest since that > is the only mode supported for IBM extended double). Any long double value > higher than the above will round up the high double to inf. Well, what I don't quite understand is that the gnulib value, which is 0x1.f7cp+1023 likewise should round to the same double value, shouldn't it? I notice that if I actually attempt to use that value in C source code, the compiler does indeed round it to inf -- but I don't see why it actually should do so ...
[Bug target/70123] New: [6 Regression] Miscompilation of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70123 Bug ID: 70123 Summary: [6 Regression] Miscompilation of Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- The following testcase is miscompiled on s390x-linux with -O2 -m64 -march=z9-109 -mtune=z10 -fPIC starting with r222144. It works with -mno-lra, so suspect either a LRA bug or machine description bug. The effect is that the 186 for (h = 0; h < g; h++) 187 baz (" %2d", t[h]); loop iterates until [h] is unaccessible. But g should be at most 20, so h certainly shouldn't be almost 5000. -fsanitize=undefined,address and valgrind is clean on this on x86_64-linux. __attribute__ ((noinline, noclone)) int bar (int flag, const char *__restrict format, ...) { asm volatile ("" : : "r" (flag), "r" (format) : "memory"); return 0; } extern inline __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) int baz (const char *__restrict fmt, ...) { return bar (1, fmt, __builtin_va_arg_pack ()); } __attribute__ ((noinline, noclone)) int f1 (void **a, const char *b, int *c) { *a = 0; *c = 0; asm volatile ("" : : "r" (), "r" (b), "r" () : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f2 (void *a, int b, int c, long d[], int *e) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d), "r" (e) : "memory"); return 1; } __attribute__ ((noinline, noclone)) int f3 (void *a, int *b) { asm volatile ("" : : "r" (a), "r" (b) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f4 (void *a, const char *b, int c, int d, double *e, int f, char **g, int *h) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f5 (void *a, long long b, int c, char **d, char **e, char **f, const char *g, long long h, int *i) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); asm volatile ("" : : "r" (i) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f6 (void *a, int b, int *c, int *d) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f7 (void *a, int b, long long c, long long d, long long e, double *f, int *g) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f8 (void *a, int b, long long c, long long d, long long e, char *f, const char **g, int *h, int *i) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); asm volatile ("" : : "r" (i) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f9 (void *a, int b, long long c, long long d, long long e, char *f, int *g) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f10 (void *a, int b, long long c, long long d, long long e, unsigned char f, unsigned char *g, int *h, int *i) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); asm volatile ("" : : "r" (i) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f11 (void *a, int b, long long c, long long d, long long e, long f, long *g, int *h, int *i) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); asm volatile ("" : : "r" (i) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f12 (void *a, int b, long long c, long long d, long long e, float f, float *g, int *h, int *i) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f), "r" (g), "r" (h) : "memory"); asm volatile ("" : : "r" (i) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f13 (void *a, int b, long long c, long *d, long *e, int *f) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); asm volatile ("" : : "r" (e), "r" (f) : "memory"); return 0; } __attribute__ ((noinline, noclone)) int f14 (void *a, int b, int *c, int *d) { asm volatile ("" : : "r" (a), "r" (b), "r" (c), "r" (d) : "memory"); return 0; } volatile int a; int main () { int b, c, d = 0, e, f = 0; long g, h; int j = 0; long k, l; int m; unsigned char n[21]; long o[21]; float p[21]; double
[Bug target/70120] [6 Regression][aarch64] -freorder-functions -g causes Assembler messages: Error: unaligned opcodes detected in executable segment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70120 ktkachov at gcc dot gnu.org changed: What|Removed |Added Known to work||4.9.4, 5.3.1 Summary|[aarch64] |[6 Regression][aarch64] |-freorder-functions -g |-freorder-functions -g |causes Assembler messages: |causes Assembler messages: |Error: unaligned opcodes|Error: unaligned opcodes |detected in executable |detected in executable |segment |segment --- Comment #3 from ktkachov at gcc dot gnu.org --- Confirmed as well. That makes this a regression for GCC 6
[Bug tree-optimization/69666] [5 Regression] gcc ICE at -O2 and -O3 on valid code on x86_64-linux-gnu in "verify_gimple failed"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69666 --- Comment #14 from Martin Jambor --- With the reverted patch re-applied, this should be again fixed everywhere (and the fix should not be causing any new issues).
[Bug target/70120] [aarch64] -mno-pc-relative-literal-loads -g causes Assembler messages: Error: unaligned opcodes detected in executable segment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70120 --- Comment #2 from Zdenek Sojka --- Created attachment 37888 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37888=edit another testcase This testcase does not need -mno-pc-relative-literal-loads. $ aarch64-unknown-linux-gnu-gcc -Og -freorder-functions -g3 -mcmodel=large testcase.c /tmp/cc4Fczxu.s: Assembler messages: /tmp/cc4Fczxu.s: Error: unaligned opcodes detected in executable segment
[Bug target/63503] [AArch64] A57 executes fused multiply-add poorly in some situations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503 Thomas Preud'homme changed: What|Removed |Added Status|WAITING |RESOLVED CC||thopre01 at gcc dot gnu.org Resolution|--- |FIXED --- Comment #26 from Thomas Preud'homme --- Fixed as of r222512
[Bug c/7652] -Wswitch-break : Warn if a switch case falls through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7652 --- Comment #37 from Matthew Woehlke --- > Essentially, this warning and the "intentional fallthrough" attribute exist for both clang and MSVC and will be enabled there; but GCC still doesn't have this feature. [[fallthrough]] was approved for C++17. While the standard does not normatively *require* a diagnostic, it's certainly expected that one be issued. It's a shame that gcc is behind the curve here.
[Bug tree-optimization/70116] tail-merge merges ubsan internal fns with different location information
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70116 vries at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED Target Milestone|--- |6.0 --- Comment #3 from vries at gcc dot gnu.org --- Patch committed, no test-case known. Marking resolved-fixed.
[Bug target/70048] [6 Regression][AArch64] Inefficient local array addressing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048 Jiong Wang changed: What|Removed |Added CC||jiwang at gcc dot gnu.org --- Comment #7 from Jiong Wang --- (In reply to Richard Henderson from comment #6) > Created attachment 37886 [details] > proposed patch > > I agree -- at minimum virtual and eliminable frame registers ought to be > special-cased. If we separate the constants too far, we'll never be able > to fold the constant plus the adjustment back together. > > If the statement in #c4 is taken at face value -- that r233136 was applied > to simplify frame-based array accesses... Well, I simply don't believe > that. > > I can see how the patch would aid reduction of access to members of a > structure that are in an array which in turn is *not* on the stack. But > for the average stack-based access I can't see except that it would hurt. (In reply to Richard Henderson from comment #6) > Created attachment 37886 [details] > proposed patch > > I agree -- at minimum virtual and eliminable frame registers ought to be > special-cased. If we separate the constants too far, we'll never be able > to fold the constant plus the adjustment back together. > > If the statement in #c4 is taken at face value -- that r233136 was applied > to simplify frame-based array accesses... Well, I simply don't believe > that. > > I can see how the patch would aid reduction of access to members of a > structure that are in an array which in turn is *not* on the stack. But > for the average stack-based access I can't see except that it would hurt. Richard, There is a similar hunk in aarch64 TARGET_LEGITIMATE_ADDRESS_P implementation (the "case PLUS" in aarch64_classify_address) which allows virtual frame reference be with any constant offset. if (! strict_p && REG_P (op0) && (op0 == virtual_stack_vars_rtx || op0 == frame_pointer_rtx || op0 == arg_pointer_rtx) Looks to me, your patch have handled situations where op0 are virtual_stack_dynamic_rtx and virtual_outgoing_args_rtx as well. Suppose offset elimination will happen on these two, then I think we should update the code in TARGET_LEGITIMATE_ADDRESS_P to use the same checks in your patch, then the two places are consitent.
[Bug libstdc++/69879] Create a pointer to the default operator new and delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69879 Gabriel Ibarra changed: What|Removed |Added Attachment #37848|0 |1 is obsolete|| --- Comment #5 from Gabriel Ibarra --- Created attachment 37887 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37887=edit [RFC] Added default functions for new and delete operators + unit test. Hi, I added a new section (GLIBCXX_3.4.23) and a test file for this issue. Jonathan, I have a couple of questions: 1) When you say "it can't be applied until after GCC 6", you mean that my changes won't be commited until then? If you commit this now, how would you handle me using 4.23 as a version number? 2) If you commit this now, is there any way to make the unit test that I added run only for GLIBCXX_3.4.23? 3) And finally, how should I deal with the sized delete? Since it depends on the __cpp_sized_deallocation macro, should I add a test that also depends on this macro? Thanks, Gabriel
[Bug tree-optimization/69666] [5 Regression] gcc ICE at -O2 and -O3 on valid code on x86_64-linux-gnu in "verify_gimple failed"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69666 --- Comment #13 from Martin Jambor --- Author: jamborm Date: Mon Mar 7 15:17:49 2016 New Revision: 234030 URL: https://gcc.gnu.org/viewcvs?rev=234030=gcc=rev Log: Fix PR 69666 and PR 69920 2016-03-07 Martin JamborPR tree-optimization/69666 PR middle-end/69920 * tree-sra.c (sra_modify_assign): Do not attempt to create default_def replacements for unscalarizable regions. Do not remove loads of uninitialized aggregates to SSA_NAMEs. testsuite/ * gcc.dg/torture/pr69932.c: New test. * gcc.dg/torture/pr69936.c: Likewise. Added: branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr69932.c branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr69936.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/testsuite/ChangeLog branches/gcc-5-branch/gcc/tree-sra.c
[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920 --- Comment #16 from Martin Jambor --- Author: jamborm Date: Mon Mar 7 15:17:49 2016 New Revision: 234030 URL: https://gcc.gnu.org/viewcvs?rev=234030=gcc=rev Log: Fix PR 69666 and PR 69920 2016-03-07 Martin JamborPR tree-optimization/69666 PR middle-end/69920 * tree-sra.c (sra_modify_assign): Do not attempt to create default_def replacements for unscalarizable regions. Do not remove loads of uninitialized aggregates to SSA_NAMEs. testsuite/ * gcc.dg/torture/pr69932.c: New test. * gcc.dg/torture/pr69936.c: Likewise. Added: branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr69932.c branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr69936.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/testsuite/ChangeLog branches/gcc-5-branch/gcc/tree-sra.c
[Bug c/7652] -Wswitch-break : Warn if a switch case falls through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7652 Tom Tromey changed: What|Removed |Added CC||tromey at gcc dot gnu.org --- Comment #36 from Tom Tromey --- This came up for Mozilla today: https://groups.google.com/forum/#!topic/mozilla.dev.platform/YT7SXFhyr_I Essentially, this warning and the "intentional fallthrough" attribute exist for both clang and MSVC and will be enabled there; but GCC still doesn't have this feature.
[Bug target/70117] ppc long double isinf() is wrong?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70117 Alan Modra changed: What|Removed |Added CC||amodra at gmail dot com --- Comment #3 from Alan Modra --- > while with GCC, we get: > > high double: 7FEF > low double: 7C8F FFFE Right. This is 0x1.f78p+1023 gnulib isn't correct here. As the comment says the high double must be the value of the long double correctly rounded to double (to nearest since that is the only mode supported for IBM extended double). Any long double value higher than the above will round up the high double to inf.
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #9 from alalaw01 at gcc dot gnu.org --- In analyze_access_subtree (since r147980, "New implementation of SRA", 2009): else if (root->grp_write || TREE_CODE (root->base) == PARM_DECL) root->grp_unscalarized_data = 1; /* not covered and written to */ adding a case for constant_decl_p alongside the PARM_DECL case, fixes the ICE; AArch64 bootstrap in progress.
[Bug libgomp/70122] [openacc] Handle acc loop directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70122 --- Comment #3 from vries at gcc dot gnu.org --- (In reply to vries from comment #2) > For now, marking missed-optimization/enhancement. There might also be > correctness failures due to the lack of explicit handling, I'm not sure. Which is why I'd like to have it in stage4, if possible.
[Bug libgomp/70122] [openacc] Handle acc loop directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70122 --- Comment #2 from vries at gcc dot gnu.org --- For now, marking missed-optimization/enhancement. There might also be correctness failures due to the lack of explicit handling, I'm not sure.
[Bug libgomp/70122] [openacc] Handle acc loop directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70122 vries at gcc dot gnu.org changed: What|Removed |Added Severity|normal |enhancement
[Bug libgomp/70122] [openacc] Handle acc loop directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70122 vries at gcc dot gnu.org changed: What|Removed |Added Keywords||missed-optimization, patch --- Comment #1 from vries at gcc dot gnu.org --- patch: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01903.html pinged: https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00452.html
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #8 from Martin Jambor --- That is what I suspected. Please have at look why analyze_access_subtree (which has to set the grp_unscalarized_data flag) acts differently then.
[Bug libgomp/70122] New: [openacc] Handle acc loop directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70122 Bug ID: 70122 Summary: [openacc] Handle acc loop directive Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Atm, in trunk, we don't explicitly handle acc loop directive in a kernels region. As a consequence, such kernels regions are executed sequentially.
[Bug rtl-optimization/69710] performance issue with SP Linpack with Autovectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710 --- Comment #15 from Doug Gilmore --- > I had a patch too, will send it for review in GCC7 if it's still needed. Sorry I got side track last week and didn't make much progress. Please go ahead and submit if you have something you feel comfortable with, I'll assist in testing. Thanks,
[Bug tree-optimization/70116] tail-merge merges ubsan internal fns with different location information
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70116 --- Comment #2 from vries at gcc dot gnu.org --- Author: vries Date: Mon Mar 7 14:50:13 2016 New Revision: 234029 URL: https://gcc.gnu.org/viewcvs?rev=234029=gcc=rev Log: Skip ubsan/asan internal fns with different location in tail-merge 2016-03-07 Tom de VriesPR tree-optimization/70116 * tree-ssa-tail-merge.c (merge_stmts_p): New function, handling is_tm_ending stmts and ubsan/asan internal functions. (find_duplicate): Use it. Don't test is_tm_ending here. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-tail-merge.c
[Bug c++/67364] [5/6 Regression] "accessing uninitialized member" error in constexpr context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67364 --- Comment #15 from Jason Merrill --- Author: jason Date: Mon Mar 7 14:43:14 2016 New Revision: 234028 URL: https://gcc.gnu.org/viewcvs?rev=234028=gcc=rev Log: PR c++/67364 * constexpr.c (cxx_eval_store_expression): Replace CONSTRUCTOR_ELTS in nested CONSTRUCTORs, too. Added: branches/gcc-5-branch/gcc/testsuite/g++.dg/cpp0x/constexpr-aggr3.C Modified: branches/gcc-5-branch/gcc/cp/ChangeLog branches/gcc-5-branch/gcc/cp/constexpr.c
[Bug lto/69607] undefined reference to MAIN__._omp_fn.0 in atomic_capture-1.f with -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69607 --- Comment #24 from vries at gcc dot gnu.org --- (In reply to vries from comment #23) > pinged patch: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01632.html ping^2: https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00487.html
[Bug tree-optimization/70013] [6 Regression] packed structure tree-sra loses initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70013 --- Comment #7 from alalaw01 at gcc dot gnu.org --- *second* half, sorry. grp_to_be_replaced is here true, but grp_unscalarized_data is false, so handle_unscalarized_data_in_subtree sets sad->refreshed=UDH_LEFT and we build the access to the LHS. (Then, load_assign_lhs_subreplacements exits, and the caller sees UDH_LEFT and removes the original block move statement.) In contrast, on a similar testcase using a parameter rather than *.LC0, grp_unscalarized_data is true, handle_unscalarized_data_in_subtree sets sad->refreshed=UDH_RIGHT and we build an access to the RHS, which is OK; and leave the block move statement in place, hence correctness.
[Bug target/70083] [6 Regression] ICE: in assign_stack_local_1, at function.c:409 with -fschedule-insns -mavx512* @ i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70083 --- Comment #4 from Jakub Jelinek --- RA is not my area of expertise, so sure, go ahead (unless Vlad wants to have a look).
[Bug c++/70121] [5/6 Regression] spurious warning and crash when returning a reference from lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70121 Marc Glisse changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-07 Summary|spurious warning and crash |[5/6 Regression] spurious |when returning a reference |warning and crash when |from lambda |returning a reference from ||lambda Ever confirmed|0 |1 --- Comment #1 from Marc Glisse --- The wrong warning is old, but since gcc-5 we reuse that warning code to change the return to null, which makes the issue much worse.