[Bug sanitizer/109698] gcc/g++ build/link fails for libhwasan.so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109698 Andrew Pinski changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |MOVED
[Bug sanitizer/109698] gcc/g++ build/link fails for libhwasan.so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109698 --- Comment #7 from Nathan Ridge --- Based on some searching around for other users running into this error, this seems to be caused by an ld bug which was fixed in 2.32: https://sourceware.org/bugzilla/show_bug.cgi?id=24458
[Bug sanitizer/109698] gcc/g++ build/link fails for libhwasan.so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109698 Nathan Ridge changed: What|Removed |Added CC||zeratul976 at hotmail dot com --- Comment #6 from Nathan Ridge --- I'm experiencing the same issue. I'm also on Debian 10 and using ld 2.31.1.
[Bug target/111064] 5-10% regression of parest on icelake between g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a (Aug 15 2023) and g:9ade70bb86c8744f4416a48bb69cf4705f00905a (Aug 16)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111064 --- Comment #6 from Hongtao.liu --- > > [liuhongt@intel gather_emulation]$ ./gather.out > ;./nogather_xmm.out;./nogather_ymm.out > elapsed time: 1.75997 seconds for gather with 3000 iterations > elapsed time: 2.42473 seconds for no_gather_xmm with 3000 iterations > elapsed time: 1.86436 seconds for no_gather_ymm with 3000 iterations > For 510.parest_r, enable gather emulation for ymm can bring back 3% performance, still not as good as gather instruction due to thoughput bound.
[Bug c++/111222] ICE on basic_string_view and alias templates with missing template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111222 --- Comment #2 from Steven Xia --- interesting
[Bug c++/111222] ICE on basic_string_view and alias templates with missing template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111222 Andrew Pinski changed: What|Removed |Added Keywords||accepts-invalid Summary|[c++23] ICE on |ICE on basic_string_view |basic_string_view with |and alias templates with |missing template argument |missing template argument --- Comment #1 from Andrew Pinski --- Hmm, we have an accepts invalid here too. For C++20 we accept it but deduction for alias templates is not valid ...
[Bug tree-optimization/111221] Floating point handling a*1.0 vs. a+0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111221 --- Comment #2 from Andrew Pinski --- That is GCC will remove additions of -0.0: double addneg0 (double a) { return a + -0.0; } Gets optimized to just `return a;`.
[Bug c++/111222] New: ICE on basic_string_view with missing template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111222 Bug ID: 111222 Summary: ICE on basic_string_view with missing template argument Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: stevenxia990430 at gmail dot com Target Milestone: --- The following invalid program reports an internal compiler error. Failed on gcc-trunk. To quickly reproduce: https://gcc.godbolt.org/z/To8EdvdhM ``` #include template using string_view = std::basic_string_view; string_view my_string = "12345"; ``` note after providing the valid declaration (i.e., string_view my_string = "12345";) the program compiles successfully
[Bug tree-optimization/111221] Floating point handling a*1.0 vs. a+0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111221 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- -0.0 * 1.0 is still -0.0 While -0.0 + 0.0 is 0.0 rather than -0.0.
[Bug tree-optimization/111221] New: Floating point handling a*1.0 vs. a+0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111221 Bug ID: 111221 Summary: Floating point handling a*1.0 vs. a+0.0 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- I just noticed that gcc will optimize away multiplying a floating point number with 1.0, but will not do for an addition with 0.0. Example, with -O3, double add0 (double a) { return a + 0.0; } double mul1 (double a) { return a * 1.0; } yields add0: .LFB0: .cfi_startproc pxor%xmm1, %xmm1 addsd %xmm1, %xmm0 ret vs. mul1: .LFB1: .cfi_startproc ret which seems inconsistent. If this is the result of a deliberate design decision, feel free to close as WONTFIX.
[Bug c++/109859] [12/13/14 Regression] ICE on concept mis-typed as template type parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109859 Andrew Pinski changed: What|Removed |Added CC||stevenxia990430 at gmail dot com --- Comment #3 from Andrew Pinski --- *** Bug 111220 has been marked as a duplicate of this bug. ***
[Bug c++/111220] ICE with std::integral in template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111220 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Dup of bug 109859. *** This bug has been marked as a duplicate of bug 109859 ***
[Bug c++/111220] New: ICE with std::integral in template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111220 Bug ID: 111220 Summary: ICE with std::integral in template Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: stevenxia990430 at gmail dot com Target Milestone: --- The following invalid program reports an internal compiler error. Failed on gcc-trunk. To quickly reproduce: https://gcc.godbolt.org/z/TsKjKbfrd ``` #include template ``` note that it requires --std=c++20 or c++23, tried on default and it errors without any crashes.
[Bug tree-optimization/110111] bool patterns that should produce a?b:c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110111 --- Comment #3 from Andrew Pinski --- f1: _6 = b_2(D) ^ c_3(D); _7 = a_1(D) & _6; _4 = c_3(D) ^ _7; Which was done due to: /* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */ (simplify (bit_ior:c (bit_and:cs @0 (bit_not @2)) (bit_and:cs @1 @2)) (bit_xor (bit_and (bit_xor @0 @1) @2) @0)) Note if we move this over to bitwise_inverted_equal_p (which we should), we will lose also: ``` bool f(int a, int b, int t) { bool x = a == 0; bool y = b == 1; bool m = t == 2; bool mp = !m; return (x & mp) | (y & m); } ``` Which is currently handled. We should check for `element_precision (type) == 1` too. So something like: (simplify (bit_ior (bit_and:c@and1 @0 @3) (bit_and:c@and2 @1 @2)) (with { bool wascmp; } (if (bitwise_inverted_equal_p (@0, @2, wascmp)) (switch /* For 1bit, wascmp can be true and we can just convert it into `m ? y : x` */ (if (INTEGRAL_TYPE_P (type) && element_precision (type) == 1) (cond @3 @0 @1)) (if (!wascmp && element_precision (type) != 1 && single_use (@and1) && single_use (@and2)) (bit_xor (bit_and (bit_xor @0 @1) @2) @0)) ) ) ) ) ) /* 1bit `((x ^ y) & m) ^ x` should just be convert into `m ? y : x` early */ (simplify (bit_xor:c (bit_and:c (bit_xor:c @0 @1) @2) @0) (if (INTEGRAL_TYPE_P (type) && element_precision (type) == 1) (cond @2 @0 @1)))
[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983 --- Comment #5 from Mao --- (In reply to Andrew Pinski from comment #4) > `make html` is the way to build the HTML web pages ... Thanks for the help. Yes, I have confirmed with the generated HTML as well. My patch can fix it.
[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983 --- Comment #4 from Andrew Pinski --- (In reply to Mao from comment #3) > Created attachment 55810 [details] > invoke-doc-patch > > I think this can help fix the issue. > I am not sure how to build the HTML web pages. But I also checked the man > page. The fpatchable-function-entry is also missing in the manpage in the > option summary section. And my fix can solve this issue. `make html` is the way to build the HTML web pages ...
[Bug tree-optimization/111149] bool0 != bool1 should be expanded as bool0 ^ bool1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49 Andrew Pinski changed: What|Removed |Added Component|middle-end |tree-optimization --- Comment #2 from Andrew Pinski --- I need to look into this again for the gimple level because I have noticed VRP changes bool != bool into bool ^ bool but we should be able to do it without VRP.
[Bug middle-end/110983] -fpatchable-function-entry is missing in Option Summary page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110983 Mao changed: What|Removed |Added CC||sray at live dot com --- Comment #3 from Mao --- Created attachment 55810 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55810&action=edit invoke-doc-patch I think this can help fix the issue. I am not sure how to build the HTML web pages. But I also checked the man page. The fpatchable-function-entry is also missing in the manpage in the option summary section. And my fix can solve this issue. I still need more time to fight with the git-send-mail and my email provider before I can send this patch file to the gcc-patches mail list...
[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 --- Comment #14 from Andrew Pinski --- I have a patch which is able to optimize this to: t1_3 = b_1(D) >= a_2(D); _6 = b_1(D) > a_2(D); _4 = t1_3 ^ _6; But then we need to handle some simplifications for ^. I will handle that next week or so ...
[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185 --- Comment #8 from Andrew Pinski --- I have a patch which converts this into: _1 = x_4(D) < 0; _2 = y_5(D) <= 0; _3 = _1 ^ _2;
[Bug target/110943] RISC-V: vmv.v.x and vmv.s.x pattern combine error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110943 Lehua Ding changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from Lehua Ding --- Fixed.
[Bug target/110943] RISC-V: vmv.v.x and vmv.s.x pattern combine error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110943 --- Comment #1 from CVS Commits --- The trunk branch has been updated by Lehua Ding : https://gcc.gnu.org/g:973eb0deb467c79cc21f265a710a81054cfd3e8c commit r14-3535-g973eb0deb467c79cc21f265a710a81054cfd3e8c Author: Lehua Ding Date: Tue Aug 29 09:54:22 2023 +0800 RISC-V: Fix error combine of pred_mov pattern This patch fix PR110943 which will produce some error code. This is because the error combine of some pred_mov pattern. Consider this code: ``` void foo9 (void *base, void *out, size_t vl) { int64_t scalar = *(int64_t*)(base + 100); vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1); *(vint64m2_t*)out = v; } ``` RTL before combine pass: ``` (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ]) (if_then_else:RVVM2DI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (const_int 1 [0x1]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2DI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2DI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di}) (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128]) (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole}) ``` RTL after combine pass: ``` (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128]) (if_then_else:RVVM2DI (unspec:RVVMF32BI [ (const_vector:RVVMF32BI repeat [ (const_int 1 [0x1]) ]) (const_int 1 [0x1]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2DI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2DI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di}) ``` This combine change the semantics of insn 14. I split @pred_mov pattern and restrict the conditon of @pred_mov. PR target/110943 gcc/ChangeLog: * config/riscv/predicates.md (vector_const_int_or_double_0_operand): New predicate. * config/riscv/riscv-vector-builtins.cc (function_expander::function_expander): force_reg mem target operand. * config/riscv/vector.md (@pred_mov): Wrapper. (*pred_mov): Remove imm -> reg pattern. (*pred_broadcast_imm): Add imm -> reg pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Adjust. * gcc.target/riscv/rvv/base/pr110943.c: New test.
[Bug fortran/111218] Conflict in BIND(C) INTERFACEs in two Modules leads to ICE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111218 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org --- Comment #1 from kargl at gcc dot gnu.org --- > (tested with gcc version 14.0.0 20230828 (experimental) [master > r14-3528-gc3669bb677b] (GCC) No ICE with a 14.0.0 20230824 gfortran
[Bug tree-optimization/107880] bool tautology missed optimisation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #4 from Andrew Pinski --- With a patch I have for PR 95185 we get: ``` _1 = b_2(D) == a_3(D); _10 = b_2(D) ^ a_3(D); _5 = _1 ^ _10; ``` Which is better than before. One more improvement would be: ``` bool a(bool x, bool y) { bool t = x == y; return t ^ x; } ``` Into: ``` bool a0(bool x, bool y) { bool t = (x ^ y); return t ^ x ^1; // ~y } ``` So the 2 which are needed still: /* (a == b) ^ a -> b^1 */ (simplify (bit_xor:c (eq:c zero_one_valued_p@0 zero_one_valued_p@1) @0) (bit_xor @1 { build_one_cst (type); }) /* (a == b) ^ (a^b) -> b^(b^1) or (b^b)^1 or rather 1 */ (simplify (bit_xor:c (eq:c zero_one_valued_p@0 zero_one_valued_p@1) (bit_xor:c @0 @1)) { build_one_cst (type); }) So mine.
[Bug c/111219] -Wformat-truncation false negative with %p modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 --- Comment #2 from Nick Desaulniers --- Ah ok that makes sense. Would it be possible to get that behavior documented on this page? https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wformat-truncation We can probably modify clang to match this behavior then. It's good to know that this was intentional, but too bad that Martin did the work to change this, but the kernel commit still disabled the diagnostic. Martin's GCC patch is dated: Date: Tue Nov 29 21:08:02 2016 Linus' kernel patch is dated: Date: Wed Jul 12 19:25:47 2017 -0700 (So this was changed in GCC BEFORE the kernel commit; perhaps Linus was using an older release at the time. Or perhaps there was something else Linus was witnessing).
[Bug c/111219] -Wformat-truncation false negative with %p modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=78512 --- Comment #1 from Andrew Pinski --- >From the GCC itself: case 'p': /* The %p output is implementation-defined. It's possible to determine this format but due to extensions (especially those of the Linux kernel -- see bug 78512) the first %p in the format string disables any further processing. */ return false;
[Bug c/111219] New: -Wformat-truncation false negative with %p modifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219 Bug ID: 111219 Summary: -Wformat-truncation false negative with %p modifier Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: ndesaulniers at google dot com Target Milestone: --- I noticed that -Wformat-truncation was disabled in the linux kernel. commit bd664f6b3e37 ("disable new gcc-7.1.1 warnings for now") I was curious since I was unfamiliar with that flag. I filed a bug against clang to look into implementing something similar. https://github.com/llvm/llvm-project/issues/64871 They extended their existing -Wfortify-source flag instead (*sigh*), but we noticed now in the Linux kernel that `-Wfortify-source` is flagging a few cases where kernel devs have added custom format flags for pretty printing oft-used data structures, which is tripping up this warning, since these format specifiers are not part of the language standard. A recent kernel patch looks to re-enable -Wformat-truncation for W=1 kernel builds. Nathan noticed that GCC is not warning for the %p related flags, whereas clang is (with -Wfortify-source). I don't think GCC's current behavior is intentional? For example, consider the following code: ``` void foo (void *x) { char dst [1]; __builtin_snprintf(dst, sizeof(dst), "%p", x); } ``` Clang-18 (trunk, not yet released, after https://github.com/llvm/llvm-project/commit/0c9c9dd9a24f9d715d950fef0ac7aae01437af96) with -Wfortify-source will warn: ``` tmp.c:3:5: warning: 'snprintf' will always be truncated; specified size is 1, but format string expands to at least 4 [-Wfortify-source] 3 | __builtin_snprintf(dst, sizeof(dst), "%p", x); | ^ ``` GCC with -Wformat-truncation does not warn, but I think it should.
[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 Andrew Pinski changed: What|Removed |Added Depends on||95185 --- Comment #13 from Andrew Pinski --- But this depends on PR 95185 still. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185 [Bug 95185] Failure to optimize specific kind of sign comparison check
[Bug tree-optimization/107880] bool tautology missed optimisation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880 Bug 107880 depends on bug 107881, which changed state. Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 What|Removed |Added Status|RESOLVED|ASSIGNED Resolution|DUPLICATE |---
[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887 Bug 107887 depends on bug 107881, which changed state. Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 What|Removed |Added Status|RESOLVED|ASSIGNED Resolution|DUPLICATE |---
[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 Andrew Pinski changed: What|Removed |Added Status|RESOLVED|ASSIGNED Resolution|DUPLICATE |--- --- Comment #12 from Andrew Pinski --- Actually reopen since it is not an exact dup. But still mine.
[Bug testsuite/111216] [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216 Peter Bergner changed: What|Removed |Added CC||linkw at gcc dot gnu.org, ||meissner at gcc dot gnu.org --- Comment #2 from Peter Bergner --- The code change that led to this looks correct to me. Are we possibly just folding more than we used to (a good thing), and that is changing our numbers? What are the actual and expected counts? I'm sorry for repeating myself, but I really really dislike counting xxlor insns, since they're mostly used for register copies and the number of those can easily change with the phase of the moon, day of the week, etc. etc.
[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #7 from Andrew Pinski --- (In reply to Andrew Pinski from comment #5) > > Something like: > Prefer ^ over == > ``` > (for cmp > (for cmpN > (for neeq > (simplify > (neeq:c (cmp @0 @1) @3 > (if (cmpN == inverseof(cmp, TREE_TYPE (type)) >(bit_xor (cmpN @0 @1) @3) > ) > ) > ))) > ``` Actually we can just do: ``` /* For CMP == b, prefer CMP` ^ b. */ (for neeq (ne eq) (for cmp (tcc_comparison) (simplify (neeq:c (cmp@0 @1 @2) @3) (bit_xor (bit_not! @0) @3) ) ) ) ``` Since we already have folding of (bit_not cmp) in another place.
[Bug bootstrap/111141] Compiling gcc-13.2.0 on Ubuntu 22.04.3 LTS, problem asm-generic/errno.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41 --- Comment #4 from Andrew Pinski --- (In reply to etienne_lorrain from comment #3) > Unlike for ARM64 host compiling a native compiler, you need to say such > --disable-multilib for amd64 compiling a native compiler. Well aarch64 (arm64 [which is techincally not a thing]) defaults to having only one multi-lib (LP64) while x86_64 (amd64 which is the non-canonical name for x86_64) defaults to having both 64 and 32bit multi-lib.
[Bug bootstrap/111141] Compiling gcc-13.2.0 on Ubuntu 22.04.3 LTS, problem asm-generic/errno.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41 --- Comment #3 from etienne_lorrain at yahoo dot fr --- Just reporting that the problem do not appears when --disable-multilib is asked at the configure stage. Unlike for ARM64 host compiling a native compiler, you need to say such --disable-multilib for amd64 compiling a native compiler.
[Bug tree-optimization/107880] bool tautology missed optimisation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107880 Bug 107880 depends on bug 107881, which changed state. Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/107887] (bool0 > bool1) | bool1 is not optimized to bool0 | bool1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107887 Bug 107887 depends on bug 107881, which changed state. Bug 107881 Summary: (a <= b) == (b >= a) should be optimized to (a == b) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|ASSIGNED|RESOLVED --- Comment #11 from Andrew Pinski --- Basically a dup of bug 95185. *** This bug has been marked as a duplicate of bug 95185 ***
[Bug tree-optimization/95185] Failure to optimize specific kind of sign comparison check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95185 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #6 from Andrew Pinski --- *** Bug 107881 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/107881] (a <= b) == (b >= a) should be optimized to (a == b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107881 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #10 from Andrew Pinski --- Mine, there was another bug where we had `cmp == b` and I Mentioned the way to improve that is prefer ^ and `~cmp`.
[Bug tree-optimization/101676] ^ not changed to | if the non-zero don't overlap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101676 --- Comment #2 from Andrew Pinski --- (In reply to Richard Biener from comment #1) > why is | better than ^? Just to reply to this. The reasoning from simplify-rtx.cc: /* If we are XORing two things that have no bits in common, convert them into an IOR. This helps to detect rotation encoded using those methods and possibly other simplifications. */ Which was added with r0-24478-g79e8185c9ccfcb .
[Bug tree-optimization/111147] bitwise_inverted_equal_p can be used in the `(x | y) & (~x ^ y)` pattern to catch more
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47 Andrew Pinski changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2023-August/ ||628600.html Keywords||patch --- Comment #1 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628600.html
[Bug c++/111160] [11/12/13/14 Regression] ICE on assigning volatile through ternary operator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #2 from Marek Polacek --- Started with r6-4886: commit cda0a029f45d20f4535dcacf6c3194352c31e736 Author: Jason Merrill Date: Fri Nov 13 19:08:05 2015 -0500 Merge C++ delayed folding branch.
[Bug testsuite/111216] [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-08-28 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- test6_nor in fold-vec-logical-ors-char.c Trying 10, 9 -> 11: 10: r127:V16QI=const_vector // (-1) 9: r125:V16QI=r126:V16QI-r124:V16QI REG_DEAD r126:V16QI REG_DEAD r124:V16QI 11: r121:V16QI=r125:V16QI+r127:V16QI REG_DEAD r127:V16QI REG_DEAD r125:V16QI REG_EQUAL r125:V16QI+const_vector Failed to match this instruction: (set (reg:V16QI 121 [ ]) (plus:V16QI (not:V16QI (reg:V16QI 124)) (reg:V16QI 126 [ *foo_4(D) ]))) Successfully matched this instruction: (set (reg:V16QI 127) (not:V16QI (reg:V16QI 124))) Successfully matched this instruction: (set (reg:V16QI 121 [ ]) (plus:V16QI (reg:V16QI 127) (reg:V16QI 126 [ *foo_4(D) ]))) allowing combination of insns 9, 10 and 11 original costs 4 + 20 + 4 = 28 replacement costs 4 + 4 = 8 deferring deletion of insn with uid = 9. modifying insn i210: r127:V16QI=~r124:V16QI REG_DEAD r124:V16QI deferring rescan insn with uid = 10. modifying insn i311: r121:V16QI=r127:V16QI+r126:V16QI REG_DEAD r126:V16QI REG_DEAD r127:V16QI deferring rescan insn with uid = 11.
[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Target Milestone|--- |14.0 Resolution|--- |FIXED --- Comment #4 from Andrew Pinski --- Fixed. Filed PR 111217 as mentioned.
[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215 --- Comment #3 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:b7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a commit r14-3529-gb7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a Author: Andrew Pinski Date: Mon Aug 28 19:27:41 2023 + Fix cond-bool-2.c on powerpc and other targets This adds `--param logical-op-non-short-circuit=1` to the tescase so it becomes a target indepdendent testcase now. I filed PR 111217 as the variant of the testcase which fails indepdendently of the param. Committed as obvious after testing to make sure it passes on powerpc now. gcc/testsuite/ChangeLog: PR testsuite/111215 * gcc.dg/tree-ssa/cond-bool-2.c: Add `--param logical-op-non-short-circuit=1` to the options.
[Bug tree-optimization/111217] variant of cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111217 --- Comment #1 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:b7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a commit r14-3529-gb7f9ee7fb89fc9c48f03970e8e6581c7bae58f5a Author: Andrew Pinski Date: Mon Aug 28 19:27:41 2023 + Fix cond-bool-2.c on powerpc and other targets This adds `--param logical-op-non-short-circuit=1` to the tescase so it becomes a target indepdendent testcase now. I filed PR 111217 as the variant of the testcase which fails indepdendently of the param. Committed as obvious after testing to make sure it passes on powerpc now. gcc/testsuite/ChangeLog: PR testsuite/111215 * gcc.dg/tree-ssa/cond-bool-2.c: Add `--param logical-op-non-short-circuit=1` to the options.
[Bug fortran/111218] New: Conflict in BIND(C) INTERFACEs in two Modules leads to ICE.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111218 Bug ID: 111218 Summary: Conflict in BIND(C) INTERFACEs in two Modules leads to ICE. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: toon at moene dot org Target Milestone: --- The following program: MODULE FIELD_2RM_UTIL_MODULE INTERFACE SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER() BIND(C,name="set_abor1_exception_handler") END SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER END INTERFACE END MODULE MODULE FIELD_3RM_UTIL_MODULE INTERFACE SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER() BIND(C,name="set_abor1_exception_handler") END SUBROUTINE SET_ABOR1_EXCEPTION_HANDLER END INTERFACE END MODULE MODULE FIELD_UTIL_MODULE USE FIELD_2RM_UTIL_MODULE USE FIELD_3RM_UTIL_MODULE IMPLICIT NONE END MODULE leads to the following internal compiler error: /home/toon/compilers/install/gcc/bin/gfortran -c -g a.f90 in gfc_format_decoder, at fortran/error.cc:1078 0x75917d gfc_format_decoder /home/toon/compilers/gcc/gcc/fortran/error.cc:1078 0x2153e1f pp_format(pretty_printer*, text_info*) /home/toon/compilers/gcc/gcc/pretty-print.cc:1475 0x21315be diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) /home/toon/compilers/gcc/gcc/diagnostic.cc:1606 0x9b628e gfc_report_diagnostic /home/toon/compilers/gcc/gcc/fortran/error.cc:890 0x9b628e gfc_error_opt /home/toon/compilers/gcc/gcc/fortran/error.cc:1460 0x9b7470 gfc_error(char const*, ...) /home/toon/compilers/gcc/gcc/fortran/error.cc:1489 0xa6205b ambiguous_symbol /home/toon/compilers/gcc/gcc/fortran/symbol.cc:3167 0xa6ce9e gfc_find_sym_tree(char const*, gfc_namespace*, int, gfc_symtree**) /home/toon/compilers/gcc/gcc/fortran/symbol.cc:3240 0xa6cec1 gfc_find_symbol(char const*, gfc_namespace*, int, gfc_symbol**) /home/toon/compilers/gcc/gcc/fortran/symbol.cc:3291 0xb27a05 check_against_globals /home/toon/compilers/gcc/gcc/fortran/frontend-passes.cc:5842 0xa630e2 do_traverse_symtree /home/toon/compilers/gcc/gcc/fortran/symbol.cc:4190 0xb30231 gfc_check_externals(gfc_namespace*) /home/toon/compilers/gcc/gcc/fortran/frontend-passes.cc:5888 0xa293d8 gfc_parse_file() /home/toon/compilers/gcc/gcc/fortran/parse.cc:7195 0xa7aecf gfc_be_parse_file /home/toon/compilers/gcc/gcc/fortran/f95-lang.cc:229 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. (tested with gcc version 14.0.0 20230828 (experimental) [master r14-3528-gc3669bb677b] (GCC)
[Bug tree-optimization/111217] New: variant of cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111217 Bug ID: 111217 Summary: variant of cond-bool-2.c fails Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: ``` static inline _Bool nand(_Bool a, _Bool b) { _Bool t = 0; if (a) { if (b) t = 1; } return !t; // return !(a && b); } _Bool f(int a, int b) { return nand(nand(b, nand(a, a)), nand(a, nand(b, b))); } ``` we get at ifcombine: [local count: 1073741824]: if (a_3(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: if (b_2(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: if (b_2(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: ... [local count: 536870912]: # iftmp.0_21 = PHI <1(3), 0(4)> So we could swap these ifs around slighlty if (b_2(D) != 0) goto L1; else goto L2; L1: if (a_3(D) != 0) goto L3; else goto L4; L3: goto L4; L4: iftmp.0_21 = PHI <1(3), 0(4)> L1: goto bb5; And then it will be optimized.
[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215 --- Comment #2 from Andrew Pinski --- So there might be two ways of fixing this: [local count: 1073741824]: if (a_3(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: if (b_2(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: if (b_2(D) != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 536870912]: ... [local count: 536870912]: # iftmp.0_21 = PHI <1(3), 0(4)> So we could swap this if around slighlty if (b_2(D) != 0) goto L1; else goto L2; L1: if (a_3(D) != 0) goto L3; else goto L4; L3: goto L4; L4: iftmp.0_21 = PHI <1(3), 0(4)> L1: goto bb5; Implementing that will take some time. But --param=logical-op-non-short-circuit=1 is enough to fix the testcase so that is what I am going to use here. Will file another bug about the above case.
[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209 --- Comment #5 from cqwrteur --- (In reply to Jakub Jelinek from comment #4) > (In reply to cqwrteur from comment #3) > > (In reply to Jakub Jelinek from comment #1) > > > Just use __int128 addition if all you want is double-word addition (or > > > long > > > long for 32-bit arches)? > > > > Well, I've presented this merely as an illustrative example. The length can > > actually be arbitrary. > > No, it was working with all the other lengths. This might come across as unusual. I frequently engage in manipulations involving the carry flag. like this implementation for 128 bit division (for 32 bit machine and Microsoft compiler) auto shift = static_cast(::std::countl_zero(divisorhigh) - ::std::countl_zero(dividendhigh)); divisorhigh = ::fast_io::intrinsics::shiftleft(divisorlow,divisorhigh,shift); divisorlow <<= shift; quotientlow = 0; bool carry; do { carry=0; dividendlow=intrinsics::subc(dividendlow,divisorlow,carry,carry); dividendhigh=intrinsics::subc(dividendhigh,divisorhigh,carry,carry); constexpr T zero{}; T mask{zero-carry}; T templow{divisorlow&mask},temphigh{divisorhigh&mask}; carry=!carry; quotientlow=intrinsics::addc(quotientlow,quotientlow,carry,carry); carry=0; dividendlow=intrinsics::addc(dividendlow,templow,carry,carry); dividendhigh=intrinsics::addc(dividendhigh,temphigh,carry,carry); divisorlow = intrinsics::shiftright(divisorlow,divisorhigh,1u); divisorhigh >>= 1u; } while(shift--); return {quotientlow,0,dividendlow,dividendhigh};
[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209 --- Comment #4 from Jakub Jelinek --- (In reply to cqwrteur from comment #3) > (In reply to Jakub Jelinek from comment #1) > > Just use __int128 addition if all you want is double-word addition (or long > > long for 32-bit arches)? > > Well, I've presented this merely as an illustrative example. The length can > actually be arbitrary. No, it was working with all the other lengths.
[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209 --- Comment #3 from cqwrteur --- (In reply to Jakub Jelinek from comment #1) > Just use __int128 addition if all you want is double-word addition (or long > long for 32-bit arches)? Well, I've presented this merely as an illustrative example. The length can actually be arbitrary. I've directly taken the code from the GCC documentation, but it doesn't appear to perform as the document asserts. " Built-in Function: unsigned int __builtin_addc (unsigned int a, unsigned int b, unsigned int carry_in, unsigned int *carry_out) Built-in Function: unsigned long int __builtin_addcl (unsigned long int a, unsigned long int b, unsigned int carry_in, unsigned long int *carry_out) Built-in Function: unsigned long long int __builtin_addcll (unsigned long long int a, unsigned long long int b, unsigned long long int carry_in, unsigned long long int *carry_out) These built-in functions are equivalent to: ({ __typeof__ (a) s; \ __typeof__ (a) c1 = __builtin_add_overflow (a, b, &s); \ __typeof__ (a) c2 = __builtin_add_overflow (s, carry_in, &s); \ *(carry_out) = c1 | c2; \ s; }) i.e. they add 3 unsigned values, set what the last argument points to to 1 if any of the two additions overflowed (otherwise 0) and return the sum of those 3 unsigned values. Note, while all the first 3 arguments can have arbitrary values, better code will be emitted if one of them (preferrably the third one) has only values 0 or 1 (i.e. carry-in). " Additionally, it's advisable to steer clear of using __uint128_t in certain situations. This data type is not compatible with the Microsoft compiler and 32-bit machines. Moreover, the compiler does not effectively optimize the associated costs.
[Bug middle-end/111209] GCC fails to understand adc pattern what its document describes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209 Jakub Jelinek changed: What|Removed |Added Last reconfirmed||2023-08-28 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Created attachment 55809 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55809&action=edit gcc14-pr111209.patch Anyway, here is a patch that makes it match, but it is getting ugly to avoid making it match prematurely and break other matching.
[Bug target/111209] GCC fails to understand adc pattern what its document describes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111209 --- Comment #1 from Jakub Jelinek --- Just use __int128 addition if all you want is double-word addition (or long long for 32-bit arches)?
[Bug testsuite/111215] New test case gcc.dg/tree-ssa/cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Keywords||testsuite-fail Component|other |testsuite Last reconfirmed||2023-08-28 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- I think I know what the issue is with the testcase.
[Bug testsuite/111216] New: [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111216 Bug ID: 111216 Summary: [14 regression] instructions counts for vector tests change after r14-3258-ge7a36e4715c716 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:e7a36e4715c7162ccfd7cd32da985d629bbd9c61, r14-3258-ge7a36e4715c716 FAIL: gcc.target/powerpc/fold-vec-logical-ors-char.c scan-assembler-times \\mxxlnor\\M 1 FAIL: gcc.target/powerpc/fold-vec-logical-ors-char.c scan-assembler-times \\mxxlor\\M 7 FAIL: gcc.target/powerpc/fold-vec-logical-ors-int.c scan-assembler-times \\mxxlnor\\M 1 FAIL: gcc.target/powerpc/fold-vec-logical-ors-int.c scan-assembler-times \\mxxlor\\M 7 FAIL: gcc.target/powerpc/fold-vec-logical-ors-longlong.c scan-assembler-times \\mxxlnor\\M 3 FAIL: gcc.target/powerpc/fold-vec-logical-ors-longlong.c scan-assembler-times \\mxxlor\\M 9 FAIL: gcc.target/powerpc/fold-vec-logical-ors-short.c scan-assembler-times \\mxxlnor\\M 1 FAIL: gcc.target/powerpc/fold-vec-logical-ors-short.c scan-assembler-times \\mxxlor\\M 7 FAIL: gcc.target/powerpc/fold-vec-logical-other-char.c scan-assembler-times \\mxxlnand\\M 3 FAIL: gcc.target/powerpc/fold-vec-logical-other-int.c scan-assembler-times \\mxxlnand\\M 3 FAIL: gcc.target/powerpc/fold-vec-logical-other-longlong.c scan-assembler-times \\mxxlnand\\M 3 FAIL: gcc.target/powerpc/fold-vec-logical-other-short.c scan-assembler-times \\mxxlnand\\M 3 These are all just instruction count tests so the changes may not matter. commit e7a36e4715c7162ccfd7cd32da985d629bbd9c61 (HEAD) Author: Yanzhang Wang Date: Wed Aug 16 22:28:50 2023 -0600 [PATCH] RISC-V: Support simplify (-1-x) for vector.
[Bug target/111107] i686-w64-mingw32 does not realign stack when __attribute__((aligned)) or __attribute__((vector_size)) are used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07 Gabriel Ivăncescu changed: What|Removed |Added CC||gabrielopcode at gmail dot com --- Comment #7 from Gabriel Ivăncescu --- So to re-iterate summary of the problem: 1) The i686 Win32 ABI has a de-facto stack alignment of 4 bytes *only*. GCC may have set it to 16 bytes on Linux because it compiled the whole userland, but that's not the case on Windows; the caller can be MSVC compiled code (very likely on Windows) and MSVC only uses 4-byte alignment. 2) SSE is *not* the only thing that requires stack realignment. Sure, it does require it, but that's more a side effect of requiring larger-than-4 alignment in the first place. A variable (or its type) declared with __attribute__((aligned(...))) **should** also let GCC re-align the stack upon entry, if it's > 4 bytes and if it's actually used on the stack and spilled (or has its address taken). There's no reason to special-case SSE at all. It's just the alignment of the variable or spilled vector that should matter, and GCC must know that the incoming stack is aligned only to 4 bytes on this platform. i686 PE targets should simply default to -mincoming-stack-boundary=2 -mpreferred-stack-boundary=2 (the latter to minimize realignments unless necessary), as that's basically MSVC's behavior, and as such the de-facto standard on this platform.
[Bug other/111215] New: New test case gcc.dg/tree-ssa/cond-bool-2.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111215 Bug ID: 111215 Summary: New test case gcc.dg/tree-ssa/cond-bool-2.c fails Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:ddd64a6ec3b38e18aefb9fcba50c0d9297e5e711, r14-3432-gddd64a6ec3b38e make -k check-gcc RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/cond-bool-2.c" FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized "ne_expr, " 2 FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-not optimized "gimple_cond " FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-not optimized "gimple_phi " FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized "bit_xor_expr, " 1 FAIL: gcc.dg/tree-ssa/cond-bool-2.c scan-tree-dump-times optimized "gimple_assign " 3 # of expected passes3 # of unexpected failures5 commit ddd64a6ec3b38e18aefb9fcba50c0d9297e5e711 (HEAD) Author: Andrew Pinski Date: Tue Aug 22 18:41:56 2023 -0700 MATCH: remove negate for 1bit types * gcc.dg/tree-ssa/cond-bool-2.c: New test.
[Bug fortran/102417] Wrong error message about character length with -std=f2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102417 anlauf at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords|diagnostic |rejects-valid, wrong-code CC||anlauf at gcc dot gnu.org See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=107721 Ever confirmed|0 |1 Last reconfirmed||2023-08-28 --- Comment #2 from anlauf at gcc dot gnu.org --- It appears that we lose the typespec for nested ctors, so I guess this PR is related to pr107721. Slight variation of testcase: program p character:: x = 'a' character(4) :: y(2) y = [ character(4) :: x, 'b' ] y = [[character(4) :: x, 'b']] print *, y print *, len ([ character(4) :: x, 'b' ]) print *, len ([[character(4) :: x, 'b']]) end Compiling with -fdump-fortran-original, I see: code: ASSIGN p:y(FULL) (/ p:x , 'b ' /) ASSIGN p:y(FULL) (/ p:x , 'b' /) WRITE UNIT=6 FMT=-1 TRANSFER p:y(FULL) DT_END WRITE UNIT=6 FMT=-1 TRANSFER 4 DT_END WRITE UNIT=6 FMT=-1 TRANSFER 1 DT_END Clearly, the code for the lines with nested ctors is wrong.
[Bug target/111171] [14 Regression] ICE: in decompose, at rtl.h:2297 at -O1 on riscv64-unknown-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71 --- Comment #2 from Zdenek Sojka --- (In reply to Xi Ruoyao from comment #1) > Can you try > https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627024.html? The patch * combine.cc (simplify_compare_const): Properly handle unsigned constants while narrowing comparison of memory and constants. fixes this ICE on several testcases
[Bug libgomp/111214] New: omp_get_num_procs: Improve documentation - especially for devices
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111214 Bug ID: 111214 Summary: omp_get_num_procs: Improve documentation - especially for devices Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: documentation Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Current wording: https://gcc.gnu.org/onlinedocs/libgomp/omp_005fget_005fnum_005fprocs.html "Returns the number of processors online on that device." (A) For the host, I wonder whether it should mention the affinity bits, which we have in Linux: if (gomp_places_list == NULL) ... && pthread_getaffinity_np (pthread_self (), gomp_get_cpuset_size, gomp_cpusetp) == 0) ... (B) We are completely silent for devices. Seems as if the number of independent hardware threads is what is the sentiment during today's OpenMP accel talk, i.e. #warps (nvptx) and #wavefronts (amdgcn) in hardware (possibly: minus those removed via explicit num_threads settings). We currently have for accelerators: return gomp_icv (false)->nthreads_var with gomp_icv(false) is: struct gomp_task *task = gomp_thread ()->task; if (task) return &task->icv; /*... */ else return &gomp_global_icv; And set on GCN gomp_global_icv.nthreads_var = 16 and for nvptx: gomp_global_icv.nthreads_var = ntids. Possibly, it should be omp_get_num_teams() * nthreads_var, or not. In any case, it needs to be documented.
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #18 from Thorsten Glaser --- I cannot, unfortunately. But I have found _another_ “mitigation”: varsub() is static and has only one caller: https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=alioth/mksh.git;a=blob;f=eval.c;h=cb959b1d1104229ead20a698ff2dc974b8da3b10;hb=35563a7897b98de2743233c5f3340a14bea6ebf2#l400 By making varsub… https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=alioth/mksh.git;a=blob;f=eval.c;h=cb959b1d1104229ead20a698ff2dc974b8da3b10;hb=35563a7897b98de2743233c5f3340a14bea6ebf2#l1238 … not static, the bug *also* goes away. (Probably because varsub is not inlined.) Now we see that… 399 sp = cstrchr(sp, '\0') + 1; 400 type = varsub(&x, varname, sp, &stype, &slen); … the varsub call is *directly* below the strchr/strlen line, *and* it gets passed the sp variable. (Inside varsub, the variable is also modified.) My suspicion here is that, somehow only triggerable on x32+dietlibc, something about the multiple modifications of sp (just before and within varsub) confuses GCC? And indeed. Adding -O2, -O1, -O0 to the GCC command line doesn’t help, but -fno-inline again does. As does adding an attribute to the function prototype: static int varsub(Expand *, const char *, const char *, unsigned int *, int *) __attribute__((noinline)); Could we somehow debug there further? I really don’t see a way to reproduce this on x32/glibc or amd64…
[Bug target/111171] [14 Regression] ICE: in decompose, at rtl.h:2297 at -O1 on riscv64-unknown-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #1 from Xi Ruoyao --- Can you try https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627024.html?
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #17 from Thorsten Glaser --- Hm, okay, I’ll try to find if I can trigger it in glibc/x32 then…
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #16 from Thorsten Glaser --- If I add -maddress-mode=long to the build of the expr.c file, then link it with the rest, it still fails. I’m not sure about reducing, and not sure about the cross-anything, but I *did* get it to fail on amd64 now! (Just differently.) HOWEVER, I’m not sure whether this is from x32/amd64 mismatch or from the bug, as the resulting pattern differs. The code flow is roughly: eval.c from line 1608 onwards opens a temporary file, dups it to stdout, calls funsub() from line 2147, and on return rewinds that file and restores stdout. This all is called from line 352 (where the jump to the subroutine is), but the strlen in question is on line 399 in a different codepath (where the stuff immediately following '${' is parsed). They only have the use of the variable 'sp' and the jumping past the first NUL in it in common (the funsub caller has 'sp = strnul(sp) + 1;' instead, but that’s just 'sp+strlen(sp)', and changing the 'sp = cstrchr(sp, '\0') + 1;' to that (which I did in upstream CVS HEAD now anyway) doesn’t “fix” the issue. In a Debian sid/amd64 chroot, with GCC 13.2.0-1 (as packaged in Debian), I did: gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing -fstack-protector-strong -malign-data=abi -fwrapv -I. -D_FORTIFY_SOURCE=2 -DMKSH_BUILDMEAT -DMKSH_BUILDSH=1 -D_GNU_SOURCE -DSETUID_CAN_FAIL_WITH_EAGAIN=1 -DHAVE_STRING_POOLING=2 -DHAVE_ATTRIBUTE_BOUNDED=0 -DHAVE_ATTRIBUTE_FORMAT=1 -DHAVE_ATTRIBUTE_NORETURN=1 -DHAVE_ATTRIBUTE_UNUSED=1 -DHAVE_ATTRIBUTE_USED=1 -DHAVE_SYS_TIME_H=1 -DHAVE_TIME_H=1 -DHAVE_BOTH_TIME_H=1 -DHAVE_SYS_SELECT_H=1 -DHAVE_SELECT_TIME_H=1 -DHAVE_SYS_BSDTYPES_H=0 -DHAVE_SYS_FILE_H=1 -DHAVE_SYS_MKDEV_H=0 -DHAVE_SYS_MMAN_H=1 -DHAVE_SYS_PARAM_H=1 -DHAVE_SYS_PTEM_H=0 -DHAVE_SYS_RESOURCE_H=1 -DHAVE_SYS_SYSMACROS_H=1 -DHAVE_BSTRING_H=0 -DHAVE_GRP_H=1 -DHAVE_IO_H=0 -DHAVE_LIBGEN_H=1 -DHAVE_LIBUTIL_H=0 -DHAVE_PATHS_H=1 -DHAVE_STDINT_H=1 -DHAVE_STRINGS_H=1 -DHAVE_TERMIOS_H=1 -DHAVE_ULIMIT_H=1 -DHAVE_VALUES_H=1 -DHAVE_CAN_INTTYPES=1 -DHAVE_SIG_T=1 -DHAVE_STRERRORDESC_NP=1 -DHAVE_SYS_ERRLIST=1 -DHAVE_SIGABBREV_NP=1 -DHAVE_SYS_SIGNAME=0 -DHAVE_SIGDESCR_NP=1 -DHAVE_SYS_SIGLIST=1 -DHAVE_FLOCK=1 -DHAVE_LOCK_FCNTL=1 -DHAVE_RLIMIT=1 -DHAVE_RLIM_T=1 -DHAVE_GET_CURRENT_DIR_NAME=1 -DHAVE_GETRANDOM=0 -DHAVE_GETRUSAGE=1 -DHAVE_GETSID=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_KILLPG=1 -DHAVE_MEMMOVE=1 -DHAVE_MKNOD=0 -DHAVE_MMAP=1 -DHAVE_FTRUNCATE=1 -DHAVE_NICE=1 -DHAVE_RENAME=1 -DHAVE_REVOKE=0 -DHAVE_POSIX_UTF8_LOCALE=0 -DHAVE_SELECT=1 -DHAVE_SETRESUGID=1 -DHAVE_SETGROUPS=1 -DHAVE_SIGACTION=1 -DHAVE_STRERROR=0 -DHAVE_STRSIGNAL=0 -DHAVE_STRLCPY=0 -DHAVE_STRSTR=1 -DHAVE_FLOCK_DECL=1 -DHAVE_REVOKE_DECL=1 -DHAVE_SYS_ERRLIST_DECL=1 -DHAVE_SYS_SIGLIST_DECL=1 -DHAVE_ST_MTIMENSEC=0 -DHAVE_INTCONSTEXPR_RSIZE_MAX=0 -DHAVE_PERSISTENT_HISTORY=1 -DMKSH_BUILD_R=599 -c lalloc.c edit.c eval.c exec.c expr.c funcs.c histrap.c jobs.c lex.c main.c misc.c shf.c syn.c tree.c var.c ulimit.c strlcpy.c gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing -fstack-protector-strong -malign-data=abi -fwrapv -fno-lto -o mksh lalloc.o edit.o eval.o exec.o expr.o funcs.o histrap.o jobs.o lex.o main.o misc.o shf.o syn.o tree.o var.o ulimit.o strlcpy.o ./mksh -c 'x=q; x=${ echo a; typeset e=2; return 3; echo x$e;}; echo .$x.' gcc-13 -g -fno-lto -fno-asynchronous-unwind-tables -fno-strict-aliasing -fstack-protector-strong -malign-data=abi -fwrapv -I. -D_FORTIFY_SOURCE=2 -DMKSH_BUILDMEAT -DMKSH_BUILDSH=1 -D_GNU_SOURCE -DSETUID_CAN_FAIL_WITH_EAGAIN=1 -DHAVE_STRING_POOLING=2 -DHAVE_ATTRIBUTE_BOUNDED=0 -DHAVE_ATTRIBUTE_FORMAT=1 -DHAVE_ATTRIBUTE_NORETURN=1 -DHAVE_ATTRIBUTE_UNUSED=1 -DHAVE_ATTRIBUTE_USED=1 -DHAVE_SYS_TIME_H=1 -DHAVE_TIME_H=1 -DHAVE_BOTH_TIME_H=1 -DHAVE_SYS_SELECT_H=1 -DHAVE_SELECT_TIME_H=1 -DHAVE_SYS_BSDTYPES_H=0 -DHAVE_SYS_FILE_H=1 -DHAVE_SYS_MKDEV_H=0 -DHAVE_SYS_MMAN_H=1 -DHAVE_SYS_PARAM_H=1 -DHAVE_SYS_PTEM_H=0 -DHAVE_SYS_RESOURCE_H=1 -DHAVE_SYS_SYSMACROS_H=1 -DHAVE_BSTRING_H=0 -DHAVE_GRP_H=1 -DHAVE_IO_H=0 -DHAVE_LIBGEN_H=1 -DHAVE_LIBUTIL_H=0 -DHAVE_PATHS_H=1 -DHAVE_STDINT_H=1 -DHAVE_STRINGS_H=1 -DHAVE_TERMIOS_H=1 -DHAVE_ULIMIT_H=1 -DHAVE_VALUES_H=1 -DHAVE_CAN_INTTYPES=1 -DHAVE_SIG_T=1 -DHAVE_STRERRORDESC_NP=1 -DHAVE_SYS_ERRLIST=1 -DHAVE_SIGABBREV_NP=1 -DHAVE_SYS_SIGNAME=0 -DHAVE_SIGDESCR_NP=1 -DHAVE_SYS_SIGLIST=1 -DHAVE_FLOCK=1 -DHAVE_LOCK_FCNTL=1 -DHAVE_RLIMIT=1 -DHAVE_RLIM_T=1 -DHAVE_GET_CURRENT_DIR_NAME=1 -DHAVE_GETRANDOM=0 -DHAVE_GETRUSAGE=1 -DHAVE_GETSID=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_KILLPG=1 -DHAVE_MEMMOVE=1 -DHAVE_MKNOD=0 -DHAVE_MMAP=1 -DHAVE_FTRUNCATE=1 -DHAVE_NICE=1 -DHAVE_RENAME=1 -DHAVE_REVOKE=0 -DHAVE_POSIX_UTF8_LOCALE=0 -DHAVE_SELECT=1 -DHAVE_SETRESUGID=1 -DHAVE_SETGROUPS=1 -DHAVE_SIGACTION=1 -DHAVE_STRERROR=0 -DHAVE_STRSIGNAL=0 -DHAVE_STRLCPY=0 -DHAVE_STRSTR=1 -DHAVE_FLOCK_DECL=1 -DHAVE_REVOKE_DECL=1 -DHAVE_SYS_ERRLIST_DECL=1 -DHAVE_SYS_SIGLIST_DECL=1 -DHAVE_ST_MTIMENSEC=0 -DHAVE_INTCONSTEXPR
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #15 from H.J. Lu --- We need a testcase which can be reproduced with glibc since the bug may be in other parts of dietlibc.
[Bug c++/111173] G++ allows constinit functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 Uroš Bizjak changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #14 from Uroš Bizjak --- (In reply to Thorsten Glaser from comment #13) > The interesting part is around the occurrence of… > > # eval.c:399: sp = cstrchr(sp, '\0') + 1; > > … in the .s files (it occurs thrice, the first is the beginning of the setup > part, the second and third surround the strlen call, so they’re all within a > bunch of lines). Unfortunately, the runtime bug requires test that fails at runtime; the attached dumps are not that usable. The fact that the compiler fails for not so common target makes things even harder. I think that the best way forward is to create a minimized standalone testcase (From Comment #11 it looks that the issue is independent of dietlibc) that can be compiled with -mx32 in a kind of cross-compiler fashion. You can use -maddress-mode=long with -mx32 to create a .s assembly file that is compatible with x86_64, as far as stack handling is concerned. The resulting .s assembly can then be compiled and linked with a C wrapper, so a testcase that eventually fails on x86_64 can be produced. IOW, does the testcase fail when -maddress-mode=long is used?
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #13 from Thorsten Glaser --- The interesting part is around the occurrence of… # eval.c:399: sp = cstrchr(sp, '\0') + 1; … in the .s files (it occurs thrice, the first is the beginning of the setup part, the second and third surround the strlen call, so they’re all within a bunch of lines).
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #12 from Thorsten Glaser --- Created attachment 55808 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55808&action=edit tarball (.xz) with preprocessed and assembly output I’ve verified (back to unmodified source) that it is indeed only the file eval.c that’s at fault. I’ve compiled mksh with gcc-12, then built that one file with gcc-13, linked with gcc-12, and it failed. I’m attaching an xz-compressed tarball with preprocessed and assembly (both AT&T and Intel, both -fverbose-asm and not) of the file with exactly identical options between GCC 12 and 13, in the hope of that being helpful to hunt this down.
[Bug target/111212] [13/14 Regression] internal compiler error: in extract_insn, at recog.cc:2791
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212 --- Comment #2 from Mathieu Malaterre --- reduced: % g++ -maltivec -mcpu=power8 -O2 -c testcase.i testcase.i:15:30: warning: '{anonymous}::m {anonymous}::n(a) [with f = short int]' used but never defined 15 | template m n(a); | ^ testcase.i: In function 'void f::o::b()': testcase.i:66:25: error: unrecognizable insn: 66 | void b() { bo(bj()); } | ^ (insn 14 10 15 2 (set (reg:DI 127) (ashift:DI (reg:DI 126) (const_int 56 [0x38]))) "testcase.i":61:8 -1 (nil)) during RTL pass: vregs testcase.i:66:25: internal compiler error: in extract_insn, at recog.cc:2791 0x10813967 internal_error(char const*, ...) ???:0 0x10813a77 fancy_abort(char const*, int, char const*) ???:0 0x1041cb67 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ???:0 0x1041cba3 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ???:0 0x10bc12b7 extract_insn(rtx_insn*) ???:0 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See for instructions. with: % cat testcase.i typedef int a; typedef short b; namespace c { template struct g { using h = e &; }; template using i = typename g::h; template struct k { using h = i; }; template class ad; template class ad { public: typename k::h operator[](a); }; } // namespace c namespace { template using m = c::ad; template m n(a); } // namespace #pragma GCC target "cpu=power10" namespace f { namespace o { template struct p { using f = q; }; namespace detail { template struct aq { using h = p; }; template struct at { static constexpr a au = 0; using h = typename aq::h; }; } // namespace detail template using ax = typename detail::at::h; template using az = typename ay::f; namespace detail { template struct be { static void bf(a, a) { ax d; bd()(f(), d); } }; } // namespace detail template class bg { public: template void operator()(f) { a bh; constexpr a bi = as; constexpr a bc{}; detail::be::bf(1, bh); } }; template class bj { public: template void operator()(f r) { bg()(r); } }; template void bl(bk bm) { bm(b()); } template void bn(bk bm) { bl(bm); } template void bo(bk bm) { bn(bm); } struct s { template void operator()(f, ay) { ay bp; using bq = az; a br; auto bs = n(br); bq bt[]{8, 5, 4, 4, 5, 4, 9, 8, 5}; for (a j;;) bs[j] = bt[j]; } }; void b() { bo(bj()); } } // namespace o } // namespace f
[Bug c/111059] [11/12/13/14 Regression] ICE: in gimplify_expr, at gimplify.cc:17253
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111059 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- void f() { (_Bool) (0 / 0); } ICEs too, so I think the problem is elsewhere.
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #11 from Thorsten Glaser --- OK, to summarise: When using the original code but providing a wrapper function (in a separate CU) for strchr, it works. When replacing the strchr with strlen (which GCC also does), it fails even without the presence of dietlibc’s strlen. (And yes, disassembly of main.o (where I added it) shows no call to dietlibc from xstrlen.) This doesn’t seem to be coupled to the name of the function (the wrapper functions are called cstrchr and xstrlen, so the compiler cannot make any assumptions about them).
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #10 from Thorsten Glaser --- oh no, wait, that was for strchr… the strlen one… but, yeah, that too: extern size_t xstrlen(const char *s); and changing the line again to… sp += xstrlen(sp) + 1; … and adding in another .c file: size_t xstrlen(const char *s) { register const char *cp = s; while (*cp++ != '\0') ; return (--cp - s); } And it still fails.
[Bug target/111165] [13 regression] builtin strchr miscompiles on Debian/x32 with dietlibc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65 --- Comment #9 from Thorsten Glaser --- > Does providing your own (trivially correct) strlen implementation in a > separate CU also fix the issue? Even providing one that just calls dietlibc’s (in a separate CU) fixes the issue, so I’m very sure it’s not that, but probably some codegen surrounding the call.
[Bug tree-optimization/111211] No warning for iterator going out of scope for writing to array of inline-asm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211 Andrew Pinski changed: What|Removed |Added Keywords||inline-asm Summary|No warning for iterator |No warning for iterator |going out of scope |going out of scope for ||writing to array of ||inline-asm --- Comment #5 from Andrew Pinski --- Note this is only an issue with inline-asm really and only if write directly in the array. If we change the code slightly: ``` #include int foo2 (uint64_t ddr0_addr_access) { uint64_t check[1] = {0}; for (int k = 0; k < 8; k += 1) { int t; asm volatile ("nop" : "=r"(t) : "r"(ddr0_addr_access)); check[k] = t; } return 0; } ``` GCC does warn (though slightly different): ``` :11:18: warning: iteration 1 invokes undefined behavior [-Waggressive-loop-optimizations] 11 | check[k] = t; | ~^~~ :7:23: note: within this loop 7 | for (int k = 0; k < 8; k += 1) | ~~^~~ ```
[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167 --- Comment #10 from Christophe Lyon --- (In reply to Jonathan Wakely from comment #9) > (In reply to Christophe Lyon from comment #8) > > On arm-eabi targets (thus, using newlib), we've noticed new errors: > > New since when? These files haven't changed in the last two weeks. The bisection pointed to the patch in comment #6.
[Bug tree-optimization/111146] Some patterns in match.pd are no longer needed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED --- Comment #2 from Andrew Pinski --- Fixed.
[Bug tree-optimization/111146] Some patterns in match.pd are no longer needed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46 --- Comment #1 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:cbde03abe5dbba13b992a3b610efe43aefc0e234 commit r14-3527-gcbde03abe5dbba13b992a3b610efe43aefc0e234 Author: Andrew Pinski Date: Sun Aug 27 17:04:04 2023 -0700 MATCH: Remove redundant pattern for `(x | y) & ~x` After r14-2885-gb9237226fdc938, this pattern becomes redundant as we match it using bitwise_inverted_equal_p. There is already a testcase (gcc.dg/nand.c) for this pattern and it still passes after the removal. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/46 * match.pd (`(x | y) & ~x`, `(x & y) | ~x`): Remove redundant pattern.
[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167 --- Comment #9 from Jonathan Wakely --- (In reply to Christophe Lyon from comment #8) > On arm-eabi targets (thus, using newlib), we've noticed new errors: New since when? These files haven't changed in the last two weeks.
[Bug c/111211] No warning for iterator going out of scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211 --- Comment #4 from Andrew Pinski --- (In reply to Lehua Ding from comment #3) > (In reply to Richard Biener from comment #2) > > We diagnose this after unrolling, so the difference is whether we unroll or > > not. > > But based on the assembly code it looks like both are unrolled. > > foo: > nop > nop > nop > nop > nop > nop > nop > xor eax, eax > ret > foo2: > nop > nop > nop > nop > nop > nop > nop > nop > xor eax, eax > ret At different times in the pipeline and the warning happens before the second unrollinb
[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210 --- Comment #5 from Shaohua Li --- Thanks for all your comments!
[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210 --- Comment #4 from Alexander Monakov --- The testcase is small enough to notice the issue by inspection. Note that you get the "expected" answer with -fno-strict-aliasing, and as explained in https://gcc.gnu.org/bugs/ it is one of the things you should check when submitting a bugreport: Before reporting that GCC compiles your code incorrectly, compile it with gcc -Wall -Wextra and see whether this shows anything wrong with your code. Similarly, if compiling with -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations makes a difference, or if compiling with -fsanitize=undefined produces any run-time errors, then your code is probably not correct.
[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66 Richard Biener changed: What|Removed |Added CC||guojiufu at gcc dot gnu.org, ||sayle at gcc dot gnu.org --- Comment #6 from Richard Biener --- Roger was working on TImode incoming(?) argument code generation, this is TImode outgoing argument code generation where we produce for 32bit parts 7: NOTE_INSN_BASIC_BLOCK 2 2: r84:SI=di:SI 3: r85:SI=si:SI 4: r86:SI=dx:SI 5: r87:SI=cx:SI 6: NOTE_INSN_FUNCTION_BEG 9: r88:DI=zero_extend(r84:SI) 10: r89:DI=r82:TI#0 11: r91:DI=0x 12: {r90:DI=r89:DI&r91:DI;clobber flags:CC;} 13: {r92:DI=r90:DI|r88:DI;clobber flags:CC;} 14: r82:TI=r82:TI&<0x,0>|zero_extend(r92:DI) 15: r93:DI=zero_extend(r85:SI) 16: {r94:DI=r93:DI<<0x20;clobber flags:CC;} 17: r95:DI=r82:TI#0 18: r96:DI=zero_extend(r95:DI#0) 19: {r97:DI=r96:DI|r94:DI;clobber flags:CC;} 20: r82:TI=r82:TI&<0x,0>|zero_extend(r97:DI) 21: r98:DI=zero_extend(r86:SI) 22: r99:DI=r82:TI#8 23: r101:DI=0x 24: {r100:DI=r99:DI&r101:DI;clobber flags:CC;} 25: {r102:DI=r100:DI|r98:DI;clobber flags:CC;} 26: r82:TI=r82:TI&<0,0x>|zero_extend(r102:DI)<<0x40 27: r103:DI=zero_extend(r87:SI) 28: {r104:DI=r103:DI<<0x20;clobber flags:CC;} 29: r105:DI=r82:TI#8 30: r106:DI=zero_extend(r105:DI#0) 31: {r107:DI=r106:DI|r104:DI;clobber flags:CC;} 32: r82:TI=r82:TI&<0,0x>|zero_extend(r107:DI)<<0x40 33: r108:DI=r82:TI#0 34: r109:DI=r82:TI#8 35: di:DI=r108:DI 36: si:DI=r109:DI 37: ax:DI=call [`do_smth_with_4_u32'] argc:0 and we fail to dissect "backwards" from the 33: r108:DI=r82:TI#0 34: r109:DI=r82:TI#8 subregs. Possibly one issue is that we re-use r82. The dual-use of r82 at the end also poses issues as combine tries to match things like (parallel [ (set (reg:DI 108 [ D.2865 ]) (subreg:DI (reg:TI 82 [ D.2865 ]) 0)) (set (reg:TI 82 [ D.2865 ]) (ior:TI (and:TI (reg:TI 82 [ D.2865 ]) (const_wide_int 0x0)) (ashift:TI (zero_extend:TI (reg:DI 107)) (const_int 64 [0x40] ]) but fails to "rename" r82 to split the parallel. At RTL expansion time we store to D.2865 where it's DECL_RTL is r82:TI so we can hardly fix it there. Only a later pass could figure each of the insns fully define the reg. Jiufu Guo is working to improve what we choose for DECL_RTL, but for incoming params / outgoing return. This is a case where we could, with -fno-tree-vectorize, improve DECL_RTL for an automatic var and choose not TImode but something like a (concat:TI reg:DI reg:DI).
[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66 --- Comment #5 from gnu_bugzilla_gcc at catelyn dot tech --- (In reply to Richard Biener from comment #4) > note the situation is difficult to rectify - ideally the vectorizer > would see that we require two 64bit register pieces but it doesn't - it sees > we store into memory. right, I figured that might have been what was going on, given some of the related issues, the vectorizer incorrectly calculating the cost beforehand > I'll note the non-vectorized code is also far from optimal. clang > produces the following which is faster by more of the delta that > the vectorized version is slower compared to the scalar GCC variant. I did notice that the GCC -Os and clang -O3 versions were different, didn't realize that it was by that much, interesting
[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #3 from Xi Ruoyao --- (In reply to Shaohua Li from comment #2) > (In reply to Alexander Monakov from comment #1) > > 'c' is called with 'd' pointing to 'long e[2]', so > > > > return *(int *)(d + 1); > > > > is an aliasing violation (dereferencing a pointer to an incompatible type). > > Thanks for the quick diagnosis. I tried to enable -Wall -Wextra -pedantic > but got no warning about the test case. Could you share how you diagnose > this issue? The red banner in the bug creation page says clearly: "Similarly, if compiling with -fno-strict-aliasing -fwrapv makes a difference, your code probably is not correct."
[Bug libstdc++/104167] Implement C++20 std::chrono::utc_clock, std::chrono::tzdb etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104167 Christophe Lyon changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #8 from Christophe Lyon --- On arm-eabi targets (thus, using newlib), we've noticed new errors: FAIL: std/time/clock/gps/io.cc (test for excess errors) UNRESOLVED: std/time/clock/gps/io.cc compilation failed to produce executable FAIL: std/time/clock/tai/io.cc (test for excess errors) UNRESOLVED: std/time/clock/tai/io.cc compilation failed to produce executable The logs say: FAIL: std/time/clock/gps/io.cc (test for excess errors) Excess errors: ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `std::filesystem::current_path(std::filesystem::__cxx11::path const&, std::error_code&)': /libstdc++-v3/src/c++17/fs_ops.cc:806:(.text._ZNSt10filesystem12current_pathERKNS_7__cxx114pathE+0x10): undefined reference to `chdir' ld: /libstdc++-v3/src/c++17/fs_ops.cc:806:(.text._ZNSt10filesystem12current_pathERKNS_7__cxx114pathERSt10error_code+0x6): undefined reference to `chdir' ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `(anonymous namespace)::create_dir(std::filesystem::__cxx11::path const&, std::filesystem::perms, std::error_code&)': /libstdc++-v3/src/c++17/fs_ops.cc:583:(.text._ZN12_GLOBAL__N_110create_dirERKNSt10filesystem7__cxx114pathENS0_5permsERSt10error_code+0xa): undefined reference to `mkdir' ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `std::filesystem::create_directory(std::filesystem::__cxx11::path const&, std::error_code&)': /libstdc++-v3/src/c++17/fs_ops.cc:583:(.text._ZNSt10filesystem16create_directoryERKNS_7__cxx114pathERSt10error_code+0xe): undefined reference to `mkdir' ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `std::filesystem::permissions(std::filesystem::__cxx11::path const&, std::filesystem::perms, std::filesystem::perm_options, std::error_code&)': /libstdc++-v3/src/c++17/fs_ops.cc:1134:(.text._ZNSt10filesystem11permissionsERKNS_7__cxx114pathENS_5permsENS_12perm_optionsERSt10error_code+0x7c): undefined reference to `chmod' ld: /libstdc++-v3/src/c++17/fs_ops.cc:1134:(.text._ZNSt10filesystem11permissionsERKNS_7__cxx114pathENS_5permsENS_12perm_optionsERSt10error_code+0x9c): undefined reference to `chmod' ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `std::filesystem::current_path[abi:cxx11](std::error_code&)': /libstdc++-v3/src/c++17/fs_ops.cc:750:(.text._ZNSt10filesystem12current_pathB5cxx11ERSt10error_code+0x22): undefined reference to `pathconf' ld: /libstdc++-v3/src/c++17/fs_ops.cc:769:(.text._ZNSt10filesystem12current_pathB5cxx11ERSt10error_code+0x54): undefined reference to `getcwd' ld: /arm-eabi/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o): in function `std::filesystem::do_copy_file(char const*, char const*, std::filesystem::copy_options_existing_file, stat*, stat*, std::error_code&)': /libstdc++-v3/src/c++17/../filesystem/ops-common.h:553:(.text._ZNSt10filesystem12do_copy_fileEPKcS1_NS_26copy_options_existing_fileEP4statS4_RSt10error_code+0x114): undefined reference to `chmod' collect2: error: ld returned 1 exit status BTW I noticed the same error messages for other tests (eg. std/time/clock/gps/1.cc)
[Bug c/111210] Wrong code at -Os on x86_64-linux-gnu since r12-4849-gf19791565d7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111210 --- Comment #2 from Shaohua Li --- (In reply to Alexander Monakov from comment #1) > 'c' is called with 'd' pointing to 'long e[2]', so > > return *(int *)(d + 1); > > is an aliasing violation (dereferencing a pointer to an incompatible type). Thanks for the quick diagnosis. I tried to enable -Wall -Wextra -pedantic but got no warning about the test case. Could you share how you diagnose this issue?
[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66 Richard Biener changed: What|Removed |Added Depends on||101926 --- Comment #4 from Richard Biener --- Your benchmark confirms the vectorized variant is slower, on a 7900X it's both the memory roundtrip and the gpr->xmm move causing it. perf shows |turn_into_struct(): 1 | movd %edi,%xmm1 3 | movd %esi,%xmm4 4 | movd %edx,%xmm0 95 | movd %ecx,%xmm3 6 | punpckldq %xmm4,%xmm1 2 | punpckldq %xmm3,%xmm0 1 | movdqa %xmm1,%xmm2 | punpcklqdq %xmm0,%xmm2 5 | movaps %xmm2,-0x18(%rsp) 63 | mov-0x18(%rsp),%rdi 70 | mov-0x10(%rsp),%rsi 47 | jmp400630 note the situation is difficult to rectify - ideally the vectorizer would see that we require two 64bit register pieces but it doesn't - it sees we store into memory. I'll note the non-vectorized code is also far from optimal. clang produces the following which is faster by more of the delta that the vectorized version is slower compared to the scalar GCC variant. turn_into_struct: # @turn_into_struct .cfi_startproc # %bb.0: # kill: def $ecx killed $ecx def $rcx # kill: def $esi killed $esi def $rsi shlq$32, %rsi movl%edi, %edi orq %rsi, %rdi shlq$32, %rcx movl%edx, %esi orq %rcx, %rsi jmp do_smth_with_4_u32 # TAILCALL Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101926 [Bug 101926] [meta-bug] struct/complex/other argument passing and return should be improved
[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57 --- Comment #7 from Martin Jambor --- (In reply to Jan Hubicka from comment #4) > So here ipa-modref declares the field dead, while ipa-prop determines its > value even if it is unused and makes it used later? This is what I wanted to ask about. Looking at the dumps, ipa-modref knows it is "killed." Is that enough or does it need to be also not read to be know to be useless? > > I think dead argument is probably better than optimizing out one store, so I > think ipa-prop, however question is how to detect this reliably. > > ipa-modref has update_signature which updates summaries after ipa-sra work, > so it is also place to erase the info about parameter being dead from the > summary. This is what I have been looking at last week and where I'd like to plug such mechanism in so that it is not even streamed from WPA.
[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57 --- Comment #6 from Martin Jambor --- (In reply to Richard Biener from comment #5) > I think if IPA modref declares the argument dead at the call site then IPA > CP/SRA cannot declare it known constant. It is declared "killed" by the function. I still need to figure out whether that is all I need or whether the fact that it is not read either is the combination I am after. But I agree that IPA-CP should refrain from propagating clearly unneeded info in that case. > > Now, I wonder why IPA CP/SRA does not replace the known constant parameter > with an automatic var like > > point.constprop.isra (double ISRA.1740, int & restrict ipoint, double & > restrict x, double & restrict y, double & restrict z, int & restrict istat) > { > ... > const int istat.local = 0; > istat = &istat.local; > > ? So if not all uses of 'istat' get resolved we avoid generating wrong > code. The expense is a constant pool entry (if not all uses are removed), > but I think that's OK. It would also work for aggregates. It would also > relieve IPA-CP modification phase from doing anything but trival value > replacement (in case the arg isn't apointer). I'm afraid I don't understand. Even in this particular case, istat is checked by the caller and the callee can assign to it also other values, not just the one which happens to be what it it initialized to by the caller - and in the original code it does when there is an error - those writes cannot be redirected to a local variable.
[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66 --- Comment #3 from gnu_bugzilla_gcc at catelyn dot tech --- (In reply to Richard Biener from comment #1) > Unless you can come up with an actual benchmark showing the vector code is > slower I'd say it's not. Given it's smaller it should win on the icache > side if not executed frequently as well. I'm not an expert in benchmarking C, so my benchmark may be incorrect, but I compiled the same (attached preprocessed) file with -O2, -O3, and -Os into an object file, and then compiled a benchmarking file into an object as well (to avoid variance caused by the benchmarking file being compiled with different optimization levels), I added a very simple implementation for `do_smth_with_4_u32`, and ran the `turn_into_struct` function in a hot loop, with varying (pre-generated) input data and storing the result in an array, I timed this hot loop using `(float)clock()/CLOCKS_PER_SEC;` at the start and end, then added up the calculated results to ensure all three programs get the same result on my machine (Ryzen 9 5900X) the -Os version takes ~.36s, while the -O2 and -O3 versions take ~.43 and ~.42 seconds I tried both -O2 and -O3 to get a slightly better view of the typical variance between program runs, and their times are very similar, but the -Os version is a decent amount faster (around 16%, which I'd assume is significant) I've added the preprocessed benchmark file as well, which I then compiled with -mtune=generic and -march=x86-64 to match the system-under-test
[Bug target/111166] gcc unnecessarily creates vector operations for packing 32 bit integers into struct (x86_64)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66 --- Comment #2 from gnu_bugzilla_gcc at catelyn dot tech --- Created attachment 55807 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55807&action=edit preprocessed file containing the benchmark code I used I compiled this code (although using includes for clock, CLOCKS_PER_SEC, time_t, printf, and ) to an object and linked it with the bug-triggering file (compiled with -Os, -O2 and -O3 to test all those options), to measure the speed of the generated implementations of the bug-triggering file
[Bug analyzer/111213] New: -Wanalyzer-out-of-bounds false negative with `return arr[9];`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111213 Bug ID: 111213 Summary: -Wanalyzer-out-of-bounds false negative with `return arr[9];` Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: dale.mengli.ming at proton dot me Target Milestone: --- Hi, this case (https://godbolt.org/z/98PMz1KKz) contains an out-of-bound error (stmt: `return arr[9];`). At -O0, the analyzer can report this warning. However, at -O1, -O2, and -O3, the analyzer doesn't report that. After removing the `static` keyword (https://godbolt.org/z/qKohK3eeY), the analyzer can report this warning at -O1, -O2, and -O3.
[Bug bootstrap/100932] autoconf error: possibly undefined macro: GCC_AC_ENABLE_DECIMAL_FLOAT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100932 Nicolas Boulenguez changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Nicolas Boulenguez --- Quite ironically, given the only answer so far, somebody has investigated the same issue, duplicated the effort, and applied almost the same fix. https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=25861cf3a88a07c8dca3fb32d098c0ad756bbe38
[Bug c/111211] No warning for iterator going out of scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211 --- Comment #3 from Lehua Ding --- (In reply to Richard Biener from comment #2) > We diagnose this after unrolling, so the difference is whether we unroll or > not. But based on the assembly code it looks like both are unrolled. foo: nop nop nop nop nop nop nop xor eax, eax ret foo2: nop nop nop nop nop nop nop nop xor eax, eax ret
[Bug c/111211] No warning for iterator going out of scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111211 --- Comment #2 from Richard Biener --- We diagnose this after unrolling, so the difference is whether we unroll or not.
[Bug target/111212] [13/14 Regression] internal compiler error: in extract_insn, at recog.cc:2791
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212 Richard Biener changed: What|Removed |Added Target Milestone|--- |13.3 Summary|internal compiler error: in |[13/14 Regression] internal |extract_insn, at|compiler error: in |recog.cc:2791 |extract_insn, at ||recog.cc:2791
[Bug target/111212] internal compiler error: in extract_insn, at recog.cc:2791
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111212 --- Comment #1 from Mathieu Malaterre --- Compilation line: % /usr/bin/c++ -freport-bug -DHWY_STATIC_DEFINE -DTOOLCHAIN_MISS_ASM_HWCAP_H -I/home/malat/highway -maltivec -mcpu=power8 -O2 -g -DNDEBUG -fPIE -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -fmath-errno -fno-exceptions -DHWY_IS_TEST=1 -DGTEST_HAS_PTHREAD=1 -MD -MT CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o -MF CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o.d -o CMakeFiles/table_test.dir/hwy/tests/table_test.cc.o -c /home/malat/highway/hwy/tests/table_test.cc