[Bug tree-optimization/106912] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032 since r13-1575-gcf3a120084e94614
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106912 --- Comment #3 from Richard Biener --- OK, it's a late IPA pass doing the clones it seems. The scalar node got the 'const' stripped btw, but the call fntype still has it via the attributes. It loses 'const' by Old value = 252968993 New value = 251920417 set_const_flag_1 (node=, set_const=false, looping=false, changed=0x7fffda5f) at /home/rguenther/src/trunk/gcc/cgraph.cc:2696 2696 DECL_LOOPING_CONST_OR_PURE_P (node->decl) = false; (gdb) bt #0 set_const_flag_1 (node=, set_const=false, looping=false, changed=0x7fffda5f) at /home/rguenther/src/trunk/gcc/cgraph.cc:2696 #1 0x00da633f in cgraph_node::set_const_flag ( this=, set_const=false, looping=false) at /home/rguenther/src/trunk/gcc/cgraph.cc:2789 #2 0x015e2910 in tree_profiling () at /home/rguenther/src/trunk/gcc/tree-profile.cc:818 #3 0x015e2b9f in (anonymous namespace)::pass_ipa_tree_profile::execute (this=0x42a0c70) at /home/rguenther/src/trunk/gcc/tree-profile.cc:888 but the IL happily continues to treat the calls as 'const' because flags_from_decl_or_type on the call fntype has 849 else if (TYPE_P (exp)) 850 { 851 if (TYPE_READONLY (exp)) 852 flags |= ECF_CONST; note that __attribute__((pure)) is not duplicated on the type and so the IPA profile effect will change the IL in fixup_cfg (), rewriting virtual operands there. making things consistent.
[Bug target/106910] roundss not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106910 --- Comment #3 from Hongtao.liu --- > The backend should modernize itself, get rid of the > ix86_builtin_vectorized_function parts for those functions and instead rely > on define_expands with vector modes. Indeed, let me do it.
[Bug c/106920] -Warray-bound false positive regression with -O2 or -Os and constant address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920 --- Comment #2 from Dominique Martinet --- Thanks for the very fast reply! since you mentioned null pointers I now see this warning doesn't happen if I try with a larger constant, I just had bad luck that imx-atf uses an address < 4k...? I checked the first dozen of issues from the meta-bug (from start of open bugs list to 86613 included), but there are just too many and didn't see a workaround in the ones I did open. I can see catching bad casts to be useful, but for low level hardware code accessing register addresses directly is the norm -- I'm not too worried now I've noticed the <4k "rule" but there really can't be any assumption made with hardware, as seen here... (And NXP isn't exactly great at working with external entities, I tried reaching out for another compile fix with little success... but that's offtopic.) Well, good to understand the reason behind that warning at least.
[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 Richard Biener changed: What|Removed |Added Summary|[11/12 Regression] Program |[11/12/13 Regression] |compiled with -O3 -mfma |Program compiled with -O3 |produces different result |-mfma produces different ||result Blocks||53947 --- Comment #5 from Richard Biener --- (In reply to Martin Liška from comment #4) > Fixed on master with r13-1450-gd2a89809452e. > Started with r11-4637-gf5e18dd9c7dacc96. I believe both a are unrelated. The fix possibly caused a missed optimization while the cause exposed some opportunity. More analysis is needed here. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug tree-optimization/106909] [13 Regression] error: control flow in the middle of basic block since r13-2541-g78ef801b7263606d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106909 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #5 from Richard Biener --- [local count: 1073741824]: _80 = SR.96_116(D); # DEBUG this => SR.96_116(D) # DEBUG firstElement => ptrCopy_79(D) # DEBUG elementCount => sizeCopy_83(D) # DEBUG capacity => sizeCopy_83(D) # DEBUG INLINE_ENTRY dispose # DEBUG firstElement => ptrCopy_79(D) # DEBUG elementCount => sizeCopy_83(D) # DEBUG capacity => sizeCopy_83(D) # DEBUG disposer => SR.96_116(D) # DEBUG INLINE_ENTRY dispose _81 = MEM[(const struct ArrayDisposer *)SR.96_116(D)]._vptr.ArrayDisposer; _82 = *_81; __builtin_unreachable (); # DEBUG firstElement => NULL # DEBUG elementCount => NULL # DEBUG capacity => NULL # DEBUG disposer => NULL # DEBUG this => NULL # DEBUG firstElement => NULL # DEBUG elementCount => NULL # DEBUG capacity => NULL after some folding. I fear this is the general gimple_build_builtin_unreachable which is now generally used but esp. folding should _not_ mark the call as control altering but leave that to CFG fixup (CFG cleanup doesn't catch this since it only looks at the last stmt of BBs). I'm fixing up in the use.
[Bug c++/106921] New: [11/12.1] -O1 and -fipa-icf -fpartial-inlining causes wrong code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106921 Bug ID: 106921 Summary: [11/12.1] -O1 and -fipa-icf -fpartial-inlining causes wrong code Product: gcc Version: 11.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lutztonineubert at gmail dot com Target Milestone: --- Short summary: The following code returns 1 if compiled with -O2 (which is wrong) and does return 0 if compiled without optimization. ``` #include #include #include #define GCC_VERSION (__GNUC__ * 1 \ + __GNUC_MINOR__ * 100 \ + __GNUC_PATCHLEVEL__) static_assert(GCC_VERSION == 110300); template class bitset { private: using word_t = size_t; static constexpr size_t bits_per_word = sizeof(word_t) * 8; static constexpr size_t number_of_words = (Bits / bits_per_word) + (((Bits % bits_per_word) == 0) ? 0 : 1); public: bool all_first(size_t n) const { { if (n > Bits) { #ifdef RETURN_INSTEAD_TERMINATE return false; #else std::terminate(); #endif } size_t i = 0; for (; n > bits_per_word; n -= bits_per_word, i++) { if (words_[i] != ~word_t{0}) { return false; } } word_t last_word = words_[i]; for (; n != 0; n--) { if ((last_word & 1) != 1) { return false; } last_word >>= 1; } return true; } } void fill() noexcept { for (auto& word : words_) { word = ~word_t{0}; } } private: std::array words_{}; }; volatile int X = 0; int main() { if (X == 1) { bitset<123> bitset; static_cast(bitset.all_first(123)); } else { bitset<256> bitset; bitset.fill(); if (!bitset.all_first(255)) { return 1; } } return 0; } ``` See: https://gcc.godbolt.org/z/bEexjrKP4 This issue does not exist in GCC 10 or GCC > 12.1. I couldn't test if it does work in GCC 11.3.1 (or the trunk of it). Additional: * I could also trigger the issue with -O1 -fipa-icf -fpartial-inlining * If we do a return false instead of a std::terminate, no wrong code is generated. I am sorry, but I couldn't reduced the code any further - this already took so much time to figure out it is a compiler bug.
[Bug c/106920] -Warray-bound false positive regression with -O2 or -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920 Richard Biener changed: What|Removed |Added Keywords||diagnostic Status|UNCONFIRMED |NEW Last reconfirmed||2022-09-13 Blocks||56456 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed, that was an intended change to catch errors with accessing a subobject of an object at nullptr. There's some related duplicate where we discuss workarounds. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 [Bug 56456] [meta-bug] bogus/missing -Warray-bounds
[Bug target/106919] [13 Regression] RTL check: expected code 'set' or 'clobber', have 'if_then_else' in s390_rtx_costs, at config/s390/s390.cc:3672on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106919 Richard Biener changed: What|Removed |Added Target Milestone|--- |13.0
[Bug tree-optimization/106914] [13 Regression] ICE in operator[], at vec.h:889 since r13-2288-g61c4c989034548f4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106914 Richard Biener changed: What|Removed |Added Target Milestone|--- |13.0 Priority|P3 |P1
[Bug rtl-optimization/106913] [13 Regression] ICE in dump_bb_info, at cfg.cc:796 since r13-2263-gf71abacfed170852
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106913 Richard Biener changed: What|Removed |Added Assignee|marxin at gcc dot gnu.org |rguenth at gcc dot gnu.org Target Milestone|--- |13.0 --- Comment #2 from Richard Biener --- That looks like we fail to clear an auto_bb_flag but verification should also catch that earlier ... huh. Maybe we fail to verify that for ENTRY/EXIT. I have a patch.
[Bug c/106920] New: -Warray-bound false positive regression with -O2 or -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106920 Bug ID: 106920 Summary: -Warray-bound false positive regression with -O2 or -Os Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: npfhrotynz-ptnqh.myvf at noclue dot notk.org Target Milestone: --- Hello, I think I've run into a false positive on this file: https://source.codeaurora.org/external/imx/imx-atf/tree/plat/imx/imx8m/hab.c?h=lf_v2.6 I could trim it down to this #include typedef void hab_rvt_entry_t(void); int main() { hab_rvt_entry_t *a; a = ((hab_rvt_entry_t *)(*(unsigned long *)(0x908))); a(); return 0; } $ gcc -O2 -Warray-bounds -c t.c t.c: In function ‘main’: t.c:7:34: warning: array subscript 0 is outside array bounds of ‘long unsigned int[0]’ [-Warray-bounds] 7 | a = ((hab_rvt_entry_t *)(*(unsigned long *)(0x908))); | ~^~ According to godbolt this passed on 11.3 and starts emitting the warning on 12.1 (it doesn't have 12.0) and still emits it on trunk. Note the warning requires -O2, -O3 or -Os to be emitted. The problem seems to be that it considers an arbitrary address casted to u64* to be a u64[0] ? If so that might be a problem for quite a few embedded products as that is quite common when dealing with hardware registers. (and who doesn't love products that compile with -Werror for release builds...) Thanks!
[Bug tree-optimization/106912] [13 Regression] ICE in vect_transform_loops, at tree-vectorizer.cc:1032 since r13-1575-gcf3a120084e94614
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106912 Richard Biener changed: What|Removed |Added CC||jakub at gcc dot gnu.org Status|NEW |ASSIGNED Priority|P3 |P1 Target Milestone|--- |13.0 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- Confirmed, we have # .MEM = VDEF <.MEM> vect__5.57_58 = foo.simdclone.0 (vect__4.56_57); here. IIRC I filed a bugreport about simdclones not being const when the scalar version is, in this case it's possibly IPA pure const not updating the clones before materializing them!? That said, the not vectorized variant is just _5 = foo (_4); and without -fprofile-generate the vectorized variant also keeps 'const'. I will look at this again after Cauldron. Have to dig to where the simdclone is actually generated.
[Bug target/106910] roundss not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106910 Richard Biener changed: What|Removed |Added Blocks||53947 CC||crazylht at gmail dot com Target|x86_64 |x86_64-*-* --- Comment #2 from Richard Biener --- Probably missing patterns for V2SFmode here. Hmm, we don't seem to have any vector mode patterns here but possibly rely on ix86_builtin_vectorized_function which indeed doesn't have any V2SFmode support. The vectorizer would go the direct internal fn way for those, querying the floor optab but the x86 backend only has scalar modes supported for the rounding optabs. The backend should modernize itself, get rid of the ix86_builtin_vectorized_function parts for those functions and instead rely on define_expands with vector modes. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #14 from Kewen Lin --- Should be fixed everywhere.
[Bug testsuite/106345] Some ppc64le tests fail with -mcpu=power9 -mtune=power9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345 --- Comment #13 from CVS Commits --- The releases/gcc-10 branch has been updated by Kewen Lin : https://gcc.gnu.org/g:12d28957b613d8c9b74e7841d73945025a7f0ccb commit r10-10982-g12d28957b613d8c9b74e7841d73945025a7f0ccb Author: Kewen Lin Date: Tue Sep 6 20:37:57 2022 -0500 rs6000/test: Fix empty TU in some cases of effective targets [PR106345] As the failure of test case gcc.target/powerpc/pr92398.p9-.c in PR106345 shows, some test sources for some powerpc effective targets use empty translation unit wrongly. The test sources could go with options like "-ansi -pedantic-errors", then those effective target checkings will fail unexpectedly with the error messages like: error: ISO C forbids an empty translation unit [-Wpedantic] This patch is to fix empty TUs with one dummy function definition accordingly. PR testsuite/106345 gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_has_arch_pwr5): Add a function definition to avoid pedwarn about empty translation unit. (check_effective_target_has_arch_pwr6): Likewise. (check_effective_target_has_arch_pwr7): Likewise. (check_effective_target_has_arch_pwr8): Likewise. (check_effective_target_has_arch_pwr9): Likewise. (check_effective_target_has_arch_ppc64): Likewise. (check_effective_target_ppc_float128): Likewise. (check_effective_target_ppc_float128_insns): Likewise. (check_effective_target_powerpc_vsx): Likewise. (cherry picked from commit 7a43e52a48b6403a99d3e8ab3105869b4b3c081e)