[Bug ipa/102059] Incorrect always_inline diagnostic in LTO mode with #pragma GCC target("cpu=power10")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059 Kewen Lin changed: What|Removed |Added Assignee|linkw at gcc dot gnu.org |meissner at gcc dot gnu.org URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2022-March/5 ||91496.html --- Comment #36 from Kewen Lin --- Mike had one patch [1] under review for the power8 fusion piece, moving this under his name. Thanks Mike! [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591496.html
[Bug middle-end/104854] -Wstringop-overread should not warn for strnlen, strndup and strncmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104854 --- Comment #8 from Siddhesh Poyarekar --- (In reply to Martin Sebor from comment #7) > Moving warnings into the analyzer and scaling it up to be able to run by > default, during development, sounds like a good long-term plan. Until that That's not quite what I'm suggesting here. I'm not a 100% convinced that those are the right heuristics at all; the size argument for strnlen, strndup and strncmp does not intend to describe the size of the passed strings. It is only recommended security practice that the *n* variant functions be used instead of their unconstrained relatives to mitigate overflows. In fact in more common cases the size argument (especially in case of strnlen and strncmp) may describe a completely different buffer or some other application-specific property. This is different from the -Wformat-overflow, where there is a clear relationship between buffer, the format string and the string representation of input numbers and we're only tweaking is the optimism level of the warnings. So it is not just a question of levels of verosity/paranoia. In that context, using size to describe the underlying buffer of the source only makes sense only for a subset of uses, making this heuristic quite noisy. So what I'm actually saying is: the heuristic is too noisy but if we insist on keeping it, it makes sense as an analyzer warning where the user *chooses* to look for pessimistic scenarios and is more tolerant of noisy heuristics. > happens, rather than gratuitously removing warnings that we've added over > the years, just because they fall short of the ideal 100% efficacy (as has > been known and documented), making them easier to control seems like a > better approach. It's not just a matter of efficacy here IMO. The heuristic for strnlen, strncmp and strndup overreads is too loose for it to be taken seriously.
[Bug target/104894] [11/12 Regression] ICE with -fno-plt -mcpu=power10 on PowerPC64 LE Linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104894 Alan Modra changed: What|Removed |Added CC||amodra at gmail dot com Last reconfirmed||2022-03-15 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Alan Modra --- Regressed with 95f17e26112d8a0 "rs6000: Enable more sibcalls when TOC is not preserved". Likely "gcc_assert (INTVAL (cookie) == 0);" in rs6000_sibcall_aix can just be deleted.
[Bug target/103743] PPC: Inefficient equality compare for large 64-bit constants having only 16-bit relevant bits in high part
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103743 Jiu Fu Guo changed: What|Removed |Added CC||guojiufu at gcc dot gnu.org --- Comment #3 from Jiu Fu Guo --- For "in == 0x8000LL", it would be also ok with: rotldi %r9,%r3,16 cmpldi %cr0,%r9,32768 And it would be similar for "in == 0x8000LL" (highest bit and low48bits are all 1) rotldi %r9,%r3,16 cmpdi %cr0,%r9,-32768
[Bug c++/69623] Invalid deduction of non-trailing template parameter pack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69623 --- Comment #7 from jim x --- In a simple way, the rule just requires that, for a function template, the template parameter that is declared after a template parameter pack should either appear in the parameter-declaration-clause before the template pack(deducible) or just have a default argument. auto f(auto..., auto a, auto...) { return a; } IIUC, this is just disallowed since all arguments would only match the first function parameter pack.
[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #41 from Hongtao.liu --- (In reply to Richard Biener from comment #22) > (In reply to Hongtao.liu from comment #21) > > Now we have SLP node available in vector cost hook, maybe we can do sth in > > cost model to prevent vectorization when node's definition from big-size > > parameter. > > Note we vectorize a load here for which we do not pass down an SLP node. > But of course there's the stmt-info one could look at - but the issue > is that for SLP that doesn't tell you which part of the variable is accessed. > Also even if we were to pass down the SLP node we do not know exactly how > it is going to vectorize - but sure, we could play with some heuristics Then, we can't get exact offset between load address and store address.
[Bug target/80640] Missing memory side effect with __atomic_thread_fence (2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80640 Andrew Pinski changed: What|Removed |Added CC||tomash.brechko at gmail dot com --- Comment #12 from Andrew Pinski --- *** Bug 55690 has been marked as a duplicate of this bug. ***
[Bug target/55690] On some targets thread_fence is not a compiler barrier when memmodel != MEMMODEL_SEQ_CST
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55690 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #3 from Andrew Pinski --- I went and found this is a dup of bug 80640 which had a testcase in it while this one didn't until recently so closing as a dup of bug 80640. *** This bug has been marked as a duplicate of bug 80640 ***
[Bug c++/104924] bad_variant_access When using iostream and variant as modules
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104924 --- Comment #2 from Lorenzo Gomez --- (In reply to Andrew Pinski from comment #1) > C++ modules in GCC 11 (and it looks like 12 but some bugs have been fixed > there) is still considered experimental and your mileage on this feature > will varry. > > Note I have not checked to see if this has been fixed on the trunk for GCC > 12. This is a good point! I will try with GCC12 and check if this issue has been fixed whenever I get time. Thanks!
[Bug target/80640] Missing memory side effect with __atomic_thread_fence (2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80640 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |8.0
[Bug c++/104924] bad_variant_access When using iostream and variant as modules
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104924 --- Comment #1 from Andrew Pinski --- C++ modules in GCC 11 (and it looks like 12 but some bugs have been fixed there) is still considered experimental and your mileage on this feature will varry. Note I have not checked to see if this has been fixed on the trunk for GCC 12.
[Bug c++/104924] New: bad_variant_access When using iostream and variant as modules
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104924 Bug ID: 104924 Summary: bad_variant_access When using iostream and variant as modules Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: yunior.eury at gmail dot com Target Milestone: --- When compiling the following: ``` export module Chapter2; import ; import ; export { class Object { std::variant var; }; } ``` I get the following error in gcc: /usr/local/include/c++/11.2.1/variant:1285:9: internal compiler error: in build_op_delete_call, at cp/call.c:7144 1285 | class bad_variant_access : public exception If I change the order of the imports, then it compiles fine: ``` export module Chapter2; import ; import ; export { class Object { std::variant var; }; }``` g++ version: Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.2.0-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.2.0 (Ubuntu 11.2.0-7ubuntu2) I compile the stl header files with the following commands: ``` g++ -std=c++20 -fmodules-ts -xc++-system-header iostream g++ -std=c++20 -fmodules-ts -xc++-system-header variant ``` This looks very similar to Bug 103256. Is this behavior expected? Thanks in advance. Lorenzo
[Bug middle-end/104077] bogus/missing -Wdangling-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104077 Bug 104077 depends on bug 104436, which changed state. Bug 104436 Summary: [12 Regression] spurious -Wdangling-pointer assigning local address to a class passed by value https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104436 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/104436] [12 Regression] spurious -Wdangling-pointer assigning local address to a class passed by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104436 Martin Sebor changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #5 from Martin Sebor --- Fixed in r12-7650.
[Bug middle-end/104436] [12 Regression] spurious -Wdangling-pointer assigning local address to a class passed by value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104436 --- Comment #4 from CVS Commits --- The master branch has been updated by Martin Sebor : https://gcc.gnu.org/g:373a2dc2be0089ae59b61202a6023458aaaf63d8 commit r12-7650-g373a2dc2be0089ae59b61202a6023458aaaf63d8 Author: Martin Sebor Date: Mon Mar 14 18:23:08 2022 -0600 Avoid -Wdangling-pointer for by-transparent-reference arguments [PR104436]. This change avoids -Wdangling-pointer for by-value arguments transformed into by-transparent-reference. Resolves: PR middle-end/104436 - spurious -Wdangling-pointer assigning local address to a class passed by value gcc/ChangeLog: PR middle-end/104436 * gimple-ssa-warn-access.cc (pass_waccess::check_dangling_stores): Check for warning suppression. Avoid by-value arguments transformed into by-transparent-reference. gcc/testsuite/ChangeLog: PR middle-end/104436 * c-c++-common/Wdangling-pointer-8.c: New test. * g++.dg/warn/Wdangling-pointer-5.C: New test.
[Bug middle-end/103483] [12 regression] context-sensitive ranges change triggers stringop-overread
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103483 --- Comment #23 from Martin Sebor --- (In reply to Richard Biener from comment #22) Your question may have been rhetorical but to be explicit, the real difference is hidden in the implementation (which is why these warnings can sometimes seem inconsistent). GCC doesn't warn for the second test case (copied below) because it only considers the lower bound of len's range: int a[2]; void foo (unsigned len) { if (len == 1 || len == 20) __builtin_memset (a, 0, len); } But the warning would trigger if GCC decided it was profitable to split the memset call into two statements: int a[2]; void foo (unsigned len) { if (len == 1) a[0] = 0; else if (len == 20) __builtin_memset (a, 0, 20); } I suspect most users (though not all, otherwise this report would have never been raised) would consider a warning valid and helpful for the source code. But if instead of (len == 1 || len == 20) the condition were to be written in terms of a relational expression (like len <= N) where N were greater than or even equal to sizeof (a) + 1, I'd expect complaints about the warning being a false positive because GCC can't "know" that len == N necessarily holds.
[Bug c++/104008] [11/12 Regression] New g++ folly compile error since r11-7931-ga2531859bf5bf6cf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104008 Marek Polacek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |mpolacek at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug target/104923] MMA __builtin_mma_disassemble_acc test case ICEs in LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104923 Peter Bergner changed: What|Removed |Added Target Milestone|--- |12.0 CC||dje at gcc dot gnu.org, ||segher at gcc dot gnu.org
[Bug target/104923] MMA __builtin_mma_disassemble_acc test case ICEs in LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104923 Peter Bergner changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |bergner at gcc dot gnu.org Known to fail||11.0, 12.0 Known to work||10.0 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2022-03-14 --- Comment #1 from Peter Bergner --- We ICE due to having illegal addresses in the MEM passed to the __builtin_mma_disassemble_acc builtin: (insn 18 17 0 2 (set (mem:V16QI (plus:DI (and:DI (mult:DI (reg:DI 137) (const_int 32 [0x20])) (const_int 68719476704 [0xfffe0])) (reg:DI 136)) [0 *_4+0 S16 A128]) (unspec:V16QI [ (reg:XO 134 [ *acc_6(D) ]) (const_int 3 [0x3]) ] UNSPEC_MMA_EXTRACT)) "ice1.c":6:14 2153 {*mma_disassemble_acc} (expr_list:REG_DEAD (reg:DI 137) (expr_list:REG_DEAD (reg:DI 136) (expr_list:REG_DEAD (reg:XO 134 [ *acc_6(D) ]) (nil) The problem here is that the mma_disassemble_output_operand predicate is too lenient on the types of addresses it will accept. I have a patch which restricts the addresses accepted which fixes the ICE. I'm currently regression testing the patch.
[Bug target/104923] New: MMA __builtin_mma_disassemble_acc test case ICEs in LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104923 Bug ID: 104923 Summary: MMA __builtin_mma_disassemble_acc test case ICEs in LRA Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: bergner at gcc dot gnu.org Target Milestone: --- The following two test cases both ICE in LRA with current trunk and GCC11. It compiles finw with GCC 10: bergner@rain6p1:~$ cat ice1.c void foo (__vector_quad *acc, __vector char *dst, unsigned int n) { __vector char a[4]; __builtin_mma_disassemble_acc(a, acc); dst[2 * n] = a[0]; } bergner@rain6p1:~$ gcc -S -O2 -mcpu=power10 ice1.c during RTL pass: reload ice1.c: In function ‘foo’: ice1.c:7:1: internal compiler error: in lra_set_insn_recog_data, at lra.cc:1010 7 | } | ^ 0x109a49fb lra_set_insn_recog_data(rtx_insn*) /home/bergner/gcc/gcc-fsf-mainline-base/gcc/lra.cc:1010 ... bergner@rain6p1:~$ cat ice2.c void foo (__vector_quad *acc, __vector char *dst, unsigned int n) { __vector char a[4]; __builtin_mma_disassemble_acc(a, acc); dst[3 * n] = a[0]; } bergner@rain6p1:~$ gcc -S -O2 -mcpu=power10 ice2.c during RTL pass: reload ice2.c: In function ‘foo’: ice2.c:7:1: internal compiler error: in decompose_normal_address, at rtlanal.cc:6716 7 | } | ^ 0x10be4b23 decompose_normal_address /home/bergner/gcc/gcc-fsf-mainline-base/gcc/rtlanal.cc:6716 ...
[Bug fortran/97592] Incorrectly set pointer remapping with array pointer argument to CONTIGUOUS dummy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97592 --- Comment #3 from anlauf at gcc dot gnu.org --- It looks like argument association is confused here. (The F2018 reference is 15.5.2.3 and 15.5.2.4). The following patch appears to fix the testcase: diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 06713f24f95..c7fb4633ab1 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -6854,7 +6854,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, INTENT_IN, fsym->attr.pointer); } else if (fsym && fsym->attr.contiguous - && !gfc_is_simply_contiguous (e, false, true) + && gfc_is_not_contiguous (e) && gfc_expr_is_variable (e)) { gfc_conv_subref_array_arg (&parmse, e, nodesc_arg, but unfortunately regresses on gfortran.dg/bind-c-contiguous-3.f90 :-(
[Bug testsuite/103324] RFE: Add a `make quickcheck` or `make smoketest` Makefile target to allow only running a portion of the testsuite
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103324 --- Comment #7 from Jonathan Wakely --- (In reply to Eric Gallager from comment #5) > So, now I'm running the testsuite anyways for other reasons, and one more > thing to note is that using any sort of parallelism when running the > testsuite (which is pretty much a must these days) makes picking out the > ‘Running … .exp’ lines more difficult than necessary... Surely only if you try to get them while the tests are still running? After they finish, all the output is flattened out into the .log files.
[Bug target/55690] On some targets thread_fence is not a compiler barrier when memmodel != MEMMODEL_SEQ_CST
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55690 --- Comment #2 from Joseph --- Created attachment 52626 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52626&action=edit Reproducer I created a reproducer (see attached file or online: https://godbolt.org/z/n76K3Ejds). Note that the acquire fence does not prevent GCC 7 from loading l->b ahead of the loop. With GCC 8 and later l->b is loaded inside the loop (as it should be).
[Bug d/104911] [12 regression] Comparison failure in gcc/d/typesem.o etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104911 --- Comment #3 from ro at CeBiTec dot Uni-Bielefeld.DE --- > --- Comment #2 from Iain Buclaw --- > That's interesting. I've just done a build of > 54ef95cc4d1f3f2cde7c1f13250f889ffb81ca75 (20220301) and I get the same > comparison failure. Weird: I've just run a reghunt to identify the culprit patch and it converged on commit 7e28750395889d16a9cba49cd5935ced7dc00ce8 Author: Iain Buclaw Date: Sun Mar 13 12:28:05 2022 +0100 d: Merge upstream dmd 02a3fafc6, druntime 26b58167, phobos 16cb085b5.
[Bug middle-end/104854] -Wstringop-overread should not warn for strnlen, strndup and strncmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104854 --- Comment #7 from Martin Sebor --- Moving warnings into the analyzer and scaling it up to be able to run by default, during development, sounds like a good long-term plan. Until that happens, rather than gratuitously removing warnings that we've added over the years, just because they fall short of the ideal 100% efficacy (as has been known and documented), making them easier to control seems like a better approach.
[Bug target/104890] [12 Regression] fails to build the 32bit libgcc on x86_64-linux-gnu (--enable-cet --with-arch-32=i686)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104890 H.J. Lu changed: What|Removed |Added Attachment #52620|0 |1 is obsolete|| --- Comment #11 from H.J. Lu --- Created attachment 52625 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52625&action=edit The v3 patch
[Bug tree-optimization/104922] bogus -Wformat-overflow=2 due to missing range for related variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104922 Martin Sebor changed: What|Removed |Added Keywords|lto |missed-optimization Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2022-03-14 --- Comment #1 from Martin Sebor --- Confirmed per bug 104746 comment 13.
[Bug tree-optimization/104922] bogus -Wformat-overflow=2 due to missing range for related variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104922 Bug 104922 depends on bug 104746, which changed state. Bug 104746 Summary: False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |INVALID
[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746 Martin Sebor changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |INVALID --- Comment #14 from Martin Sebor --- Andrew, I agree with tracking the improvement we discussed. Because it's not directly related to the warning for which this bug was opened (the 4095 limit), and implementing it won't prevent this warning. I've raised pr104922 to track it. I'm also open to revisiting the design behind this instance of the warning (the 4095 limit), reconsidering whether it's still useful (it was motivated by some old sprintf implementations failing for large amounts of output -- see for example https://bugzilla.redhat.com/show_bug.cgi?id=441945), or perhaps exposing it under a target hook. But to avoid confusion I'd prefer to do that separately of this bug report. It doesn't illustrate a false positive or reflect a bug in the warning.
[Bug tree-optimization/104922] New: bogus -Wformat-overflow=2 due to missing range for related variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104922 Bug ID: 104922 Summary: bogus -Wformat-overflow=2 due to missing range for related variables Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: diagnostic, lto Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: msebor at gcc dot gnu.org CC: amacleod at redhat dot com, marxin at gcc dot gnu.org, msebor at gcc dot gnu.org, unassigned at gcc dot gnu.org Depends on: 104746 Target Milestone: --- +++ This bug was initially created as a clone of Bug #104746 +++ As mentioned in bug 104746 comment 5, the following test case triggers -Wformat-overflow (level 2) due to the constraint on i and j not being fully exposed to the warning (each directive on its own can produce at most two bytes/digits, but when one does produce two digits the other must produce exactly one, so the output must fit in four bytes). The same limitation affects string directives with strings of bounded lengths. As Andrew explains in bug 104746 comment 13, this can be improved in Ranger, and should be made use of to avoid the warning. char a[4]; void f (int i, int j) { if (i < 0 || j < 0 || i + j > 19) return; __builtin_sprintf (a, "%u%u", i, j); } a.c: In function ‘f’: a.c:8:26: warning: ‘%u’ directive writing between 1 and 10 bytes into a region of size 4 [-Wformat-overflow=] 8 | __builtin_sprintf (a, "%u%u", i, j); | ^~ a.c:8:25: note: using the range [0, 4294967295] for directive argument 8 | __builtin_sprintf (a, "%u%u", i, j); | ^~ a.c:8:25: note: using the range [0, 4294967295] for directive argument a.c:8:3: note: ‘__builtin_sprintf’ output between 3 and 21 bytes into a destination of size 4 8 | __builtin_sprintf (a, "%u%u", i, j); | ^~~ Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746 [Bug 104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 99541, which changed state. Bug 99541 Summary: ICE when reading a module https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99541 What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |INVALID
[Bug c++/99541] ICE when reading a module
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99541 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING |RESOLVED --- Comment #2 from Andrew Pinski --- No testcase for the last year so closing as invalid. If you provide a testcase, we will look into it.
[Bug c++/104855] -Wclass-memaccess is too broad with valid code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104855 Martin Sebor changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |INVALID CC||msebor at gcc dot gnu.org --- Comment #5 from Martin Sebor --- The documented purpose of the warning is to detect not just code that's undefined (such as overwriting const or reference members) but also code that violates encapsulation (and with it potentially also class invariants). Such code may be valid in the strict language sense but it's almost certainly not valid under the design of the class, as in the test case in comment #0. In the rare instances when such code is intentional and safe or where it cannot be changed, an explicit cast along with a suitable comment certainly seems like a preferable solution to avoid the warning than would be disabling it for the majority of uses where it is not intentional. Since any object pointer can be converted to void*, either implicitly (by passing it to a memcpy argument) or by a static_cast, using reinterpret_cast is not necessary to suppress the warning (nor would it be appropriate). If -Wclass-memaccess had the ability to analyze the class invariants splitting up -Wclass-memaccess into multiple levels as Jonathan suggests in comment #3 might perhaps be doable and useful, with level 1 triggering for the subset of code that's strictly undefined, and higher levels for the rest. But because the warning runs in the C++ front end where the actual implementation of the class is not readily analyzable, the separation would result in a bias heavily skewed toward the latter (i.e., lots of false negatives at level 1). The guidance for deciding whether or not a subset of warnings should be in -Wall is in the GCC manual: [-Wall] enables all the warnings about constructions that some users consider questionable, and that are easy to avoid (or modify to prevent the warning), even in conjunction with macros. As Richard notes in comment #1, this instance of -Wclass-memacess (and in my experience the overwhelming majority of others) with the easy suppression fit this description.
[Bug middle-end/104854] -Wstringop-overread should not warn for strnlen, strndup and strncmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104854 --- Comment #6 from Siddhesh Poyarekar --- (In reply to Martin Sebor from comment #5) > It would be useful to separate these warnings into multiple levels: level 1 > for invalid code, and higher levels for suspicious (or pointless) code, > similarly to -Wformat-overflow. I think the analyzer is a great level for the higher level heuristics, with ME warnings only sticking to level 1. Adding levels within ME warnings seems unnecessary. ISTM that users tend to *expect* false positives (to some sane extent) when doing static analysis but are much less tolerant of those during usual builds.
[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746 Andrew Macleod changed: What|Removed |Added Resolution|INVALID |--- Status|RESOLVED|REOPENED --- Comment #13 from Andrew Macleod --- Why can't we leave this open? There have been VRP bugs open since 2005. There was no intent to fix them back then, but they identify opportunities for improvement. I think this is a good placeholder for improvements in analyzing the parameters of the warning. We can make more effort when there is more than one parameter to see if the information can be further refined. _4 = i_6(D) + j_7(D); if (_4 > 19) goto ; [INV] else goto ; [INV] : __builtin_sprintf (&a, "%u%u", i_6(D), j_7(D)); i_6 and j_7 are both understood to be [0,19], and it wouldn't take much more work to find if there is a relationship between them, and refine the values. They appear together in one statement _4 = i_6 + j_7, and _4 is known to be [0,19] at the warning location. so [0,19] = i_6 + j_7 we can ask what the range of i_6 or j_7 is if the other one is refined. [10,99] will create 2 characters of output, you can ask if i_6 is [10,19], whats the range of j_7, and it will give you [0,9]. And then the warning doesn't need to trigger. Likewise for i_6 computed from j_7 = [10, 19]. This could be generalized using [100,999] for 3 character outputs, [1000, ] is 4 characters, etc. And doing a bit more work trying to analyze the parameters. Those bits are all there now. This could be in the next release. I'd even argue we should probably split this into 2 reports. The earlier warning on: char *path = malloc(strlen(dirname) + strlen(result) + 2); sprintf(path, "%s/%s", dirname, result); Seems like a different opportunity in which we could track/associate strlen results with the string. ie result_17 = malloc (_5); sprintf (result_17, "%s", suffix_14); _6 = strlen (dirname_13); _7 = strlen (result_17); _8 = _6 + _7; _9 = _8 + 2; path_20 = malloc (_9); sprintf (path_20, "%s/%s", dirname_13, result_17); _6 is length of dirname_13, and _7 is length of result_17, Then we malloc an object that is _6 + _7 + 2 and copy those 2 strings into it. With the appropriate pointer/string/strlen/malloc associations it seems deterministically knowable that we can avoid that warning too. Any false positive we can potentially eliminate with some additional analysis is worth tracking and considering IMO. This PR is on the list of PRs I track for possible future improvements.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 seurer at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #12 from seurer at gcc dot gnu.org --- This no longer fails.
[Bug c++/104920] Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 Ali Kouhzadi changed: What|Removed |Added Resolution|--- |WORKSFORME Status|UNCONFIRMED |RESOLVED --- Comment #5 from Ali Kouhzadi --- Thanks Jakub for the explanation, it does make more sense now. Although I'm still not quite sure why the addition of the 2nd member function in bad_04.cpp triggers the warning, I guess that's a StackOverflow question.
[Bug tree-optimization/98335] [9/10/11/12 Regression] Poor code generation for partial struct initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98335 Roger Sayle changed: What|Removed |Added Target Milestone|9.5 |12.0 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from Roger Sayle --- This should now be fixed on mainline.
[Bug c/83840] missing -Wmemset-elt-size with address of array element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83840 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Last reconfirmed||2022-03-14 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Jakub Jelinek --- Confirmed, just ran into this too. I bet we should special case ARRAY_REF with 0 index, for non-zero indexes the warning would be more complicated and less obvious.
[Bug middle-end/98420] Invalid simplification of x - x with -frounding-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98420 Roger Sayle changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |12.0 --- Comment #5 from Roger Sayle --- This should now be fixed on mainline.
[Bug c++/104920] Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 --- Comment #4 from Jakub Jelinek --- E.g. in the above #c0 testcase, you can see it in the -fdump-tree-gimple dump: memset (&arr, 0, 10); _1 = std::array::data (&arr2); memset (_1, 0, 10); For the arr case, the FE can see the array and the size, while for the other case it can only see the size to be constant (even that goes beyond what the language guarantees, as the size is passed outside of manifestly constant evaluated context, so nothing guarantees it is constant evaluated). But, if you e.g. add constexpr auto p = arr2.data(); to your #c0 testcase, you'll see it is not a constant expression: error: ‘(int*)(& arr2.std::array::_M_elems)’ is not a constant expression
[Bug d/104911] [12 regression] Comparison failure in gcc/d/typesem.o etc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104911 --- Comment #2 from Iain Buclaw --- That's interesting. I've just done a build of 54ef95cc4d1f3f2cde7c1f13250f889ffb81ca75 (20220301) and I get the same comparison failure.
[Bug other/61257] configure should check if sys/sdt.h is usable, not just checking the existance of the header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61257 --- Comment #8 from Eric Gallager --- (In reply to Eric Gallager from comment #7) > Patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591704.html Oh, this one is also relevant, too: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591734.html
[Bug c++/104920] Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- There is nothing unreliable on the warning. If it sees a memset call with an base language array and size that constant evaluates to the number of array elements and the array element has size > 1, it warns. The warning is implemented in the FEs, so it can't see through functions that would need to be inlined etc., it really requires the size to be constant expression. In some of your testcases, that is the case even when you use std:array, in others it is not.
[Bug other/61257] configure should check if sys/sdt.h is usable, not just checking the existance of the header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61257 Eric Gallager changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2022-March/5 ||91704.html --- Comment #7 from Eric Gallager --- Patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591704.html
[Bug target/104921] New: aarch64: Assembler failure with vbfmlalbq_lane_f32 intrinsic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104921 Bug ID: 104921 Summary: aarch64: Assembler failure with vbfmlalbq_lane_f32 intrinsic Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- The following fails: $ cat t.c #include float32x4_t foo(float32x4_t x, bfloat16x8_t a, bfloat16x4_t b) { asm("" ::: "v0", "v1", "v2", "v3", "v4", "v5"); return vbfmlalbq_lane_f32 (x, a, b, 0); } $ ./aarch64-linux-gnu-gcc -c t.c -O2 -march=armv8.2-a+bf16 /tmp/ccwCbu7Y.s: Assembler messages: /tmp/ccwCbu7Y.s:15: Error: register number out of range 0 to 15 at operand 3 -- `bfmlalb v0.4s,v7.8h,v16.h[0]' it looks like the problem exists since the intrinsic was added in GCC 10.
[Bug c++/104920] Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 --- Comment #2 from Ali Kouhzadi --- Thanks Andrew for the response. Examples bad_03.cpp and bad_04.cpp (attached) show a case where this works as expected on an STL array. I guess the point is that it's somewhat unreliable, and improvements would be greatly appreciated.
[Bug middle-end/104854] -Wstringop-overread should not warn for strnlen, strndup and strncmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104854 --- Comment #5 from Martin Sebor --- It would be useful to separate these warnings into multiple levels: level 1 for invalid code, and higher levels for suspicious (or pointless) code, similarly to -Wformat-overflow.
[Bug c++/104920] Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 Andrew Pinski changed: What|Removed |Added Keywords||diagnostic --- Comment #1 from Andrew Pinski --- Iirc this warning only works with the language array feature. So if you are using the library array, the compiler does not know the implementation details of it.
[Bug c++/104920] New: Unreliable results with memset-elt-size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104920 Bug ID: 104920 Summary: Unreliable results with memset-elt-size Product: gcc Version: og11 (devel/omp/gcc-11) Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kouhzadi at rohumm dot com Target Milestone: --- Created attachment 52624 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52624&action=edit 4 examples where memset-elt-size fails, one where it's successful * GCC versions: 10.3.1 and 11.2 * System type: GNU/Linux (amd64, aarch64, aarch32) * Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.2.0-7ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2 * Command line: g++ -Wall -Wextra -O3 source.cpp * Compiler output: none * Expected compiler output: warning: ‘memset’ used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size] * Sample code (more attached): #include #include constexpr std::size_t ARRAY_LEN = 10u; int main() { int arr[ARRAY_LEN]; std::memset(arr, 0, ARRAY_LEN); // OK, warning as expected std::array arr2; std::memset(arr2.data(), 0, arr2.size()); // No warning; note that both std::array::data() and std::array::size() are constexpr return 0; }
[Bug tree-optimization/104789] [12 Regression] -Wstringop-overflow false positive at -O3 for an unrolled loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104789 --- Comment #11 from Martin Sebor --- The direct store subset of -Wstringop-overflow that runs in the strlen pass (i.e., those handled in strlen_pass::handle_store) might be better handled in VRP and issued under -Warray-bounds. The challenge there is that unlike this -Wstringop-overflow subset which respects subobject boundaries, -Warray-bounds intentionally considers complete objects (this was done to avoid false positives). So before moving all of strlen a solution to consider is to merge these two sets of warnings while preserving the subobject sensitivity. Another (indirectly related) improvement is to also move -Warray-bounds out of VRP and into the access warning pass, and run it at the same times as most other warnings there (that would also enable -Warray-bounds at -O0, which might of course trigger some new false positives).
[Bug testsuite/103324] RFE: Add a `make quickcheck` or `make smoketest` Makefile target to allow only running a portion of the testsuite
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103324 --- Comment #6 from Eric Gallager --- (In reply to Eric Gallager from comment #5) > (In reply to Eric Gallager from comment #3) > > https://gcc.gnu.org/install/test.html says "To get a list of the possible > > *.exp files, pipe the output of ‘make check’ into a file and look at the > > ‘Running … .exp’ lines." ...has anyone stored their output from doing so > > recently? I don't really want to run the entire testsuite just to generate > > this list... > > So, now I'm running the testsuite anyways for other reasons, and one more > thing to note is that using any sort of parallelism when running the > testsuite (which is pretty much a must these days) makes picking out the > ‘Running … .exp’ lines more difficult than necessary... In any case, here's my current list of "Running … .exp ..." lines (sorted and uniq-ed): Running ../../../../libatomic/testsuite/libatomic.c/c.exp ... Running ../../../../libgomp/testsuite/libgomp.c++/c++.exp ... Running ../../../../libgomp/testsuite/libgomp.c/c.exp ... Running ../../../../libgomp/testsuite/libgomp.fortran/fortran.exp ... Running ../../../../libgomp/testsuite/libgomp.graphite/graphite.exp ... Running ../../../../libgomp/testsuite/libgomp.oacc-c++/c++.exp ... Running ../../../../libgomp/testsuite/libgomp.oacc-c/c.exp ... Running ../../../../libgomp/testsuite/libgomp.oacc-fortran/fortran.exp ... Running ../../../../libitm/testsuite/libitm.c++/c++.exp ... Running ../../../../libitm/testsuite/libitm.c/c.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/analyzer/analyzer.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/asan/asan.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/bprob/bprob.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/charset/charset.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/compat/compat.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/compat/struct-layout-1.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/coroutines/coroutines.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/coroutines/torture/coro-torture.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/debug/debug.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/debug/dwarf2/dwarf2.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/dfp/dfp.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/dg.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/gcov/gcov.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/goacc-gomp/goacc-gomp.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/goacc/goacc.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/gomp/gomp.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/graphite/graphite.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/guality/guality.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/hwasan/hwasan.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/lto/lto.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/modules/modules.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/pch/pch.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/plugin/plugin.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/simulate-thread/simulate-thread.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/special/ecos.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/tls/tls.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/tm/tm.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/torture/dg-torture.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/torture/stackalign/stackalign.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/tree-prof/tree-prof.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/tsan/tsan.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/ubsan/ubsan.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.dg/vect/vect.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.old-deja/old-deja.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/aarch64.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/sve/aarch64-sve.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle.exp ... Running /Users/ericgallager/gcc_newgit/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp ... Running /Users/e
[Bug tree-optimization/94566] conversion between std::strong_ordering and int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566 --- Comment #11 from Jakub Jelinek --- (In reply to Oliver Schönrock from comment #10) > I agree the switch optimisation is better, but... > > shouldn't std::bit_cast prevent incorrect casting with different underlying > implementaion? (ie if the size doesn't match, and the size could be deduced > with TMP) The size can be deduced, yes. What the bits actually mean can't be. > and "unordered value" doesn't apply to std::strong_ordering? Sure, but this PR isn't just about strong_ordering, same problem applies for partial_ordering. And actually not just those, but any case of some set of enumerators or macros where you don't know the values exactly and mapping them to or from a set of integer constants, ideally with 1:1 mapping but not guaranteed that way.
[Bug c++/104919] New: [modules] enum in constexpr function causes "failed to read compiled module cluster 1: Bad file data"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104919 Bug ID: 104919 Summary: [modules] enum in constexpr function causes "failed to read compiled module cluster 1: Bad file data" Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ensadc at mailnesia dot com Target Milestone: --- https://godbolt.org/z/d3sdeEz1r $ cat mod.cpp export module mod; export constexpr void f() { enum { a }; a; } $ cat example.cpp import mod; int main() { f(); } $ g++ -std=c++20 -fmodules-ts mod.cpp example.cpp In module imported at example.cpp:1:1: mod: In function ‘int main()’: mod: error: failed to read compiled module cluster 1: Bad file data mod: note: compiled module file is ‘gcm.cache/mod.gcm’ example.cpp:4:5: fatal error: failed to load binding ‘::f@mod’ 4 | f(); | ^ compilation terminated. It compiles fine without `constexpr`.
[Bug other/42540] c++ error message [vtable undefined] is unhelpful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42540 --- Comment #18 from Jonathan Wakely --- (In reply to Eyal Rozenberg from comment #16) > The compiler could store information in the compiled object listing the > virtual members for which no implementation was found, due to which reason > the vtable was not defined already. In this specific example, storing the > name of "A::B()" somewhere. In every file that includes the header defining A? Consider the case where you include the header in ten files, and define the virtual functions in one of them. Nine out of ten files do not contain a definitions of the virtual functions, so they would each contain the same info naming every virtual function in the class. Then do that for every polymorphic class in every object file. This is a lot more info being written out, and most of it will never be used. You have nine files saying "this function is missing" and one not saying it. What exactly does the linker do with that information? Why would that be better than comment 7 here?
[Bug other/42540] c++ error message [vtable undefined] is unhelpful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42540 --- Comment #17 from Jonathan Wakely --- (In reply to Eyal Rozenberg from comment #16) > I'm not quite sure what a key function is, Then read the link I gave you in PR 104918 comment 1. > Not just learners. If you have a large class with many methods, whose > implementation is spread across many files, it can take quite a bit of time > to figure out which method implementation is missing. The first one. They key function is the first non-inline, non-pure virtual function. Read the wiki page. I didn't write that page and suggest you read it just for fun. The linker could easily say that, with no changes from GCC. That belongs in the binutils bugzilla though.
[Bug tree-optimization/94566] conversion between std::strong_ordering and int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566 --- Comment #10 from Oliver Schönrock --- I agree the switch optimisation is better, but... shouldn't std::bit_cast prevent incorrect casting with different underlying implementaion? (ie if the size doesn't match, and the size could be deduced with TMP) and "unordered value" doesn't apply to std::strong_ordering?
[Bug c++/104918] Pass information to let the linker tell the user which virtual members are missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104918 --- Comment #3 from Jonathan Wakely --- (In reply to Eyal Rozenberg from comment #2) > Why not store information in the compiled object saying which virtual items > are undefined? The vtable was missing because some virtual members were > purely-virtual, right? No, there are no pure virtual functions here. The vtable is missing because it will be emitted in the same object file as the key function, and the key function is not defined in the program. See the wiki link I gave.
[Bug tree-optimization/94566] conversion between std::strong_ordering and int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- That one isn't portable, relies on both the strong_ordering underlying implementation using an 8-bit integer member rather than something else, and also hardcodes the exact values where in C++ the -1 / 0 / 1 are exposition only and unordered value is -127 rather than what gcc uses (2). By writing a series of ifs or switch one achieves portability and we'd just like to get efficient code if the values the user chose match those used by the implementation.
[Bug target/104253] libgcc missing __floatdiif
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253 --- Comment #18 from Peter Bergner --- (In reply to Eric Botcazou from comment #17) > The test fails on VxWorks, where there is no 128-bit long double: > > cc1: warning: The '-mfloat128' option may not be fully supported > > so it looks like a DejaGNU selector is missing. It's a dg-run test case, but the dg-require only uses ppc_float128_sw. There is a ppc_float128_hw which is probably what Mike meant to use?
[Bug c++/97198] __is_constructible(int[], int) should return true
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97198 --- Comment #8 from Zhihao Yuan --- (In reply to Jonathan Wakely from comment #7) > (In reply to Zhihao Yuan from comment #5) > > Encountered this today. In case I cannot show up when discussing LWG3486, my > > use case is that C(in_place_type, a, b, c) should "just works." It's up > > to C how to deal with it. In my case, it's new T[]. > > I was going to add a note to the issue, but I don't know what to add. What > is C? Why wouldn't it work today? Why does std::is_constructible affect it? I meant to let C stand for some type C's constructor here. I wanted to express the following: in_place_type is often used by type-erasures to forward all information that is expression-equivalent to some form of initialization. Let's say I want to create an object of type C with C obj(in_place_type, a, b, c); where obj erases the type of a hypothetical object created in U x(a, b, c); Because type erasure means to erase the U in in_place_type, it doesn't matter if U != decltype(x). It works according to the literal interpretation of the standard today. But if we say std::is_constructible_v == false, extra efforts are needed to restore the meaning of in_place_type.
[Bug tree-optimization/94566] conversion between std::strong_ordering and int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94566 --- Comment #8 from Oliver Schönrock --- how about: #include #include #include int conv3(std::strong_ordering s){ return std::bit_cast(s); } std::strong_ordering conv4(int i){ return std::bit_cast(static_cast(i)); } conv3(std::strong_ordering): movsbl %dil, %eax ret conv4(int): movl%edi, %eax ret https://godbolt.org/z/szP5MGq4T
[Bug tree-optimization/104746] False positive for -Wformat-overflow=2 since r12-7033-g3c9f762ad02f398c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104746 Martin Sebor changed: What|Removed |Added Resolution|--- |INVALID Status|NEW |RESOLVED --- Comment #12 from Martin Sebor --- The documented purpose of -Wformat-overflow=2 is to point out potential problems, including those where an argument is not known to be sufficiently constrained. (Level 1 behaves close to what you expect.) Level 2 is not enabled in -Wall or -Wextra and must be explicitly enabled. Different designe choices are of course possible but since some projects are using it as is it's useful as designed. This is not a bug.
[Bug middle-end/104880] [11 Regression] ICE in expand_expr_addr_expr_1, at expr.c:8231 since r11-165-geb72dc663e9070
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104880 --- Comment #4 from Dimitar Yordanov --- Thanks, works for me!
[Bug target/96882] Wrong assembly code generated with arm-none-eabi-gcc -flto -mfloat-abi=hard options
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96882 David Crocker changed: What|Removed |Added CC||dcrocker at eschertech dot com --- Comment #9 from David Crocker --- Is there any update on this? I need to turn on LTO to keep the code size of a large application within the flash memory space of the target ARM Cortex M4F processor; but by the sound of it, doing so will be unsafe.
[Bug c++/86426] g++ ICE at on valid code in tree_operand_check, at tree.h:3615
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86426 Patrick Palka changed: What|Removed |Added CC||dcb314 at hotmail dot com --- Comment #9 from Patrick Palka --- *** Bug 104837 has been marked as a duplicate of this bug. ***
[Bug c++/104837] tree check fail: in tree_operand_check, at tree.h:3948
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104837 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #2 from Patrick Palka --- dup of PR86426 I think *** This bug has been marked as a duplicate of bug 86426 ***
[Bug middle-end/104696] [OpenMP] component/array-ref/component (x.r[1].d) should use 'x' for GOMP_MAP_STRUCT (not yield 'x.r[1]' for nonptr 'x.r')
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104696 Tobias Burnus changed: What|Removed |Added Component|fortran |middle-end Summary|[OpenMP] Implicit mapping |[OpenMP] |breaks struct mapping |component/array-ref/compone ||nt (x.r[1].d) should use ||'x' for GOMP_MAP_STRUCT ||(not yield 'x.r[1]' for ||nonptr 'x.r') --- Comment #5 from Tobias Burnus --- (In reply to Tobias Burnus from comment #4) > #pragma omp target map(tofrom: x.r[1].d) >*.r[1].d = 3; s/*/x/ The problem is that in gimplify.cc's gimplify_scan_omp_clauses: tree base = extract_base_bit_offset (OMP_CLAUSE_DECL (c), &base_ref, &bitpos1, &offset1, &tree_offset1); bool do_map_struct = (base == decl && !tree_offset1); Here, 'base' == 'x' but 'decl' is 'x.r[1]' - while for 'x.q.d' it is 'x' (== base). The comp refs are removed as follows. (It seems as if some additional ARRAY_REF checking is needed for non-pointer components.) else if (TREE_CODE (decl) == COMPONENT_REF && (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION)) { component_ref_p = true; while (TREE_CODE (decl) == COMPONENT_REF) decl = TREE_OPERAND (decl, 0); if (TREE_CODE (decl) == INDIRECT_REF && DECL_P (TREE_OPERAND (decl, 0)) && (TREE_CODE (TREE_TYPE (TREE_OPERAND (decl, 0))) == REFERENCE_TYPE)) decl = TREE_OPERAND (decl, 0); } Probably the same for '!DECL_P (decl)' in the previous 'if' branch.
[Bug other/42540] c++ error message [vtable undefined] is unhelpful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42540 --- Comment #16 from Eyal Rozenberg --- Some comments following my recent dupe... (In reply to Andrew Pinski from comment #1) > I don't know if there is anything there could be done here since the linker > is what is producing the error. The compiler could store information in the compiled object listing the virtual members for which no implementation was found, due to which reason the vtable was not defined already. In this specific example, storing the name of "A::B()" somewhere. If that information is available, we could then petition linker authors to use it and print the missing virtual members in the error message. (In reply to Richard Earnshaw from comment #5) > As suggested, there's no bug in the compiler here Not passing sufficient information to the linker is a "bug", or at least - a missing feature. (In reply to Zhihao Yuan from comment #11) > 2. Add extra information to name the key function, and pass it to the linker, > generate an error message to match MSVC's quality. I'm not quite sure what a key function is, but it sounds like my suggestion is similar to this one. So, I support your suggestion (2.) > Calling this a "well-known issue" is irresponsible. The issue significantly > increases the bar to learners to consume and accept new paradigms in the > language. Not just learners. If you have a large class with many methods, whose implementation is spread across many files, it can take quite a bit of time to figure out which method implementation is missing.
[Bug target/104910] [10/11/12 Regression] ICE: internal consistency failure (error: invalid rtl sharing found in the insn)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104910 Richard Biener changed: What|Removed |Added Target Milestone|--- |10.4
[Bug c++/104918] Pass information to let the linker tell the user which virtual members are missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104918 --- Comment #2 from Eyal Rozenberg --- (In reply to Jonathan Wakely from comment #1) > I don't think there's anything for GCC to do here. Why not store information in the compiled object saying which virtual items are undefined? The vtable was missing because some virtual members were purely-virtual, right? > Maybe the linker should > print a note after a missing vtable error saying that they key function > needs to be defined, which is already suggested at PR 42540. Yes, that's what I'm saying... I'll comment on bug 42540.
[Bug c++/69623] Invalid deduction of non-trailing template parameter pack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69623 Ed Catmur changed: What|Removed |Added CC||ed at catmur dot uk --- Comment #6 from Ed Catmur --- There is a fairly well-known idiom for extracting a specific element of an argument pack: auto f(auto..., auto a, auto...) { return a; } template struct any { any(auto) {} }; template auto g(std::index_sequence, auto... a) { return f...>(a...); } template auto h(auto... a) { return g(std::make_index_sequence(), a...); } I believe the intent of http://eel.is/c++draft/temp#param-14.sentence-3 is that this idiom should work; per http://eel.is/c++draft/temp#param-example-7 (as in comment #5) a function template secondary template parameter pack is only disallowed if it is nondeducible. In the original description of this ticket (not comment #5, which I agree is disallowed) T is specifiable and U is deducible, so both are OK. > [...] A template parameter pack of a function template shall not be followed > by another template parameter unless that template parameter can be deduced > from the parameter-type-list ([dcl.fct]) of the function template or has a > default argument ([temp.deduct]). [...] Or is the intent that T must be explicitly specified, i.e. that f can be called as f<>() but not as f(), pace http://eel.is/c++draft/temp#arg.explicit-4.sentence-3 ? > If all of the template arguments can be deduced, they may all be omitted; in > this case, the empty template argument list <> itself may also be omitted.
[Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104912 --- Comment #5 from Richard Biener --- Another thing is noticing the loop performs no vector loads/stores at all, all of them are strided. If we'd improve SLP analysis we could get equal (but VF==1) basic-block vectorization - but with the caveat of having to deal with the possible aliasing of XPQKL(MPQ,MKL) and XPQKL(MRS,MKL). Still in a case where there's no aliasing doing BB vectorization will eventually be a better solution. That said - a x86 backend specific thing could be to count the number of vector loads/stores as well as the number of strided loads/stores and apply the biasing based on that at finish_cost time, not on the individual case. We can also count the number of "other" stmts in the loop body so to weight the ratio between them. For gamess it's 10 vector stmts vs. 6 strided loads + 2 strided stores. We could simply sum vector stmts (including vector loads and stores), subtract the "emulated scalar" ones (maybe weight the variably strided cases with a factor of two) and require the outcome to be > 0 to be worthwhile to vectorize. Eventually the finish_cost hook should get a bool result to indicate that independent of the cost of the scalar loop we do not want this vectorization (that's nicer than returning an arbitrary high number for example).
[Bug other/42540] c++ error message [vtable undefined] is unhelpful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42540 --- Comment #15 from Jonathan Wakely --- *** Bug 104918 has been marked as a duplicate of this bug. ***
[Bug c++/104918] Pass information to let the linker tell the user which virtual members are missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104918 Jonathan Wakely changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Jonathan Wakely --- The issue is that the vtable is missing, which is why the linker says so. If the vtable is present, it does tell you about the missing virtual: https://godbolt.org/z/zbfvTh74v /opt/compiler-explorer/gcc-11.2.0/bin/../lib/gcc/x86_64-linux-gnu/11.2.0/../../../../x86_64-linux-gnu/bin/ld: /tmp/ccI0XFwh.o:(.rodata._ZTV1B[_ZTV1B]+0x18): undefined reference to `B::foo()' I don't think there's anything for GCC to do here. Maybe the linker should print a note after a missing vtable error saying that they key function needs to be defined, which is already suggested at PR 42540. See also https://gcc.gnu.org/wiki/VerboseDiagnostics#missing_vtable *** This bug has been marked as a duplicate of bug 42540 ***
[Bug c++/104918] New: Pass information to let the linker tell the user which virtual members are missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104918 Bug ID: 104918 Summary: Pass information to let the linker tell the user which virtual members are missing Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: eyalroz1 at gmx dot com Target Milestone: --- Consider the following program: ``` struct A { virtual void foo() { } }; struct B : A { void foo() override; }; int main() { B b; } ``` this compiles, but fails to link: https://godbolt.org/z/Mzx3c7354 ``` :10: undefined reference to `vtable for B' ``` which is fine, but - I'm annoyed the linker doesn't tell me which virtual member is missing. That might be an issue with the linker, but - is foo even a symbol in the compiled code? I tried compiling this into an object file and using objdump (on my GNU/Linux Devuan Chimaera), and got: a.o: file format elf64-x86-64 SYMBOL TABLE: ldf *ABS* a.cpp ld .text .text ld .data .data ld .bss .bss ld .note.GNU-stack .note.GNU-stack ld .eh_frame .eh_frame ld .comment .comment g F .text 0016 main *UND* vtable for B So, no foo... and no way for the linker to be able to tell me what's missing. I claim that GCC should expose information via the symbol table (or otherwise?) that would let ld tell me which virtual member it's missing.
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #10 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/104917] New: No runtime alias test required for dependent reductions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104917 Bug ID: 104917 Summary: No runtime alias test required for dependent reductions Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The testcase from PR87561 shows a case where we have two in-memory reductions that are possibly dependent but the runtime alias check isn't needed since we only possibly re-order the summations in the reduction. Small C testcase: void foo (double *x, double *y, double * __restrict a, double * __restrict b) { for (int i = 0; i < 1024; ++i) { x[i] += a[i]; y[i] += b[i]; } } here x[] and y[] are dependent but we can vectorize this just fine with -fassociative-math, eliding the runtime alias check.
[Bug debug/104778] [12 Regression] ICE in simplify_subreg, at simplify-rtx.cc:7324 since r12-1202-g9080a3bf232978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104778 --- Comment #9 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:77eb0461abe61a85f69270048ad81b25b1cc95d6 commit r12-7644-g77eb0461abe61a85f69270048ad81b25b1cc95d6 Author: Jakub Jelinek Date: Mon Mar 14 14:49:09 2022 +0100 lra: Fix up debug_p handling in lra_substitute_pseudo [PR104778] The following testcase ICEs on powerpc-linux, because lra_substitute_pseudo substitutes (const_int 1) into a subreg operand. First a subreg of subreg of a reg appears in a debug insn (which surely is invalid outside of debug insns, but in debug insns we allow even what is normally invalid in RTL like subregs which the target doesn't like, because either dwarf2out is able to handle it, or we just throw away the location expression, making some var . lra_substitute_pseudo already has some code to deal with specifically SUBREG of REG with the REG being substituted for VOIDmode constant, but that doesn't cover this case, so the following patch extends lra_substitute_pseudo for debug_p mode to treat stuff like e.g. combiner's subst function to ensure we don't lose mode which is essential for the IL. 2022-03-14 Jakub Jelinek PR debug/104778 * lra.cc (lra_substitute_pseudo): For debug_p mode, simplify SUBREG, ZERO_EXTEND, SIGN_EXTEND, FLOAT or UNSIGNED_FLOAT if recursive call simplified the first operand into VOIDmode constant. * gcc.target/powerpc/pr104778.c: New test.
[Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104912 --- Comment #4 from Richard Biener --- I think for the case at hand no runtime alias checking is needed, since we have DO 30 MK=1,NOC DO 30 ML=1,MK MKL = MKL+1 XPQKL(MPQ,MKL) = XPQKL(MPQ,MKL) + * VAL1*(CO(MS,MK)*CO(MR,ML)+CO(MS,ML)*CO(MR,MK)) XPQKL(MRS,MKL) = XPQKL(MRS,MKL) + * VAL3*(CO(MQ,MK)*CO(MP,ML)+CO(MQ,ML)*CO(MP,MK)) 30 CONTINUE so we're dealing with reductions which we can interleave (with -Ofast). Editing the source with !GCC$ ivdep reduces the vectorization penalty to 5% (we still need the niter/epilogue checks). It also shows that only fixing PR89755 isn't the solution we're looking for. In the end the vectorization is unlikely going to play out since V2DF is usually handled well by dual issue capabilities for DFmode arithmetic on modern archs. The only mitigation I can think of is realizing the outer inner loop niter is 0, 1, 2, .., NOC - 1 and thus the first outer iterations will have inner loop vectorization not profitable. But the question is what to do with this (not knowing the actual runtime values of NOC). As PR87561 says "Note for 416.gamess it looks like NOC is just 5 but MPQ and MRS are so that there is no runtime aliasing between iterations most of the time (sometimes they are indeed equal). The cost model check skips the vector loop for MK == 2 and 3 and only will execute it for MK == 4 and 5. An alternative for this kind of loop nest would be to cost-model for MK % 2 == 0, thus requiring no epilogue loop." In general applying no vectorization to these kind of loops looks wrong. Versioning also the outer loop in addition to the inner loop in case the number of iterations evolves in the outer loop looks excessive (but would eventually help 416.gamess). Implementation-wise it's also non-trivial.
[Bug tree-optimization/91246] vectorization failure for a small loop to search array element
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91246 d_vampile changed: What|Removed |Added CC||d_vampile at 163 dot com --- Comment #6 from d_vampile --- (In reply to Jiangning Liu from comment #3) > Expect to vectorize the inner loop by generating the code below for x86, > > vpbroadcastd [mem], ymm0 > vpaddd [mem], ymm0, ymm1 > vpbroadcastd reg, ymm2 > vpcmpeqd ymm2, ymm1, k0 > kortestw k0, k0 > cmovne ... > > AArch64 should have vectorization instructions counterpart to implement the > same functionality. I see that on x86, the result of vcmpeqb comparison can be recorded through the vmovmskb instruction. I wonder if there is a similar instruction for efficiently recording the result of vectorized comparison on neno? x86 i.e.. .. vpcmpeqb %ymm0, %ymm1, %ymm0 vpmovmskb %ymm0, %ebx cmp 0x, %ebx ..
[Bug c/104913] [OpenMP] Bogus 'unused variable' with 'omp depobj'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104913 --- Comment #2 from Jakub Jelinek --- We also warn e.g. in void foo () { int x; #pragma omp task firstprivate(x) ; } case. To some extent at least for data sharing and most other OpenMP clauses the data sharing isn't really a kind of use, the variable there is still unused. depend clause is a border line, it also isn't either setter nor use and such use is questionable, on the other side under the hood it is taking the address of the variable and using that for the dependency purposes, so it is some kind of a light use. On the other side, e.g. the C FE build_external_ref function handles various other tasks, like -Wdeprecated or setting DECL_NONLOCAL, so perhaps we want to do it for all OpenMP clauses somewhere. Similarly, C++ mark_used can even instantiate stuff etc.
[Bug libstdc++/104875] libstdc++-v3/src/c++11/codecvt.cc:312:24: warning: left shift count >= width of type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104875 --- Comment #2 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:8f7b7c1495f92c72da154d32317943a2cc276ca8 commit r12-7643-g8f7b7c1495f92c72da154d32317943a2cc276ca8 Author: Jonathan Wakely Date: Fri Mar 11 14:52:38 2022 + libstdc++: Fix reading UTF-8 characters for 16-bit targets [PR104875] The current code in read_utf8_code_point assumes that integer promotion will create a 32-bit int, but that's not true for 16-bit targets like msp430 and avr. This changes the intermediate variables used for each octet from unsigned char to char32_t, so that (c << N) works correctly when N > 8. libstdc++-v3/ChangeLog: PR libstdc++/104875 * src/c++11/codecvt.cc (read_utf8_code_point): Use char32_t to hold octets that will be left-shifted.
[Bug target/104916] [nvptx] Handle Independent Thread Scheduling for sm_70+ with -muniform-simt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916 --- Comment #1 from Tom de Vries --- We could try the same solution as for atomic: predicate ld/st to only execute in lane 0, and propagate ld result. Another solution might be to wrap each ld/st in two bar.warp.sync.
[Bug target/104916] New: [nvptx] Handle Independent Thread Scheduling for sm_70+ with -muniform-simt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104916 Bug ID: 104916 Summary: [nvptx] Handle Independent Thread Scheduling for sm_70+ with -muniform-simt Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- The problem -muniform-simt is trying to address is to make sure that a register produced outside an openmp simd region is available when used in a lane inside an simd region. The solution is to, outside an simd region, execute in all warp lanes, thus producing consistent values in result registers in each warp thread. [ Note that this solution is: as-produced, asap. Openacc has the same problem, but deals with it: as-needed, alap. ] This approach doesn't work when executing in all warp lanes multiplies the side effects from 1 to 32 separate side effects, which is the case for for instance atomic insns. So atomic insns are rewritten to execute only in the master lane, and if there are any results, propagate those to the other threads in the warp. [ And likewise for system calls malloc, free, vprintf. ] [ The corresponding reorg pass nvptx_reorg_uniform_simt potentially rewrites all statements, be those inside or outside an simd region. But care is taken that the rewrite only has effect outside the simd region. ] Now, take a non-atomic update: ld, add, store. The store has side effects, are those multiplied as well? Now, pre-sm_70 we have the guarantee that warps execute in lock step. So: - the load will load the same value into the result register across the warp, - the add will write the same value into the result register across the warp, - the store will write the same value to the same memory location, 32 times, at once, having the result of a single store. So, no side-effect multiplication (well, at least that's the observation). Starting sm_70, the threads in a warp are no longer guaranteed to execute in lockstep. Consequently, we can have the following execution trace: - some threads load a value into the result register - those threads do an add and write the result into the result register - that result is stored - the other threads arrive, and now load the now updated, thus different value into the result register - the other threads do an add and write a different result into their result register - the updated result is stored So, we both have now the side effect multiplied, and the registers are no longer in sync.
[Bug c++/99331] [8/9/10 Regression] -Wconversion false-positive in immediate context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99331 Patrick Palka changed: What|Removed |Added CC||jens.maurer at gmx dot net --- Comment #11 from Patrick Palka --- *** Bug 80601 has been marked as a duplicate of this bug. ***
[Bug c++/80601] spurious -Wconversion warning with explicit class template arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80601 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #3 from Patrick Palka --- dup of PR99331 (which is fixed for GCC 10.4/11/12) *** This bug has been marked as a duplicate of bug 99331 ***
[Bug c/104913] [OpenMP] Bogus 'unused variable' with 'omp depobj'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104913 --- Comment #1 from Jakub Jelinek --- That isn't depobj related but depend clause related: void foo () { int x; #pragma omp task depend(inout: x) ; } warns as well.
[Bug target/104915] New: Miss optimization for vec_setv8hi_0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104915 Bug ID: 104915 Summary: Miss optimization for vec_setv8hi_0 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com Target Milestone: --- #include __m128i foo (short* p) { return _mm_set_epi32 (0, 0, 0, (unsigned short) ((*(__m16_u *)p)[0])); } __m128i foo1 (short* p) { return _mm_set_epi16 (0, 0, 0, 0, 0, 0, 0, (*(__m16_u *)p)[0]); } under avx512fp16, foo could generate vmovw instead of movzx + vmovd, without avx512fp16 foo1 could generate movzx + movd instead of pxor + pinsrw.
[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 d_vampile changed: What|Removed |Added CC||d_vampile at 163 dot com --- Comment #48 from d_vampile --- (In reply to Jiu Fu Guo from comment #41) > (In reply to Wilco from comment #40) > > (In reply to Jiu Fu Guo from comment #39) > > > I’m thinking to draft a patch for this optimization. If any suggestions, > > > please point out, thanks. > > > > Which optimization to be precise? Besides unrolling I haven't seen a > > proposal for an optimization which is both safe and generally applicable. > > 1. For unroll, there are still branches in the loop. And then need careful > merge on those reading and comparison. Another thing about unroll would be > that, if we prefer to optimize this early in GIMPLE, we still not GIMPLE > unroll on it. > while (len != max) > { > if (p[len] != cur[len]) > break; ++len; > if (p[len] != cur[len]) > break; ++len; > if (p[len] != cur[len]) > break; ++len; > > } > > 2. Also thinking about if it makes sense to enhance GIMPLE vectorization > pass. In an aspect that using a vector to read and compare, also need to > handle/merge compares into vector compare and handle early exit carefully. > if (len + 8 < max && buffers not cross page) ///(p&4K) == (p+8)&4k? > 4k:pagesize > while (len != max) > { > vec a = xx p; > vec b = xx cur; > if (a != b) /// may not only for comparison > {;break;} > len += 8; > } > > 3. Introduce a new stand-alone pass to optimize reading/computing shorter > types into large(dword/vector) reading/computing. > > Thanks a lot for your comments/suggestions! Any progress or patches for the new pass mentioned in point 3? Or new ideas?
[Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104912 --- Comment #3 from Richard Biener --- (In reply to Richard Biener from comment #2) > PR87561 has a testcase and for it we pessimized strided loads & stores "a > bit more" in r9-6581-g7d7d1ce83889ee and r9-6580-g0538ed1d3602ec We're entering this CTOR cost pessimization with a cost of 4 now (down from 8), aka one sse_op, and multiply that by 3. I think it would be better to add TYPE_VECTOR_SUBPARTS times ->lea cost, though that would not help here obviously. This cost pessimization is a (bad) workaround for the inability to handle PR89754 and PR89755 Since we halved the CTOR cost we'd now need to apply that factor of two ontop of the pessimization for strided loads/stores to recover. Since we only halved the CTOR case but not vec_to_scalar we get away with just doing that for load_vec_info_type.
[Bug target/104914] [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 --- Comment #1 from Yangfl --- Original issue: https://github.com/matplotlib/matplotlib/issues/21789
[Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104912 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2022-03-14 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Richard Biener --- PR87561 has a testcase and for it we pessimized strided loads & stores "a bit more" in r9-6581-g7d7d1ce83889ee and r9-6580-g0538ed1d3602ec
[Bug target/104912] [12 Regression] 416.gamess regression after r12-7612-g69619acd8d9b58
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104912 --- Comment #1 from Richard Biener --- +mccas.fppized.f:3160:21: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:3160:21: optimized: loop versioned for vectorization because of possible aliasing +mccas.fppized.f:3195:21: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:3195:21: optimized: loop versioned for vectorization because of possible aliasing +mccas.fppized.f:3259:21: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:3259:21: optimized: loop versioned for vectorization because of possible aliasing +mccas.fppized.f:3304:21: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:3304:21: optimized: loop versioned for vectorization because of possible aliasing mccas.fppized.f:2576:18: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:2524:17: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:3055:22: optimized: loop vectorized using 16 byte vectors @@ -147,9 +155,11 @@ mccas.fppized.f:1890:25: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1859:20: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1843:19: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:1843:19: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1737:17: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1727:20: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1714:19: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:1714:19: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:884:24: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:904:33: optimized: basic block part vectorized using 16 byte vectors mccas.fppized.f:653:17: optimized: loop vectorized using 16 byte vectors @@ -159,8 +169,11 @@ mccas.fppized.f:1188:14: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:1188:14: optimized: loop versioned for vectorization because of possible aliasing mccas.fppized.f:522:72: optimized: basic block part vectorized using 16 byte vectors +mccas.fppized.f:522:72: optimized: basic block part vectorized using 16 byte vectors mccas.fppized.f:2399:14: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:2399:14: optimized: loop versioned for vectorization because of possible aliasing mccas.fppized.f:2130:14: optimized: loop vectorized using 16 byte vectors mccas.fppized.f:2261:72: optimized: basic block part vectorized using 16 byte vectors +mccas.fppized.f:2261:72: optimized: basic block part vectorized using 16 byte vectors +mccas.fppized.f:2261:72: optimized: basic block part vectorized using 16 byte vectors are the vectorization differences, the performance difference happens entirely in TWOTFF (lines 3209 and following). +mccas.fppized.f:3304:21: optimized: loop vectorized using 16 byte vectors +mccas.fppized.f:3304:21: optimized: loop versioned for vectorization because of possible aliasing are the inner loops of DO 30 MK=1,NOC DO 30 ML=1,MK MKL = MKL+1 XPQKL(MPQ,MKL) = XPQKL(MPQ,MKL) + * VAL1*(CO(MS,MK)*CO(MR,ML)+CO(MS,ML)*CO(MR,MK)) XPQKL(MRS,MKL) = XPQKL(MRS,MKL) + * VAL3*(CO(MQ,MK)*CO(MP,ML)+CO(MQ,ML)*CO(MP,MK)) 30 CONTINUE and the other similar copy. We are doing all strided loads and stores here but the vectorized code never executes, instead we just pay the overhead of the runtime alias test for each inner iteration (we'd ideally formulate it in a way including the outer iteration so we could version the outer loop instead). The runtime alias check is XPOKL(MPQ,MKL) vs. XPOKL(MRS,MKL) - an index check on MPQ should be invariant but I guess the situation is more complicated than that. The cost model differences for this are mccas.fppized.f:3304:21: note: Cost model analysis: Vector inside of loop cost: 552 Vector prologue cost: 48 Vector epilogue cost: 280 Scalar iteration cost: 264 Scalar outside cost: 8 Vector outside cost: 328 prologue iterations: 0 epilogue iterations: 1 mccas.fppized.f:3304:21: missed: cost model: the vector iteration cost = 552 divided by the scalar iteration cost = 264 is greater or equal to the vectorization factor = 2. mccas.fppized.f:3304:21: missed: not vectorized: vectorization not profitable. mccas.fppized.f:3304:21: missed: not vectorized: vector version will never be profitable. mccas.fppized.f:3304:21: missed: Loop costings may not be worthwhile. vs. mccas.fppized.f:3304:21: note: Cost model analysis: Vector inside of loop cost: 480 Vector prologue cost: 48 Vector epilogue cost: 280 Scalar iteration cost: 264 Scalar outside cost: 8 Vector outside cost: 328 prologue iterations: 0 epilogue iterations: 1 Calculated minimum iters for profitability: 4 where the V2DF vec_construct costs are reduced from 24 to
[Bug c++/97198] __is_constructible(int[], int) should return true
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97198 --- Comment #7 from Jonathan Wakely --- (In reply to Zhihao Yuan from comment #5) > Encountered this today. In case I cannot show up when discussing LWG3486, my > use case is that C(in_place_type, a, b, c) should "just works." It's up > to C how to deal with it. In my case, it's new T[]. I was going to add a note to the issue, but I don't know what to add. What is C? Why wouldn't it work today? Why does std::is_constructible affect it?
[Bug target/104914] New: [MIPS] wrong comparison with scrabbled int value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104914 Bug ID: 104914 Summary: [MIPS] wrong comparison with scrabbled int value Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: mmyangfl at gmail dot com Target Milestone: --- GCC 12.0 (current git master, 80fcc4b) and 11 generates wrong instructions for this code. (older version not tested) $ mips64el-img-elf-gcc -mabi=64 -S -O1 -o - ~/a.c #include void test(const unsigned char *buf) { int val; ((unsigned char*)&val)[0] = *buf++; ((unsigned char*)&val)[1] = *buf++; ((unsigned char*)&val)[2] = *buf++; ((unsigned char*)&val)[3] = *buf++; if(val > 0) puts("a"); else fputs("b", stderr); } int main() { test("\xff\xff\xff\xff"); } // => "a" Generated asm code in question: test: .frame $sp,16,$31 # vars= 0, regs= 1/0, args= 0, gp= 0 .mask 0x8000,-8 .fmask 0x,0 .setnoreorder .setnomacro daddiu $sp,$sp,-16 sd $31,8($sp) lbu $3,0($4) move$2,$0 dins$2,$3,0,8 lbu $3,1($4) dins$2,$3,8,8 lbu $3,2($4) dins$2,$3,16,8 lbu $3,3($4) dins$2,$3,24,8 blezc $2,.L2 // signed extending $2 missing! lui $4,%highest(.LC0) lui $2,%hi(.LC0) daddiu $4,$4,%higher(.LC0) daddiu $2,$2,%lo(.LC0) dsll$4,$4,32 daddu $4,$4,$2 balcputs ld $31,8($sp) .L5: daddiu $sp,$sp,16 jrc $31 .L2: ld $2,%gp_rel(_impure_ptr)($28) ld $5,24($2) li $4,98 # 0x62 balcfputc b .L5 ld $31,8($sp) Below are my attempts to fix this bug: -fdump-final-insns gives the following statement: (jump_insn # 0 0 (set (pc) (if_then_else (le (reg:SI 2 $2 [orig:201 val ] [201]) (const_int 0 [0])) (label_ref #) (pc))) "/home/ding/a.c":8:5# {*branch_ordersi} (expr_list:REG_DEAD (reg:SI 2 $2 [orig:201 val ] [201]) (int_list:REG_BR_PROB 440234148 (nil))) -> 2) After manually `icode != CODE_FOR_cbranchsi4` in gcc/gcc/optabs.cc:4501, combine pass still combines them back, but the machine description simply define "cbranch4" for all cbranch family. I wonder since MIPS64 can't really do comparsion over partial register, is this RTL valid?
[Bug target/104910] [10/11/12 Regression] ICE: internal consistency failure (error: invalid rtl sharing found in the insn)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104910 Jakub Jelinek changed: What|Removed |Added Last reconfirmed||2022-03-14 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Jakub Jelinek --- Created attachment 52623 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52623&action=edit gcc12-pr104910.patch Untested fix.
[Bug c++/104905] untranslated word in diagnostic about compiled module
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104905 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2022-03-14 Status|UNCONFIRMED |NEW Ever confirmed|0 |1
[Bug c/104913] New: [OpenMP] Bogus 'unused variable' with 'omp depobj'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104913 Bug ID: 104913 Summary: [OpenMP] Bogus 'unused variable' with 'omp depobj' Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: diagnostic, openmp Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- The following C code shows the warning foo.c:6:7: warning: unused variable ‘x’ [-Wunused-variable] 6 | int x; | ^ but the variable is used in 'depend(inout: x)'. (No warning for the Fortran code, which is fine. Found when looking at https://github.com/SOLLVE/sollve_vv/pull/493 which then uses obj with omp_target_memcpy_async.) --- #include void foo () { int x; omp_depend_t obj; #pragma omp depobj(obj) depend(inout: x) } --- subroutine foo use omp_lib implicit none integer :: x integer(omp_depend_kind) obj !$omp depobj(obj) depend(inout: x) end
[Bug fortran/104888] diagnostics use non-idiomatic '%s'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104888 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #6 from Martin Liška --- (In reply to Roland Illig from comment #0) > fortran/openmp.cc says: > > selector '%s' not allowed for context selector set '%s' at %C > > One year ago, the message contained the idiomatic %qs. Why was it changed in > ae3c4e521dd0b66db712639298cd08331d62f315? g:ae3c4e521dd0b66db712639298cd08331d62f315