[Bug c/100854] TS 18661-3 and backwards-incompatible setting of __FLT_EVAL_METHOD__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100854 --- Comment #3 from Hongtao.liu --- I'm testing diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index a25d59fa77b..4dab4d60773 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -8842,6 +8842,10 @@ excess_precision_mode_join (enum flt_eval_method x, || y == FLT_EVAL_METHOD_UNPREDICTABLE) return FLT_EVAL_METHOD_UNPREDICTABLE; + /* FLT_EVAL_METHOD only accepts negative values, 0, 1 or 2, but + FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 is 16. */ +if (x == y && x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16) + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; /* GCC only supports one interchange type right now, _Float16. If we're evaluating _Float16 in 16-bit precision, then flt_eval_method will be FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16. */
Re: _Float16-related failures on x86_64-apple-darwin
gcc define __FLT_EVAL_METHOD__ according to builtin_define_with_int_value ("__FLT_EVAL_METHOD__", c_flt_eval_method (true)); and guess we need to handle things like: /* GCC only supports one interchange type right now, _Float16. If we're evaluating _Float16 in 16-bit precision, then flt_eval_method will be FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16. */ + if (x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 + && x == y) +return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; if (x == FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16) return y; I'm testing the patch but still need approval from related MAINTAINERs. On Fri, Dec 24, 2021 at 7:15 AM FX via Gcc wrote: > > > I’m not sure what the fix should be, either. We could use fixinclude to > > make the darwin headers happy, but we don’t really have a macro to provide > > the right value. Like a __FLT_EVAL_METHOD_OLDSTYLE__ macro. > > > > What should be the float_t and double_t types for FLT_EVAL_METHOD == 16? > > float and double, if I understand right? > > This is one possibility, assuming I am right about the types: > > diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def > index 46e3b8c993a..bea85ef7367 100644 > --- a/fixincludes/inclhack.def > +++ b/fixincludes/inclhack.def > @@ -1767,6 +1767,18 @@ fix = { > test_text = ""; /* Don't provide this for wrap fixes. */ > }; > > +/* The darwin headers don't accept __FLT_EVAL_METHOD__ == 16. > +*/ > +fix = { > +hackname = darwin_flt_eval_method; > +mach = "*-*-darwin*"; > +files = math.h; > +select= "^#if __FLT_EVAL_METHOD__ == 0$"; > +c_fix = format; > +c_fix_arg = "#if __FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == 16"; > +test_text = "#if __FLT_EVAL_METHOD__ == 0"; > +}; > + > /* > * Fix on Digital UNIX V4.0: > * It contains a prototype for a DEC C internal asm() function, > > > Sucks to have to fix headers… and we certainly can’t fix people’s code that > may depend on __FLT_EVAL_METHOD__ have well-defined values. So not convinced > this is the right approach. > > FX -- BR, Hongtao
[Bug tree-optimization/38943] Optimization removes trapping instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38943 Andrew Pinski changed: What|Removed |Added Last reconfirmed|2021-07-20 00:00:00 |2021-12-23 --- Comment #2 from Andrew Pinski --- I think the test is invalid as you can still remove trapping math if the result is not used; even with FENV access implemented.
[Bug testsuite/82390] gcc.dg/torture tests run with same optimization level
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82390 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-12-24 --- Comment #6 from Andrew Pinski --- Confirmed, this is the current list: apinski@xeond:~/src/upstream-gcc/gcc/gcc/testsuite$ git grep dg-options gcc.dg/torture gcc.c-torture/ g++.dg/torture/ c-c++-common/torture/ |grep -- -O g++.dg/torture/pr33134.C:/* { dg-options "-O2" } */ g++.dg/torture/pr36745.C:/* { dg-options "-O2 -fPIC -Wno-return-type" } */ g++.dg/torture/pr39259.C:// { dg-options "-O2" } g++.dg/torture/pr48954.C:/* { dg-options "-O2 -flto -fno-early-inlining -fkeep-inline-functions" } */ g++.dg/torture/pr51903.C:// { dg-options "-O2 -fnon-call-exceptions -fno-guess-branch-probability" } g++.dg/torture/pr58201_1.C:/* { dg-options "-O2" } */ g++.dg/torture/pr64988.C:// { dg-options "-O -fdeclone-ctor-dtor" } g++.dg/torture/pr81462.C:// { dg-options "-O1 -fno-ipa-pure-const" } g++.dg/torture/pr83718.C:/* { dg-options "-O2 -std=c++11" } */ gcc.c-torture/compile/asmgoto-4.c:/* { dg-options "-O0 -fdump-rtl-reload" } */ gcc.c-torture/compile/pr48641.c:/* { dg-options "-O -fno-tree-ccp -fno-tree-copy-prop" } */ gcc.c-torture/compile/pr69102.c:/* { dg-options "-Og -fPIC -fschedule-insns2 -fselective-scheduling2 -fno-tree-fre --param=max-sched-extend-regions-iters=10" } */ gcc.c-torture/compile/pr72749.c:/* { dg-options "-O2 -fsched2-use-superblocks" } */ gcc.c-torture/compile/pr83575.c:/* { dg-options "-O2 -funroll-loops -fno-tree-dominator-opts -fno-tree-loop-im -fno-code-hoisting -fno-tree-pre -fno-guess-branch-probability" } */ gcc.c-torture/compile/pr85401.c:/* { dg-options "-O2" } */ gcc.c-torture/execute/pr68381.c:/* { dg-options "-O -fexpensive-optimizations -fno-tree-bit-ccp" } */ gcc.c-torture/execute/pr68390.c:/* { dg-options "-O2" } */ gcc.c-torture/execute/pr68532.c:/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */ gcc.dg/torture/builtin-sprintf.c: { dg-options "-O2 -Wall" } gcc.dg/torture/cris-asm-mof-1.c:/* { dg-options "-O2 -march=v10" } */ gcc.dg/torture/pr36244.c:/* { dg-options "-O3 -ftree-parallelize-loops=4" } */ gcc.dg/torture/pr68906.c:/* { dg-options "-O3" } */ gcc.dg/torture/pr70935.c:/* { dg-options "-O3 -g" } */ gcc.dg/torture/pr77916.c:/* { dg-options "-O3 -Wno-int-conversion" } */ gcc.dg/torture/pr77937-1.c:/* { dg-options "-O3" } */ gcc.dg/torture/pr77937-2.c:/* { dg-options "-O3" } */ gcc.dg/torture/pr98289.c:/* { dg-options "-O2 -freorder-blocks-and-partition" } */ I do think the patch is wrong as mentioned, rather what it should test if one of the options was a -O* option instead.
[Bug target/60480] gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480 Andrew Pinski changed: What|Removed |Added Known to work||7.1.0 Target Milestone|--- |9.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #3 from Andrew Pinski --- GCC 9 is: movq%r14, -8(%r15) movq%r14, %rax leaq-8(%r15), %r14 movq%r14, %r15 movq%rax, %r14 addq$8, %r15 So fixed for GCC 9.
[Bug target/50135] Loop optimization.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50135 Andrew Pinski changed: What|Removed |Added Resolution|--- |WONTFIX Status|SUSPENDED |RESOLVED --- Comment #4 from Andrew Pinski --- As mentioned the loop instruction is not very useful at on any modern processor so closing as won't fix.
[Bug lto/53777] [lto] lto does not propagate optimization flags from command lines given at "compilation time"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53777 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0 Resolution|--- |FIXED Keywords||lto Status|WAITING |RESOLVED --- Comment #5 from Andrew Pinski --- I think the problem listed here is all fully fixed on the trunk (there has been many improvements over time even to get this fixed, even as recently as r12-5920 [PR 103515] ).
[Bug tree-optimization/39094] loop_niter_by_eval should deal with [i_1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39094 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=66718 --- Comment #1 from Andrew Pinski --- Hmm, do we need this any more after gimple-laddress (PR66718) was added?
[Bug c/100854] TS 18661-3 and backwards-incompatible setting of __FLT_EVAL_METHOD__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100854 Francois-Xavier Coudert changed: What|Removed |Added CC||fxcoudert at gcc dot gnu.org --- Comment #2 from Francois-Xavier Coudert --- This affects x86_64-apple-darwin: https://gcc.gnu.org/pipermail/gcc/2021-December/237959.html The system header errors out on any value of __FLT_EVAL_METHOD__ that is not -1, 0, 1, or 2.
Re: [PATCH] driver: Improve option diagnostics [PR103465]
On 12/10/2021 3:04 AM, Martin Liška wrote: It happens that options are parsed and various diagnostics happen in finish_options. That's a proper place as the function is also called for optimize/target attributes (pragmas). However, it is possible that target overwrites an option from command line and so the diagnostics does not happen. That's fixed in the patch. - options are parsed and finish_options is called: if (opts->x_flag_unwind_tables && !targetm_common.unwind_tables_default && opts->x_flag_reorder_blocks_and_partition && (ui_except == UI_SJLJ || ui_except >= UI_TARGET)) { if (opts_set->x_flag_reorder_blocks_and_partition) inform (loc, "%<-freorder-blocks-and-partition%> does not support " "unwind info on this architecture"); opts->x_flag_reorder_blocks_and_partition = 0; opts->x_flag_reorder_blocks = 1; } It's not triggered because of opts->x_flag_unwind_tables is false by default, but the option is overwritten in target: ... if (TARGET_64BIT_P (opts->x_ix86_isa_flags)) { if (opts->x_optimize >= 1) SET_OPTION_IF_UNSET (opts, opts_set, flag_omit_frame_pointer, !USE_IX86_FRAME_POINTER); if (opts->x_flag_asynchronous_unwind_tables && TARGET_64BIT_MS_ABI) SET_OPTION_IF_UNSET (opts, opts_set, flag_unwind_tables, 1); ... Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin PR driver/103465 gcc/ChangeLog: * opts.c (finish_options): More part of diagnostics to ... (diagnose_options): ... here. Call the function from both finish_options and process_options. * opts.h (diagnose_options): Declare. * toplev.c (process_options): Call diagnose_options. OK. Jeff
[Bug fortran/102595] ICE in var_element, at fortran/decl.c:298 since r10-5607-gde89b5748d68b76b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102595 kargl at gcc dot gnu.org changed: What|Removed |Added Attachment #52053|0 |1 is obsolete|| --- Comment #5 from kargl at gcc dot gnu.org --- Created attachment 52054 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52054=edit new patch This is a better patch and now deals with the legal code program foo complex a data a%re, a%im /1., 2./ print *, a%re, a%im end program foo and the invalid code program foo complex a data a%re, a%re /1., 2./ print *, a%re, a%im end program foo
Re: [PATCH] enable -Winvalid-memory-order for C++ [PR99612]
On 12/8/2021 9:49 AM, Martin Sebor via Gcc-patches wrote: Even with -Wno-system-headers enabled, the -Winvalid-memory-order code tries to make sure calls to atomic functions with invalid memory orders are diagnosed even though the C atomic functions are defined as macros in the system header. The warning triggers at all optimization levels, including -O0. Independently, the core diagnostic enhancements implemented earlier this year (the warning group control) enable warnings for functions defined in system headers that are inlined into user code. This was done for similar reason as above: because it's desirable to diagnose invalid calls made from user code to system functions (e.g., buffer overflows, invalid or mismatched deallocations, etc.) However, the C macro solution interferes with the code diagnostic changes and prevents the invalid memory model warnings from being issued for the same problems in C++. In addition, because C++ atomics are ordinary (inline) functions that call the underlying __atomic_xxx built-ins, the invalid memory orders can only be detected with both inlining and constant propagation enabled. The attached patch removes these limitations and enables -Winvalid-memory-order to trigger even for C++ std::atomic, (almost) just like it does in C, at all optimization levels including -O0. To make that possible I had to move -Winvalid-memory-order from builtins.c to a GIMPLE pass where it can use context-sensitive range info at -O0, instead of relying on constant propagation (only available at -O1 and above). Although the same approach could be used to emit better object code for C++ atomics at -O0 (i.e., use the right memory order instead of dropping to seq_cst), this patch doesn't do that.) In addition to enabling the warning I've also enhanced it to include the memory models involved in the diagnosed call (both the problem ones and the viable alternatives). Tested on x86_64-linux. Jonathan, I CC you for two reasons: a) because this solution is based on your (as well as my own) preference for handling C++ system headers, and because of our last week's discussion of the false positives in std::string resulting from the same choice there. I don't anticipate this change to lead to the same fallout because it's unlikely for GCC to synthesize invalid memory orders out of thin air; and b) because the current solution can only detect the problems in calls to atomic functions at -O0 that are declared with attribute always_inline. This includes member functions defined in the enclosing atomic class but not namespace-scope functions. To make the detection possible those would also have to be always_inline. If that's a change you'd like to see I can look into making it happen. Martin gcc-99612.diff PR middle-end/99612 - Remove "#pragma GCC system_header" from atomic file to warn on incorrect memory order gcc/ChangeLog: PR middle-end/99612 * builtins.c (get_memmodel): Move warning code to gimple-ssa-warn-access.cc. (expand_builtin_atomic_compare_exchange): Same. (expand_ifn_atomic_compare_exchange): Same. (expand_builtin_atomic_load): Same. (expand_builtin_atomic_store): Same. (expand_builtin_atomic_clear): Same. * doc/extend.texi (__atomic_exchange_n): Update valid memory models. * gimple-ssa-warn-access.cc (memmodel_to_uhwi): New function. (struct memmodel_pair): New struct. (memmodel_name): New function. (pass_waccess::maybe_warn_memmodel): New function. (pass_waccess::check_atomic_memmodel): New function. (pass_waccess::check_atomic_builtin): Handle memory model. * input.c (expansion_point_location_if_in_system_header): Return original location if expansion location is in a system header. gcc/testsuite/ChangeLog: PR middle-end/99612 * c-c++-common/pr83059.c: Adjust text of expected diagnostics. * gcc.dg/atomic-invalid-2.c: Same. * gcc.dg/atomic-invalid.c: Same. * c-c++-common/Winvalid-memory-model.c: New test. * g++.dg/warn/Winvalid-memory-model-2.C: New test. * g++.dg/warn/Winvalid-memory-model.C: New test. Probably larger than I would have liked for a stage3 submitted bugfix. But it looks reasonable and as you mentioned, I think the potential for fallout is relatively small. OK. jeff
Re: _Float16-related failures on x86_64-apple-darwin
> I’m not sure what the fix should be, either. We could use fixinclude to make > the darwin headers happy, but we don’t really have a macro to provide the > right value. Like a __FLT_EVAL_METHOD_OLDSTYLE__ macro. > > What should be the float_t and double_t types for FLT_EVAL_METHOD == 16? > float and double, if I understand right? This is one possibility, assuming I am right about the types: diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def index 46e3b8c993a..bea85ef7367 100644 --- a/fixincludes/inclhack.def +++ b/fixincludes/inclhack.def @@ -1767,6 +1767,18 @@ fix = { test_text = ""; /* Don't provide this for wrap fixes. */ }; +/* The darwin headers don't accept __FLT_EVAL_METHOD__ == 16. +*/ +fix = { +hackname = darwin_flt_eval_method; +mach = "*-*-darwin*"; +files = math.h; +select= "^#if __FLT_EVAL_METHOD__ == 0$"; +c_fix = format; +c_fix_arg = "#if __FLT_EVAL_METHOD__ == 0 || __FLT_EVAL_METHOD__ == 16"; +test_text = "#if __FLT_EVAL_METHOD__ == 0"; +}; + /* * Fix on Digital UNIX V4.0: * It contains a prototype for a DEC C internal asm() function, Sucks to have to fix headers… and we certainly can’t fix people’s code that may depend on __FLT_EVAL_METHOD__ have well-defined values. So not convinced this is the right approach. FX
Re: _Float16-related failures on x86_64-apple-darwin
Hi, > See https://gcc.gnu.org/bugzilla//show_bug.cgi?id=100854 . I found that, indeed, but what I struggle to see is: this behaviour of __FLT_EVAL_METHOD__ has been around for several years now, so why aren’t there more tests failing? I’m not sure what the fix should be, either. We could use fixinclude to make the darwin headers happy, but we don’t really have a macro to provide the right value. Like a __FLT_EVAL_METHOD_OLDSTYLE__ macro. What should be the float_t and double_t types for FLT_EVAL_METHOD == 16? float and double, if I understand right? FX
Re: [PATCH v2] tree-optimization/103759: Truncate unknown to sizetype on compare
On 12/17/2021 2:42 PM, Siddhesh Poyarekar wrote: Since all computations in tree-object-size are now done in sizetype and not HOST_WIDE_INT, comparisons after conversion to HOST_WIDE_INT would be incorrect. Instead, truncate unknown (object_size_type) to sizetype to compare with the computed size to evaluate if it is unknown. gcc/ChangeLog: PR tree-optimization/103759 * tree-object-size (unknown, initval): Change to arrays. Adjust all uses. (init_limits): Rename from init_offset_limit. Initialize UNKNOWN and INITVAL. Adjust all uses. OK jeff
gcc-9-20211223 is now available
Snapshot gcc-9-20211223 is now available on https://gcc.gnu.org/pub/gcc/snapshots/9-20211223/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 9 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-9 revision e08ac1a01b18076520c12ab0997dd59d6b0b7ad8 You'll find: gcc-9-20211223.tar.xzComplete GCC SHA256=3471eb642fd65261af439eb32fbddf9ad1b3ca395305e392f391392f2a1a5b20 SHA1=5b7d79c3d451206aced06b0dd6013c237debaf4b Diffs from 9-20211216 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
[Bug middle-end/80929] [9/10/11/12 Regression] Division with constant no more optimized to mult highpart
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80929 Roger Sayle changed: What|Removed |Added Status|NEW |RESOLVED CC||roger at nextmovesoftware dot com Resolution|--- |FIXED Target Milestone|9.5 |11.3 --- Comment #17 from Roger Sayle --- According to godbolt, with the example from comment #6, this appears to have been fixed in gcc 11.1, but was still present (generated calls to __divmodhi4) back in gcc 10.3. Likewise the original example in comment #1 has been fixed since gcc 5.4, but was broken in gcc 4.6.4.
Re: _Float16-related failures on x86_64-apple-darwin
On Thu, Dec 23, 2021, 14:24 FX via Gcc wrote: > Hi, > > Some recently introduced tests have been failing for several weeks on > x86_64-apple-darwin. > > FAIL: gcc.target/i386/cond_op_maxmin__Float16-1.c > FAIL: gcc.target/i386/pr102464-copysign-1.c > FAIL: gcc.target/i386/pr102464-fma.c > FAIL: gcc.target/i386/pr102464-maxmin.c > FAIL: gcc.target/i386/pr102464-sqrtph.c > FAIL: gcc.target/i386/pr102464-sqrtsh.c > FAIL: gcc.target/i386/pr102464-vrndscaleph.c > > In all cases the symptom is the same: the include of errors out > with “Unsupported value of __FLT_EVAL_METHOD__”. It appears that the > compile option -mavx512fp16 defines __FLT_EVAL_METHOD__ to have value 16, > which is not understood by darwin: > > > /* Define float_t and double_t per C standard, ISO/IEC 9899:2011 7.12 2, > > taking advantage of GCC's __FLT_EVAL_METHOD__ (which a compiler may > > define anytime and GCC does) that shadows FLT_EVAL_METHOD (which a > > compiler must define only in float.h). > */ > > #if __FLT_EVAL_METHOD__ == 0 > > typedef float float_t; > > typedef double double_t; > > #elif __FLT_EVAL_METHOD__ == 1 > > typedef double float_t; > > typedef double double_t; > > #elif __FLT_EVAL_METHOD__ == 2 || __FLT_EVAL_METHOD__ == -1 > > typedef long double float_t; > > typedef long double double_t; > > #else /* __FLT_EVAL_METHOD__ */ > > # error "Unsupported value of __FLT_EVAL_METHOD__." > > #endif /* __FLT_EVAL_METHOD__ */ > > > Is the use of __FLT_EVAL_METHOD__ set to 16 supposed to be portable across > all targets? Or is it linux-only, and should marked as such? > See https://gcc.gnu.org/bugzilla//show_bug.cgi?id=100854 . > Thanks for any help you can give. > > FX
_Float16-related failures on x86_64-apple-darwin
Hi, Some recently introduced tests have been failing for several weeks on x86_64-apple-darwin. FAIL: gcc.target/i386/cond_op_maxmin__Float16-1.c FAIL: gcc.target/i386/pr102464-copysign-1.c FAIL: gcc.target/i386/pr102464-fma.c FAIL: gcc.target/i386/pr102464-maxmin.c FAIL: gcc.target/i386/pr102464-sqrtph.c FAIL: gcc.target/i386/pr102464-sqrtsh.c FAIL: gcc.target/i386/pr102464-vrndscaleph.c In all cases the symptom is the same: the include of errors out with “Unsupported value of __FLT_EVAL_METHOD__”. It appears that the compile option -mavx512fp16 defines __FLT_EVAL_METHOD__ to have value 16, which is not understood by darwin: > /* Define float_t and double_t per C standard, ISO/IEC 9899:2011 7.12 2, > taking advantage of GCC's __FLT_EVAL_METHOD__ (which a compiler may > define anytime and GCC does) that shadows FLT_EVAL_METHOD (which a > compiler must define only in float.h). > */ > #if __FLT_EVAL_METHOD__ == 0 > typedef float float_t; > typedef double double_t; > #elif __FLT_EVAL_METHOD__ == 1 > typedef double float_t; > typedef double double_t; > #elif __FLT_EVAL_METHOD__ == 2 || __FLT_EVAL_METHOD__ == -1 > typedef long double float_t; > typedef long double double_t; > #else /* __FLT_EVAL_METHOD__ */ > # error "Unsupported value of __FLT_EVAL_METHOD__." > #endif /* __FLT_EVAL_METHOD__ */ Is the use of __FLT_EVAL_METHOD__ set to 16 supposed to be portable across all targets? Or is it linux-only, and should marked as such? Thanks for any help you can give. FX
[Bug target/103785] [12 Regression] Ada bootstrap ICEs on i?86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103785 H.J. Lu changed: What|Removed |Added CC||ubizjak at gmail dot com --- Comment #6 from H.J. Lu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587326.html
[PATCH] i386: Require TARGET_64BIT for any_mul_highpart peephole
Restore i686 bootstrap by requiring TARGET_64BIT for any_mul_highpart peephole. PR bootstrap/103785 * config/i386/i386.md: Require TARGET_64BIT for any_mul_highpart peephole. --- gcc/config/i386/i386.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 284b9507466..9d6786c5c2e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8588,7 +8588,8 @@ (any_mul_highpart:SWI48 (match_dup 2) (match_dup 0))) (clobber (match_dup 2)) (clobber (reg:CC FLAGS_REG))])] - "REGNO (operands[0]) != REGNO (operands[2]) + "TARGET_64BIT + && REGNO (operands[0]) != REGNO (operands[2]) && REGNO (operands[0]) != REGNO (operands[3]) && (REGNO (operands[0]) == REGNO (operands[4]) || peep2_reg_dead_p (3, operands[0]))" -- 2.33.1
[Bug hsa/86948] Internal compiler error compiling brig.dg/test/gimple/mulhi.hsail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86948 Roger Sayle changed: What|Removed |Added CC||roger at nextmovesoftware dot com Ever confirmed|0 |1 Last reconfirmed||2021-12-23 Status|UNCONFIRMED |NEW --- Comment #7 from Roger Sayle --- A default expansion for MULT_HIGHPART_EXPR was proposed as part of https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551316.html I'll split off just that part of the patch and resubmit it for review.
[Bug tree-optimization/99620] Subtract with borrow (SBB) missed optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99620 --- Comment #5 from Andrew Pinski --- l_7 = a$l_12 - _1; k_8 = l_7 > a$l_12; vs: l_6 = a$l_11 - b$l_12; k_7 = a$l_11 < b$l_12;
Many analyzer failures on non-Linux system (x86_64-apple-darwin)
Hi David, hi everone, I’m trying to understand how best to fix or silence the several failures in gcc.dg/analyzer that occur on x86_64-apple-darwin. Some of them, according to gcc-testresults, also occur on other non-Linux targets. See for example, the test results at https://gcc.gnu.org/pipermail/gcc-testresults/2021-December/743901.html ## gcc.dg/analyzer/torture/asm-x86-linux-*.c Are these supposed to be run only on Linux (as the name implies)? Four of them fail on x86_64-apple-darwin, because they use assembly that is not supported: FAIL: gcc.dg/analyzer/torture/asm-x86-linux-cpuid-paravirt-1.c FAIL: gcc.dg/analyzer/torture/asm-x86-linux-cpuid-paravirt-2.c FAIL: gcc.dg/analyzer/torture/asm-x86-linux-rdmsr-paravirt.c FAIL: gcc.dg/analyzer/torture/asm-x86-linux-wfx_get_ps_timeout-full.c Should they be restricted to Linux targets? There is another one that has the same error, as well, although it doesn’t have linux in the name: FAIL: gcc.dg/analyzer/asm-x86-lp64-1.c ## Builtin-related failures Those four cases fail: gcc.dg/analyzer/data-model-1.c gcc.dg/analyzer/pr103526.c gcc.dg/analyzer/taint-size-1.c gcc.dg/analyzer/write-to-string-literal-1.c but pass if the function calls (memset and memcpy) are replaced by the built-in variant (__builtin_memset and __builtin_memcpy). The reason for that is the darwin headers, in (included from ) does this: #if __has_builtin(__builtin___memcpy_chk) || defined(__GNUC__) #undef memcpy /* void *memcpy(void *dst, const void *src, size_t n) */ #define memcpy(dest, ...) \ __builtin___memcpy_chk (dest, __VA_ARGS__, __darwin_obsz0 (dest)) #endif where __darwin_obsz0 is defined thusly: #define __darwin_obsz0(object) __builtin_object_size (object, 0) Does the analyzer not handle the _chk builtin variants? Should it? I’m happy to investigate more, but I’m not sure what to do. Best, FX
[Bug target/99548] Help me! Lost the fight against the compiler.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99548 --- Comment #3 from Andrew Pinski --- #include #include #if defined(_MSC_VER) #include #elif defined(__x86_64__) || defined(__i386__) #include #endif using field_number = std::conditional_t=8,std::array,std::array>; namespace intrinsics { template #if __cpp_lib_concepts >= 202002L requires (std::unsigned_integral) #endif inline constexpr bool sub_borrow(bool borrow,T a,T b,T& out) noexcept { #if defined(_MSC_VER) || defined(__x86_64__) || defined(__i386__) #if __cpp_lib_is_constant_evaluated >= 201811L if(std::is_constant_evaluated()) return (out=a-b-borrow)>=a; else #endif { if constexpr(sizeof(T)==8) #if defined(__x86_64__) return _subborrow_u64(borrow,a,b, #if !defined(__INTEL_COMPILER ) &&(defined(__GNUC__) || defined(__clang__)) reinterpret_cast()); #else ); #endif #else return (out=a-b-borrow)>=a; #endif if constexpr(sizeof(T)==4) return _subborrow_u32(borrow,a,b,reinterpret_cast()); else if constexpr(sizeof(T)==2) return _subborrow_u16(borrow,a,b,reinterpret_cast()); else if constexpr(sizeof(T)==1) return _subborrow_u8(borrow,a,b,reinterpret_cast()); } #else return (out=a-b-borrow)>=a; #endif } } template #if __cpp_lib_concepts >= 202002L requires (std::unsigned_integral) #endif inline constexpr bool add_carry(bool carry,T a,T b,T& out) noexcept { #if defined(_MSC_VER) || defined(__x86_64__) || defined(__i386__) #if __cpp_lib_is_constant_evaluated >= 201811L if(std::is_constant_evaluated()) return (out=a+b+carry)<=a; else #endif { if constexpr(sizeof(T)==8) #if defined(__x86_64__) return _addcarry_u64(carry,a,b, #if !defined(__INTEL_COMPILER ) &&(defined(__GNUC__) || defined(__clang__)) reinterpret_cast()); #else ); #endif #else return (out=a+b+carry)<=a; #endif else if constexpr(sizeof(T)==4) return _addcarry_u32(carry,a,b,reinterpret_cast()); else if constexpr(sizeof(T)==2) return _addcarry_u16(carry,a,b,reinterpret_cast()); else if constexpr(sizeof(T)==1) return _addcarry_u8(carry,a,b,reinterpret_cast()); } #else return (out=a+b+carry)<=a; #endif } void my_asm_field_add( std::uint64_t* __restrict r, std::uint64_t const* __restrict x, std::uint64_t const* __restrict y) noexcept { std::uint64_t r0,r1,r2,r3; std::uint64_t rv; __asm__ __volatile__(R"(mov (%[x]),%[r0] add (%[y]),%[r0] mov 8(%[x]),%[r1] adc 8(%[y]),%[r1] mov 16(%[x]),%[r2] adc 16(%[y]),%[r2] mov 24(%[x]),%[r3] adc 24(%[y]),%[r3] sbb %[rv],%[rv] and $38,%[rv] add %[rv],%[r0] adc $0,%[r1] adc $0,%[r2] adc $0,%[r3] sbb %[rv],%[rv] and $38,%[rv] add %[rv],%[r0] mov %[r0],(%[res]) adc $0,%[r1] mov %[r1],8(%[res]) adc $0,%[r2] mov %[r2],16(%[res]) adc $0,%[r3] mov %[r3],24(%[res]))": [r0]"="(r0),[r1]"="(r1),[r2]"="(r2),[r3]"="(r3),[rv]"="(rv): [x]"r"(x),[y]"r"(y),[res]"r"(r):"memory","cc"); } void intrinsics_add(std::uint64_t* __restrict f, std::uint64_t const* __restrict x, std::uint64_t const* __restrict y) noexcept { using namespace intrinsics; using unsigned_type = field_number::value_type; constexpr unsigned_type zero{}; std::uint64_t f0,f1,f2,f3; bool carry{add_carry(false,x[0],y[0],f0)}; carry=add_carry(carry,x[1],y[1],f1); carry=add_carry(carry,x[2],y[2],f2); carry=add_carry(carry,x[3],y[3],f3); unsigned_type v=0; sub_borrow(carry,v,v,v); v&=static_cast(38); carry=add_carry(false,f0,v,f0); carry=add_carry(carry,f1,zero,f1); carry=add_carry(carry,f2,zero,f2); carry=add_carry(carry,f3,zero,f3); sub_borrow(carry,v,v,v); v&=static_cast(38); carry=add_carry(false,f0,v,f[0]); carry=add_carry(carry,f1,zero,f[1]); carry=add_carry(carry,f2,zero,f[2]); carry=add_carry(carry,f3,zero,f[3]); }
[Bug c++/99539] Varargs are allowed in requires expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99539 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-12-23 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug rtl-optimization/99551] aarch64: csel is used for cold scalar computation which affects performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99551 --- Comment #2 from Andrew Pinski --- if-conversion succeeded through noce_try_cmove_arith Removing jump 8. deleting insn with uid = 8. deleting insn with uid = 11. deleting insn with uid = 10. deleting block 3 Merging block 4 into block 2... changing bb of uid 13 changing bb of uid 18 from 4 to 2 changing bb of uid 19 from 4 to 2 Merged blocks 2 and 4. Conversion succeeded on pass 1.
[Bug target/99228] blend/shuffle
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99228 --- Comment #6 from Andrew Pinski --- Hmm, the trunk no longer does the if conversion: complex_sgn(std::complex const&): .LFB2678: .cfi_startproc vmovsd xmm0, QWORD PTR [rdi] vxorpd xmm1, xmm1, xmm1 vcomisd xmm0, xmm1 jne .L8 vmovsd xmm0, QWORD PTR [rdi+8] vcomisd xmm0, xmm1 je .L13 .L8: vandpd xmm0, xmm0, XMMWORD PTR .LC1[rip] vorpd xmm0, xmm0, XMMWORD PTR .LC2[rip] ret
[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 --- Comment #5 from H.J. Lu --- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 284b9507466..9d6786c5c2e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8588,7 +8588,8 @@ (define_peephole2 (any_mul_highpart:SWI48 (match_dup 2) (match_dup 0))) (clobber (match_dup 2)) (clobber (reg:CC FLAGS_REG))])] - "REGNO (operands[0]) != REGNO (operands[2]) + "TARGET_64BIT + && REGNO (operands[0]) != REGNO (operands[2]) && REGNO (operands[0]) != REGNO (operands[3]) && (REGNO (operands[0]) == REGNO (operands[4]) || peep2_reg_dead_p (3, operands[0]))" is sufficient to restore bootstrap.
[Bug rtl-optimization/98977] [x86] Failure to optimize consecutive sub flags usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98977 Andrew Pinski changed: What|Removed |Added Component|target |rtl-optimization Target|x86_64-*-* i?86-*-* |x86_64-*-* i?86-*-* ||aarch64*-*-* --- Comment #2 from Andrew Pinski --- Here is a testcase which shows the issue on other targets (aarch64) too: #include #include extern bool z, c; uint32_t f(uint32_t dest, uint32_t src) { uint32_t res = dest - src; z = !res; c = src > dest; return res; }
[Bug target/98977] [x86] Failure to optimize consecutive sub flags usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98977 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2021-12-23 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Depends on||3507 --- Comment #1 from Andrew Pinski --- Confirmed, PR 3507 is part of it (maybe all of it) as shown by: #include #include extern bool z, c; uint8_t f(uint8_t dest, uint8_t src) { uint8_t res = dest - src; //z = !res; c = src > dest; return res; } Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507 [Bug 3507] appalling optimisation with sub/cmp on multiple targets
[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 --- Comment #4 from H.J. Lu --- (In reply to Roger Sayle from comment #3) > Thanks for investigating this HJ (I'm having difficulty configuring my You can bootstrap 32bit GCC on Linux/x86-64 if 32-bit libraries are available. > system to reproduce this). Is the TARGET_64BIT guard needed by both > peephole2s, or is one sufficient to restore bootstrap? Your fix/workaround I will double check. > (disabling the optimization on -m32) looks good, but I still can't figure > out what about this transformation is unsafe, and therefore perhaps latent > on -m64. Sorry again for the inconvenience.
[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 --- Comment #3 from Roger Sayle --- Thanks for investigating this HJ (I'm having difficulty configuring my system to reproduce this). Is the TARGET_64BIT guard needed by both peephole2s, or is one sufficient to restore bootstrap? Your fix/workaround (disabling the optimization on -m32) looks good, but I still can't figure out what about this transformation is unsafe, and therefore perhaps latent on -m64. Sorry again for the inconvenience.
[Bug fortran/102595] ICE in var_element, at fortran/decl.c:298 since r10-5607-gde89b5748d68b76b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102595 --- Comment #4 from kargl at gcc dot gnu.org --- Comment on attachment 52053 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52053 patch This patch fixes the problem in the PR and addresses a few niggles I found as I poked gfortran. Some of those errors don't occur without this patch. program p complex, parameter :: x(0) = 2 ! complex :: x(0) = 2! { dg-error "more values than variables" } ! complex :: x(1) = 2! { dg-error "already is initialized" } ! complex :: x = 2 ! { dg-error "already is initialized" } ! complex :: x(1)! Works (as expected?) ! complex :: x(0)! { dg-error "more values than variables" } data x%re /3.0/ print *, x%re end
[Bug tree-optimization/103821] [12 Regression] huge compile time (jump threading) at -O3 for simple code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103821 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=103815 --- Comment #1 from Andrew Pinski --- I found this while looking into PR 103815.
[Bug tree-optimization/103821] [12 Regression] huge compile time (jump threading) at -O3 for simple code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103821 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0
[Bug tree-optimization/103821] New: [12 Regression] huge compile time (jump threading) at -O3 for simple code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103821 Bug ID: 103821 Summary: [12 Regression] huge compile time (jump threading) at -O3 for simple code Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: #include uint16_t int_sqrt32(uint32_t x) { uint16_t res=0; uint16_t add= 0x8000; do { uint16_t temp=res | add; uint32_t g2=temp*temp; if (x>=g2) res=temp; add>>=1; } while(add); return res; } - CUT - Compile this at -O3 and GCC takes a long time: backwards jump threading : 20.12 ( 81%) 0.01 ( 50%) 20.12 ( 78%) 26M ( 79%)
[Bug fortran/102595] ICE in var_element, at fortran/decl.c:298 since r10-5607-gde89b5748d68b76b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102595 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org --- Comment #3 from kargl at gcc dot gnu.org --- Created attachment 52053 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52053=edit patch
[Bug target/103785] [12 Regression] Ada bootstrap ICEs on i?86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103785 Andrew Pinski changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #5 from Andrew Pinski --- *** Bug 103820 has been marked as a duplicate of this bug. ***
[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 Andrew Pinski changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #2 from Andrew Pinski --- Dup of bug 103785. *** This bug has been marked as a duplicate of bug 103785 ***
[Bug bootstrap/103820] [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-12-23 --- Comment #1 from H.J. Lu --- This patch: diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 284b9507466..4eb217a93ee 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8588,7 +8588,8 @@ (define_peephole2 (any_mul_highpart:SWI48 (match_dup 2) (match_dup 0))) (clobber (match_dup 2)) (clobber (reg:CC FLAGS_REG))])] - "REGNO (operands[0]) != REGNO (operands[2]) + "TARGET_64BIT + && REGNO (operands[0]) != REGNO (operands[2]) && REGNO (operands[0]) != REGNO (operands[3]) && (REGNO (operands[0]) == REGNO (operands[4]) || peep2_reg_dead_p (3, operands[0]))" @@ -8608,7 +8609,8 @@ (define_peephole2 (any_mul_highpart:SI (match_dup 2) (match_dup 0 (clobber (match_dup 2)) (clobber (reg:CC FLAGS_REG))])] - "REGNO (operands[0]) != REGNO (operands[2]) + "TARGET_64BIT + && REGNO (operands[0]) != REGNO (operands[2]) && REGNO (operands[0]) != REGNO (operands[3]) && (REGNO (operands[0]) == REGNO (operands[4]) || peep2_reg_dead_p (3, operands[0]))" made bootstrap to pass the failed point.
[Bug bootstrap/103820] New: [12 Regression] i686 failed to bootstrap with ada by r12-6077
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103820 Bug ID: 103820 Summary: [12 Regression] i686 failed to bootstrap with ada by r12-6077 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: roger at nextmovesoftware dot com, ubizjak at gmail dot com Target Milestone: --- Target: i686 On Linux/i686, r12-6077 caused: make[5]: *** [/export/gnu/import/git/sources/gcc/gcc/ada/gcc-interface/Make-lang.in:167: ada/exp_cg.o] Error 1 +===GNAT BUG DETECTED==+ | 12.0.0 20211223 (experimental) (i686-linux) Storage_Error stack overflow or erroneous memory access| | Error detected at system.ads:106:30 | | Compiling /export/gnu/import/git/sources/gcc/gcc/ada/exp_ch11.adb| | Please submit a bug report; see https://gcc.gnu.org/bugs/ . | | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Include the exact command that you entered. | | Also include sources listed below. | +==+ Please include these source files with error report Note that list may not be accurate in some cases, so please double check that the problem can still be reproduced with the set of files listed. Consider also -gnatd.n switch (see debug.adb).
[Bug ipa/103818] [12 Regression] ICE: in insert, at ipa-modref-tree.c:591
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103818 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-12-23 Keywords||ice-on-valid-code --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug ipa/103819] [10/11/12 Regression] ICE in redirect_callee, at cgraph.c:1389 with __attribute__((flatten)) and -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103819 Andrew Pinski changed: What|Removed |Added Keywords||ice-checking Summary|[12 Regression] ICE in |[10/11/12 Regression] ICE |redirect_callee, at |in redirect_callee, at |cgraph.c:1389 with |cgraph.c:1389 with |__attribute__((flatten))|__attribute__((flatten)) |and -O2 |and -O2 Known to fail||10.3.0, 11.1.0, 12.0 Ever confirmed|0 |1 Known to work||10.1.0, 10.2.0 Status|UNCONFIRMED |NEW Last reconfirmed||2021-12-23 --- Comment #1 from Andrew Pinski --- With -fchecking, GCC 10.3.0, 11.1.0 also ICE: :22:1: error: calls_comdat_local is set outside of a comdat group 22 | } | ^ :22:1: error: invalid calls_comdat_local flag _Z16value_to_numericv/5 (void value_to_numeric()) @0x7f780edb3ca8 Type: function definition analyzed Visibility: externally_visible public References: Referring: Availability: available Function flags: count:1073741824 (estimated locally) body calls_comdat_local Called by: Calls: _ZN8OptionalI10CompletionED2Ev/25 (inlined) (1073741824 (estimated locally),1.00 per call) :22: confused by earlier errors, bailing out
[Bug ipa/103819] [12 Regression] ICE in redirect_callee, at cgraph.c:1389 with __attribute__((flatten)) and -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103819 Andrew Pinski changed: What|Removed |Added Keywords||ice-on-valid-code Target Milestone|--- |12.0
[Bug c/103815] Misoptimization of a bounded do/while loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103815 --- Comment #1 from Andrew Pinski --- Note temp*temp is really ((int)temp)*((int)temp) due to interger promotion rules in c/c++.
[Bug ipa/103819] New: [12 Regression] ICE in redirect_callee, at cgraph.c:1389 with __attribute__((flatten)) and -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103819 Bug ID: 103819 Summary: [12 Regression] ICE in redirect_callee, at cgraph.c:1389 with __attribute__((flatten)) and -O2 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: dani at danielbertalan dot dev CC: marxin at gcc dot gnu.org Target Milestone: --- Created attachment 52052 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52052=edit Test case Link to Compiler Explorer: https://godbolt.org/z/5bE9zsvfh The following code fails to compile with gcc 12 (commit 61e53698a08dc1d9a54d785218af687a6751c1b3): === template struct Optional { ~Optional() { if (m_has_value) value(); } T value(); void release_value() { m_has_value = false; } bool m_has_value; }; struct Completion { Optional m_target; }; struct ThrowCompletionOr { void release_error() { m_throw_completion.release_value(); } Optional m_throw_completion; } __trans_tmp_1; __attribute__((flatten)) void value_to_numeric() { auto _temporary_result(__trans_tmp_1); _temporary_result.release_error(); } === The issue is both present in Compiler Explorer's trunk compiler, and in my local build that has SerenityOS-specific patches applied. The error, as reported by the latter, is: $ g++ -O2 -v -save-temps -c repro.ii: Using built-in specs. COLLECT_GCC=../../Toolchain/Local/gcc-12/i686/bin/i686-pc-serenity-g++ Target: i686-pc-serenity Configured with: ../../Tarballs/gcc/configure --prefix=/home/dani/Projects/contributions/serenity/Toolchain/Local/gcc-12/i686 --target=i686-pc-serenity --with-sysroot=/home/dani/Projects/contributions/serenity/Build/i686/Root --disable-nls --with-newlib --enable-shared --enable-languages=c,c++ --enable-default-pie --enable-lto --enable-threads=posix Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20211223 (experimental) (GCC) COLLECT_GCC_OPTIONS='-O2' '-v' '-save-temps' '-c' '-shared-libgcc' '-mtune=generic' '-march=pentiumpro' /home/dani/Projects/contributions/serenity/Toolchain/Local/gcc-12/i686/libexec/gcc/i686-pc-serenity/12.0.0/cc1plus -fpreprocessed repro.ii -ftls-model=initial-exec -fPIC -fno-semantic-interposition -quiet -dumpbase repro.ii -dumpbase-ext .ii -mtune=generic -march=pentiumpro -O2 -version -ftls-model=initial-exec -o repro.s GNU C++17 (GCC) version 12.0.0 20211223 (experimental) (i686-pc-serenity) compiled by GNU C version 11.1.0, GMP version 6.2.1, MPFR version 4.1.0-p13, MPC version 1.2.1, isl version none GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C++17 (GCC) version 12.0.0 20211223 (experimental) (i686-pc-serenity) compiled by GNU C version 11.1.0, GMP version 6.2.1, MPFR version 4.1.0-p13, MPC version 1.2.1, isl version none GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: 42faffcdcc480dea6a6ccc95e1bcf344 during IPA pass: inline repro.ii:21:1: internal compiler error: in redirect_callee, at cgraph.c:1389 21 | } | ^ 0x761a99 cgraph_edge::redirect_callee(cgraph_node*) ../../../Tarballs/gcc/gcc/cgraph.c:1389 0xf0a8af redirect_to_unreachable ../../../Tarballs/gcc/gcc/ipa-fnsummary.c:260 0xf0a8af edge_set_predicate ../../../Tarballs/gcc/gcc/ipa-fnsummary.c:284 0xf0ad3d remap_edge_summaries ../../../Tarballs/gcc/gcc/ipa-fnsummary.c:4037 0xf0e246 ipa_merge_fn_summary_after_inlining(cgraph_edge*) ../../../Tarballs/gcc/gcc/ipa-fnsummary.c:4200 0xf23cd5 inline_call(cgraph_edge*, bool, vec*, int*, bool, bool*) ../../../Tarballs/gcc/gcc/ipa-inline-transform.c:504 0x1eb2477 flatten_function ../../../Tarballs/gcc/gcc/ipa-inline.c:2413 0x1eb69eb ipa_inline ../../../Tarballs/gcc/gcc/ipa-inline.c:2736 0x1eb69eb execute ../../../Tarballs/gcc/gcc/ipa-inline.c:3148
[PATCH] smuldi3_highpart.c: Replace long with long long for -mx32
* gcc.target/i386/smuldi3_highpart.c: Replace long with long long. --- gcc/testsuite/gcc.target/i386/smuldi3_highpart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/smuldi3_highpart.c b/gcc/testsuite/gcc.target/i386/smuldi3_highpart.c index 8bbd5f5cb8d..cd8ea41e019 100644 --- a/gcc/testsuite/gcc.target/i386/smuldi3_highpart.c +++ b/gcc/testsuite/gcc.target/i386/smuldi3_highpart.c @@ -2,7 +2,7 @@ /* { dg-options "-O2" } */ typedef int __attribute ((mode(TI))) ti_t; -long foo(long x) +long long foo(long long x) { return ((ti_t)x * 19065) >> 72; } -- 2.33.1
Re: [PATCH v7] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]
Sorry, sent an incomplete email. it was missing this part: On Thu, Nov 25, 2021 at 03:12:32PM -0600, Segher Boessenkool wrote: > > +;; int fegetround(void) > > +;; > > +;; This expansion for the C99 function only expands for compatible > > +;; target libcs. Because it needs to return one of FE_DOWNWARD, > > +;; FE_TONEAREST, FE_TOWARDZERO or FE_UPWARD with the values as defined > > +;; by the target libc, and since they are free to > > +;; choose the values and the expand needs to know then beforehand, > > +;; this expand only expands for target libcs that it can handle the > > +;; values is knows. > > +;; Because of these restriction, this only expands on the desired > > +;; case and fallback to a call to libc on any otherwise. > > +(define_expand "fegetroundsi" > > (This needs some wordsmithing.) How about something like this? It is just a light editing of the above: ;; This expansion for the C99 function only expands for compatible ;; target libcs, because it needs to return one of FE_DOWNWARD, ;; FE_TONEAREST, FE_TOWARDZERO or FE_UPWARD with the values as defined ;; by the target libc, and since the libc is free to choose the values ;; (and they may differ from the hardware) and the expander needs to ;; know then beforehand, this expanded only expands for target libcs ;; that it can handle the values is knows. ;; Because of these restriction, this only expands on the desired ;; case and fallback to a call to libc otherwise. o/ Raoni
Re: [PATCH v7] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]
Hi Segher, On Thu, Nov 25, 2021 at 03:12:32PM -0600, Segher Boessenkool wrote: > Hi! > > On Wed, Nov 24, 2021 at 08:48:47PM -0300, Raoni Fassina Firmino wrote: > > gcc/ChangeLog: > > * builtins.c (expand_builtin_fegetround): New function. > > (expand_builtin_feclear_feraise_except): New function. > > (expand_builtin): Add cases for BUILT_IN_FEGETROUND, > > BUILT_IN_FECLEAREXCEPT and BUILT_IN_FERAISEEXCEPT > > Something is missing here (maybe just a full stop?) Yeap. Done. > > * config/rs6000/rs6000.md (fegetroundsi): New pattern. > > (feclearexceptsi): New Pattern. > > (feraiseexceptsi): New Pattern. > > * doc/extend.texi: Add a new introductory paragraph about the > > new builtins. > > Pet peeve: please don't break lines early, we have only 72 columns per > line and we have many long symbol names. Trying to make many lines very > short only results in everything looking very irregular, which is harder > to read. Sure thing, it is my bad that I have shortcuts for 70 and 80 textwidth but not 72. In any case: |* doc/extend.texi: Add a new introductory paragraph about the new |builtins. It would be 73 columns or I am reading my text editor wrong? Also here: |gcc/testsuite/ChangeLog: | |* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c: New test. |* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c: New test. It is 79 for now, but has the same 73 problem, I guess the correct formatting is: |gcc/testsuite/ChangeLog: | |* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-1.c: |New test. |* gcc.target/powerpc/builtin-feclearexcept-feraiseexcept-2.c: |New test. is that right? > > +;; int fegetround(void) > > +;; > > +;; This expansion for the C99 function only expands for compatible > > +;; target libcs. Because it needs to return one of FE_DOWNWARD, > > +;; FE_TONEAREST, FE_TOWARDZERO or FE_UPWARD with the values as defined > > +;; by the target libc, and since they are free to > > +;; choose the values and the expand needs to know then beforehand, > > +;; this expand only expands for target libcs that it can handle the > > +;; values is knows. > > +;; Because of these restriction, this only expands on the desired > > +;; case and fallback to a call to libc on any otherwise. > > +(define_expand "fegetroundsi" > > (This needs some wordsmithing.) > > +;; int feclearexcept(int excepts) > > +;; > > +;; This expansion for the C99 function only works when EXCEPTS is a > > +;; constant known at compile time and specifies any one of > > +;; FE_INEXACT, FE_DIVBYZERO, FE_UNDERFLOW and FE_OVERFLOW flags. > > +;; It doesn't handle values out of range, and always returns 0. > > It FAILs the expansion if a parameter is bad? Is this comment out of > date? If the parameter is one that it cannot handle, including boggus values, it then FAILs and the libc function will handle it, including returning error for wrong input. This part is verbatin from v5 (priour to the refactoring that then was undone) > > +;; Note that FE_INVALID is unsupported because it maps to more than > > +;; one bit of the FPSCR register. > > It could be implemented, now that you check for the libc used. It is a > fixed part of the ABI :-) Oh yeah, I can add it now or in a subsequent commit, is that a hard requirement for the patch? > > +;; The FE_* are defined in the targed libc, and since they are free to > > +;; choose the values and the expand needs to know then beforehand, > > s/then/them/ Done. > > +;; this expand only expands for target libcs that it can handle the > > (this expander) Done. > > +;; values is knows. > > s/is/it/ Done. > > +/* This testcase ensures that the builtins expand with the matching > > arguments > > + * or otherwise fallback gracefully to a function call, and don't ICE > > during > > + * compilation. > > + * "-fno-builtin" option is used to enable calls to libc implementation of > > the > > + * gcc builtins tested when not using __builtin_ prefix. */ > > Don't use leading * in comments, btw. Done. > This is a testcase so anything > goes, but FYI :-) Yeah, better keep the same style :-) > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/builtin-fegetround.c > > > + int i, rounding, expected; > > + const int rm[] = {FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD}; > > + for (i = 0; i < sizeof(rm); i++) > > That should be sizeof rm / sizeof rm[0] ? It accesses out of bounds > as it is. Done. Thanks for the catch, newbie mistake on my part. > Maybe test more values? At least 0, but also combinations of these FE_ > bits, and maybe even FE_INVALID? I Don't get what you mean, like use some invalid values for fesetround()? I am using only expected values because fegetround() will only read what was previously set. I could set some invalid values and expect that it did not change the value
[Bug c/103818] New: ICE: in insert, at ipa-modref-tree.c:591
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103818 Bug ID: 103818 Summary: ICE: in insert, at ipa-modref-tree.c:591 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: k.even-mendoza at imperial dot ac.uk Target Milestone: --- The following code fails with -O1, -O2, -O3, and -Os in GCC-12, but works fine with -O0: struct a { int b[0] } c(struct a *d) { d->b[0] = d->b[-144115188075855873] + d->b[11] * d->b[2] + d->b[0] % d->b[1025] + d->b[5]; d->b[0] = d->b[144678138029277184] + d->b[0] & d->b[-3] * d->b[053] + d->b[7] ^ d->b[-9] + d->b[14] + d->b[9] % d->b[49] + d->b[024] + d->b[82] & d->b[4096]; } void main() {} It works fine with GCC-11. === The trace in GCC-12 Version is: /home/user42/data/gcc-csmith-1223/gcc-install/bin/gcc -O2 fuzzer-file-14593.c fuzzer-file-14593.c:3:1: warning: no semicolon at end of struct or union 3 | } c(struct a *d) { | ^ during GIMPLE pass: modref fuzzer-file-14593.c: In function ‘c’: fuzzer-file-14593.c:11:1: internal compiler error: in insert, at ipa-modref-tree.c:591 11 | void main() {} | ^~~~ 0x73439f modref_access_node::insert(vec*&, modref_access_node, unsigned long, bool) .././../gcc-source/gcc/ipa-modref-tree.c:591 0xcc201a modref_ref_node::insert_access(modref_access_node, unsigned long, bool) .././../gcc-source/gcc/ipa-modref-tree.h:194 0xcc201a modref_tree::insert(unsigned int, unsigned int, unsigned int, int, int, modref_access_node, bool) .././../gcc-source/gcc/ipa-modref-tree.h:445 0xcb303c modref_tree::insert(tree_node*, int, int, modref_access_node const&, bool) .././../gcc-source/gcc/ipa-modref-tree.h:472 0xcb303c record_access .././../gcc-source/gcc/ipa-modref.c:1076 0xcb3968 analyze_load .././../gcc-source/gcc/ipa-modref.c:1707 0xc09dd1 walk_stmt_load_store_addr_ops(gimple*, void*, bool (*)(gimple*, tree_node*, tree_node*, void*), bool (*)(gimple*, tree_node*, tree_node*, void*), bool (*)(gimple*, tree_node*, tree_node*, void*)) .././../gcc-source/gcc/gimple-walk.c:800 0xcbc7a1 analyze_stmt .././../gcc-source/gcc/ipa-modref.c:1788 0xcbc7a1 analyze .././../gcc-source/gcc/ipa-modref.c:1900 0xcbc7a1 analyze_function .././../gcc-source/gcc/ipa-modref.c:3219 0xcbec2a execute .././../gcc-source/gcc/ipa-modref.c:4186 Please submit a full bug report, === I tested it with gcc (GCC) 12.0.0 20211023 (experimental), gcc (GCC) 12.0.0 20211216 (experimental), and gcc (GCC) 12.0.0 20211223 (experimental) (current version: commit ef26c151c14a87177d46fd3d725e7f82e040e89f) checking the fix of bugs 102687 and 103073 there.
Re: [PATCH] docs: replace http:// with https://
On 12/22/2021 5:57 AM, Martin Liška wrote: I replaced and verified http:// links for various domains. Ready to be installed? Tahnks, Martin gcc/ada/ChangeLog: * doc/share/gnu_free_documentation_license.rst: Replace http:// with https. * gnat-style.texi: Likewise. * gnat_rm.texi: Likewise. * gnat_ugn.texi: Likewise. gcc/d/ChangeLog: * gdc.texi: Replace http:// with https. gcc/ChangeLog: * doc/contrib.texi: Replace http:// with https. * doc/contribute.texi: Likewise. * doc/extend.texi: Likewise. * doc/gccint.texi: Likewise. * doc/gnu.texi: Likewise. * doc/implement-c.texi: Likewise. * doc/implement-cxx.texi: Likewise. * doc/include/fdl.texi: Likewise. * doc/include/gpl_v3.texi: Likewise. * doc/install.texi: Likewise. * doc/invoke.texi: Likewise. * doc/passes.texi: Likewise. * doc/service.texi: Likewise. * doc/sourcebuild.texi: Likewise. * doc/standards.texi: Likewise. gcc/fortran/ChangeLog: * gfortran.texi: Replace http:// with https. * intrinsic.texi: Likewise. gcc/go/ChangeLog: * gccgo.texi: Replace http:// with https. gcc/jit/ChangeLog: * docs/_build/texinfo/libgccjit.texi: Replace http:// with https. * docs/cp/index.rst: Likewise. * docs/cp/intro/index.rst: Likewise. * docs/cp/intro/tutorial01.rst: Likewise. * docs/cp/intro/tutorial02.rst: Likewise. * docs/cp/intro/tutorial03.rst: Likewise. * docs/cp/intro/tutorial04.rst: Likewise. * docs/cp/topics/asm.rst: Likewise. * docs/cp/topics/compilation.rst: Likewise. * docs/cp/topics/contexts.rst: Likewise. * docs/cp/topics/expressions.rst: Likewise. * docs/cp/topics/functions.rst: Likewise. * docs/cp/topics/index.rst: Likewise. * docs/cp/topics/locations.rst: Likewise. * docs/cp/topics/objects.rst: Likewise. * docs/cp/topics/types.rst: Likewise. * docs/index.rst: Likewise. * docs/internals/index.rst: Likewise. * docs/intro/index.rst: Likewise. * docs/intro/tutorial01.rst: Likewise. * docs/intro/tutorial02.rst: Likewise. * docs/intro/tutorial03.rst: Likewise. * docs/intro/tutorial04.rst: Likewise. * docs/intro/tutorial05.rst: Likewise. * docs/topics/asm.rst: Likewise. * docs/topics/compatibility.rst: Likewise. * docs/topics/compilation.rst: Likewise. * docs/topics/contexts.rst: Likewise. * docs/topics/expressions.rst: Likewise. * docs/topics/function-pointers.rst: Likewise. * docs/topics/functions.rst: Likewise. * docs/topics/index.rst: Likewise. * docs/topics/locations.rst: Likewise. * docs/topics/objects.rst: Likewise. * docs/topics/performance.rst: Likewise. * docs/topics/types.rst: Likewise. OK . And I think this falls under the obvious rule. jeff
[Bug bootstrap/103817] Bootstrap broken on x86_64-apple-darwin21
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103817 Francois-Xavier Coudert changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Francois-Xavier Coudert --- Waiting, I think I got the --with-gmp=/usr/local/opt wrong, it's in /usr/local and got picked by default by some checks but not all, explaining the weird behavior.
[Bug bootstrap/103817] New: Bootstrap broken on x86_64-apple-darwin21
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103817 Bug ID: 103817 Summary: Bootstrap broken on x86_64-apple-darwin21 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: fxcoudert at gcc dot gnu.org Target Milestone: --- Bootstrap of gcc master at ef26c151c14a87177d46fd3d725e7f82e040e89f on x86_64-apple-darwin21.2.0 is broken with: In file included from ./bconfig.h:3, from ../../gcc/gcc/genmodes.c:20: ./auto-host.h:2667:16: error: declaration does not declare anything [-fpermissive] 2667 | #define rlim_t long |^~~~ In file included from ../../gcc/gcc/genmodes.c:21: ../../gcc/gcc/system.h:555:20: error: conflicting declaration of C function ‘const char* strsignal(int)’ 555 | extern const char *strsignal (int); |^ In file included from /Users/devel/ibin/prev-x86_64-apple-darwin21.2.0/libstdc++-v3/include/cstring:42, from ../../gcc/gcc/system.h:241, from ../../gcc/gcc/genmodes.c:21: /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk/usr/include/string.h:134:10: note: previous declaration ‘char* strsignal(int)’ 134 | char*strsignal(int __sig); | ^ make[3]: *** [build/genmodes.o] Error 1 make[2]: *** [all-stage2-gcc] Error 2 make[1]: *** [stage2-bubble] Error 2 make: *** [all] Error 2 The configure line is: configure --prefix=$HOME/irun --with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk --with-gmp=/usr/local/opt --enable-languages=all
[Bug tree-optimization/103816] [12 Regression] ICE: in vect_build_slp_tree_2, at tree-vect-slp.c:1748 since r12-1551-g3dfa4fe9f1a089b2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103816 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |NEW Known to work||11.2.0 Component|c |tree-optimization Target Milestone|--- |12.0 Known to fail||12.0 Last reconfirmed||2021-12-23 CC||marxin at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Ever confirmed|0 |1 Summary|ICE: in |[12 Regression] ICE: in |vect_build_slp_tree_2, at |vect_build_slp_tree_2, at |tree-vect-slp.c:1748|tree-vect-slp.c:1748 since ||r12-1551-g3dfa4fe9f1a089b2 --- Comment #1 from Martin Liška --- Started with r12-1551-g3dfa4fe9f1a089b2.
Re: [PATCH v2] ix86: Don't use the 'm' constraint for x86_64_general_operand
On Mon, Dec 20, 2021 at 2:22 PM H.J. Lu wrote: > > On Mon, Dec 20, 2021 at 12:38 PM Jakub Jelinek wrote: > > > > On Mon, Dec 20, 2021 at 11:44:08AM -0800, H.J. Lu wrote: > > > The problem is in > > > > > > (define_memory_constraint "TARGET_MEM_CONSTRAINT" > > > "Matches any valid memory." > > > (and (match_code "mem") > > >(match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, > > > 0), > > > MEM_ADDR_SPACE (op))"))) > > > > > > define_register_constraint allows LRA to convert the operand to the form > > > '(mem (reg X))', where X is a base register. I am testing the v2 patch > > > with > > > > If you mean replacing an immediate with a MEM containing that immediate, > > isn't that often the right thing though? > > I mean, if the register pressure is high and options are either spill some > > register, set it to immediate, use it in one instruction and then fill the > > spilled register (i.e. 2 memory loads), compared to a MEM use on the > > arithmetic instruction one MEM seems cheaper to me. With -fPIC and the > > cst needing runtime relocation slightly less so of course. > > > > We will check the performance impact on SPEC CPU 2017. > Here is the v2 patch. Liwei, can you help collect SPEC CPU 2017 > impact of the enclosed patch? Thanks. We checked SPEC CPU 2017 performance with -O2 and -Ofast. There is no performance regression. OK for master? > > The code due to ivopts is trying to have something like > > size_t a = (size_t) _list; > > size_t b = 0xffa8 - a; > > size_t c = x + b; > > and for that cst - one needs actually 2 registers, one to hold the > > constant and one to hold the (%rip) based address. > > (insn 790 789 791 111 (set (reg:DI 292) > > (const_int -88 [0xffa8])) "dl-tunables.c":304:15 76 > > {*movdi_internal} > > (nil)) > > (insn 791 790 792 111 (set (reg:DI 293) > > (symbol_ref:DI ("tunable_list") [flags 0x2] > 0x7f3460aa9cf0 tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal} > > (nil)) > > (insn 792 791 793 111 (parallel [ > > (set (reg:DI 291) > > (minus:DI (reg:DI 292) > > (reg:DI 293))) > > (clobber (reg:CC 17 flags)) > > ]) "dl-tunables.c":304:15 299 {*subdi_1} > > (nil)) > > (insn 793 792 794 111 (parallel [ > > (set (reg:DI 294) > > (plus:DI (reg:DI 291) > > (reg:DI 198 [ ivtmp.176 ]))) > > (clobber (reg:CC 17 flags)) > > ]) "dl-tunables.c":304:15 226 {*adddi_1} > > (nil)) > > It would be smarter to rewrite the above into a lea 88+tunable_list(%rip), > > %temp1 > > and use a subtraction instead of addition in the last insn above, or of > > course in the particular case even consider the following 2 instructions > > that do: > > (insn 794 793 795 111 (set (reg:DI 296) > > (symbol_ref:DI ("tunable_list") [flags 0x2] > 0x7f3460aa9cf0 tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal} > > (nil)) > > (insn 795 794 796 111 (parallel [ > > (set (reg:DI 295 [ cur ]) > > (plus:DI (reg:DI 294) > > (reg:DI 296))) > > (clobber (reg:CC 17 flags)) > > ]) "dl-tunables.c":304:15 226 {*adddi_1} > > (nil)) > > and find out that _list - _list is 0 and we don't need it at > > all. Guess we don't figure that out due to the cast of one of those > > addresses to size_t and the other one used in POINTER_PLUS_EXPR as normal > > pointer. > > > > Jakub > > > > > -- > H.J. Thanks. -- H.J.
[Bug c/103816] New: ICE: in vect_build_slp_tree_2, at tree-vect-slp.c:1748
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103816 Bug ID: 103816 Summary: ICE: in vect_build_slp_tree_2, at tree-vect-slp.c:1748 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: k.even-mendoza at imperial dot ac.uk Target Milestone: --- This program crashed with GCC-12 with -O2 and -O3: typedef enum { e } f; struct { f __attribute__((mode(__byte__))) a; f __attribute__((mode(__byte__))) b; f __attribute__((mode(__byte__))) c; f __attribute__((mode(__byte__))) d } g[]; void main() { g[0].b = (g[0].b & g[4].b) * g[2305843009213693952].c; } This code works fine with GCC-11. I tried several versions of GCC-12: 20211023, 2028, and 20211216 on Ubuntu-18. The trace looks like this: === fuzzer-file-54092.c:7:1: warning: no semicolon at end of struct or union 7 | } g[]; | ^ fuzzer-file-54092.c:7:3: warning: array ‘g’ assumed to have one element 7 | } g[]; | ^ during GIMPLE pass: slp fuzzer-file-54092.c: In function ‘main’: fuzzer-file-54092.c:8:6: internal compiler error: in vect_build_slp_tree_2, at tree-vect-slp.c:1748 8 | void main() { g[0].b = (g[0].b & g[4].b) * g[2305843009213693952].c; } | ^~~~ 0x7d2c1c vect_build_slp_tree_2 .././../gcc-source/gcc/tree-vect-slp.c:1748 0x119b89d vect_build_slp_tree .././../gcc-source/gcc/tree-vect-slp.c:1549 0x119fe30 vect_build_slp_instance .././../gcc-source/gcc/tree-vect-slp.c:3025 0x11a5966 vect_analyze_slp(vec_info*, unsigned int) .././../gcc-source/gcc/tree-vect-slp.c:3388 0x11aa0ea vect_slp_analyze_bb_1 .././../gcc-source/gcc/tree-vect-slp.c:5762 0x11aa0ea vect_slp_region .././../gcc-source/gcc/tree-vect-slp.c:5864 0x11aa0ea vect_slp_bbs .././../gcc-source/gcc/tree-vect-slp.c:6056 0x11ac131 vect_slp_function(function*) .././../gcc-source/gcc/tree-vect-slp.c:6144 0x11b36e2 execute .././../gcc-source/gcc/tree-vectorizer.c:1503 Please submit a full bug report,
[Bug c++/52830] ICE: "canonical types differ for identical types ..." when attempting SFINAE with member type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52830 Patrick Palka changed: What|Removed |Added Target Milestone|--- |12.0 Resolution|--- |FIXED CC||ppalka at gcc dot gnu.org Status|REOPENED|RESOLVED --- Comment #15 from Patrick Palka --- (In reply to Jason Merrill from comment #14) > This seems to be fixed on trunk. Looks like ever since r12-3766, which also removed the dg-ice from constexpr-52830.C, so I suppose we can close this PR then.
[Bug middle-end/57955] [9/10/11/12 Regression] Uniquization of constants reduces alignment of initializers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57955 --- Comment #26 from David Edelsohn --- As Bill mentioned, one can increase the alignment of a large constant, but there is no way for the hooks that set alignment to recognize that the constant will be assigned to variable with stricter alignment.
[Bug ada/79724] GNAT tools do not respect --program-suffix and --program-prefix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79724 --- Comment #4 from Francois-Xavier Coudert --- The current situation is the result of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=864 Comment 20 by Dave Korn (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=864#c20) is spot on: > ... means that it only recognizes a suffix if there is also a prefix, i.e. it > only > works for cross-compilers. The documentation suggests this is deliberate: > [...] > but why? The native behaviour is wrong and it seems incorrect to me that > it should have different semantics from the cross-compiler case. As far as I can see, this comment from 2011 was never addressed, and the bug was closed. This is what is biting us, this logic needs to be improved. The simplest patch is to detect suffix in all cases, as far as I can see one only needs to remove the test: diff --git a/gcc/ada/osint.adb b/gcc/ada/osint.adb index cf39128fb7b..9a578a62273 100644 --- a/gcc/ada/osint.adb +++ b/gcc/ada/osint.adb @@ -2286,9 +2286,7 @@ package body Osint is end if; end loop; - if End_Of_Prefix > 1 then - Start_Of_Suffix := End_Of_Prefix + Prog'Length + 1; - end if; + Start_Of_Suffix := End_Of_Prefix + Prog'Length + 1; -- Create the new program name And with this patch I get the expected result: $ gnatmake-11 hello.adb gcc-11 -c hello.adb gnatbind-11 -x hello.ali gnatlink-11 hello.ali
[Bug ada/79724] GNAT tools do not respect --program-suffix and --program-prefix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79724 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #3 from Eric Gallager --- MacPorts also configures with --program-suffix so if we ever wanted to add ada support there, this would block that, too.
[Bug c/48110] "fast" and "g" should be aliases of "Ofast" and "Og" inside optimize attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48110 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org, ||roger at nextmovesoftware dot com --- Comment #5 from Eric Gallager --- Hm, I wonder if this affects the new -Oz switch, too...
[Bug c/103815] New: Misoptimization of a bounded do/while loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103815 Bug ID: 103815 Summary: Misoptimization of a bounded do/while loop Product: gcc Version: og10 (devel/omp/gcc-10) Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: matthias at urlichs dot de Target Milestone: --- This code, which calculates an integer square root ..: #include uint16_t int_sqrt32(uint32_t x) { uint16_t res=0; uint16_t add= 0x8000; do { uint16_t temp=res | add; uint32_t g2=temp*temp; if (x>=g2) res=temp; add>>=1; } while(add); return res; } ... should be compileable 1:1, since the right shift sets the condition flags appropriately. Unfortunately, GCC's optimizer notices that this is a 16-step loop, "helpfully" invents a loop counter, and pessimizes the code to this sub-optimal result (ARM Thumb output; x86 has essentially the same problem): 0: b500push{lr} 2: 2110movsr1, #16 4: 4686mov lr, r0 6: f44f 4200 mov.w r2, #32768 ; 0x8000 a: 2000movsr0, #0 c: ea40 0302 orr.w r3, r0, r2 10: 0852lsrsr2, r2, #1 12: b29buxthr3, r3 14: fb03 fc03 mul.w ip, r3, r3 18: 45f4cmp ip, lr 1a: bf98it ls 1c: 4618movls r0, r3 1e: 3901subsr1, #1 20: d1f4bne.n c 22: f85d fb04 ldr.w pc, [sp], #4
[Bug c++/92944] [concepts] redefinition error when using constrained structure template inside namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92944 Patrick Palka changed: What|Removed |Added CC||florin at iucha dot net --- Comment #3 from Patrick Palka --- *** Bug 103809 has been marked as a duplicate of this bug. ***
[Bug c++/103809] wrong reporting of (template) struct redefinition when doing a more constrained template outside of the namespace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103809 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Patrick Palka --- dup *** This bug has been marked as a duplicate of bug 92944 ***
Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
On Tue, 21 Dec 2021 at 17:56, François Dumont via Libstdc++ < libstd...@gcc.gnu.org> wrote: > On 21/12/21 7:28 am, Daniel Krügler wrote: > > Am Di., 21. Dez. 2021 um 07:08 Uhr schrieb François Dumont via > > Libstdc++ : > >> Hi > >> > >> Is there a chance for this patch to be integrated for next gcc > >> release ? > >> > >> François > >> > > No counterargument for the acceptance, but: Shouldn't > > __small_size_threshold() be a noexcept function? > > > > - Daniel > > Could it enhance code generation ? I could make it depends on > _Hashtable_hash_traits<>::__small_size_threshold() noexcept > qualification if so. But I was hoping that the compiler to detect all > that itself. > > Otherwise no, it do not have to be noexcept as it is used to avoid > hasher invocation in some situations and hasher is not noexcept > constraint. At least I do not need to static_assert this. > > But why not make it noexcept? It just returns a constant integer. It can be noexcept.
[Bug target/103773] wrong code at -Oz due to sign extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103773 --- Comment #5 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:ef26c151c14a87177d46fd3d725e7f82e040e89f commit r12-6106-gef26c151c14a87177d46fd3d725e7f82e040e89f Author: Roger Sayle Date: Thu Dec 23 12:33:07 2021 + x86: PR target/103773: Fix wrong-code with -Oz from pop to memory. This is a fix to PR target/103773 where -Oz shouldn't use push/pop on x86 to shrink writing small integer constants to memory. Instead clang uses "andl $0, mem" for writing zero, and "orl $-1, mem" when writing -1 to memory when using -Oz. This patch implements this via peephole2 where we can confirm that its ok to clobber the flags. 2021-12-23 Roger Sayle Uroš Bizjak gcc/ChangeLog PR target/103773 * config/i386/i386.md (*mov_and): New define_insn for writing a zero to memory using AND. (*mov_or): Extend to allow memory destination and HImode. (*movdi_internal): Remove -Oz push/pop optimization from here. (*movsi_internal): Likewise. (peephole2): Perform -Oz push/pop optimization here, only for register destinations, values other than zero, and in functions that don't used the red zone. (peephole2): With -Oz, convert writes of 0 or -1 to memory into their clobber forms, i.e. *mov_and and *mov_or resp. gcc/testsuite/ChangeLog PR target/103773 * gcc.target/i386/pr103773-2.c: New test case. * gcc.target/i386/pr103773.c: New test case.
Re: New ThreadSanitizer runtime (v3)
On Thu, 23 Dec 2021 at 13:10, Martin Liška wrote: > >> On 11/22/21 20:01, Dmitry Vyukov wrote: > >>> I've already reverted the change. So I will include a fix into the next > >>> version. > >>> Thanks for notifying. > >> > >> Hello. > >> > >> Am I correct that the patch set is installed again? Any near future plans > >> for another > >> revert of the patch? Do you think it's the right time to merge it to GCC? > > > > It has been re-landed at least twice since then :) > > Well, I can't promise that it won't be reverted again. I obviously do > > not want that. Hard to say. I need to send some follow up clean up > > patches and I plan to wait till the end of the week (to avoid messy > > multi commit reverts). > > So if there are no deadlines to miss, I would suggest waiting until > > the beginning of the next week. > > Hello. > > May I please ask about the status of TSANv3? Note we'll flip to stage4 stage > in about > 3 weeks and so I'm curious if we want to do one more merge from upstream? > > @Jakub: What do you think about it? Hi Martin, I re-laned it last time 10 days ago: b332134921b4 Mon, 13 Dec 2021 12:48:34 +0100 tsan: new runtime (v3) at this point it wasn't yet reverted... Few more issues were fixed and it now survived testing in Chromium. I hope it won't be reverted anymore... but can't promise (it's not me who revert it :)) I would say if we wait a few more days maybe, it's reasonably safe to assume it sticks.
Re: New ThreadSanitizer runtime (v3)
On 11/30/21 05:17, Dmitry Vyukov wrote: On Mon, 29 Nov 2021 at 19:16, Martin Liška wrote: On 11/22/21 20:01, Dmitry Vyukov wrote: I've already reverted the change. So I will include a fix into the next version. Thanks for notifying. Hello. Am I correct that the patch set is installed again? Any near future plans for another revert of the patch? Do you think it's the right time to merge it to GCC? It has been re-landed at least twice since then :) Well, I can't promise that it won't be reverted again. I obviously do not want that. Hard to say. I need to send some follow up clean up patches and I plan to wait till the end of the week (to avoid messy multi commit reverts). So if there are no deadlines to miss, I would suggest waiting until the beginning of the next week. Hello. May I please ask about the status of TSANv3? Note we'll flip to stage4 stage in about 3 weeks and so I'm curious if we want to do one more merge from upstream? @Jakub: What do you think about it? Cheers, Martin
Re: [PATCH] libsanitizer: Fix setbuffer() interceptor (accept size not mode)
On 12/22/21 22:09, Azat Khuzhin via Gcc-patches wrote: So what is the right way here? - migrate all tests - write test only for setbuffer() - do not add any tests, since they are covered in llvm repo Hello. Yes, we don't automatically sync sanitizer tests when we merge from master. Historically, we have taken some upstream tests, but not all. Problem is that one needs to migrate (port) LIT markup to DejaGNU format so that it can be supported in the GCC test-suite. Cheers, Martin
[Bug c++/98662] checking ICE in friend_accessible_p since r227023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98662 Martin Liška changed: What|Removed |Added CC||jason at gcc dot gnu.org --- Comment #2 from Martin Liška --- Started with r6-2860-g7ac2c0bd17900b3c.
[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #17 from Uroš Bizjak --- (In reply to hubicka from comment #16) > > > > > > It could be done, but I was under impression that the sequence to load > > > 1.0f > > > into topmost elements nullifies the benefit of operation to divide two > > > > Sure, so perhaps we should somewhat increase the vectorization cost of > > V2SFmode > > division so that we would use it only if it is part of longer sequences? > > I wonder how the hardware implements it. If divps is of similar latency > as divss then I guess it is essentially always win to load 1.0 to the > upper part, since it is slow operation. On the other hand if divps is > about 4 times divss, then this may be harmful. > > Agner Fog seems to be listing divss and divps with same latencies. > For zen it is 10 cycles which should be enough to do the setup. OK, I'll prepare and test a formal patch.
[Bug middle-end/57955] [9/10/11/12 Regression] Uniquization of constants reduces alignment of initializers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57955 --- Comment #25 from Eric Botcazou --- > Before the gimplification change the initializer {1,} was promoted to a > static const and given an alignment of 128; due to this part of the code: > > if (align > DECL_ALIGN (new_tree)) > { > DECL_ALIGN (new_tree) = align; > DECL_USER_ALIGN (new_tree) = 1; > } > > But now it just uses DATA_ALIGNMENT (the code should be using > TARGET_CONSTANT_ALIGNMENT but does not right now, that was a proposal). Yes, exactly. The initializer is put into the "tree" constant pool, where its alignment is uniformly set to DATA_ALIGNMENT.
[Bug ada/79724] GNAT tools do not respect --program-suffix and --program-prefix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79724 --- Comment #2 from Francois-Xavier Coudert --- It's puzzling, because although I've never read any Ada code, I can see there is some logic in place that should deal with it. In make.adb there is: Gcc : String_Access := Program_Name ("gcc", "gnatmake"); Gnatbind : String_Access := Program_Name ("gnatbind", "gnatmake"); Gnatlink : String_Access := Program_Name ("gnatlink", "gnatmake"); and in osint.adb the function Program_Name has some logic to find prefix and suffix: -- Find the target prefix if any, for the cross compilation case. -- For instance in "powerpc-elf-gcc" the target prefix is -- "powerpc-elf-" -- Ditto for suffix, e.g. in "gcc-4.1", the suffix is "-4.1" and it uses them to return the program that is going to be used: -- Create the new program name return new String' (Name_Buffer (Start_Of_Prefix .. End_Of_Prefix) & Nam & Name_Buffer (Start_Of_Suffix .. Name_Len)); I tried to debug this by sticking some Ada.Text_IO.Put_Line calls, but either I'm doing it wrong, or it's never actually called.
[Bug ada/79724] GNAT tools do not respect --program-suffix and --program-prefix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79724 Francois-Xavier Coudert changed: What|Removed |Added Ever confirmed|0 |1 CC||charlet at adacore dot com, ||derodat at adacore dot com, ||fxcoudert at gcc dot gnu.org Summary|please respect calling gnat |GNAT tools do not respect |tools configured with |--program-suffix and |--program-suffix and|--program-prefix |--program-prefix| Last reconfirmed||2021-12-23 Status|UNCONFIRMED |NEW --- Comment #1 from Francois-Xavier Coudert --- Confirmed. This is blocking Ada integration into Homebrew: https://github.com/Homebrew/homebrew-core/pull/77641 We configure GCC with: configure --prefix=/tmp/irun --enable-languages=all --program-suffix=-11 This leads to the installation of suffixed gnat tools: meau /tmp/irun $ ls bin/gnat* bin/gnat-11 bin/gnatchop-11 bin/gnatkr-11bin/gnatls-11 bin/gnatname-11 bin/gnatbind-11 bin/gnatclean-11 bin/gnatlink-11 bin/gnatmake-11 bin/gnatprep-11 But they're not usable: meau /tmp/irun $ gnatmake-11 hello.adb gcc -c hello.adb clang: error: unknown argument: '-gnatea' clang: error: unknown argument: '-gnatez' gnatmake-11: "hello.adb" compilation error because gnatmake-11 is calling the unsuffixed `gcc`, which does not support Ada. (On macOS, this system compiler is a wrapper to clang.) Even if I try to add some options to gnatmake, it will still somehow fail, because the relevant options are not passed down to gnatlink: meau /tmp/irun $ gnatmake-11 hello.adb --GCC=gcc-11 --GNATBIND=gnatbind-11 --GNATLINK=gnatlink-11 gcc-11 -c hello.adb gnatbind-11 -x hello.ali gnatlink-11 hello.ali clang: error: unknown argument: '-gnatA' clang: error: unknown argument: '-gnatWb' clang: error: unknown argument: '-gnatiw' clang: error: unknown argument: '-gnatws' gnatmake-11: *** link failed.
[Bug tree-optimization/98552] Make more use of __builtin_undefined for assuring that variables do not change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98552 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2021-12-23 --- Comment #5 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/98598] Missed opportunity to optimize dependent loads in loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98598 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug c++/98662] checking ICE in friend_accessible_p since r227023
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98662 --- Comment #1 from Andrew Pinski --- Before GCC 6, we accepted the code.
[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- > > > > It could be done, but I was under impression that the sequence to load 1.0f > > into topmost elements nullifies the benefit of operation to divide two > > Sure, so perhaps we should somewhat increase the vectorization cost of > V2SFmode > division so that we would use it only if it is part of longer sequences? I wonder how the hardware implements it. If divps is of similar latency as divss then I guess it is essentially always win to load 1.0 to the upper part, since it is slow operation. On the other hand if divps is about 4 times divss, then this may be harmful. Agner Fog seems to be listing divss and divps with same latencies. For zen it is 10 cycles which should be enough to do the setup.
Re: [PATCH take #3] PR target/103773: Fix wrong-code with -Oz from pop to memory.
On Thu, Dec 23, 2021 at 10:35 AM Roger Sayle wrote: > > Hi Uros, > > A huge thanks for the list of suggested improvements to the -Oz related > patches. > I've combined them altogether in the submission below, which makes sense now > that everything is implemented using peephole2. The implementation of > push/pop via peephole2 is exactly as you've suggested, also checking that the > immediate value isn't zero (the value -1 is still a size win over OR), and > extended > to include HImode (where it is a win), but not QImode (where it isn't). > > For writes to memory, I've extended *mov_or to allow memory destinations > and HImode, but I've introduced a new *mov_and for writing zero to > memory, > rather than complicate/overload *mov_xor (for example, it doesn't take > an > immediate). In this form, only a single peephole2 is needed, that adds a > clobber to > the instruction if the flags are dead. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures, and the new testcase checked > both with and without -m32. Ok for mainline? > > > 2021-12-23 Roger Sayle > Uroš Bizjak > > gcc/ChangeLog > PR target/103773 > * config/i386/i386.md (*mov_and): New define_insn for > writing a zero to memory using AND. > (*mov_or): Extend to allow memory destination and HImode. > (*movdi_internal): Remove -Oz push/pop optimization from here. > (*movsi_internal): Likewise. > (peephole2): Perform -Oz push/pop optimization here, only for > register destinations, values other than zero, and in functions > that don't used the red zone. > (peephole2): With -Oz, convert writes of 0 or -1 to memory into > their clobber forms, i.e. *mov_and and *mov_or resp. > > gcc/testsuite/ChangeLog > PR target/103773 > * gcc.target/pr103773-2.c: New test case. > * gcc.target/pr103773.c: New test case. OK, but please add a small comment above new peephole2 patterns. Thanks, Uros.
[Bug c++/91008] error redeclaring the same type involving a non-type template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91008 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |11.0 --- Comment #5 from Andrew Pinski --- (In reply to Martin Liška from comment #4) > Fixed with r11-8207-g89c863488bc8c731. That patch is fixing this testcase explictly.
[PATCH take #3] PR target/103773: Fix wrong-code with -Oz from pop to memory.
Hi Uros, A huge thanks for the list of suggested improvements to the -Oz related patches. I've combined them altogether in the submission below, which makes sense now that everything is implemented using peephole2. The implementation of push/pop via peephole2 is exactly as you've suggested, also checking that the immediate value isn't zero (the value -1 is still a size win over OR), and extended to include HImode (where it is a win), but not QImode (where it isn't). For writes to memory, I've extended *mov_or to allow memory destinations and HImode, but I've introduced a new *mov_and for writing zero to memory, rather than complicate/overload *mov_xor (for example, it doesn't take an immediate). In this form, only a single peephole2 is needed, that adds a clobber to the instruction if the flags are dead. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures, and the new testcase checked both with and without -m32. Ok for mainline? 2021-12-23 Roger Sayle Uroš Bizjak gcc/ChangeLog PR target/103773 * config/i386/i386.md (*mov_and): New define_insn for writing a zero to memory using AND. (*mov_or): Extend to allow memory destination and HImode. (*movdi_internal): Remove -Oz push/pop optimization from here. (*movsi_internal): Likewise. (peephole2): Perform -Oz push/pop optimization here, only for register destinations, values other than zero, and in functions that don't used the red zone. (peephole2): With -Oz, convert writes of 0 or -1 to memory into their clobber forms, i.e. *mov_and and *mov_or resp. gcc/testsuite/ChangeLog PR target/103773 * gcc.target/pr103773-2.c: New test case. * gcc.target/pr103773.c: New test case. Many thanks again for your help. Roger -- > -Original Message- > From: Uros Bizjak > Sent: 22 December 2021 15:24 > To: Roger Sayle > Subject: Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to > memory. > > On Wed, Dec 22, 2021 at 3:19 PM Uros Bizjak wrote: > > > > On Wed, Dec 22, 2021 at 2:57 PM Uros Bizjak wrote: > > > > > > On Wed, Dec 22, 2021 at 2:12 PM Roger Sayle > wrote: > > > > > > > > > > > > Hi Uros, > > > > I'm bootstrapping and regression testing your proposed patch now > > > > (including the removal/reversion of my pieces in *mov[sd]i2_internal). > > > > Many thanks for all of your help with this. > > > > > > Probably you want to avoid transformation of loads of 0 and -1, > > > which should still be implemented via xor %reg, %ref and or $-1, %eax. > > > > This constraint will result in optimal conversion approach: > > > > + "optimize_insn_for_size_p () && optimize_size > 1 > > + && operands[1] != const0_rtx && operands[1] != constm1_rtx > > + && IN_RANGE (INTVAL (operands[1]), -128, 127) > > + && !ix86_red_zone_used" > > I think we should also convert HImode and QImode initializations, the pattern > supports it by changing the mode iterator to SWI. > > Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 58b1064..b709a3e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2028,9 +2028,19 @@ (set_attr "mode" "SI") (set_attr "length_immediate" "0")]) +(define_insn "*mov_and" + [(set (match_operand:SWI248 0 "memory_operand" "=m") + (match_operand:SWI248 1 "const0_operand")) + (clobber (reg:CC FLAGS_REG))] + "reload_completed" + "and{}\t{%1, %0|%0, %1}" + [(set_attr "type" "alu1") + (set_attr "mode" "") + (set_attr "length_immediate" "1")]) + (define_insn "*mov_or" - [(set (match_operand:SWI48 0 "register_operand" "=r") - (match_operand:SWI48 1 "constm1_operand")) + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm") + (match_operand:SWI248 1 "constm1_operand")) (clobber (reg:CC FLAGS_REG))] "reload_completed" "or{}\t{%1, %0|%0, %1}" @@ -2218,14 +2228,7 @@ case TYPE_IMOV: gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); if (get_attr_mode (insn) == MODE_SI) - { - if (optimize_size > 1 - && TARGET_64BIT - && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) - return "push{q}\t%1\n\tpop{q}\t%0"; - return "mov{l}\t{%k1, %k0|%k0, %k1}"; - } + return "mov{l}\t{%k1, %k0|%k0, %k1}"; else if (which_alternative == 4) return "movabs{q}\t{%1, %0|%0, %1}"; else if (ix86_use_lea_for_mov (insn, operands)) @@ -2443,14 +2446,6 @@ gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); if (ix86_use_lea_for_mov (insn, operands)) return "lea{l}\t{%E1, %0|%0, %E1}"; - else if (optimize_size > 1 - && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) - { - if (TARGET_64BIT) - return
[Bug c++/98523] Bug with class static definition and non-type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98523 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Known to fail||4.1.2 Last reconfirmed||2021-12-23 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug c++/98450] Inconsistent Wunused-variable warning for std::array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98450 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Known to fail||12.0 Ever confirmed|0 |1 Last reconfirmed||2021-12-23 --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug libstdc++/102221] Missed optimizations for algorithms over std::unique_ptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102221 Andrew Pinski changed: What|Removed |Added Keywords||alias Severity|normal |enhancement
[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #15 from Jakub Jelinek --- (In reply to Uroš Bizjak from comment #12) > (In reply to Jakub Jelinek from comment #10) > > At least on your short testcase clang doesn't use divps either. > > We do support mulv2sf3, addv2sf3 etc. but not divv2sf3 I bet because with > > TARGET_MMX_WITH_SSE it would divide by zero in the 3rd and 4th elts, > > but perhaps we could insert 1.0f, 1.0f into those elements of the divisor > > before using divps? > > It could be done, but I was under impression that the sequence to load 1.0f > into topmost elements nullifies the benefit of operation to divide two Sure, so perhaps we should somewhat increase the vectorization cost of V2SFmode division so that we would use it only if it is part of longer sequences?
[Bug c++/103524] [meta-bug] modules issue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524 Bug 103524 depends on bug 103814, which changed state. Bug 103814 Summary: Internal error while compiling concepts, exception and fstream modules. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103814 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/67491] [meta-bug] concepts issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67491 Bug 67491 depends on bug 103814, which changed state. Bug 103814 Summary: Internal error while compiling concepts, exception and fstream modules. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103814 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/99244] [modules] ICE in tsubst_copy, at cp/pt.c:16581
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99244 Andrew Pinski changed: What|Removed |Added CC||samuel.hangouet at gmail dot com --- Comment #4 from Andrew Pinski --- *** Bug 103814 has been marked as a duplicate of this bug. ***
[Bug c++/103814] Internal error while compiling concepts, exception and fstream modules.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103814 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Dup of bug 99244. *** This bug has been marked as a duplicate of bug 99244 ***
[PATCH V2] fixed testcase riscv/pr103302.c
From: LiaoShihua because riscv32 not support __int128, so skip if int128 not support. gcc/testsuite\ChangeLog: * gcc.target/riscv/pr103302.c: skip if int128 not support --- gcc/testsuite/gcc.target/riscv/pr103302.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/riscv/pr103302.c b/gcc/testsuite/gcc.target/riscv/pr103302.c index 822c4087416..cfaa47c 100644 --- a/gcc/testsuite/gcc.target/riscv/pr103302.c +++ b/gcc/testsuite/gcc.target/riscv/pr103302.c @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { target int128 } } */ /* { dg-options "-Og -fharden-compares -fno-tree-dce -fno-tree-fre " } */ typedef unsigned char u8; -- 2.31.1.windows.1
[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #14 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #13) > Created attachment 52051 [details] > Patch that implements v2sf division This patch also enables vectorization of the testcase from Comment #7. Using -ffast-math, it also generates vectorized reciprocal: movss f(%rip), %xmm4 movss test+8(%rip), %xmm3 movqtest(%rip), %xmm2 mulss %xmm4, %xmm3 movaps %xmm4, %xmm0 shufps $0xe0, %xmm0, %xmm0 mulps %xmm0, %xmm2 movhps .LC0(%rip), %xmm0 --> rcpps %xmm0, %xmm1 sqrtss %xmm3, %xmm3 mulps %xmm1, %xmm0 sqrtps %xmm2, %xmm2 divss %xmm4, %xmm3 movaps %xmm2, %xmm5 mulps %xmm1, %xmm0 addps %xmm1, %xmm1 subps %xmm0, %xmm1 mulps %xmm1, %xmm5 movlps %xmm5, test(%rip) movss %xmm3, test+8(%rip) ret
[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #13 from Uroš Bizjak --- Created attachment 52051 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52051=edit Patch that implements v2sf division Please try the attached patch, for the following testcase: --cut here-- float a[2], b[2], r[2]; void bar (void) { int i; for (i = 0; i < 2; i++) r[i] = a[i] / b[i]; } --cut here-- the compiler generates: movqb(%rip), %xmm1 movqa(%rip), %xmm0 movhps .LC0(%rip), %xmm1 divps %xmm1, %xmm0 movlps %xmm0, r(%rip) ret
[Bug c++/103814] New: Internal error while compiling concepts, exception and fstream modules.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103814 Bug ID: 103814 Summary: Internal error while compiling concepts, exception and fstream modules. Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: samuel.hangouet at gmail dot com Target Milestone: --- To reproduce, just type: rm -rf ./gcm.cache g++-11 -fmodules-ts -std=c++20 -x c++-system-header concepts g++-11 -fmodules-ts -std=c++20 -x c++-system-header exception g++-11 -fmodules-ts -std=c++20 -x c++-system-header fstream Here is the resulting error message: In file included from /usr/include/c++/11/bits/nested_exception.h:40, from /usr/include/c++/11/exception:148, of module /usr/include/c++/11/exception, imported at /usr/include/c++/11/ios:39, included from /usr/include/c++/11/istream:38, from /usr/include/c++/11/fstream:38: /usr/include/c++/11/bits/move.h: In instantiation of ‘constexpr std::_Require >, std::is_move_constructible<_Tp>, std::is_move_assignable<_Tp> > std::swap(_Tp&, _Tp&) [with _Tp = _IO_FILE*; std::_Require >, std::is_move_constructible<_Tp>, std::is_move_assignable<_Tp> > = void]’: /usr/include/x86_64-linux-gnu/c++/11/bits/basic_file.h:79:11: required from here /usr/include/c++/11/bits/move.h:204:19: internal compiler error: in tsubst_copy, at cp/pt.c:16660 204 | _Tp __tmp = _GLIBCXX_MOVE(__a); | ^ 0xe306b3 internal_error(char const*, ...) ???:0 0xe27039 fancy_abort(char const*, int, char const*) ???:0 0x10edb0c tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ???:0 0x10ee05d tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ???:0 0x10eefae tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ???:0 0x1178f37 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool) ???:0 0x117927d tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool) ???:0 0x1178ffe tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool) ???:0 0x1178f95 tsubst_expr(tree_node*, tree_node*, int, tree_node*, bool) ???:0 0x122150a instantiate_decl(tree_node*, bool, bool) ???:0 0xfa3850 instantiate_pending_templates(int) ???:0 0xf9f8f3 c_parse_final_cleanups() ???:0 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See for instructions. I'm using standard g++-11 shipped with ubuntu 20.04 : $ g++-11 -v Using built-in specs. COLLECT_GCC=g++-11 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.1.0-1ubuntu1~20.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --disable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-2V7zgg/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-2V7zgg/gcc-11-11.1.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.1.0 (Ubuntu 11.1.0-1ubuntu1~20.04)
Re: Inconsistent segmentation fault in GCC
On Thu, 23 Dec 2021, 06:07 Alessandro Baretta via Gcc, wrote: > > How I might help diagnose and fix this bug? For instance, how does one > run gcc from inside gdb? I know that gcc is just the driver and > cc1plus is the actual compiler, so I presume I'd have to run cc1plus > inside gdb, but as far as I know cc1plus cannot be called directly. > That's not true, it can be run directly. The full cc1plus command is shown if you add -v to your GCC command. https://gcc.gnu.org/wiki/DebuggingGCC has some more tips.
[Bug middle-end/103813] [11/12 Regression] Crash in decompose, at wide-int.h:984 fold-const since r11-5271-g4866b2f5db117f9e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103813 Martin Liška changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||marxin at gcc dot gnu.org Summary|[11/12 Regression] Crash in |[11/12 Regression] Crash in |decompose, at |decompose, at |wide-int.h:984 fold-const |wide-int.h:984 fold-const ||since ||r11-5271-g4866b2f5db117f9e --- Comment #4 from Martin Liška --- Started with r11-5271-g4866b2f5db117f9e.
[Bug target/103808] [12 Regression] '-fcompare-debug' failure (length) w/ -O2 -ftrapv since r12-5944-ga7acb6dca941db2b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103808 Martin Liška changed: What|Removed |Added Keywords|needs-bisection | Summary|[12 Regression] |[12 Regression] |'-fcompare-debug' failure |'-fcompare-debug' failure |(length) w/ -O2 -ftrapv |(length) w/ -O2 -ftrapv ||since ||r12-5944-ga7acb6dca941db2b CC||marxin at gcc dot gnu.org, ||vmakarov at gcc dot gnu.org --- Comment #4 from Martin Liška --- Example from #c3 started to fail with r12-5944-ga7acb6dca941db2b.
[Bug ipa/103786] Suspicious code in verify_type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103786 Martin Liška changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Martin Liška --- Fixed.
[Bug ipa/103786] Suspicious code in verify_type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103786 --- Comment #2 from CVS Commits --- The master branch has been updated by Martin Liska : https://gcc.gnu.org/g:9ac0730c25b357b5fc75e18677cec27a546c1b64 commit r12-6104-g9ac0730c25b357b5fc75e18677cec27a546c1b64 Author: Feng Xue Date: Tue Dec 21 09:48:16 2021 +0100 Fix typo in type verification. PR ipa/103786 gcc/ChangeLog: * tree.c (verify_type): Fix typo.
[Bug tree-optimization/88842] missing optimization CSE, reassociation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88842 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement