[Bug target/100711] Miss optimization for pandn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711 --- Comment #9 from Hongtao.liu --- (In reply to jbeulich from comment #8) > Since the commit doesn't really explain it (maybe it's obvious to others, > but it isn't to me), may I ask why two splitters were introduced, yet then > still not covering all possible modes? VI48_128 only covers two of the four > possible SSE2 modes, while VI124_AVX2 leaves out all DI-element-size ones as > well as all 512-bit ones. Shouldn't both be folded, using VI_AVX2 as the > mode iterator? We don't have single instruction for V8HI/V16QImode broadcast without AVX2, that's why the first splitter only have VI48_128. And yes, for the second splitter, I think we should use VI_AVX2 to cover all modes. > > As an aside, it is also interesting that the 1st splitter uses TARGET_SSE > without the corresponding testcase limiting itself to just SSE. When > building that testcase with SSE2 turned off, foo() uses shufps and andnps as > expected, but the splitter doesn't appear to come into play at all for > bar(), when really it is only the broadcast that needs synthesizing, while > andnps can be used regardless of mode.
[Bug fortran/87270] "FINAL" subroutine is called when compiled with "gfortran -O1", but not "gfortran -O0"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87270 Paul Thomas changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED --- Comment #7 from Paul Thomas --- Hi Harald, I don't know how I missed this in my final clean-up. Closing Paul
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #10 from Andrew Pinski --- (In reply to Alexander Monakov from comment #8) > I think the following testcase indicates that GCC assumes that tail padding > is accessible: Well it aligned accesses are always accessable the alignment of `struct S` in this case is 4 byte aligned after all.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #9 from Martin Uecker --- Clang as well, but that would be only padding inside the first part without taking into account extra element in the FAM. I am more concert about programmers using the formula sizeof(.) + n * sizeof for memcpy etc. (and we have an example in the standard using this formula). Creating objects smaller than this seems a bit dangerous.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #8 from Alexander Monakov --- (In reply to jos...@codesourcery.com from comment #6) > For the standard, dynamically allocated case, you should only need to > allocate enough memory to contain the initial part of the struct and the > array members being accessed - not any padding after that array. (There > were wording problems before C99 TC2; see DR#282.) I think the following testcase indicates that GCC assumes that tail padding is accessible: struct S { int i; char c; char fam[]; }; void f(struct S *p, struct S *q) { *p = *q; } f: movq(%rsi), %rax movq%rax, (%rdi) ret Sorry for the tangential remark, but there seems to be a contradiction.
[Bug fortran/90504] Improved NORM2 algorithm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90504 --- Comment #2 from Janne Blomqvist --- (In reply to anlauf from comment #1) > (In reply to Janne Blomqvist from comment #0) > > Hanson, Hopkins, Remark on Algorithm 539: A Modern Fortran Reference > > Implementation for Carefully Computing the Euclidean Norm, > > https://dl.acm.org/citation.cfm?id=3134441 > > > > Above article tests different algorithms for NORM2 and tests performance and > > numerical accuracy. > > This article is behind a paywall. > > Is there a publicly available description? https://kar.kent.ac.uk/67205/1/remark.pdf (Found via the https://unpaywall.org/ browser extension)
[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959 --- Comment #4 from Andrew Pinski --- Note the underlaying issue with VRP is similar to PR 109959 but it is about a slightly different optimization though.
[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948 --- Comment #5 from Rimvydas (RJ) --- (In reply to anlauf from comment #4) > Can you check if this works for you? This patch allows to avoid issue on all other associate use cases (tried on gcc-13 branch). However it is a bit suspicious that using variable name abbreviations (to dig out arrays from deeply nested types) is enough to change how the internal gfc_array_ref is populated. ICE was triggered only on patterns involving first using abbreviated name indexed access (like k(1)) followed by any operation involving whole array.
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 --- Comment #4 from Andrew Pinski --- I happened to notice this because I am working on a match patch that transform `a ? 1 : b` into `a | b`. In the case of stmt_can_terminate_bb_p, I noticed we had: [local count: 330920071]: _48 = MEM[(const struct gasm *)t_22(D)].D.129035.D.128905.D.128890.subcode; _49 = _48 & 2; if (_49 != 0) goto ; [34.00%] else goto ; [66.00%] [local count: 218407246]: _50 = (bool) _48; [local count: 940291388]: # _13 = PHI <0(14), _50(32), _12(29), 0(11), 0(30), 1(2), 1(31), 0(25)> And the patch to match would do: [local count: 330920071]: _48 = MEM[(const struct gasm *)t_22(D)].D.129035.D.128905.D.128890.subcode; _49 = _48 & 2; _50 = (bool) _48; _127 = _49 != 0; _44 = _50 | _127; [local count: 940291388]: # _13 = PHI <0(14), 0(25), _12(29), 0(11), 0(30), 1(2), _44(31)> Which is definitely better than before but I was like isn't that the same as: _49 = _48 & 3; _44 = _49 != 0;
[Bug target/100106] [10 Regression] ICE in gen_movdi, at config/arm/arm.md:6187 since r10-2840-g70cdb21e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100106 --- Comment #10 from CVS Commits --- The master branch has been updated by Alexandre Oliva : https://gcc.gnu.org/g:d6b756447cd58bcca20e6892790582308b869817 commit r14-1187-gd6b756447cd58bcca20e6892790582308b869817 Author: Alexandre Oliva Date: Wed May 24 03:07:56 2023 -0300 [PR100106] Reject unaligned subregs when strict alignment is required The testcase for pr100106, compiled with optimization for 32-bit powerpc -mcpu=604 with -mstrict-align expands the initialization of a union from a float _Complex value into a load from an SCmode constant pool entry, aligned to 4 bytes, into a DImode pseudo, requiring 8-byte alignment. The patch that introduced the testcase modified simplify_subreg to avoid changing the MEM to outermode, but simplify_gen_subreg still creates a SUBREG or a MEM that would require stricter alignment than MEM's, and lra_constraints appears to get confused by that, repeatedly creating unsatisfiable reloads for the SUBREG until it exceeds the insn count. Avoiding the unaligned SUBREG, expand splits the DImode dest into SUBREGs and loads each SImode word of the constant pool with the proper alignment. for gcc/ChangeLog PR target/100106 * emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that requires stricter alignment than MEM's. for gcc/testsuite/ChangeLog PR target/100106 * gcc.target/powerpc/pr100106-sa.c: New.
[Bug target/109933] __atomic_test_and_set is broken for BIG ENDIAN riscv targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109933 --- Comment #9 from Rory Bolt --- Created attachment 55153 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55153&action=edit patch Tested fix for big endian, NOT tested on little endian
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 Andrew Pinski changed: What|Removed |Added Ever confirmed|1 |0 Status|ASSIGNED|UNCONFIRMED Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #3 from Andrew Pinski --- Nope not working, even tried to figure out how to modify tree-ssa-reassoc.cc to teach it about `(bool)a` being the same as `(a & 1) != 0` But I could not figure out how.
[Bug c++/109961] auto assigned from requires and lambda inside
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109961 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Summary|storage size of 'variable |auto assigned from requires |name' isn't known |and lambda inside Keywords||c++-lambda, rejects-valid Last reconfirmed||2023-05-25 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed. Reduced all the way: ``` auto a = requires{ []() {}; }; ```
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-05-25 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- Or maybe extend recognize_single_bit_test to recognize (bool)a != 0 is the same as a & 1 != 0. Let me try that.
[Bug c++/109961] New: storage size of 'variable name' isn't known
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109961 Bug ID: 109961 Summary: storage size of 'variable name' isn't known Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: Darrell.Wright at gmail dot com Target Milestone: --- The following valid code fails to compile in gcc-trunk on https://foo.godbolt.org/z/vGMGbv8oP auto a = requires{ []( int b ) consteval { if( b ) { throw b; } }( 0 ); }; With the following error :3:6: error: storage size of 'a' isn't known 3 | auto a = requires{ | ^ Compiler returned: 1
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 --- Comment #1 from Andrew Pinski --- We could have a pattern that does: `(a & CST) != 0 ? 1: (bool)a` -> `a & (CST|1) != 0` to fix this I think.
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 Andrew Pinski changed: What|Removed |Added Known to work||8.5.0 Known to fail||9.1.0 Target Milestone|--- |10.5
gcc-bugs@gcc.gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109960 Bug ID: 109960 Summary: [10/11/12/13/14 Regression] missing combining of `(a&1) != 0 || (a&2)!=0` into `(a&3)!=0` Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take the following C++ code (reduced from stmt_can_terminate_bb_p): ``` static inline bool f1(unsigned *a) { return (*a&1); } static inline bool f2(unsigned *a) { return (*a&2); } bool f(int c, unsigned *a) { if (c) return 0; return f2(a) || f1(a) ; } ``` At -O1 we can produce: ``` movl$0, %eax testl %edi, %edi jne .L1 testb $3, (%rsi) setne %al .L1: ret ``` But at -O2 we get: xorl%eax, %eax testl %edi, %edi jne .L1 movl(%rsi), %edx movl%edx, %eax andl$1, %eax andl$2, %edx movl$1, %edx cmovne %edx, %eax .L1: ret Which is just so much worse. This started in GCC 9.
[Bug target/109927] Bootstrap fails for m68k in stage2 compilation of gimple-match.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109927 --- Comment #18 from Stan Johnson --- $ git clone git://gcc.gnu.org/git/gcc.git $ cd gcc $ git checkout master I'm testing a manual bootstrap of "gcc version 14.0.0 20230524 (experimental) (GCC)" now, accessed via git as shown above. It will still take about 24 more hours for the bootstrap to finish (I'll send an update if it fails), but with gimple-match.cc (and generic-match.cc, which was not affected in my tests) split up, it looks like it will finish ok (it's currently in about the middle of stage 2 and has successfully compiled all the gimple-match-n.cc files). Note that Gentoo's emerge of gcc-13 behaves a little differently than a manual bootstrap. I don't know why, since I think I'm using Gentoo's ./configure options in the manual bootstrap, but in Gentoo's emerge of gcc, they seem to run cc1plus and "as" simultaneously for each compilation, perhaps aggravating the memory issue for gimple-match.cc (or maybe not, since the problem is virtual memory exhausted, not swap space exhausted). Anyway, it looks like the solution was already close. Does anyone know whether the change will be backported to gcc-12 or gcc-13 available from ftp.gnu.org/pub/gnu/gcc? Thanks to all of the GNU developers who continue to make modern tools available for use on old hardware!
[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959 --- Comment #3 from Andrew Pinski --- here is another related testcase but this was the exactly reduced one from bitmap_single_bit_set_p : ``` _Bool f(unsigned a, int t) { void g(void); if (t) return 0; g(); if (a > 1) return 0; return a == 1; } ``` this should be optimized down to: ``` _Bool f(unsigned a, int t) { void g(void); if (t) return 0; g(); return a == 1; } ```
[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959 --- Comment #2 from Andrew Pinski --- I should note I found this while looking at code generation of bitmap_single_bit_set_p after a match pattern addition.
[Bug tree-optimization/109959] `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out at -O2+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959 Andrew Pinski changed: What|Removed |Added Summary|`(a > 1) ? 0 : (a == 1)` is |`(a > 1) ? 0 : (a == 1)` is |not optimized when spelled |not optimized when spelled |out |out at -O2+ --- Comment #1 from Andrew Pinski --- I should say this at -O2. part of the reason is VRP changes `a == 1` to be `(bool)a` and then phiopt comes along and decides to factor out the conversion (phiopt did that even before my recent changes). at -O1, it is actually optimized during reassoc1 (because the above is not done) since GCC 7.
[Bug tree-optimization/109959] New: `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109959 Bug ID: 109959 Summary: `(a > 1) ? 0 : (a == 1)` is not optimized when spelled out Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: ``` _Bool f(unsigned a) { if (a > 1) return 0; return a == 1; } _Bool f0(unsigned a) { return (a > 1) ? 0 : (a == 1); } ``` Both of these should just optimize to: `return a == 1`, f0 is currently.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #7 from joseph at codesourcery dot com --- I suppose the question is how to interpret "the longest array (with the same element type) that would not make the structure larger than the object being accessed". The difficulty of interpreting "make the structure larger" in terms of including post-array padding in the replacement structure is that there might not be a definition of what that post-array padding should be given the offset of the array need not be the same as the offset with literal replacement in the struct definition.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #6 from joseph at codesourcery dot com --- For the standard, dynamically allocated case, you should only need to allocate enough memory to contain the initial part of the struct and the array members being accessed - not any padding after that array. (There were wording problems before C99 TC2; see DR#282.)
[Bug tree-optimization/107986] [12/13/14 Regression] Bogus -Warray-bounds diagnostic with std::sort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107986 --- Comment #9 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:1cd5bc387c453126fdb4c9400096180484ecddee commit r14-1179-g1cd5bc387c453126fdb4c9400096180484ecddee Author: Andrew MacLeod Date: Wed May 24 09:52:26 2023 -0400 Gimple range PHI analyzer and testcases Provide a PHI analyzer framework to provive better initial values for PHI nodes which formk groups with initial values and single statements which modify the PHI values in some predicatable way. PR tree-optimization/107822 PR tree-optimization/107986 gcc/ * Makefile.in (OBJS): Add gimple-range-phi.o. * gimple-range-cache.h (ranger_cache::m_estimate): New phi_analyzer pointer member. * gimple-range-fold.cc (fold_using_range::range_of_phi): Use phi_analyzer if no loop info is available. * gimple-range-phi.cc: New file. * gimple-range-phi.h: New file. * tree-vrp.cc (execute_ranger_vrp): Utililze a phi_analyzer. gcc/testsuite/ * gcc.dg/pr107822.c: New. * gcc.dg/pr107986-1.c: New.
[Bug tree-optimization/107822] [13/14/14 Regression] Dead Code Elimination Regression at -Os (trunk vs. 12.2.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107822 --- Comment #6 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:1cd5bc387c453126fdb4c9400096180484ecddee commit r14-1179-g1cd5bc387c453126fdb4c9400096180484ecddee Author: Andrew MacLeod Date: Wed May 24 09:52:26 2023 -0400 Gimple range PHI analyzer and testcases Provide a PHI analyzer framework to provive better initial values for PHI nodes which formk groups with initial values and single statements which modify the PHI values in some predicatable way. PR tree-optimization/107822 PR tree-optimization/107986 gcc/ * Makefile.in (OBJS): Add gimple-range-phi.o. * gimple-range-cache.h (ranger_cache::m_estimate): New phi_analyzer pointer member. * gimple-range-fold.cc (fold_using_range::range_of_phi): Use phi_analyzer if no loop info is available. * gimple-range-phi.cc: New file. * gimple-range-phi.h: New file. * tree-vrp.cc (execute_ranger_vrp): Utililze a phi_analyzer. gcc/testsuite/ * gcc.dg/pr107822.c: New. * gcc.dg/pr107986-1.c: New.
[Bug libstdc++/109947] std::expected monadic operations do not support move-only error types yet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109947 Martin Seemann changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #5 from Martin Seemann --- Thanks for the clarification! Now I am convinced that it is not a bug in libstdc++ (although I still doubt that the side-effects were intended when the committee formulated the "Effects" for monadic operations, but that's not relevant here). Marking as resolved and sorry for the noise.
[Bug fortran/90504] Improved NORM2 algorithm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90504 --- Comment #1 from anlauf at gcc dot gnu.org --- (In reply to Janne Blomqvist from comment #0) > Hanson, Hopkins, Remark on Algorithm 539: A Modern Fortran Reference > Implementation for Carefully Computing the Euclidean Norm, > https://dl.acm.org/citation.cfm?id=3134441 > > Above article tests different algorithms for NORM2 and tests performance and > numerical accuracy. This article is behind a paywall. Is there a publicly available description?
[Bug fortran/87270] "FINAL" subroutine is called when compiled with "gfortran -O1", but not "gfortran -O0"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87270 --- Comment #6 from anlauf at gcc dot gnu.org --- All current compilers seem to give the same, apparently correct result, even with different optimization level. So can we close this finally?
[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876 --- Comment #9 from Jason Merrill --- (In reply to Marek Polacek from comment #8) > > Instead, we should probably treat num as value-dependent even though it > > actually isn't. > > An attempt to implement that: > > --- a/gcc/cp/pt.cc > +++ b/gcc/cp/pt.cc > @@ -27969,6 +27969,12 @@ value_dependent_expression_p (tree expression) >else if (TYPE_REF_P (TREE_TYPE (expression))) > /* FIXME cp_finish_decl doesn't fold reference initializers. */ > return true; > + else if (DECL_DECLARED_CONSTEXPR_P (expression) > + && TREE_STATIC (expression) I'd expect we could get a similar issue with non-static constexprs. > + && !DECL_NAMESPACE_SCOPE_P (expression) This seems an unnecessary optimization? > + && DECL_INITIAL (expression) Perhaps we also want to return true if DECL_INITIAL is null? > + && TREE_CODE (DECL_INITIAL (expression)) == IMPLICIT_CONV_EXPR) Maybe !TREE_CONSTANT?
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #5 from Martin Uecker --- Clang bug: https://github.com/llvm/llvm-project/issues/62929
[Bug libstdc++/109947] std::expected monadic operations do not support move-only error types yet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109947 --- Comment #4 from Jonathan Wakely --- (In reply to Martin Seemann from comment #3) > So it comes down to how to interpret the "Effects:" clause: Does "Equivalent > to " mean that all restrictions of > `value()` apply transitively or is it merely an implementation hint? The former. The standard says: Whenever the Effects element specifies that the semantics of some function F are Equivalent to some code sequence, then the various elements are interpreted as follows. If F’s semantics specifies any Constraints or Mandates elements, then those requirements are logically imposed prior to the equivalent-to semantics. Next, the semantics of the code sequence are determined by the Constraints, Mandates, Preconditions, Effects, Synchronization, Postconditions, Returns, Throws, Complexity, Remarks, and Error conditions specified for the function invocations contained in the code sequence. The value returned from F is specified by F’s Returns element, or if F has no Returns element, a non-void return from F is specified by the return statements (8.7.4) in the code sequence. If F’s semantics contains a Throws, Postconditions, or Complexity element, then that supersedes any occurrences of that element in the code sequence. > (Strangely enough, in the "Effects:" clause of `value_or()&&` the expression > `std::move(**this)` is used instead of `std::move(value())`. Maybe this is > an oversight/inconsistency of the standard.) Yes. The spec were written by different people at different times.
[Bug c++/109876] [10/11/12/13/14 Regression] initializer_list not usable in constant expressions in a template
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109876 --- Comment #8 from Marek Polacek --- > Instead, we should probably treat num as value-dependent even though it > actually isn't. An attempt to implement that: --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -27969,6 +27969,12 @@ value_dependent_expression_p (tree expression) else if (TYPE_REF_P (TREE_TYPE (expression))) /* FIXME cp_finish_decl doesn't fold reference initializers. */ return true; + else if (DECL_DECLARED_CONSTEXPR_P (expression) + && TREE_STATIC (expression) + && !DECL_NAMESPACE_SCOPE_P (expression) + && DECL_INITIAL (expression) + && TREE_CODE (DECL_INITIAL (expression)) == IMPLICIT_CONV_EXPR) + return true; return false; case DYNAMIC_CAST_EXPR:
[Bug fortran/104350] ICE in gfc_array_dimen_size(): Bad dimension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104350 --- Comment #4 from CVS Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:ec2e86274427a402d2de2199ba550f7295ea9b5f commit r14-1175-gec2e86274427a402d2de2199ba550f7295ea9b5f Author: Harald Anlauf Date: Wed May 24 21:04:43 2023 +0200 Fortran: reject bad DIM argument of SIZE intrinsic in simplification [PR104350] gcc/fortran/ChangeLog: PR fortran/104350 * simplify.cc (simplify_size): Reject DIM argument of intrinsic SIZE with error when out of valid range. gcc/testsuite/ChangeLog: PR fortran/104350 * gfortran.dg/size_dim_2.f90: New test.
[Bug fortran/103794] ICE in gfc_check_reshape, at fortran/check.c:4727
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103794 --- Comment #3 from CVS Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:5fd5d8fb744fd9251d04e4b17d04f2340e6a283b commit r14-1174-g5fd5d8fb744fd9251d04e4b17d04f2340e6a283b Author: Harald Anlauf Date: Sun May 21 22:25:29 2023 +0200 Fortran: checking and simplification of RESHAPE intrinsic [PR103794] gcc/fortran/ChangeLog: PR fortran/103794 * check.cc (gfc_check_reshape): Expand constant arguments SHAPE and ORDER before checking. * gfortran.h (gfc_is_constant_array_expr): Add prototype. * iresolve.cc (gfc_resolve_reshape): Expand constant argument SHAPE. * simplify.cc (is_constant_array_expr): If array is determined to be constant, expand small array constructors if needed. (gfc_is_constant_array_expr): Wrapper for is_constant_array_expr. (gfc_simplify_reshape): Fix check for insufficient elements in SOURCE when no padding specified. gcc/testsuite/ChangeLog: PR fortran/103794 * gfortran.dg/reshape_10.f90: New test. * gfortran.dg/reshape_11.f90: New test.
[Bug libstdc++/109261] std::experimental::simd is not usable in several constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109261 --- Comment #13 from CVS Commits --- The releases/gcc-12 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:2b502c3119c91fe3ba2313f0842a3bedd395bc91 commit r12-9651-g2b502c3119c91fe3ba2313f0842a3bedd395bc91 Author: Matthias Kretz Date: Wed May 24 12:50:46 2023 +0200 libstdc++: Fix SFINAE for __is_intrinsic_type on ARM On ARM NEON doesn't support double, so __is_intrinsic_type_v should say false (instead of being ill-formed). Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd.h (__intrinsic_type): Specialize __intrinsic_type and __intrinsic_type in any case, but provide the member type only with __aarch64__. (cherry picked from commit aa8b363171a95b8f867a74f29c75f9577e9087e1)
[Bug libstdc++/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #10 from CVS Commits --- The releases/gcc-12 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:ff7360dafe209b960535eaaa3efcfbaaa44daff9 commit r12-9652-gff7360dafe209b960535eaaa3efcfbaaa44daff9 Author: Matthias Kretz Date: Wed May 24 16:43:07 2023 +0200 libstdc++: Fix type of first argument to vec_cntm call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109949 * include/experimental/bits/simd.h (__intrinsic_type): If __ALTIVEC__ is defined, map gnu::vector_size types to their corresponding __vector T types without losing unsignedness of integer types. Also prefer long long over long. * include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask object to the expected unsigned vector type. (cherry picked from commit efd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9)
[Bug libstdc++/109261] std::experimental::simd is not usable in several constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109261 --- Comment #12 from CVS Commits --- The releases/gcc-12 branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:8be71168f7bbafa04f592a7524432351ffea71ba commit r12-9650-g8be71168f7bbafa04f592a7524432351ffea71ba Author: Matthias Kretz Date: Tue May 23 23:48:49 2023 +0200 libstdc++: Add missing constexpr to simd_neon Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd_neon.h (_S_reduce): Add constexpr and make NEON implementation conditional on not __builtin_is_constant_evaluated. (cherry picked from commit b0a483b0a011f9cbc8b25053eae809c77dae2a12)
[Bug libstdc++/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #9 from CVS Commits --- The master branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:efd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9 commit r14-1173-gefd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9 Author: Matthias Kretz Date: Wed May 24 16:43:07 2023 +0200 libstdc++: Fix type of first argument to vec_cntm call Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109949 * include/experimental/bits/simd.h (__intrinsic_type): If __ALTIVEC__ is defined, map gnu::vector_size types to their corresponding __vector T types without losing unsignedness of integer types. Also prefer long long over long. * include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask object to the expected unsigned vector type.
[Bug libstdc++/109947] std::expected monadic operations do not support move-only error types yet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109947 --- Comment #3 from Martin Seemann --- Thanks for pointing me to the LWG issue. It makes sense that the error type must be copyable for the `value()` overloads due to potentially throwing a `bad_expected_access` with the embedded error embedded. However, the monadic operations will never throw this exception. Consequently, the standard draft for the monadic operations (https://eel.is/c++draft/expected.object.monadic) does not contain any "Throws:" clause nor is copyability of the error type included in the "Constraints:" clause. So it comes down to how to interpret the "Effects:" clause: Does "Equivalent to " mean that all restrictions of `value()` apply transitively or is it merely an implementation hint? (Strangely enough, in the "Effects:" clause of `value_or()&&` the expression `std::move(**this)` is used instead of `std::move(value())`. Maybe this is an oversight/inconsistency of the standard.)
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #4 from Martin Uecker --- The concern would be that a program relying on the size of an object being larger may then have out of bounds accesses. But rereading the standard, I am also not not seeing that this is required. (for the extension nothing is required anyway, but it should be consistent with it).
[Bug fortran/104350] ICE in gfc_array_dimen_size(): Bad dimension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104350 anlauf at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |anlauf at gcc dot gnu.org --- Comment #3 from anlauf at gcc dot gnu.org --- Submitted: https://gcc.gnu.org/pipermail/fortran/2023-May/059322.html
[Bug rtl-optimization/101188] [AVR] Miscompilation and function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 --- Comment #6 from Georg-Johann Lay --- Created attachment 55152 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55152&action=edit diff testcase by v4.9.2 vs v5.2.1 Code from v4.9.2 is correct, but from v5.2.1 is bogus: --- fail1-4.9.2.sx 2023-05-24 17:20:46.508698338 +0200 +++ fail1-5.2.1.sx 2023-05-24 17:19:50.019976879 +0200 @@ -39,11 +39,11 @@ adiw r24,1 ; 13 addhi3_clobber/1[length = 1] std Z+1,r25 ; 14 *movhi/4[length = 2] st Z,r24 - adiw r30,2 ; 15 *addhi3/3 [length = 1] - movw r14,r16 ; 39 *movhi/1[length = 1] - ldi r24,68 ; 16 addhi3_clobber/3[length = 3] - add r14,r24 + movw r14,r16 ; 38 *movhi/1[length = 1] + ldi r31,68 ; 15 addhi3_clobber/3[length = 3] + add r14,r31 adc r15,__zero_reg__ + adiw r30,2 ; 17 *addhi3/3 [length = 1] ld __tmp_reg__,Z+; 18 *movhi/3[length = 3] ld r31,Z mov r30,__tmp_reg__
[Bug c++/109958] [10/11/12/13/14 Regression] ICE: in build_ptrmem_type, at cp/decl.cc:11066 taking the address of bound static member function brought into derived class by using-declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109958 Marek Polacek changed: What|Removed |Added Keywords||ice-on-valid-code Priority|P3 |P2 Summary|ICE: in build_ptrmem_type, |[10/11/12/13/14 Regression] |at cp/decl.cc:11066 taking |ICE: in build_ptrmem_type, |the address of bound static |at cp/decl.cc:11066 taking |member function brought |the address of bound static |into derived class by |member function brought |using-declaration |into derived class by ||using-declaration Target Milestone|--- |10.5
[Bug c++/109958] ICE: in build_ptrmem_type, at cp/decl.cc:11066 taking the address of bound static member function brought into derived class by using-declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109958 Marek Polacek changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||mpolacek at gcc dot gnu.org Last reconfirmed||2023-05-24 --- Comment #1 from Marek Polacek --- Confirmed. r0-115460-g57910f3a9a81e9: commit 57910f3a9a81e9ad122a814255197f6f24c6af08 Author: Jason Merrill Date: Sat Mar 3 19:53:30 2012 -0500 class.c (add_method): Always build an OVERLOAD for using-decls. * class.c (add_method): Always build an OVERLOAD for using-decls. * search.c (lookup_member): Handle getting an OVERLOAD for a single function. From-SVN: r184873
[Bug c++/109958] New: ICE: in build_ptrmem_type, at cp/decl.cc:11066 taking the address of bound static member function brought into derived class by using-declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109958 Bug ID: 109958 Summary: ICE: in build_ptrmem_type, at cp/decl.cc:11066 taking the address of bound static member function brought into derived class by using-declaration Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ed at catmur dot uk Target Milestone: --- struct B { static int f(); }; struct D : B { using B::f; }; void f(D d) { &d.f; } : In function 'void f(D)': :3:18: error: ISO C++ forbids taking the address of a bound member function to form a pointer to member function. Say '&D::f' [-fpermissive] 3 | void f(D d) { &d.f; } |~~^ :3:18: internal compiler error: in build_ptrmem_type, at cp/decl.cc:11066 3 | void f(D d) { &d.f; } | ^ 0x23a0cee internal_error(char const*, ...) ???:0 0xa95fae fancy_abort(char const*, int, char const*) ???:0 0xd31f7f build_x_unary_op(unsigned int, tree_code, cp_expr, tree_node*, int) ???:0 0xc7ab2f c_parse_file() ???:0 0xdb9519 c_common_parse_file() ???:0 This appears to have been broken somewhere between 4.7.4 and 4.8.1.
[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948 --- Comment #4 from anlauf at gcc dot gnu.org --- The following patch fixes NULL pointer dereference with the reduced testcases: diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 83e45f1b693..89c62b3eb1e 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -5640,7 +5643,7 @@ gfc_expression_rank (gfc_expr *e) if (ref->type != REF_ARRAY) continue; - if (ref->u.ar.type == AR_FULL) + if (ref->u.ar.type == AR_FULL && ref->u.ar.as) { rank = ref->u.ar.as->rank; break; Can you check if this works for you? Still needs regtesting.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 --- Comment #3 from Pascal Cuoq --- @Andrew Pinski You don't even need to invoke the fact that this is an extension. GCC could reserve 17 bytes for each variable i of type “int”, and as long as “sizeof i” continued to evaluate to 4 (4 being the value of “sizeof(int)” for x86), no-one would be able to claim that GCC is not generating “correct” assembly code. This ticket is pointing out that the current behavior for initialized FAMs is suboptimal for programs that rely on the GCC extension, just like it would be suboptimal to reserve 17 bytes for each “int” variable for standard C programs (and I would open a ticket for it if I noticed such a behavior). It's not breaking anything and it may be inconvenient to change, and as a ticket that does not affect correctness, it can be ignored indefinitely. It's just a suggestion for smaller binaries that might also end up marginally faster as a result. @Martin Uecker Considering how casually phrased the description of FAMs was in C99 and remained in later standards (see https://stackoverflow.com/q/73497572/139746 for me trying to make sense of some of the relevant words), I doubt that the standard has anything to say about the compiler extension being discussed. But if you have convincing arguments, you could spend a few minutes filing a bug against Clang to tell them that they are making the binaries they generate too small and efficient.
[Bug jit/66594] jitted code should use -mtune=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66594 Joseph changed: What|Removed |Added CC||schuchart at icl dot utk.edu --- Comment #10 from Joseph --- The lack of target-specific optimizations is biting us quite a bit and manually specifying an architecture is not really an option, unless we duplicate the detection mechanism of GCC, which is not ideal. I am not familiar with the GCC code base and from the discussion below it's not clear what would be needed to advance this. If someone could provide some hints on what is missing and how/where it could be implemented we could probably take a stab at it. Would it be sufficient to add a macro to the header of the targets (as suggested here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66594#c6) that provide host_detect_local_cpu and ignore the ones that do not provide it? Or would it be better to hard-code calls for the architectures that provide them, like in the referenced patch but with architecture-specific pre-processor guards? We mostly care about i386 and arm/aarch64 but covering all available bases would be necessary, I guess.
[Bug tree-optimization/109957] New: Missing loop PHI optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109957 Bug ID: 109957 Summary: Missing loop PHI optimization Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: ``` void foo(); int main() { _Bool c = 0; _Bool e = 1; int i; for (i = 0; i < 1; i++) { c |= (e!=0); e = 0; } if (c == 0) foo(); return 0; } ``` This should be just optimized to just `return 0`. The reason is once c is 1, it will always stay 1. But currently we don't notice that. Note this code is reduced from PR 108352 testcase after a phiopt improvement that provided the above form and ran into a testcase failure because of that.
[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948 --- Comment #3 from anlauf at gcc dot gnu.org --- (In reply to Rimvydas (RJ) from comment #1) > More trivial testcase resulting in similar ICE. Yep, even smaller: subroutine foo(k_2d) implicit none integer :: k_2d(:) integer :: i associate(k=>k_2d) i = k(1) if (any(k==1)) i = 1 end associate end subroutine foo The associate is apparently one of the common components that is needed.
[Bug fortran/109948] [13/14 Regression] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948 anlauf at gcc dot gnu.org changed: What|Removed |Added Keywords||ice-on-valid-code Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC||anlauf at gcc dot gnu.org Summary|ICE(segfault) in|[13/14 Regression] |gfc_expression_rank() from |ICE(segfault) in |gfc_op_rank_conformable() |gfc_expression_rank() from ||gfc_op_rank_conformable() Last reconfirmed||2023-05-24 --- Comment #2 from anlauf at gcc dot gnu.org --- Confirmed. Further reduced: subroutine foo(y, x) implicit none real :: y(:) real :: x(:) associate(z=>y) where ( z < 0.0 ) x(:) = z(:) where ( z < 0.0 ) x(:) = z(:) end associate end subroutine foo
[Bug fortran/109948] ICE(segfault) in gfc_expression_rank() from gfc_op_rank_conformable()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109948 --- Comment #1 from Rimvydas (RJ) --- More trivial testcase resulting in similar ICE. $ cat test_associate2.f90 subroutine foo(grib) implicit none type b integer, allocatable :: k_2d(:) end type type(b) :: grib integer :: i associate(k=>grib%k_2d) i = k(1) if (any(k==1)) i = 1 end associate end subroutine foo
[Bug middle-end/109840] [14 Regression] internal compiler error: in expand_fn_using_insn, at internal-fn.cc:153 when building graphite2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109840 --- Comment #5 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:2738955004256c2e9753364d78a7be340323b74b commit r14-1171-g2738955004256c2e9753364d78a7be340323b74b Author: Roger Sayle Date: Wed May 24 17:32:20 2023 +0100 PR middle-end/109840: Preserve popcount/parity type in match.pd. PR middle-end/109840 is a regression introduced by my recent patch to fold popcount(bswap(x)) as popcount(x). When the bswap and the popcount have the same precision, everything works fine, but this optimization also allowed a zero-extension between the two. The oversight is that we need to be strict with type conversions, both to avoid accidentally changing the argument type to popcount, and also to reflect the effects of argument/return-value promotion in the call to bswap, so this zero extension needs to be preserved/explicit in the optimized form. Interestingly, match.pd should (in theory) be able to narrow calls to popcount and parity, removing a zero-extension from its argument, but that is an independent optimization, that needs to check IFN_ support. Many thanks to Andrew Pinski for his help/fixes with these transformations. 2023-05-24 Roger Sayle gcc/ChangeLog PR middle-end/109840 * match.pd : Preserve zero-extension when optimizing popcount((T)bswap(x)) and popcount((T)rotate(x,y)) as popcount((T)x), so the popcount's argument keeps the same type. : Likewise preserve extensions when simplifying parity((T)bswap(x)) and parity((T)rotate(x,y)) as parity((T)x), so that the parity's argument type is the same. gcc/testsuite/ChangeLog PR middle-end/109840 * gcc.dg/fold-parity-8.c: New test. * gcc.dg/fold-popcount-11.c: Likewise.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 Martin Uecker changed: What|Removed |Added CC||muecker at gwdg dot de --- Comment #2 from Martin Uecker --- To me it seems that the C standard requires that the object has size sizeof(struct s) + n * sizeof(struct t) if you want to store n elements even when the array then starts at a smaller offset.
[Bug c/109956] GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 Andrew Pinski changed: What|Removed |Added Severity|normal |trivial --- Comment #1 from Andrew Pinski --- Considering this is an extension, I think GCC is still correct.
[Bug c/109956] New: GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3};
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109956 Bug ID: 109956 Summary: GCC reserves 9 bytes for struct s { int a; char b; char t[]; } x = {1, 2, 3}; Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pascal_cuoq at hotmail dot com Target Milestone: --- Static-lifetime variables of type “struct with FAM” (flexible array member) with an initializer for the FAM are a GCC extension. As of GCC 13.1 and Compiler Explorer “trunk”, targeting x86, the definition “struct s { int a; char b; char t[]; } x = {1, 2, 3};” reserves 9 bytes for x, and in fact, with various initializers, the trailing padding for variables of type “struct s” is always 3, as if the size to reserve for the variable was computed as “sizeof (struct s) + n * sizeof(element)”. Input file: struct s { int a; char b; char t[]; } x = {1, 2, 3}; Command: gcc -S fam_init.c Result (with Ubuntu 9.4.0-1ubuntu1~20.04.1 which exhibits the same behavior as the recent versions on Compiler Explorer): .align 8 .type x, @object .size x, 9 x: .long 1 .byte 2 .byte 3 .zero 3 Clang up to version 14 used to round up the size of the variable to a multiple of the alignment of the struct, but even this is not necessary. It is only necessary that the size reserved for a variable of type t is at least “sizeof(t)” bytes, and also to reserve enough space for the initializer. Clang 15 and later uses the optimal formula: max(sizeof (struct s), offsetof(struct s, t[n])) Compiler Explorer link: https://gcc.godbolt.org/z/5W7h4KWT1 This ticket is to suggest that GCC uses the same optimal formula as Clang 15 and later.
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 Jakub Jelinek changed: What|Removed |Added Attachment #55148|0 |1 is obsolete|| --- Comment #49 from Jakub Jelinek --- Created attachment 55151 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55151&action=edit gcc14-bitint-wip.patch Added a testcase with various operations with _BitInt(N) operands and tweaked c-typeck.cc/fold-const.cc to accept those.
[Bug libstdc++/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #8 from Matthias Kretz (Vir) --- Created attachment 55150 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55150&action=edit proposed solution This patch allows unsigned intrinsic types and calls vec_cntm correctly.
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #38 from Oleg Endo --- (In reply to Alexander Klepikov from comment #37) > > As far as I understand from GCC sources, function I patched > 'expand_ashiftrt' process only constant values of shift. As you can see > earlier, I added your and other examples to tests. OK, thanks for the additional test cases. It really looks like the way the constant shift is expanded (via ashrsi3_n insn) on SH1/SH2 is getting in the way. The tst insn is mainly formed by the combine pass, which relies on certain insn patterns and combinations thereof. See also sh.md, around line 530. You can look at the debug output with the -fdump-rtl-all option to see what's happening in the RTL passes. What your patch is doing is to make it not emit the ashrsi3_n insn for constant shifts altogether? I guess it will make code that actually needs those real shifts larger, as it will always emit the whole shift stitching sequence. That might be a good thing or not. > It looks like really > dynamic shifts translate to library calls. So the option name '-mdisable-dynshift-libcall' doesn't make sense. What it actually does is more like '-mdisable-constshift-libcall'. > > Should I test more exotic situations? If so, could you please help me with > really exotic or weired examples? Have you had a look at the existing test cases for this in gcc/testsuite/gcc.target/sh ?
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #48 from rguenther at suse dot de --- > Am 24.05.2023 um 16:18 schrieb jakub at gcc dot gnu.org > : > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 > > --- Comment #47 from Jakub Jelinek --- > But then the pass effectively has to do lifetime analysis of the _BitInt(N) > for > N > 128 etc. SSA_NAMEs and perform the partitioning of those SSA_NAMEs into > VAR_DECLs/PARM_DECLs/RESULT_DECLs, so that we don't blow away the local stack; > perhaps as you wrote with some local subgraphs turned into a loop which will > handle multiple operations together instead of just one operation per loop. > Or just use different VAR_DECLs but stick in clobbers where they will be dead > and hope out of ssa can merge those. > Anyway, more work than I hoped. > Though, perhaps it can be also done incrementally, with bare minimum first and > improvements later. Sure, this is just what I think users will expect. We don’t have the high level infrastructure to do this afterwards such as loop fusion and variable contraction (well, in theory graphite can do it but even there we lack actual transform bits).
[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #7 from Matthias Kretz (Vir) --- > You should backport to N-1 first [...] That was my intent. My workflow had not yet adapted to the existence of releases/gcc-13. Fixed. > never use -mpower9-vector and friends I use -mpcu in my dejagnu boards (and the equivalent for 'check-simd'). IIUC the -maltivec -mpower9-vector flags are added by check_vect_support_and_set_flags in lib/target-supports.exp. The problem was a branch that I apparently never tested (because the check-simd testsuite wants to compile *and* run). https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/experimental/bits/simd_ppc.h;h=eca1b34241bb4efdbbb6490550750d81aee248b3;hb=HEAD#l133 The `vec_cntm(__to_intrin(__kv), 1)` call uses an incorrect type for the first argument. The compiler message isn't very helpful, though. Patch coming up.
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #47 from Jakub Jelinek --- But then the pass effectively has to do lifetime analysis of the _BitInt(N) for N > 128 etc. SSA_NAMEs and perform the partitioning of those SSA_NAMEs into VAR_DECLs/PARM_DECLs/RESULT_DECLs, so that we don't blow away the local stack; perhaps as you wrote with some local subgraphs turned into a loop which will handle multiple operations together instead of just one operation per loop. Or just use different VAR_DECLs but stick in clobbers where they will be dead and hope out of ssa can merge those. Anyway, more work than I hoped. Though, perhaps it can be also done incrementally, with bare minimum first and improvements later.
[Bug target/109933] __atomic_test_and_set is broken for BIG ENDIAN riscv targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109933 --- Comment #8 from Rory Bolt --- So... The logic for this is simple: For little endian the shift amount is ((address & 3) * 8) For big endian the shift amount is ((3 -(address & 3)) * 8) Unfortunately I have ZERO experience modifying GCC, and the mechanism to determine if it is generating big endian code or little endian code is not obvious to me... So working on this in my spare time it will be a while for me to create a patch. That said, I do have a full big endian linux environment so I can test a patch (relatively quickly - it takes a while to build GCC ;-)) if some one beats me to this.
[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #6 from Segher Boessenkool --- (In reply to Matthias Kretz (Vir) from comment #4) > With -mcpu=power10 I see the issue. The problem has been there all the time > and only surfaced with this test. (It should also have shown on `make > check-simd` in libstdc++.) Yup, you should never use -mpower9-vector and friends. Such options are handy *during development* but are heavily problematic later; they should never have existed in mainline. What is the actual problem here? Or do you want to build up the suspense and only show it in the patch you will send :-)
[Bug tree-optimization/109695] [14 Regression] crash in gimple_ranger::range_of_expr since r14-377-gc92b8be9b52b7e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109695 Andrew Macleod changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #42 from Andrew Macleod --- I think we can close this now, I think everything we plan to do has been done.
[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #5 from Segher Boessenkool --- (In reply to Matthias Kretz (Vir) from comment #2) > Yes, I stopped my backporting efforts when I became aware that it's failing > on ARM. I'll get to PPC ASAP and then continue with the backports. You should backport to N-1 first, only then to N-2, etc. Sanity is nice :-) Next time :-)
[Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195 --- Comment #16 from CVS Commits --- The master branch has been updated by Kyrylo Tkachov : https://gcc.gnu.org/g:b30ab0dcf9db2ac6d81fb3743add1fbfa0d18f6e commit r14-1167-gb30ab0dcf9db2ac6d81fb3743add1fbfa0d18f6e Author: Kyrylo Tkachov Date: Wed May 24 14:52:34 2023 +0100 aarch64: PR target/99195 Annotate vector shift patterns for vec-concat-zero Continuing the series of straightforward annotations, this one handles the normal (not widening or narrowing) vector shifts. Tests included. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: PR target/99195 * config/aarch64/aarch64-simd.md (aarch64_simd_lshr): Rename to... (aarch64_simd_lshr): ... This. (aarch64_simd_ashr): Rename to... (aarch64_simd_ashr): ... This. (aarch64_simd_imm_shl): Rename to... (aarch64_simd_imm_shl): ... This. (aarch64_simd_reg_sshl): Rename to... (aarch64_simd_reg_sshl): ... This. (aarch64_simd_reg_shl_unsigned): Rename to... (aarch64_simd_reg_shl_unsigned): ... This. (aarch64_simd_reg_shl_signed): Rename to... (aarch64_simd_reg_shl_signed): ... This. (vec_shr_): Rename to... (vec_shr_): ... This. (aarch64_shl): Rename to... (aarch64_shl): ... This. (aarch64_qshl): Rename to... (aarch64_qshl): ... This. gcc/testsuite/ChangeLog: PR target/99195 * gcc.target/aarch64/simd/pr99195_1.c: Add testing for shifts. * gcc.target/aarch64/simd/pr99195_6.c: Likewise. * gcc.target/aarch64/simd/pr99195_8.c: New test.
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #37 from Alexander Klepikov --- > Can you also compile for little endian, and most of all, use -O2 > optimization level. Some optimizations are not done below -O2. Here's source file, I added functions with non-constant shifts $ cat f.c #define ADDR 0x #define P ((unsigned char *)ADDR) #define FLAG 0x40 #define S 7 unsigned char f_char_var(char v){ return (v & FLAG) == FLAG; } unsigned char f_unsigned_char_var(unsigned char v){ return (v & FLAG) == FLAG; } unsigned char f_symbol(void){ return (*P & FLAG) == FLAG; } unsigned char f_symbol_zero(void){ return (*P & FLAG) == 0; } unsigned char f_symbol_non_zero(void){ return (*P & FLAG) != 0; } unsigned int dyn_lshift (unsigned int x, unsigned int y) { return x << (y & 31); } unsigned int dyn_rshift (unsigned int x, unsigned int y) { return x >> (y & 31); } unsigned int really_dyn_lshift (unsigned int x, unsigned int y) { return x << y; } unsigned int really_dyn_rshift (unsigned int x, unsigned int y) { return x >> y; } With patch disabled, -O2 -mb: $ cat f.s .file "f.c" .text .text .align 1 .align 2 .global _f_char_var .type _f_char_var, @function _f_char_var: mov.l .L4,r1 sts.l pr,@-r15 jsr @r1 exts.b r4,r4 mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L5: .align 2 .L4: .long ___ashiftrt_r4_6 .size _f_char_var, .-_f_char_var .align 1 .align 2 .global _f_unsigned_char_var .type _f_unsigned_char_var, @function _f_unsigned_char_var: mov.l .L8,r1 sts.l pr,@-r15 jsr @r1 exts.b r4,r4 mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L9: .align 2 .L8: .long ___ashiftrt_r4_6 .size _f_unsigned_char_var, .-_f_unsigned_char_var .align 1 .align 2 .global _f_symbol .type _f_symbol, @function _f_symbol: mov.l .L12,r1 sts.l pr,@-r15 mov.b @r1,r4 mov.l .L13,r1 jsr @r1 nop mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L14: .align 2 .L12: .long -65536 .L13: .long ___ashiftrt_r4_6 .size _f_symbol, .-_f_symbol .align 1 .align 2 .global _f_symbol_zero .type _f_symbol_zero, @function _f_symbol_zero: mov.l .L16,r1 mov.b @r1,r0 tst #64,r0 rts movtr0 .L17: .align 2 .L16: .long -65536 .size _f_symbol_zero, .-_f_symbol_zero .align 1 .align 2 .global _f_symbol_non_zero .type _f_symbol_non_zero, @function _f_symbol_non_zero: mov.l .L20,r1 sts.l pr,@-r15 mov.b @r1,r4 mov.l .L21,r1 jsr @r1 nop mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L22: .align 2 .L20: .long -65536 .L21: .long ___ashiftrt_r4_6 .size _f_symbol_non_zero, .-_f_symbol_non_zero .align 1 .align 2 .global _dyn_lshift .type _dyn_lshift, @function _dyn_lshift: mov.l .L25,r1 sts.l pr,@-r15 jsr @r1 mov r5,r0 lds.l @r15+,pr rts nop .L26: .align 2 .L25: .long ___ashlsi3_r0 .size _dyn_lshift, .-_dyn_lshift .align 1 .align 2 .global _dyn_rshift .type _dyn_rshift, @function _dyn_rshift: mov.l .L29,r1 sts.l pr,@-r15 jsr @r1 mov r5,r0 lds.l @r15+,pr rts nop .L30: .align 2 .L29: .long ___lshrsi3_r0 .size _dyn_rshift, .-_dyn_rshift .align 1 .align 2 .global _really_dyn_lshift .type _really_dyn_lshift, @function _really_dyn_lshift: mov.l .L33,r1 sts.l pr,@-r15 jsr @r1 mov r5,r0 lds.l @r15+,pr rts nop .L34: .align 2 .L33: .long ___ashlsi3_r0 .size _really_dyn_lshift, .-_really_dyn_lshift .align 1 .align 2 .global _really_dyn_rshift .type _really_dyn_rshift, @function _really_dyn_rshift: mov.l .L37,r1 sts.l pr,@-r15 jsr @r1 mov r5,r0 lds.l @r15+,pr rts nop .L38: .align 2 .L37: .long ___lshrsi3_r0 .size _really_dyn_rshift, .-_really_dyn_rshift .ident "GCC: (GNU) 12.3.0" With patch disabled, -O2 -ml $ cat f.s .file "f.c" .text .little .text .align 1 .align 2
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #46 from rguenther at suse dot de --- On Wed, 24 May 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 > > --- Comment #45 from Jakub Jelinek --- > Let's consider some simple testcase (where one doesn't really mix different > _BitInt sizes etc.). > _BitInt(512) > foo (_BitInt(512) a, _BitInt(512) b, _BitInt(512) c, _BitInt(512) d) > { > return (a + b) - (c + d); > } > With the patch, this now ICEs during expansion, because while we can handle > copying of even the larger _BitInt vars, we don't handle (nor plan to) +/- > etc. > during expansion for that, it would be in the earlier lowering pass. > If I'd emit straight line code here, I suppose I could use > BIT_FIELD_REFs/BIT_INSERT_EXPRs, but if I want loopy code, as you wrote > perhaps > ARRAY_REF on VCE could work fine for the input operands, but dunno what to use > for the > result of the operation, forcing it into a VAR_DECL I'm afraid will mean we > can't coalesce it much, the above would force the 2 + results and 1 - result > into VAR_DECLs. > Could we e.g. allow BIT_INSERT_EXPRs or have some new ref for this purpose to > update a single limb in a BITTYPE_INT SSA_NAME? I think for complex expressions that involve SSA temporaries the lowering pass has to be more complex as well and gather as much of the expression as possible so it can avoid _BitInt typed temporaries but instead create for (...) { limb_t tem1 = a[i] + b[i]; limb_t tem2 = c[i] + d[i]; limb_t tem3 = tem1 - tem2; res[i] = tem3; } but yes, for the result you want to force a VAR_DECL (I suppose DECL_RESULT for the above example will be one). I'd probably avoid rewriting user variables into SSA form and only have temporaries created by gimplifications in SSA form. You should be able to use DECL_NOT_GIMPLE_REG_P to force this and make sure update-address-taken leaves things this way unless, say, the user variable is only initialized by a constant?
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #45 from Jakub Jelinek --- Let's consider some simple testcase (where one doesn't really mix different _BitInt sizes etc.). _BitInt(512) foo (_BitInt(512) a, _BitInt(512) b, _BitInt(512) c, _BitInt(512) d) { return (a + b) - (c + d); } With the patch, this now ICEs during expansion, because while we can handle copying of even the larger _BitInt vars, we don't handle (nor plan to) +/- etc. during expansion for that, it would be in the earlier lowering pass. If I'd emit straight line code here, I suppose I could use BIT_FIELD_REFs/BIT_INSERT_EXPRs, but if I want loopy code, as you wrote perhaps ARRAY_REF on VCE could work fine for the input operands, but dunno what to use for the result of the operation, forcing it into a VAR_DECL I'm afraid will mean we can't coalesce it much, the above would force the 2 + results and 1 - result into VAR_DECLs. Could we e.g. allow BIT_INSERT_EXPRs or have some new ref for this purpose to update a single limb in a BITTYPE_INT SSA_NAME? Now, looking what we do right now, detailed expand dump before emergency dump shows: Partition map Partition 0 (_1 - 1 ) Partition 1 (_2 - 2 ) Partition 2 (_3 - 3 ) Partition 3 (a_4(D) - 4 ) Partition 4 (b_5(D) - 5 ) Partition 5 (c_6(D) - 6 ) Partition 6 (d_7(D) - 7 ) which I believe means it didn't actually coalesce anything at all. For the larger BITINT_TYPEs it will be very much desirable to coalesce as much as possible, given that none of the default def SSA_NAMEs are really use I'd think ideally we'd do a += b c += d result = a - c For at least multiplication/division and I assume conversions to/from floating point (and decimal), we'll need some library calls. One question is what ABI to use for them, whether to e.g. pass pointer to the limbs (and when -fbuilding-libgcc predefine macros on what mode is the limb mode, whether the limbs are ordered from least significant to most or vice versa, etc.) and in addition to that precision in bits for each argument and whether it is zero or sign extended from that, so that we could e.g. handle more efficiently _BitInt(16384) foo (unsigned _BitInt(2048) a, _BitInt(1024) b) { return (_BitInt(16384) a) * b; } by passing e.g. _mulwhatever (&res, 16384, &a, 2048, &b, -1024) where -1024 would mean 1024 bits sign extended, 2048 2048 bits zero extended, result is 16384 bits. And for GIMPLE a question is how to express it before expansion, whether we use some ifn that is then lowered.
[Bug libstdc++/109921] c++17/floating_from_chars.cc: compile error: ‘from_chars_strtod’ was not declared in this scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109921 --- Comment #1 from Jonathan Wakely --- The proposed change would result in ABI changes for some targets. I think the correct fix is something more like this: --- a/libstdc++-v3/src/c++17/floating_from_chars.cc +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc @@ -64,7 +64,7 @@ // strtold for __ieee128 extern "C" __ieee128 __strtoieee128(const char*, char**); #elif __FLT128_MANT_DIG__ == 113 && __LDBL_MANT_DIG__ != 113 \ - && defined(__GLIBC_PREREQ) + && defined(__GLIBC_PREREQ) && defined(USE_STRTOD_FOR_FROM_CHARS) #define USE_STRTOF128_FOR_FROM_CHARS 1 extern "C" _Float128 __strtof128(const char*, char**) __asm ("strtof128") @@ -77,10 +77,6 @@ extern "C" _Float128 __strtof128(const char*, char**) #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \ && __SIZE_WIDTH__ >= 32 # define USE_LIB_FAST_FLOAT 1 -# if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ -// No need to use strtold. -# undef USE_STRTOD_FOR_FROM_CHARS -# endif #endif #if USE_LIB_FAST_FLOAT @@ -1261,7 +1257,7 @@ from_chars_result from_chars(const char* first, const char* last, long double& value, chars_format fmt) noexcept { -#if ! USE_STRTOD_FOR_FROM_CHARS +#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ || !defined USE_STRTOD_FOR_FROM_CHARS // Either long double is the same as double, or we can't use strtold. // In the latter case, this might give an incorrect result (e.g. values // out of range of double give an error, even if they fit in long double). @@ -1329,13 +1325,23 @@ _ZSt10from_charsPKcS0_RDF128_St12chars_format(const char* first, __ieee128& value, chars_format fmt) noexcept __attribute__((alias ("_ZSt10from_charsPKcS0_Ru9__ieee128St12chars_format"))); -#elif defined(USE_STRTOF128_FOR_FROM_CHARS) +#else from_chars_result from_chars(const char* first, const char* last, _Float128& value, chars_format fmt) noexcept { +#ifdef USE_STRTOF128_FOR_FROM_CHARS // fast_float doesn't support IEEE binary128 format, but we can use strtold. return from_chars_strtod(first, last, value, fmt); +#else + // Read a long double. This might give an incorrect result (e.g. values + // out of range of long double give an error, even if they fit in _Float128). + long double ldbl_val; + auto res = std::from_chars(first, last, ldbl_val, fmt); + if (rec.ec == errc{}) +value = ldbl_val; + return res; +#endif } #endif We should not use strtof128 unless we can use strtod. We should not #undef USE_STRTOD_FOR_FROM_CHARS on line 82 just because we don't need it for long double, as we might still need it for _Float128.
[Bug target/109944] vector CTOR with byte elements and SSE2 has STLF fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109944 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Richard Biener --- Summary is fixed now. Any other changes require actual benchmarking I think.
[Bug target/109944] vector CTOR with byte elements and SSE2 has STLF fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109944 --- Comment #6 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:affee7dcfa1ee272d43ac7cb68cf423dbd956fd8 commit r14-1166-gaffee7dcfa1ee272d43ac7cb68cf423dbd956fd8 Author: Richard Biener Date: Wed May 24 10:07:36 2023 +0200 target/109944 - avoid STLF fail for V16QImode CTOR expansion The following dispatches to V2DImode CTOR expansion instead of using sets of (subreg:DI (reg:V16QI 146) [08]) which causes LRA to spill DImode and reload V16QImode. The same applies for V8QImode or V4HImode construction from SImode parts which happens during 32bit libgcc build. PR target/109944 * config/i386/i386-expand.cc (ix86_expand_vector_init_general): Perform final vector composition using ix86_expand_vector_init_general instead of setting the highpart and lowpart which causes spilling. * gcc.target/i386/pr109944-1.c: New testcase. * gcc.target/i386/pr109944-2.c: Likewise.
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #44 from rguenther at suse dot de --- On Wed, 24 May 2023, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 > > Jakub Jelinek changed: > >What|Removed |Added > > Attachment #55141|0 |1 > is obsolete|| > > --- Comment #43 from Jakub Jelinek --- > Created attachment 55148 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55148&action=edit > gcc14-bitint-wip.patch > > Another update. This version can emit _BitInt(N) values in non-automatic > variable initializers, handles passing/returning _BitInt(N) and for N <= 64 > (i.e. what fits into a single limb) from what I can see handling it in GIMPLE > passes and and even expansion/RTL seems to work. > Now, as discussed earlier, for N > GET_MODE_PRECISION (limb_mode) I think we > want to lower it in some pass in between IPA and vectorization. For N which > fits into DImode if limb is 32-bit (currently no target does that as we have > just x86-64 support) or which fits into TImode for 64-bit if TImode is > supported, I guess we want to map arithmetics > to TImode arithmetics, for say 2-4x larger emit code for arithmetics (except > perhaps multiplication/division) inline as straight line code and for even > larger as loops. > In the last case, a question is if we could use e.g. TARGET_MEM_REF for the > variable offset in those loops on the vars even when they aren't > TREE_ADDRESSABLE (but would force them into memory during expansion). Note you should use TARGET_MEM_REF only when it describes the actual addressing mode you want to use. Otherwise just synthesize ARRAY_REFs like ARRAY_REF , index> with an appropriate VLA libm[] array type. I'd do the lowering right before pass_complete_unrolli and generally emit loopy form (another pass placement required in the -Og pipeline).
[Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955 --- Comment #2 from Richard Biener --- One thing I see is -(insn 11 10 15 2 (set (subreg:V16QI (reg:V2DI 83 [ ]) 0) -(unspec:V16QI [ -(reg:V16QI 92) -(reg:V16QI 91) -(lt:V16QI (reg:V16QI 90) -(const_vector:V16QI [ -(const_int 0 [0]) repeated x16 -])) -] UNSPEC_BLENDV)) "/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10 discrim 1 7431 {*sse4_1_pblendvb_lt} (nil) vs +(insn 8 5 9 2 (set (reg:V16QI 89) +(const_vector:V16QI [ +(const_int -1 [0x]) repeated x16 +])) "/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 1838 {movv16qi_internal} + (nil)) +(insn 9 8 11 2 (set (reg:V16QI 90) +(gt:V16QI (reg:V16QI 92) +(reg:V16QI 89))) "/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 6749 {*sse2_gtv16qi3} (expr_list:REG_DEAD (reg:V16QI 92) +(expr_list:REG_DEAD (reg:V16QI 89) +(nil +(note 11 9 12 2 NOTE_INSN_DELETED) +(insn 12 11 16 2 (set (subreg:V16QI (reg:V2DI 84 [ ]) 0) +(unspec:V16QI [ +(reg:V16QI 93) +(reg:V16QI 94) +(reg:V16QI 90) +] UNSPEC_BLENDV)) "/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10 discrim 1 7429 {sse4_1_pblendvb} + (expr_list:REG_DEAD (reg:V16QI 93) +(expr_list:REG_DEAD (reg:V16QI 90) +(expr_list:REG_DEAD (reg:V16QI 94) (nil) after the combiner which seems to be a missing simplification of (insn 8 5 9 2 (set (reg:V16QI 89) (const_vector:V16QI [ (const_int -1 [0x]) repeated x16 ])) (insn 9 8 11 2 (set (reg:V16QI 90) (gt:V16QI (reg:V16QI 92) (reg:V16QI 89))) to (lt:V16QI (reg:V16QI 90) (const_vector:V16QI [ (const_int 0 [0]) repeated x16 ]) Trying 8 -> 9: 8: r89:V16QI=const_vector 9: r90:V16QI=r92:V16QI>r89:V16QI REG_DEAD r92:V16QI REG_DEAD r89:V16QI Failed to match this instruction: (set (reg:V16QI 90) (gt:V16QI (reg:V16QI 92) (const_vector:V16QI [ (const_int -1 [0x]) repeated x16 ]))) Trying 8, 9 -> 12: 8: r89:V16QI=const_vector 9: r90:V16QI=r92:V16QI>r89:V16QI REG_DEAD r92:V16QI REG_DEAD r89:V16QI 12: r84:V2DI#0=unspec[r93:V16QI,r94:V16QI,r90:V16QI] 47 REG_DEAD r93:V16QI REG_DEAD r90:V16QI REG_DEAD r94:V16QI Failed to match this instruction: (set (subreg:V16QI (reg:V2DI 84 [ ]) 0) (unspec:V16QI [ (reg:V16QI 93) (reg:V16QI 94) (gt:V16QI (reg:V16QI 92) (const_vector:V16QI [ (const_int -1 [0x]) repeated x16 ])) ] UNSPEC_BLENDV)) not sure if the lt is a standalone thing. Maybe we just need a define-insn-and-split for _gt as well. All those seem to be somewhat tuned to the exact way RTL expansion works when the vcond patterns are there. Getting rid of vcond* (but not vcond_mask) would allow quite some simplification in middle-end code and the vectorizer.
[Bug tree-optimization/109695] [14 Regression] crash in gimple_ranger::range_of_expr since r14-377-gc92b8be9b52b7e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109695 --- Comment #41 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:257c2be7ff8dfdc610202a1e1f5a8a668b939bdb commit r14-1165-g257c2be7ff8dfdc610202a1e1f5a8a668b939bdb Author: Andrew MacLeod Date: Tue May 23 15:41:03 2023 -0400 Only update global value if it changes. Do not update and propagate a global value if it hasn't changed. PR tree-optimization/109695 * gimple-range-cache.cc (ranger_cache::get_global_range): Add changed param. * gimple-range-cache.h (ranger_cache::get_global_range): Ditto. * gimple-range.cc (gimple_ranger::range_of_stmt): Pass changed flag to set_global_range. (gimple_ranger::prefill_stmt_dependencies): Ditto.
[Bug tree-optimization/109695] [14 Regression] crash in gimple_ranger::range_of_expr since r14-377-gc92b8be9b52b7e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109695 --- Comment #40 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:cfd6569e9c41181231a8427235d0c0a7ad9262e4 commit r14-1164-gcfd6569e9c41181231a8427235d0c0a7ad9262e4 Author: Andrew MacLeod Date: Tue May 23 15:20:56 2023 -0400 Use negative values to reflect always_current in the temporal cache. Instead of using 0, use negative timestamps to reflect always_current state. If the value doesn't change, keep the timestamp rather than creating a new one and invalidating any dependencies. PR tree-optimization/109695 * gimple-range-cache.cc (temporal_cache::temporal_value): Return a positive int. (temporal_cache::current_p): Check always_current method. (temporal_cache::set_always_current): Add param and set value appropriately. (temporal_cache::always_current_p): New. (ranger_cache::get_global_range): Adjust. (ranger_cache::set_global_range): set always current first.
[Bug tree-optimization/109695] [14 Regression] crash in gimple_ranger::range_of_expr since r14-377-gc92b8be9b52b7e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109695 --- Comment #39 from CVS Commits --- The master branch has been updated by Andrew Macleod : https://gcc.gnu.org/g:d8b058d3ca4ebbef5575105164417f125696f5ce commit r14-1163-gd8b058d3ca4ebbef5575105164417f125696f5ce Author: Andrew MacLeod Date: Tue May 23 15:11:44 2023 -0400 Choose better initial values for ranger. Instead of defaulting to VARYING, fold the stmt using just global ranges. PR tree-optimization/109695 * gimple-range-cache.cc (ranger_cache::get_global_range): Call fold_range with global query to choose an initial value.
[Bug fortran/109684] compiling failure: complaining about a final subroutine of a type being not PURE (while it is indeed PURE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109684 --- Comment #8 from Neil Carlson --- We've been bitten by what looks to be the same bug in our large Fortran code: 245 | end module kuprat_mapper_type | 1 Error: Contained procedure ‘__final_integer_set_type_wavl_Integer_set’ at (1) of a PURE procedure must also be PURE This one really had me baffled. The kuprat_mapper type has no component (or component of component) of the integer_set type, nor any pure procedures. At most, some procedure associated with the kuprat_mapper type has a local integer_set variable. In any event, the integer_set type does have a final procedure and it is pure! What's more baffling is why this error occurred at this point; the integer_set module compiled without error as did many other module files that use it. Note that the code compiles fine with the oneAPI ifort and NAG compilers (and also with gfortran 12.2 and earlier). I haven't attempted yet to try and pare things down to a reportable reproducer, but if it would help I could try to do so.
[Bug libstdc++/109261] std::experimental::simd is not usable in several constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109261 --- Comment #11 from Christophe Lyon --- Thanks, trunk is now OK on both arm and aarch64.
[Bug target/109944] vector CTOR with byte elements and SSE2 has STLF fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109944 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #5 from Alexander Monakov --- (In reply to Richard Biener from comment #3) > so we're building SImode elements in %xmm regs and then > unpack them - that's probably better than a series of > pinsrw due to dependences. For uarchs where grp->xmm > moves are costly it might be better to do > > pxor %xmm0, %xmm0 > pinsrw $0, (%rsi), %xmm0 > pinsrw $1, 32(%rsi), %xmm0 > > though? I'm afraid that is impossible, pinsrw will attempt to load 2 bytes, but only 1 is accessible (if at end of page).
[Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955 --- Comment #1 from Richard Biener --- Created attachment 55149 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55149&action=edit patch I tested This is the patch I tested. I have not yet investigated any of the FAILs. Causes might be missing/differing vec_cmp or vcond_mask patterns or different behavior of the vectorizer or RTL expander.
[Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955 Bug ID: 109955 Summary: Should be possible to remove vcond{,u,eq} expanders Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- It should be possible to remove all vcond, vcondu and vcondeq expanders and have the functionality be implemented via the vec_cmp and vcond_mask expanders. But when removing them a bootstrap & regtest reveals === g++ tests === Running target unix FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-not vpcmpgt[bdq] FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vblendvpd 4 FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vblendvps 4 FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vpblendvb 2 FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-not vpcmpgt[bdq] FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vblendvpd 4 FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vblendvps 4 FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vpblendvb 2 FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C -std=gnu++14 scan-assembler-times vmaxph 3 FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C -std=gnu++14 scan-assembler-times vminph 3 FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-not vpcmpeqd[ t] FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-not vpxor[ t] FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-times vblendvps[ t] 2 FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-not pcmpgt[bdq] FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times blendvpd 4 FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times blendvps 4 FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times pblendvb 2 === gcc tests === Running target unix FAIL: gcc.dg/vect/pr109011-3.c -flto -ffat-lto-objects scan-tree-dump-times optimized " = .POPCOUNT (vect" 3 FAIL: gcc.dg/vect/pr109011-3.c scan-tree-dump-times optimized " = .POPCOUNT (vect" 3 FAIL: gcc.dg/vect/pr109011-5.c -flto -ffat-lto-objects scan-tree-dump-times optimized " = .POPCOUNT (vect" 3 FAIL: gcc.dg/vect/pr109011-5.c scan-tree-dump-times optimized " = .POPCOUNT (vect" 3 FAIL: gcc.target/i386/avx2-pr99908.c scan-assembler-not \\tvpcmpeq FAIL: gcc.target/i386/avx512bw-pr96891-1.c scan-assembler-not %k[0-7] FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-not %k[0-9] FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsb[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsd[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsq[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsw[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminub[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminud[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuq[\\t ] 2 FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuw[\\t ] 2 FAIL: gcc.target/i386/pr109011-b1.c scan-assembler-times vpopcntb[ \\t]+ 4 FAIL: gcc.target/i386/pr109011-w1.c scan-assembler-times vpopcntw[ \\t]+ 4 FAIL: gcc.target/i386/sse4_1-pr99908.c scan-assembler-not \\tpcmpeq
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #12 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #0) > I see this on power 9 fedora 37 (glibc-2.36) but not on power 8 centos 7.9 > (glibc-2.17). Also seen on power 9 rhel 9 (glibc-2.34-60.el9.ppc64le) Not reproduced on Fedora 38 (glibc-2.37-4.fc38.ppc64le) for power 8 or power 9.
[Bug target/109949] new test case experimental/simd/pr109261_constexpr_simd.cc in r12-9647-g3acbaf1b253215 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109949 --- Comment #4 from Matthias Kretz (Vir) --- With -mcpu=power10 I see the issue. The problem has been there all the time and only surfaced with this test. (It should also have shown on `make check-simd` in libstdc++.)
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #36 from Oleg Endo --- (In reply to Alexander Klepikov from comment #35) > > As I understand, you meant the following (I added new functions at the end > of file): > > $ cat f.c > #define ADDR 0x > #define P ((unsigned char *)ADDR) > #define FLAG 0x40 > #define S 7 > Yes, that's what I meant, thanks. Can you also compile for little endian, and most of all, use -O2 optimization level. Some optimizations are not done below -O2. > > I choose that name because I wanted to disable dynamic shift instructions > for all CPUs. I did not hope that it will affect SH-2E code in such way. > > I can rewrite the patch so that it only affects CPUs that do not support > dynamic shifts and disables library call for dynamic shifts. I'll do it > anyway because I need it badly. How do you think, what name of option would > be better: '-mdisable-dynshift-libcall' or '-mhw-shift'? Or if you want, > please suggest another one. Thank you! '-mdisable-dynshift-libcall' would be more appropriate for what it tries to do, I think. Although that is a whole different issue ... but what is it going to do for real dynamic shifts on SH2? What kind of code is it supposed to emit for things like unsigned int dyn_shift (unsigned int x, unsigned int y) { return x << (y & 31); }
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 Jakub Jelinek changed: What|Removed |Added Attachment #55141|0 |1 is obsolete|| --- Comment #43 from Jakub Jelinek --- Created attachment 55148 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55148&action=edit gcc14-bitint-wip.patch Another update. This version can emit _BitInt(N) values in non-automatic variable initializers, handles passing/returning _BitInt(N) and for N <= 64 (i.e. what fits into a single limb) from what I can see handling it in GIMPLE passes and and even expansion/RTL seems to work. Now, as discussed earlier, for N > GET_MODE_PRECISION (limb_mode) I think we want to lower it in some pass in between IPA and vectorization. For N which fits into DImode if limb is 32-bit (currently no target does that as we have just x86-64 support) or which fits into TImode for 64-bit if TImode is supported, I guess we want to map arithmetics to TImode arithmetics, for say 2-4x larger emit code for arithmetics (except perhaps multiplication/division) inline as straight line code and for even larger as loops. In the last case, a question is if we could use e.g. TARGET_MEM_REF for the variable offset in those loops on the vars even when they aren't TREE_ADDRESSABLE (but would force them into memory during expansion).
[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263 --- Comment #35 from Alexander Klepikov --- (In reply to Oleg Endo from comment #34) > Bit-tests of char and unsigned char should be covered by the test-suite and > should work -- at least originally. However, what might be triggering this > problem is the '== FLAG' comparison. When I was working on this issue I > only used '== 0' or '!= 0' comparison. I can imagine that your test code > triggers some other middle end optimizations and hence we get this. Yes, I am sure that the problem is the '== FLAG' comparison. Before I reported that bug, I tried to bypass it and this macro does not produce shift instructions even on GCC 4.7: #define BIT_MASK_IS_SET_(VALUE, BITMASK)\ ({int _value = VALUE & BITMASK,\ _result;\ if (_value == BITMASK){\ _result = 1;\ }\ else {\ _result = 0;\ }\ _result;}) So this is definitely the comparison. > > Can you try to rewrite your test code to something like this? > > unsigned int f(char v){ > return (v & FLAG) != 0; > } > > ... and see if it generates the tst instruction as expected? > As I understand, you meant the following (I added new functions at the end of file): $ cat f.c #define ADDR 0x #define P ((unsigned char *)ADDR) #define FLAG 0x40 #define S 7 unsigned char f_char_var(char v){ return (v & FLAG) == FLAG; } unsigned char f_unsigned_char_var(unsigned char v){ return (v & FLAG) == FLAG; } unsigned char f_symbol(void){ return (*P & FLAG) == FLAG; } unsigned char f_symbol_zero(void){ return (*P & FLAG) == 0; } unsigned char f_symbol_non_zero(void){ return (*P & FLAG) != 0; } Compiler flags: -c -mrenesas -m2e -mb -O -fno-toplevel-reorder With patch disabled: $ cat f_clean.s .file "f.c" .text .text .align 1 .global _f_char_var .type _f_char_var, @function _f_char_var: sts.l pr,@-r15 mov.l .L3,r1 jsr @r1 exts.b r4,r4 mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L4: .align 2 .L3: .long ___ashiftrt_r4_6 .size _f_char_var, .-_f_char_var .align 1 .global _f_unsigned_char_var .type _f_unsigned_char_var, @function _f_unsigned_char_var: sts.l pr,@-r15 mov.l .L7,r1 jsr @r1 exts.b r4,r4 mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L8: .align 2 .L7: .long ___ashiftrt_r4_6 .size _f_unsigned_char_var, .-_f_unsigned_char_var .align 1 .global _f_symbol .type _f_symbol, @function _f_symbol: sts.l pr,@-r15 mov.l .L11,r1 mov.b @r1,r4 mov.l .L12,r1 jsr @r1 nop mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L13: .align 2 .L11: .long -65536 .L12: .long ___ashiftrt_r4_6 .size _f_symbol, .-_f_symbol .align 1 .global _f_symbol_zero .type _f_symbol_zero, @function _f_symbol_zero: mov.l .L15,r1 mov.b @r1,r0 tst #64,r0 rts movtr0 .L16: .align 2 .L15: .long -65536 .size _f_symbol_zero, .-_f_symbol_zero .align 1 .global _f_symbol_non_zero .type _f_symbol_non_zero, @function _f_symbol_non_zero: sts.l pr,@-r15 mov.l .L19,r1 mov.b @r1,r4 mov.l .L20,r1 jsr @r1 nop mov r4,r0 and #1,r0 lds.l @r15+,pr rts nop .L21: .align 2 .L19: .long -65536 .L20: .long ___ashiftrt_r4_6 .size _f_symbol_non_zero, .-_f_symbol_non_zero .ident "GCC: (GNU) 12.3.0" With patch enabled: $ cat f.s .file "f.c" .text .text .align 1 .global _f_char_var .type _f_char_var, @function _f_char_var: mov r4,r0 tst #64,r0 mov #-1,r0 rts negcr0,r0 .size _f_char_var, .-_f_char_var .align 1 .global _f_unsigned_char_var .type _f_unsigned_char_var, @function _f_unsigned_char_var: mov r4,r0 tst #64,r0 mov #-1,r0 rts negcr0,r0 .size _f_unsigned_char_var, .-_f_unsigned_char_var .align 1 .global _f_symbol .type _f_symbol, @function _f_symbol: mov.l .L4,r1 mov.b @r1,r0 tst #64,r0 mov #-1,r0 rts negcr0,r0 .L5: .align 2 .L4: .long -65536 .size _f_symbol, .-_f_symbol .align 1 .global _f_symbol_zero .type _f_symbol_zero, @function _f_symbol_zero: mov
[Bug middle-end/109849] suboptimal code for vector walking loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #13 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:5476de2618ffb77f3a52e59e2c9f10b018329689 commit r14-1161-g5476de2618ffb77f3a52e59e2c9f10b018329689 Author: Richard Biener Date: Wed May 24 12:36:28 2023 +0200 tree-optimization/109849 - fix fallout of PRE hoisting change The PR109849 fix made us no longer hoist some memory loads because of the expression set intersection. We can still avoid to compute the union by simply taking the first sets expressions and leave the pruning of expressions with values not suitable for hoisting to sorted_array_from_bitmap_set. PR tree-optimization/109849 * tree-ssa-pre.cc (do_hoist_insertion): Do not intersect expressions but take the first sets. * gcc.dg/tree-ssa/ssa-hoist-9.c: New testcase.
[Bug libstdc++/109921] c++17/floating_from_chars.cc: compile error: ‘from_chars_strtod’ was not declared in this scope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109921 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2023-05-24 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW
[Bug rtl-optimization/101188] [AVR] Miscompilation and function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 --- Comment #5 from Georg-Johann Lay --- It happens in postreload.cc::reload_cse_move2add() when (insn 45 16 17 2 (set (reg/f:HI 30 r30 [60]) (reg/v/f:HI 16 r16 [orig:51 self ] [51])) "fail1.c":29:9 101 {*movhi_split} (nil)) (insn 17 45 18 2 (parallel [ (set (reg/f:HI 30 r30 [60]) (plus:HI (reg/f:HI 30 r30 [60]) (const_int 66 [0x42]))) (clobber (scratch:QI)) ]) "fail1.c":29:9 175 {addhi3_clobber} (nil)) is transformed to: (insn 17 16 18 2 (set (reg/f:HI 30 r30 [60]) (plus:HI (reg/f:HI 30 r30 [60]) (const_int 2 [0x2]))) "fail1.c":29:9 165 {*addhi3_split} (nil)) The wrong setting of "success" is in postreload.cc:2028 as of the following, so the condition that leads to there is bogus. https://gcc.gnu.org/git/?p=gcc.git;a=blame;f=gcc/postreload.cc;h=fb392651e1b6a60e12bf3d36bc302bf9be8bc608;hb=03c7c418baa01f0642817bc9b44192d134102aa9#l2028
[Bug libstdc++/109261] std::experimental::simd is not usable in several constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109261 --- Comment #10 from CVS Commits --- The master branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:aa8b363171a95b8f867a74f29c75f9577e9087e1 commit r14-1160-gaa8b363171a95b8f867a74f29c75f9577e9087e1 Author: Matthias Kretz Date: Wed May 24 12:50:46 2023 +0200 libstdc++: Fix SFINAE for __is_intrinsic_type on ARM On ARM NEON doesn't support double, so __is_intrinsic_type_v should say false (instead of being ill-formed). Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd.h (__intrinsic_type): Specialize __intrinsic_type and __intrinsic_type in any case, but provide the member type only with __aarch64__.
[Bug libstdc++/109261] std::experimental::simd is not usable in several constant expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109261 --- Comment #9 from CVS Commits --- The master branch has been updated by Matthias Kretz : https://gcc.gnu.org/g:b0a483b0a011f9cbc8b25053eae809c77dae2a12 commit r14-1159-gb0a483b0a011f9cbc8b25053eae809c77dae2a12 Author: Matthias Kretz Date: Tue May 23 23:48:49 2023 +0200 libstdc++: Add missing constexpr to simd_neon Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/109261 * include/experimental/bits/simd_neon.h (_S_reduce): Add constexpr and make NEON implementation conditional on not __builtin_is_constant_evaluated.
[Bug rtl-optimization/109940] [14 Regression] ICE in decide_candidate_validity since g:53dddbfeb213ac4ec39f550aa81eaa4264375d2c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109940 --- Comment #7 from Peter Waller --- I can confirm that the original (not reduced) program no longer hits an ICE with ee2a8b373a88bae4c533aa68bed56bf01afea0e2 (but does with the parent commit). Thanks.
[Bug testsuite/109951] [14 Regression] libgomp, testsuite: non-native multilib c++ tests fail on Darwin.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109951 --- Comment #2 from Iain Sandoe --- OK so the best bracket I've been able to get without doing surgery to make a branch with a back port for the bootstrap break; r14-803-g20ca33db817cec OK r14-857-g30adfb85ff994c NOT OK, My analysis could well also be flawed: * perhaps the bug is actually that GXX_UNDER_TEST should not contain multi-lib-specific paths. * also maybe the include paths are not problematical - the issue might be limited to the -L ones.
[Bug modula2/109952] Inconsistent HIGH values with 'ARRAY OF CHAR'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109952 Gaius Mulley changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Gaius Mulley --- Closing as patch has been applied.
[Bug modula2/109952] Inconsistent HIGH values with 'ARRAY OF CHAR'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109952 --- Comment #2 from CVS Commits --- The master branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:b4df098647b687ca4e43952ec4a198b2816732ba commit r14-1158-gb4df098647b687ca4e43952ec4a198b2816732ba Author: Gaius Mulley Date: Wed May 24 11:14:07 2023 +0100 PR modula2/109952 Inconsistent HIGH values with 'ARRAY OF CHAR' This patch fixes the case when a single character constant literal is passed as a string actual parameter to an ARRAY OF CHAR formal parameter. To be consistent a single character is promoted to a string and nul terminated (and its high value is 1). Previously a single character string would not be nul terminated and the high value was 0. The documentation now includes a section describing the expected behavior and included in this patch is some regression test code matching the table inside the documentation. gcc/ChangeLog: PR modula2/109952 * doc/gm2.texi (High procedure function): New node. (Using): New menu entry for High procedure function. gcc/m2/ChangeLog: PR modula2/109952 * Make-maintainer.in: Change header to include emacs file mode. * gm2-compiler/M2GenGCC.mod (BuildHighFromChar): Check whether operand is a constant string and is nul terminated then return one. * gm2-compiler/PCSymBuild.mod (WalkFunction): Add default return TRUE. Static analysis missing return path fix. * gm2-libs/IO.mod (Init): Rewrite to help static analysis. * target-independent/m2/gm2-libs.texi: Rebuild. gcc/testsuite/ChangeLog: PR modula2/109952 * gm2/pim/run/pass/hightests.mod: New test. Signed-off-by: Gaius Mulley
[Bug fortran/109684] compiling failure: complaining about a final subroutine of a type being not PURE (while it is indeed PURE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109684 --- Comment #7 from Tomáš Trnka --- (In reply to Paul Thomas from comment #5) > Created attachment 55144 [details] > Fix for this PR > > Thanks for reporting this. The patch "fingered" in comment #4 is certainly > responsible for this regression. In particular, it is the first chunk in > resolve.cc that is the culprit. > > The attached patch feels to be a bit of sticking plaster on top of sticking > plaster and so I will go back to hunt down the root cause of these > namespace-less symbols. Thanks for the patch. It seems to mostly do the trick for our huge proprietary F2008 codebase, but some files ultimately fail to compile with the following error (not sure if related or a different bug): in gfc_format_decoder, at fortran/error.cc:1078 0xb01b5a gfc_format_decoder ../../gcc/fortran/error.cc:1078 0x1594c0c pp_format(pretty_printer*, text_info*) ../../gcc/pretty-print.cc:1475 0x10f0c5e diagnostic_report_diagnostic(diagnostic_context*, diagnostic_info*) ../../gcc/diagnostic.cc:1592 0x1789c5d gfc_report_diagnostic ../../gcc/fortran/error.cc:890 0x1789c5d gfc_warning ../../gcc/fortran/error.cc:923 0x1789da7 gfc_warning(int, char const*, ...) ../../gcc/fortran/error.cc:954 0x1852c41 resolve_procedure_expression ../../gcc/fortran/resolve.cc:1957 0x1852c41 resolve_variable ../../gcc/fortran/resolve.cc:6030 0x1852c41 gfc_resolve_expr(gfc_expr*) ../../gcc/fortran/resolve.cc:7266 0x1806880 gfc_resolve_expr(gfc_expr*) ../../gcc/fortran/resolve.cc:7231 0x1806880 resolve_structure_cons ../../gcc/fortran/resolve.cc:1341 0x1858969 resolve_values ../../gcc/fortran/resolve.cc:12771 0x1869492 do_traverse_symtree ../../gcc/fortran/symbol.cc:4190 0x185b02f gfc_traverse_ns(gfc_namespace*, void (*)(gfc_symbol*)) ../../gcc/fortran/symbol.cc:4215 0x185b02f resolve_types ../../gcc/fortran/resolve.cc:17899 0x184cf93 gfc_resolve(gfc_namespace*) ../../gcc/fortran/resolve.cc:17996 0x184fb47 resolve_symbol ../../gcc/fortran/resolve.cc:16567 0x1869492 do_traverse_symtree ../../gcc/fortran/symbol.cc:4190 0x185aee0 gfc_traverse_ns(gfc_namespace*, void (*)(gfc_symbol*)) ../../gcc/fortran/symbol.cc:4215 0x185aee0 resolve_types ../../gcc/fortran/resolve.cc:17880 This seems to be the following assert: gcc_assert (loc->nextc - loc->lb->line >= 0); The backtrace I get from gdb is a little different (there's no resolve_structure_cons in it, for example; I guess that it might be due to LTO): #0 gfc_warning (opt=0, gmsgid=0x1e55748 "Non-RECURSIVE procedure %qs at %L is possibly calling itself recursively. Declare it RECURSIVE or use %<-frecursive%>") at ../../gcc/fortran/error.cc:950 #1 0x01852c42 in resolve_procedure_expression (expr=0x2aefc80) at ../../gcc/fortran/resolve.cc:1957 #2 resolve_variable (e=0x2aefc80) at ../../gcc/fortran/resolve.cc:6030 #3 gfc_resolve_expr (e=0x2aefc80) at ../../gcc/fortran/resolve.cc:7266 #4 0x01806881 in gfc_resolve_expr (e=0x2aefc80) at ../../gcc/fortran/resolve.cc:7231 #5 resolve_structure_cons (expr=, init=1) at ../../gcc/fortran/resolve.cc:1341 #6 0x0185896a in resolve_values (sym=0x2ad30c0) at ../../gcc/fortran/resolve.cc:12771 #7 0x01869493 in do_traverse_symtree (st=, st_func=0x0, sym_func=0x1858900 ) at ../../gcc/fortran/symbol.cc:4190 #8 0x0185b030 in gfc_traverse_ns (sym_func=0x1858900 , ns=0x3ae65e0) at ../../gcc/fortran/symbol.cc:4215 #9 resolve_types (ns=0x3ae65e0) at ../../gcc/fortran/resolve.cc:17899 #10 0x0184cf94 in gfc_resolve (ns=0x3ae65e0) at ../../gcc/fortran/resolve.cc:17996 #11 0x0184d022 in gfc_resolve (ns=) at ../../gcc/fortran/resolve.cc:17983 #12 0x0184fb48 in resolve_symbol (sym=) at ../../gcc/fortran/resolve.cc:16567 #13 0x01869493 in do_traverse_symtree (st=, st_func=0x0, sym_func=0x184d030 ) at ../../gcc/fortran/symbol.cc:4190 #14 0x0185aee1 in gfc_traverse_ns (sym_func=0x184d030 , ns=0x3697bb0) at ../../gcc/fortran/symbol.cc:4215 #15 resolve_types (ns=0x3697bb0) at ../../gcc/fortran/resolve.cc:17880 #16 0x0184cf94 in gfc_resolve (ns=0x3697bb0) at ../../gcc/fortran/resolve.cc:17996 #17 0x0184d022 in gfc_resolve (ns=) at ../../gcc/fortran/resolve.cc:17983 #18 0x0184fb48 in resolve_symbol (sym=) at ../../gcc/fortran/resolve.cc:16567 #19 0x01869493 in do_traverse_symtree (st=, st_func=0x0, sym_func=0x184d030 ) at ../../gcc/fortran/symbol.cc:4190 #20 0x0185aee1 in gfc_traverse_ns (sym_func=0x184d030 , ns=0x3238a50) at ../../gcc/fortran/symbol.cc:4215 #21 resolve_types (ns=0x3238a50) at ../../gcc/fortran/resolve.cc:17880 #22 0x0184cf94 in gfc_resolve (ns=0x3238a50) at ../../gcc/fortran/resolve.cc:17996 #23 0x0184d022 in gfc_resolve (ns=) at ../../gcc/fortran/resolve.cc:17983 #24 0x0184fb48 in resol