Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Richard Biener
On Thu, 23 May 2024, Ian Lance Taylor wrote:

> On Thu, May 23, 2024 at 2:48 PM Martin Uecker  wrote:
> >
> > Am Donnerstag, dem 23.05.2024 um 14:30 -0700 schrieb Ian Lance Taylor:
> > > On Thu, May 23, 2024 at 2:00 PM Joseph Myers  wrote:
> > > >
> > > > On Tue, 21 May 2024, Martin Uecker wrote:
> > > > >
> > > > > C: allow aliasing of compatible types derived from enumeral types 
> > > > > [PR115157]
> > > > >
> > > > > Aliasing of enumeral types with the underlying integer is now 
> > > > > allowed
> > > > > by setting the aliasing set to zero.  But this does not allow 
> > > > > aliasing
> > > > > of derived types which are compatible as required by ISO C.  
> > > > > Instead,
> > > > > initially set structural equality.  Then set TYPE_CANONICAL and 
> > > > > update
> > > > > pointers and main variants when the type is completed (as done for
> > > > > structures and unions in C23).
> > > > >
> > > > > PR 115157
> > > > >
> > > > > gcc/c/
> > > > > * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> > > > > finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / 
> > > > > TYPE_CANONICAL.
> > > > > * c-obj-common.cc (get_alias_set): Remove special case.
> > > > > (get_aka_type): Add special case.
> > > > >
> > > > > gcc/
> > > > > * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT 
> > > > > instead
> > > > > of TYPE_CANONICAL.
> > > > >
> > > > > gcc/testsuite/
> > > > > * gcc.dg/enum-alias-1.c: New test.
> > > > > * gcc.dg/enum-alias-2.c: New test.
> > > > > * gcc.dg/enum-alias-3.c: New test.
> > > >
> > > > OK, in the absence of objections on middle-end or Go grounds within the
> > > > next week.
> > >
> > > The godump.cc patch is
> > >
> > >&& (TYPE_CANONICAL (TREE_TYPE (decl)) == NULL_TREE
> > >   || !container->decls_seen.contains
> > > -   (TYPE_CANONICAL (TREE_TYPE (decl)
> > > +   (TYPE_MAIN_VARIANT (TREE_TYPE 
> > > (decl)
> > >  {
> > >
> > > What is the problem you are seeing?
> >
> > Test failures in godump-1.c
> >
> > >
> > > This patch isn't right:
> > >
> > > 1) The code is saying if "X == NULL_TREE || !already_seen(X)".  This
> > > patch is changing the latter X but not the former.  They should be
> > > consistent.
> >
> > Maybe the X == NULL_TREE can be removed if we
> > add TYPE_MAIN_VARIANTs instead?
> 
> If TYPE_MAIN_VARIANT is never NULL_TREE, then I agree that the
> NULL_TREE test can be removed.

TYPE_MAIN_VARIANT is indeed never NULL_TREE.

Richard.

Re: [PATCH] RISC-V: Avoid splitting store dataref groups during SLP discovery

2024-05-23 Thread Richard Biener
On Thu, 23 May 2024, Richard Biener wrote:

> The following avoids splitting store dataref groups during SLP
> discovery but instead forces (eventually single-lane) consecutive
> lane SLP discovery for all lanes of the group, creating VEC_PERM
> SLP nodes merging them so the store will always cover the whole group.
> 
> With this for example
> 
> int x[1024], y[1024], z[1024], w[1024];
> void foo (void)
> {
>   for (int i = 0; i < 256; i++)
> {
>   x[4*i+0] = y[2*i+0];
>   x[4*i+1] = y[2*i+1];
>   x[4*i+2] = z[i];
>   x[4*i+3] = w[i];
> }
> }
> 
> which was previously using hybrid SLP can now be fully SLPed and
> SSE code generated looks better (but of course you never know,
> I didn't actually benchmark).  We of course need a VF of four here.
> 
> .L2:
> movdqa  z(%rax), %xmm0
> movdqa  w(%rax), %xmm4
> movdqa  y(%rax,%rax), %xmm2
> movdqa  y+16(%rax,%rax), %xmm1
> movdqa  %xmm0, %xmm3
> punpckhdq   %xmm4, %xmm0
> punpckldq   %xmm4, %xmm3
> movdqa  %xmm2, %xmm4
> shufps  $238, %xmm3, %xmm2
> movaps  %xmm2, x+16(,%rax,4)
> movdqa  %xmm1, %xmm2
> shufps  $68, %xmm3, %xmm4
> shufps  $68, %xmm0, %xmm2
> movaps  %xmm4, x(,%rax,4)
> shufps  $238, %xmm0, %xmm1
> movaps  %xmm2, x+32(,%rax,4)
> movaps  %xmm1, x+48(,%rax,4)
> addq$16, %rax
> cmpq$1024, %rax
> jne .L2
> 
> The extra permute nodes merging distinct branches of the SLP
> tree might be unexpected for some code, esp. since
> SLP_TREE_REPRESENTATIVE cannot be meaningfully set and we
> cannot populate SLP_TREE_SCALAR_STMTS or SLP_TREE_SCALAR_OPS
> consistently as we can have a mix of both.
> 
> The patch keeps the sub-trees form consecutive lanes but that's
> in principle not necessary if we for example have an even/odd
> split which now would result in N single-lane sub-trees.  That's
> left for future improvements.
> 
> The interesting part is how VLA vector ISAs handle merging of
> two vectors that's not trivial even/odd merging.  The strathegy
> of how to build the permute tree might need adjustments for that
> (in the end splitting each branch to single lanes and then doing
> even/odd merging would be the brute-force fallback).  Not sure
> how much we can or should rely on the SLP optimize pass to handle
> this.
> 
> The gcc.dg/vect/slp-12a.c case is interesting as we currently split
> the 8 store group into lanes 0-5 which we SLP with an unroll factor
> of two (on x86-64 with SSE) and the remaining two lanes are using
> interleaving vectorization with a final unroll factor of four.  Thus
> we're using hybrid SLP within a single store group.  After the change
> we discover the same 0-5 lane SLP part as well as two single-lane
> parts feeding the full store group.  But that results in a load
> permutation that isn't supported (I have WIP patchs to rectify that).
> So we end up cancelling SLP and vectorizing the whole loop with
> interleaving which is IMO good and results in better code.
> 
> This is similar for gcc.target/i386/pr52252-atom.c where interleaving
> generates much better code than hybrid SLP.  I'm unsure how to update
> the testcase though.
> 
> gcc.dg/vect/slp-21.c runs into similar situations.  Note that when
> when analyzing SLP operations we discard an instance we currently
> force the full loop to have no SLP because hybrid detection is
> broken.  It's probably not worth fixing this at this moment.
> 
> For gcc.dg/vect/pr97428.c we are not splitting the 16 store group
> into two but merge the two 8 lane loads into one before doing the
> store and thus have only a single SLP instance.  A similar situation
> happens in gcc.dg/vect/slp-11c.c but the branches feeding the
> single SLP store only have a single lane.  Likewise for
> gcc.dg/vect/vect-complex-5.c and gcc.dg/vect/vect-gather-2.c.
> 
> gcc.dg/vect/slp-cond-1.c has an additional SLP vectorization
> with a SLP store group of size two but two single-lane branches.
> 
> (merged with the testsuite changes, re-posted because the RISC-V
> CI ran on a tree w/o a fix, hopefully fixing all the reported
> ICEs)

This worked out so I pushed the change.  The gcc.dg/vect/pr97428.c
test is FAILing on RISC-V (it still gets 0 SLP), because of missed
load permutations.  I hope the followup reorg for the load side will
fix this.  It also FAILs gcc.target/riscv/rvv/autovec/struct/struct_vect-4.c
which does excessive assembly scanning on many functions - I'll leave
this for target maintainers to update - there's one or two functions
which we now expect to SLP.

Richard.

>   * tree-vect-slp.cc (vect_build_slp_instance): Do not split
>   store dataref groups on loop SLP discovery failure but create
>   a single SLP instance for the stores but branch to SLP sub-trees
>   and merge with a series of VEC_PERM nodes.
> 
>   * gcc.dg/vect/pr97428.c: Expect a single store SLP group.
>   

[PATCH V2] RISC-V: Fix missing boolean_expression in zmmul extension

2024-05-23 Thread Liao Shihua
Update v1->v2
Add testcase for this patch.

Missing boolean_expression TARGET_ZMMUL in riscv_rtx_costs() cause different 
instructions when 
multiplying an integer with a constant. ( 
https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1482 )

int foo(int *ib) {
*ib = *ib * 33938;
return 0;
}

rv64im:
lw  a4,0(a1)
li  a5,32768
addiw   a5,a5,1170
mulwa5,a5,a4
sw  a5,0(a1)
ret

rv64i_zmmul:
lw  a4,0(a1)
slliw   a5,a4,5
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,1
sw  a5,0(a1)
ret

Fixed.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ZMMUL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zmmul-3.c: New test.

---
 gcc/config/riscv/riscv.cc| 2 +-
 gcc/testsuite/gcc.target/riscv/zmmul-3.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-3.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 85df5b7ab49..580ae007181 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3753,7 +3753,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 case MULT:
   if (float_mode_p)
*total = tune_param->fp_mul[mode == DFmode];
-  else if (!TARGET_MUL)
+  else if (!(TARGET_MUL || TARGET_ZMMUL))
/* Estimate the cost of a library call.  */
*total = COSTS_N_INSNS (speed ? 32 : 6);
   else if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD)
diff --git a/gcc/testsuite/gcc.target/riscv/zmmul-3.c 
b/gcc/testsuite/gcc.target/riscv/zmmul-3.c
new file mode 100644
index 000..ae9752462e4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zmmul-3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64iafdc_zmmul -mabi=lp64d" } */
+int foo1(int a)
+{
+return a * 99;
+}
+
+/* { dg-final { scan-assembler-times "mulw\t" 1 } } */
\ No newline at end of file
-- 
2.34.1



[to-be-committed][v2][RISC-V] Use bclri in constant synthesis

2024-05-23 Thread Jeff Law
Testing with Zbs enabled by default showed a minor logic error.  After 
the loop clearing things with bclri, we can only use the sequence if we 
were able to clear all the necessary bits.  If any bits are still on, 
then the bclr sequence turned out to not be profitable.


--

So this is conceptually similar to how we handled direct generation of
bseti for constant synthesis, but this time for bclr.

In the bclr case, we already have an expander for AND.  So we just
needed to adjust the predicate to accept another class of constant
operands (those with a single bit clear).

With that in place constant synthesis is adjusted so that it counts the
number of bits clear in the high 33 bits of a 64bit word.  If that
number is small relative to the current best cost, then we try to
generate the constant with a lui based sequence for the low half which
implicitly sets the upper 32 bits as well.  Then we bclri one or more of
those upper 33 bits.

So as an example, this code goes from 4 instructions down to 3:

> unsigned long foo_0xfffbf7ff(void) { return 
0xfffbf7ffUL; }




Note the use of 33 bits above.  That's meant to capture cases like this:


> unsigned long foo_0xfffd77ff(void) { return 
0xfffd77ffUL; }




We can use lui+addi+bclri+bclri to synthesize that in 4 instructions
instead of 5.




I'm including a handful of cases covering the two basic ideas above that
were found by the testing code.

And, no, we're not done yet.  I see at least one more notable idiom
missing before exploring zbkb's potential to improve things.

Tested in my tester and waiting on Rivos CI system before moving forward.
gcc/

* config/riscv/predicates.md (arith_operand_or_mode_mask): Renamed to..
(arith_or_mode_mask_or_zbs_operand): New predicate.
* config/riscv/riscv.md (and3): Update predicate for operand 2.
* config/riscv/riscv.cc (riscv_build_integer_1): Use bclri to clear
bits, particularly bits 31..63 when profitable to do so.

gcc/testsuite/

* gcc.target/riscv/synthesis-6.c: New test.

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 8948fbfc363..c1c693c7617 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -27,12 +27,6 @@ (define_predicate "arith_operand"
   (ior (match_operand 0 "const_arith_operand")
(match_operand 0 "register_operand")))
 
-(define_predicate "arith_operand_or_mode_mask"
-  (ior (match_operand 0 "arith_operand")
-   (and (match_code "const_int")
-(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
-|| UINTVAL (op) == GET_MODE_MASK (SImode)"
-
 (define_predicate "lui_operand"
   (and (match_code "const_int")
(match_test "LUI_OPERAND (INTVAL (op))")))
@@ -398,6 +392,14 @@ (define_predicate "not_single_bit_mask_operand"
   (and (match_code "const_int")
(match_test "SINGLE_BIT_MASK_OPERAND (~UINTVAL (op))")))
 
+(define_predicate "arith_or_mode_mask_or_zbs_operand"
+  (ior (match_operand 0 "arith_operand")
+   (and (match_test "TARGET_ZBS")
+   (match_operand 0 "not_single_bit_mask_operand"))
+   (and (match_code "const_int")
+(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
+|| UINTVAL (op) == GET_MODE_MASK (SImode)"
+
 (define_predicate "const_si_mask_operand"
   (and (match_code "const_int")
(match_test "(INTVAL (op) & (GET_MODE_BITSIZE (SImode) - 1))
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 85df5b7ab49..3b32b515fac 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -893,6 +893,40 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  codes[1].use_uw = false;
  cost = 2;
}
+
+  /* If LUI/ADDI are going to set bits 32..63 and we need a small
+number of them cleared, we might be able to use bclri profitably. 
+
+Note we may allow clearing of bit 31 using bclri.  There's a class
+of constants with that bit clear where this helps.  */
+  else if (TARGET_64BIT
+  && TARGET_ZBS
+  && (32 - popcount_hwi (value & HOST_WIDE_INT_C 
(0x8000))) + 1 < cost)
+   {
+ /* Turn on all those upper bits and synthesize the result.  */
+ HOST_WIDE_INT nval = value | HOST_WIDE_INT_C (0x8000);
+ alt_cost = riscv_build_integer_1 (alt_codes, nval, mode);
+
+ /* Now iterate over the bits we want to clear until the cost is
+too high or we're done.  */
+ nval = value ^ HOST_WIDE_INT_C (-1);
+ nval &= HOST_WIDE_INT_C (~0x7fff);
+ while (nval && alt_cost < cost)
+   {
+ HOST_WIDE_INT bit = ctz_hwi (nval);
+ alt_codes[alt_cost].code = AND;
+ alt_codes[alt_cost].value = ~(1UL << bit);
+ alt_codes[alt_cost].use_uw = false;
+ 

Re: [PATCH 1/2] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

2024-05-23 Thread Hongtao Liu
CC for review.

On Tue, May 21, 2024 at 1:12 PM liuhongt  wrote:
>
> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
> of A, then it can be simplified to LSHIFTRT.
>
> i.e Simplify
> (and:v8hi
>   (ashifrt:v8hi A 8)
>   (const_vector 0xff x8))
> to
> (lshifrt:v8hi A 8)
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok of trunk?
>
> gcc/ChangeLog:
>
> PR target/114428
> * simplify-rtx.cc
> (simplify_context::simplify_binary_operation_1):
> Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for
> specific mask.
> ---
>  gcc/simplify-rtx.cc | 25 +
>  1 file changed, 25 insertions(+)
>
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index 53f54d1d392..6c91409200e 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -4021,6 +4021,31 @@ simplify_context::simplify_binary_operation_1 
> (rtx_code code,
> return tem;
> }
>
> +  /* (and:v4si
> +  (ashiftrt:v4si A 16)
> +  (const_vector: 0x x4))
> +is just (lshiftrt:v4si A 16).  */
> +  if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
> + && (CONST_INT_P (XEXP (op0, 1))
> + || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
> + && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1
> + && GET_CODE (op1) == CONST_VECTOR
> + && CONST_VECTOR_DUPLICATE_P (op1))
> +   {
> + unsigned HOST_WIDE_INT shift_count
> +   = (CONST_INT_P (XEXP (op0, 1))
> +  ? UINTVAL (XEXP (op0, 1))
> +  : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0)));
> + unsigned HOST_WIDE_INT inner_prec
> +   = GET_MODE_PRECISION (GET_MODE_INNER (mode));
> +
> + /* Avoid UD shift count.  */
> + if (shift_count < inner_prec
> + && (UINTVAL (XVECEXP (op1, 0, 0))
> + == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1))
> +   return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP 
> (op0, 1));
> +   }
> +
>tem = simplify_byte_swapping_operation (code, mode, op0, op1);
>if (tem)
> return tem;
> --
> 2.31.1
>


-- 
BR,
Hongtao


Re: [PATCH 1/2] c++/modules: Fix treatment of unnamed types

2024-05-23 Thread Nathaniel Shead
On Thu, May 23, 2024 at 03:36:48PM -0400, Jason Merrill wrote:
> On 5/23/24 09:27, Nathaniel Shead wrote:
> > On Mon, May 20, 2024 at 06:00:09PM -0400, Jason Merrill wrote:
> > > On 5/17/24 02:14, Nathaniel Shead wrote:
> > > > On Tue, May 14, 2024 at 06:21:48PM -0400, Jason Merrill wrote:
> > > > > On 5/12/24 22:58, Nathaniel Shead wrote:
> > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > > 
> > > > > OK.
> > > > > 
> > > > 
> > > > I realised as I was looking over this again that I might have spoken too
> > > > soon with the header unit example being supported. Doing the following:
> > > > 
> > > > // a.H
> > > > struct { int y; } s;
> > > > decltype(s) f(decltype(s));  // { dg-error "used but never defined" 
> > > > }
> > > > inline auto x = f({ 123 });
> > > > // b.C
> > > > struct {} unrelated;
> > > > import "a.H";
> > > > decltype(s) f(decltype(s) x) {
> > > >   return { 456 + x.y };
> > > > }
> > > > 
> > > > // c.C
> > > > import "linkage-3_a.H";
> > > > int main() { auto a = x.y; }
> > > > 
> > > > Actually does fail to link, because in 'c.C' we call 'f(.anon_0)' but
> > > > the definition 'b.C' is f(.anon_1).
> > > > 
> > > > I don't think this is fixable, so I don't think this direction is
> > > > workable.
> > > 
> > > Since namespace-scope anonymous types are TU-local, we don't need to 
> > > support
> > > that for proper modules, but it's not clear to me that we don't need to
> > > support it for header units.
> > > 
> > > OTOH, https://eel.is/c++draft/module#import-5.3 allows c.C to import a
> > > different header unit than b.C, in which case the type is different and x
> > > violates the odr.
> > > 
> > 
> > Right; I think at this stage I don't know how to support this for header
> > units (and even for module interface units it doesn't actually work;
> > more on this below), so I think saying that this is actually an ODR
> > violation is OK.
> > 
> > > > That said, I think that it might still be worth making header modules
> > > > satisfy 'module_has_cmi_p', since that is true to the name, and will be
> > > > useful in other places we currently use 'module_p ()': in which case we
> > > > could instead make all the callers in 'no_linkage_check' do
> > > > 'module_maybe_has_cmi_p () && !header_module_p ()'; something like the
> > > > following, perhaps?
> > > 
> > > If we need that condition, it should be its own predicate rather than
> > > expecting callers to do that combined check.
> > > 
> > > But it's not clear to me how this is different from a type in the GMF of a
> > > named module, which is exactly the maybe_has_cmi case; there we could 
> > > again
> > > see a different version of the type if another TU includes the header.
> > > 
> > > Jason
> > > 
> > 
> > This made me go back and double-check for named modules and it actually
> > does fail as well; the following sample ICEs, even:
> > 
> >export module M;
> >struct {} s;
> >int h(decltype(s));
> >int x = h(s);  // ICE in write_unnamed_type_name, cp/mangle.cc:1806
> > 
> > So I think maybe the way to go here is to just not treat unnamed types
> > as something that could possibly be accessed from a different TU, like
> > the below.  Then we don't need to do the special handling for header
> > units, since as you say, they're not materially different anyway.
> > Thoughts?
> 
> Sounds good.
> 
> > (And I've moved the original change to 'module_has_cmi_p' to a separate
> > patch given it's somewhat unrelated now.)
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk (and
> > maybe 14.2)?
> > 
> > -- >8 --
> > 
> > In r14-9530 we relaxed "depending on type with no-linkage" errors for
> > declarations that could actually be accessed from different TUs anyway.
> > However, this also enabled it for unnamed types, which never work.
> > 
> > In a normal module interface, an unnamed type is TU-local by
> > [basic.link] p15.2, and so cannot be exposed or the program is
> > ill-formed.  We don't yet implement this checking but we should assume
> > that we will later; currently supporting this actually causes ICEs when
> > attempting to create the mangled name in some situations.
> > 
> > For a header unit, by [module.import] p5.3 it is unspecified whether two
> > TUs importing a header unit providing such a declaration are importing
> > the same header unit.  In this case, we would require name mangling
> > changes to somehow allow the (anonymous) type exported by such a header
> > unit to correspond across different TUs in the presence of other
> > anonymous declarations, so for this patch just assume that this case
> > would be an ODR violation instead.
> > 
> > diff --git a/gcc/testsuite/g++.dg/modules/linkage-2.C 
> > b/gcc/testsuite/g++.dg/modules/linkage-2.C
> > index eb4d7b051af..f69bd7ff728 100644
> > --- a/gcc/testsuite/g++.dg/modules/linkage-2.C
> > +++ b/gcc/testsuite/g++.dg/modules/linkage-2.C
> > @@ -13,14 

Re: [PATCH] missing reuire target has_arch_ppc64 for pr106550.c

2024-05-23 Thread Jiufu Guo


Hi,
>>> ---
>>>  gcc/testsuite/gcc.target/powerpc/pr106550.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>> 
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr106550.c
>>> index 74e395331ab..146514b3adf 100644
>>> --- a/gcc/testsuite/gcc.target/powerpc/pr106550.c
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c
>>> @@ -1,6 +1,7 @@
>>>  /* PR target/106550 */
>>>  /* { dg-options "-O2 -mdejagnu-cpu=power10" } */
>>>  /* { dg-require-effective-target power10_ok } */
>>
>> Nit: power10_ok can be dropped.
> Yeap, thanks for catch this!
>
>>
>>> +/* { dg-require-effective-target has_arch_ppc64 } */
>> OK with the nits above tweaked, thanks.

Thanks, pushed as r15-790.

BR,
Jeff(Jiufu) Guo

>
> Thanks again.
>
> BR,
> Jeff(Jiufu) Guo
>
>>
>> BR,
>> Kewen


Re: [PATCH V3] report message for operator %a on unaddressible operand

2024-05-23 Thread Jiufu Guo


Hi,

Hans-Peter Nilsson  writes:

> On Mon, 20 May 2024, Jiufu Guo wrote:
>
>> Hi,
>> 
>> For PR96866, when printing asm code for modifier "%a", an addressable
>> operand is required.  While the constraint "X" allow any kind of
>> operand even which is hard to get the address directly. e.g. extern
>> symbol whose address is in TOC.
>> An error message would be reported to indicate the invalid asm operand.
>> 
>> Compare with previous version, code comments and message are updated.
>> 
>> Bootstrap pass on ppc64{,le}.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff(Jiufu Guo)
>> 
>>  PR target/96866
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.cc (print_operand_address):
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/pr96866-1.c: New test.
>>  * gcc.target/powerpc/pr96866-2.c: New test.
>
> The gcc/ChangeLog entry needs some text after that ":".
Oh, Sorry for missing that. Thanks for pointing out this.

BR.
Jeff(Jiufu) Guo.

>
> brgds, H-P


[PATCH v2] c++: mark TARGET_EXPRs for function arguments eliding [PR114707]

2024-05-23 Thread Marek Polacek
On Thu, May 23, 2024 at 04:04:13PM -0400, Jason Merrill wrote:
> On 5/23/24 10:41, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > Coming back to our discussion in
> > :
> > TARGET_EXPRs that initialize a function argument are not marked
> > TARGET_EXPR_ELIDING_P even though gimplify_arg drops such TARGET_EXPRs
> > on the floor.
> 
> But only if TREE_TYPE (TARGET_EXPR_INITIAL is non-void, I think we should
> check that here too to be parallel.

Ah yes, definitely.
 
> Perhaps most/all affected TARGET_EXPRs will have been handled earlier in the
> function under the TREE_ADDRESSABLE check, but I wouldn't rely on that
> without an assert.

So like this or you want an assert somewhere too?  dg.exp passed.

-- >8 --
Coming back to our discussion in
:
TARGET_EXPRs that initialize a function argument are not marked
TARGET_EXPR_ELIDING_P even though gimplify_arg drops such TARGET_EXPRs
on the floor.  To work around it, I added a pset to
replace_placeholders_for_class_temp_r, but it would be best to just rely
on TARGET_EXPR_ELIDING_P.

PR c++/114707

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing): Call set_target_expr_eliding.
* typeck2.cc (replace_placeholders_for_class_temp_r): Don't use pset.
(digest_nsdmi_init): Call cp_walk_tree_without_duplicates instead of
cp_walk_tree.
---
 gcc/cp/call.cc|  6 ++
 gcc/cp/typeck2.cc | 20 
 2 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index ed68eb3c568..35c024f2c7c 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9437,6 +9437,12 @@ convert_for_arg_passing (tree type, tree val, 
tsubst_flags_t complain)
   if (complain & tf_warning)
 warn_for_address_of_packed_member (type, val);
 
+  /* gimplify_arg elides TARGET_EXPRs that initialize a function argument.  */
+  if (TREE_CODE (val) == TARGET_EXPR)
+if (tree init = TARGET_EXPR_INITIAL (val))
+  if (!VOID_TYPE_P (TREE_TYPE (init)))
+   set_target_expr_eliding (val);
+
   return val;
 }
 
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 06bad4d3303..7782f38da43 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1409,16 +1409,14 @@ digest_init_flags (tree type, tree init, int flags, 
tsubst_flags_t complain)
in the context of guaranteed copy elision).  */
 
 static tree
-replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
+replace_placeholders_for_class_temp_r (tree *tp, int *, void *)
 {
   tree t = *tp;
-  auto pset = static_cast *>(data);
 
   /* We're looking for a TARGET_EXPR nested in the whole expression.  */
   if (TREE_CODE (t) == TARGET_EXPR
   /* That serves as temporary materialization, not an initializer.  */
-  && !TARGET_EXPR_ELIDING_P (t)
-  && !pset->add (t))
+  && !TARGET_EXPR_ELIDING_P (t))
 {
   tree init = TARGET_EXPR_INITIAL (t);
   while (TREE_CODE (init) == COMPOUND_EXPR)
@@ -1433,16 +1431,6 @@ replace_placeholders_for_class_temp_r (tree *tp, int *, 
void *data)
  gcc_checking_assert (!find_placeholders (init));
}
 }
-  /* TARGET_EXPRs initializing function arguments are not marked as eliding,
- even though gimplify_arg drops them on the floor.  Don't go replacing
- placeholders in them.  */
-  else if (TREE_CODE (t) == CALL_EXPR || TREE_CODE (t) == AGGR_INIT_EXPR)
-for (int i = 0; i < call_expr_nargs (t); ++i)
-  {
-   tree arg = get_nth_callarg (t, i);
-   if (TREE_CODE (arg) == TARGET_EXPR && !TARGET_EXPR_ELIDING_P (arg))
- pset->add (arg);
-  }
 
   return NULL_TREE;
 }
@@ -1490,8 +1478,8 @@ digest_nsdmi_init (tree decl, tree init, tsubst_flags_t 
complain)
  temporary materialization does not occur when initializing an object
  from a prvalue of the same type, therefore we must not replace the
  placeholder with a temporary object so that it can be elided.  */
-  hash_set pset;
-  cp_walk_tree (, replace_placeholders_for_class_temp_r, , nullptr);
+  cp_walk_tree_without_duplicates (, 
replace_placeholders_for_class_temp_r,
+  nullptr);
 
   return init;
 }

base-commit: ee492101c2e51b58e926307448d35f539aec0b2c
-- 
2.45.1



Re: [PATCH V3] report message for operator %a on unaddressible operand

2024-05-23 Thread Hans-Peter Nilsson
On Mon, 20 May 2024, Jiufu Guo wrote:

> Hi,
> 
> For PR96866, when printing asm code for modifier "%a", an addressable
> operand is required.  While the constraint "X" allow any kind of
> operand even which is hard to get the address directly. e.g. extern
> symbol whose address is in TOC.
> An error message would be reported to indicate the invalid asm operand.
> 
> Compare with previous version, code comments and message are updated.
> 
> Bootstrap pass on ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff(Jiufu Guo)
> 
>   PR target/96866
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (print_operand_address):
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr96866-1.c: New test.
>   * gcc.target/powerpc/pr96866-2.c: New test.

The gcc/ChangeLog entry needs some text after that ":".

brgds, H-P


[PATCH v2] c++/modules: Improve errors for bad module-directives [PR115200]

2024-05-23 Thread Nathaniel Shead
On Thu, May 23, 2024 at 05:11:39PM -0400, Jason Merrill wrote:
> On 5/23/24 10:54, Nathaniel Shead wrote:
> > Bootstrapped and regtested (so far just modules.exp and dg.exp) on
> > x86_64-pc-linux-gnu, OK for trunk if full regtest succeeds?
> > 
> > -- >8 --
> > 
> > This fixes an ICE when a module directive is not given at global scope.
> > Although not explicitly mentioned, it seems implied from [basic.link] p1
> > and [module.global.frag] that a module-declaration must appear at the
> > global scope after preprocessing.  Apart from this the patch also
> > slightly improves the errors given when accidentally using a module
> > control-line in other situations where it is not expected.
> 
> This could also come up with something like
> 
> int module;
> int i =
>   module; // error, unexpected module directive
> 
> Adding a line break seems like confusing advice for this problem; rather,
> they need to remove the line break before 'module'.  And possibly add it in
> somewhere else, but the problem is that 'module' is the first token on the
> line.  And if I put that in a namespace,
> 
> namespace A {
>   int module;
>   int i =
> module; // error, unexpected module directive
> }
> 
> the problem is the same, but we get a different diagnostic.
> 

True.

FWIW I just used the same wording as in 'cp_parser_import_declaration';
my understanding is it's because you can disambiguate by adding a
newline after the 'module' itself:

  int module;
  int i =
module
;

is OK.  But I'll update this message to be clearer.

> I think I'd leave the "must be at global scope" diagnostic to
> cp_parser_module_declaration, and assume that if we see a module keyword at
> function scope it wasn't intended to be a module directive.
> 

How about this then? Bootstrapped and regtested on x86_64-pc-linux-gnu.

-- >8 --

This fixes an ICE when a module directive is not given at global scope.
Although not explicitly mentioned, it seems implied from [basic.link] p1
and [module.global.frag] that a module-declaration must appear at the
global scope after preprocessing.  Apart from this the patch also
slightly improves the errors given when accidentally using a module
control-line in other situations where it is not expected.

PR c++/115200

gcc/cp/ChangeLog:

* parser.cc (cp_parser_error_1): Special-case unexpected module
directives for better diagnostics.
(cp_parser_module_declaration): Check that the module
declaration is at global scope.
(cp_parser_import_declaration): Sync error message with that in
cp_parser_error_1.

gcc/testsuite/ChangeLog:

* g++.dg/modules/mod-decl-1.C: Update error messages.
* g++.dg/modules/mod-decl-6.C: New test.
* g++.dg/modules/mod-decl-7.C: New test.
* g++.dg/modules/mod-decl-8.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/parser.cc  | 26 ---
 gcc/testsuite/g++.dg/modules/mod-decl-1.C |  8 ---
 gcc/testsuite/g++.dg/modules/mod-decl-6.C | 11 ++
 gcc/testsuite/g++.dg/modules/mod-decl-7.C | 11 ++
 gcc/testsuite/g++.dg/modules/mod-decl-8.C | 14 
 5 files changed, 64 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-6.C
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-7.C
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-8.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 476ddc0d63a..779625144db 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -3230,6 +3230,19 @@ cp_parser_error_1 (cp_parser* parser, const char* gmsgid,
   return;
 }
 
+  if (cp_token_is_module_directive (token))
+{
+  auto_diagnostic_group d;
+  error_at (token->location, "unexpected module directive");
+  if (token->keyword != RID__EXPORT)
+   inform (token->location, "perhaps insert a line break after"
+   " %qs, or other disambiguation, to prevent this being"
+   " considered a module control-line",
+   (token->keyword == RID__MODULE) ? "module" : "import");
+  cp_parser_skip_to_pragma_eol (parser, token);
+  return;
+}
+
   /* If this is actually a conflict marker, report it as such.  */
   if (token->type == CPP_LSHIFT
   || token->type == CPP_RSHIFT
@@ -15135,12 +15148,19 @@ cp_parser_module_declaration (cp_parser *parser, 
module_parse mp_state,
   parser->lexer->in_pragma = true;
   cp_token *token = cp_lexer_consume_token (parser->lexer);
 
+  tree scope = current_scope ();
   if (flag_header_unit)
 {
   error_at (token->location,
"module-declaration not permitted in header-unit");
   goto skip_eol;
 }
+  else if (scope != global_namespace)
+{
+  error_at (token->location, "module-declaration must be at global scope");
+  inform (DECL_SOURCE_LOCATION (scope), "scope opened here");
+  goto skip_eol;
+}
   else if (mp_state == MP_FIRST && !exporting
  

[PATCH] c++: extend -Wself-move for mem-init-list [PR109396]

2024-05-23 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
We already warn for:

  x = std::move (x);

which triggers:

  warning: moving 'x' of type 'int' to itself [-Wself-move]

but bug 109396 reports that this doesn't work for a member-initializer-list:

  X() : x(std::move (x))

so this patch amends that.

PR c++/109396

gcc/cp/ChangeLog:

* cp-tree.h (maybe_warn_self_move): Declare.
* init.cc (perform_member_init): Call maybe_warn_self_move.
* typeck.cc (maybe_warn_self_move): No longer static.  Change the
return type to bool.  Also warn when called from
a member-initializer-list.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wself-move2.C: New test.
---
 gcc/cp/cp-tree.h|  1 +
 gcc/cp/init.cc  |  5 ++--
 gcc/cp/typeck.cc| 28 +--
 gcc/testsuite/g++.dg/warn/Wself-move2.C | 37 +
 4 files changed, 60 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wself-move2.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ba9e848c177..ea3fa6f4aac 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8263,6 +8263,7 @@ extern cp_expr build_c_cast   
(location_t loc, tree type,
 cp_expr expr);
 extern tree cp_build_c_cast(location_t, tree, tree,
 tsubst_flags_t);
+extern bool maybe_warn_self_move   (location_t, tree, tree);
 extern cp_expr build_x_modify_expr (location_t, tree,
 enum tree_code, tree,
 tree, tsubst_flags_t);
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 52396d87a8c..4a7ed7f5302 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -999,7 +999,7 @@ perform_member_init (tree member, tree init, hash_set 
)
   if (decl == error_mark_node)
 return;
 
-  if ((warn_init_self || warn_uninitialized)
+  if ((warn_init_self || warn_uninitialized || warn_self_move)
   && init
   && TREE_CODE (init) == TREE_LIST
   && TREE_CHAIN (init) == NULL_TREE)
@@ -1013,7 +1013,8 @@ perform_member_init (tree member, tree init, 
hash_set )
warning_at (DECL_SOURCE_LOCATION (current_function_decl),
OPT_Winit_self, "%qD is initialized with itself",
member);
-  else
+  else if (!maybe_warn_self_move (input_location, member,
+ TREE_VALUE (init)))
find_uninit_fields (, , decl);
 }
 
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index d7fa6e0dd96..e058ce18276 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -9355,27 +9355,27 @@ cp_build_c_cast (location_t loc, tree type, tree expr,
 
 /* Warn when a value is moved to itself with std::move.  LHS is the target,
RHS may be the std::move call, and LOC is the location of the whole
-   assignment.  */
+   assignment.  Return true if we warned.  */
 
-static void
+bool
 maybe_warn_self_move (location_t loc, tree lhs, tree rhs)
 {
   if (!warn_self_move)
-return;
+return false;
 
   /* C++98 doesn't know move.  */
   if (cxx_dialect < cxx11)
-return;
+return false;
 
   if (processing_template_decl)
-return;
+return false;
 
   if (!REFERENCE_REF_P (rhs)
   || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
-return;
+return false;
   tree fn = TREE_OPERAND (rhs, 0);
   if (!is_std_move_p (fn))
-return;
+return false;
 
   /* Just a little helper to strip * and various NOPs.  */
   auto extract_op = [] (tree ) {
@@ -9393,13 +9393,23 @@ maybe_warn_self_move (location_t loc, tree lhs, tree 
rhs)
   tree type = TREE_TYPE (lhs);
   tree orig_lhs = lhs;
   extract_op (lhs);
-  if (cp_tree_equal (lhs, arg))
+  if (cp_tree_equal (lhs, arg)
+  /* Also warn in a member-initializer-list, as in : i(std::move(i)).  */
+  || (TREE_CODE (lhs) == FIELD_DECL
+ && TREE_CODE (arg) == COMPONENT_REF
+ && cp_tree_equal (TREE_OPERAND (arg, 0), current_class_ref)
+ && TREE_OPERAND (arg, 1) == lhs))
 {
   auto_diagnostic_group d;
   if (warning_at (loc, OPT_Wself_move,
  "moving %qE of type %qT to itself", orig_lhs, type))
-   inform (loc, "remove % call");
+   {
+ inform (loc, "remove % call");
+ return true;
+   }
 }
+
+  return false;
 }
 
 /* For use from the C common bits.  */
diff --git a/gcc/testsuite/g++.dg/warn/Wself-move2.C 
b/gcc/testsuite/g++.dg/warn/Wself-move2.C
new file mode 100644
index 000..0c0e1b9d5f9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wself-move2.C
@@ -0,0 +1,37 @@
+// PR c++/109396
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wall" }
+
+// Define std::move.
+namespace std {
+  template
+struct remove_reference
+{ typedef _Tp  

Re: [PATCH] rs6000: Adjust -fpatchable-function-entry* support for dual entry [PR112980]

2024-05-23 Thread Fangrui Song
On Wed, May 8, 2024 at 1:33 AM Kewen.Lin  wrote:
>
> Hi Richi,
>
> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> index c584664e168..58e48f7dc55 100644
> >> --- a/gcc/doc/invoke.texi
> >> +++ b/gcc/doc/invoke.texi
> >> @@ -18363,11 +18363,11 @@ If @code{N=0}, no pad location is recorded.
> >>  The NOP instructions are inserted at---and maybe before, depending on
> >>  @var{M}---the function entry address, even before the prologue.  On
> >>  PowerPC with the ELFv2 ABI, for a function with dual entry points,
> >> -the local entry point is this function entry address.
> >> +@var{M} NOP instructions are inserted before the global entry point and
> >> +@var{N} - @var{M} NOP instructions are inserted after the local entry
> >> +point, which means the NOP instructions may not be consecutive.
> >
> > Isn't it @var{M-1} NOP instructions before the global entry?  I suppose
>
> No, the existing documentation is a bit confusing, sigh ...
>
> > the existing
> >
> > "... with the function entry point before the @var{M}th NOP.
> > If @var{M} is omitted, it defaults to @code{0} so the
> > function entry points to the address just at the first NOP."
> >
> > wording is self-contradicting in a way since before the 0th NOP (default)
> > to me is the same as before the 1st NOP (M == 1).  So maybe that should
> > be _after_ the @var{M}th NOP instead which would be consistent with your
> > ELFv2 docs?  Maybe the sentence should be re-worded similar to your
> > ELVv2 one, specifying the number of NOPs before and after the entry point.
> >
>
> ... the current "with the function entry point before the Mth NOP."
> has the 0th NOP assumption, so the default (0th) NOP and 1st NOP (M == 1)
> are actually different, such as:
>
> -fpatchable-function-entry=3,0
>
> foo:
> nop
> nop
> nop
>
> -fpatchable-function-entry=3,1
>
> nop
> foo:
> nop
> nop
>
> Alan also had the similar concern on this wording before:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c8
>
> " Alan Modra 2022-08-12 03:00:29 UTC
> "
> "(In reply to Segher Boessenkool from comment #7)
> "> '-fpatchable-function-entry=N[,M]'
> ">  Generate N NOPs right at the beginning of each function, with the
> ">  function entry point before the Mth NOP.
> "
> " Bad doco.  Should be "after the Mth NOP" I think.
> " Or better written to avoid the concept of a 0th nop.
> " Default for M is zero, placing all nops after the function entry and
> " before normal function prologue code.
>
> BR,
> Kewen
>
> >> -The maximum value of @var{N} and @var{M} is 65535.  On PowerPC with the
> >> -ELFv2 ABI, for a function with dual entry points, the supported values
> >> -for @var{M} are 0, 2, 6 and 14.
> >> +The maximum value of @var{N} and @var{M} is 65535.
> >>  @end table
> >>


> So this patch is to change the current implementation by
> emitting the "before" NOPs before global entry point and
> the "after" NOPs after local entry point.  The new behavior

Thanks.  This looks good to me :)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c5


-- 
宋方睿


PING Re: PING Re: [PATCH RFA (cgraph)] c++: pragma target and static init [PR109753]

2024-05-23 Thread Jason Merrill

Ping

On 5/14/24 17:21, Jason Merrill wrote:

Ping

On 5/2/24 09:54, Jason Merrill wrote:

Tested x86_64-pc-linux-gnu, OK for trunk?  14.2?

This two-year-old thread seems relevant:
https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593410.html

-- 8< --

  #pragma target and optimize should also apply to implicitly-generated
  functions like static initialization functions and defaulted special 
member

  functions.

At least one of the create_same_body_alias/handle_optimize_attribute 
changes

is necessary to avoid regressing g++.dg/opt/pr105306.C; maybe_clone_body
creates a cgraph_node for the ~B alias before 
handle_optimize_attribute, and

the alias never goes through finalize_function, so we need to adjust
semantic_interposition somewhere else.

PR c++/109753

gcc/ChangeLog:

* cgraph.cc (cgraph_node::create_same_body_alias): Set
semantic_interposition.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_optimize_attribute): Set
cgraph_node::semantic_interposition.

gcc/cp/ChangeLog:

* decl.cc (start_preparsed_function): Call decl_attributes.

gcc/testsuite/ChangeLog:

* g++.dg/opt/always_inline1.C: New test.
---
  gcc/c-family/c-attribs.cc | 4 
  gcc/cgraph.cc | 2 ++
  gcc/cp/decl.cc    | 3 +++
  gcc/testsuite/g++.dg/opt/always_inline1.C | 8 
  4 files changed, 17 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/opt/always_inline1.C

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 04e39b41bdf..605469dd7dd 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -5971,6 +5971,10 @@ handle_optimize_attribute (tree *node, tree 
name, tree args,

    if (prev_target_node != target_node)
  DECL_FUNCTION_SPECIFIC_TARGET (*node) = target_node;
+  /* Also update the cgraph_node, if it's already built.  */
+  if (cgraph_node *cn = cgraph_node::get (*node))
+    cn->semantic_interposition = flag_semantic_interposition;
+
    /* Restore current options.  */
    cl_optimization_restore (_options, _options_set,
 _opts);
diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 473d8410bc9..f3bd2fa8ece 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -604,6 +604,8 @@ cgraph_node::create_same_body_alias (tree alias, 
tree decl)

    n = cgraph_node::create_alias (alias, decl);
    n->cpp_implicit_alias = true;
+  /* Aliases don't go through finalize_function.  */
+  n->semantic_interposition = opt_for_fn (decl, 
flag_semantic_interposition);

    if (symtab->cpp_implicit_aliases_done)
  n->resolve_alias (cgraph_node::get (decl));
    return n;
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 378311c0f04..4531d830462 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17796,6 +17796,9 @@ start_preparsed_function (tree decl1, tree 
attrs, int flags)

  doing_friend = true;
  }
+  /* Adjust for #pragma target/optimize.  */
+  decl_attributes (, NULL_TREE, 0);
+
    if (DECL_DECLARED_INLINE_P (decl1)
    && lookup_attribute ("noinline", attrs))
  warning_at (DECL_SOURCE_LOCATION (decl1), 0,
diff --git a/gcc/testsuite/g++.dg/opt/always_inline1.C 
b/gcc/testsuite/g++.dg/opt/always_inline1.C

new file mode 100644
index 000..a042a1cf0c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/always_inline1.C
@@ -0,0 +1,8 @@
+// PR c++/109753
+// { dg-do compile { target x86_64-*-* } }
+
+#pragma GCC target("avx2")
+struct aa {
+    __attribute__((__always_inline__)) aa() {}
+};
+aa _M_impl;

base-commit: 2f15787f2e1a3afe2c2ad93d4eb0d3c1f73c8fbd






Re: [PATCH v2] libstdc++: Fix std::ranges::iota not included in numeric [PR108760]

2024-05-23 Thread Patrick Palka
On Fri, 17 May 2024, Michael Levine (BLOOMBERG/ 731 LEX) wrote:

> This is the revised version of my patch incorporating the provided feedback 
> from Patrick Palka and Jonathan Wakely.
> This patch fixes GCC Bug 108760: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108760
> I moved out_value_result to , moved std::ranges:iota 
> into , removed my new test, and moved and renamed the existing test.

Nice, thanks!  The incremental changes seem good, but could you send a
single squashed patch containing all the changes?  That's what we'll end
up pushing after all.

> 
> I built my local version of gcc using the following configuration: $ 
> ../gcc/configure --disable-bootstrap --prefix="$(pwd)/_pfx/" 
> --enable-languages=c,c++,lto
> I then ran $ make -jN
> and $ make -jN install
> 
> Using the locally installed version, the following code compiled: 
> https://godbolt.org/z/33EPeqd1b
> 
> I tested my changes by running: $ make check-c++ -jN -k
> I personally found it difficult to understand the results of running the 
> tests.
> 
> I ran this on the following OS:
> 
> Virtualization: wsl
> Operating System: Ubuntu 20.04.6 LTS
> Kernel: Linux 5.15.146.1-microsoft-standard-WSL2
> Architecture: x86-64
> 
> 
> 
> From: Michael Levine (BLOOMBERG/ 731 LEX) At: 04/17/24 14:24:24 UTC-4:00
> To: libstd...@gcc.gnu.org, gcc-patches@gcc.gnu.org
> Subject: [PATCH] libstdc++: Fix std::ranges::iota is not included in numeric 
> [PR108760]
> 
> This patch fixes GCC Bug 108760: 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108760
> Before this patch, using std::ranges::iota required including  
> when it should have been sufficient to only include .
> 
> When the patch is applied, the following code will compile: 
> https://godbolt.org/z/33EPeqd1b
> 
> I added a test case for this change as well.
> 
> I built my local version of gcc using the following configuration: $ 
> ../gcc/configure --disable-bootstrap --prefix="$(pwd)/_pfx/" 
> --enable-languages=c,c++,lto
> 
> and I tested my changes by running: $ make check-c++ -jN -k
> 
> I ran this on the following OS:
> 
> Virtualization: wsl
> Operating System: Ubuntu 20.04.6 LTS
> Kernel: Linux 5.15.146.1-microsoft-standard-WSL2
> Architecture: x86-64
> 
> 
> 
> 
> 



Re: [PATCH][14 backport] c++: Fix instantiation of imported temploid friends [PR114275]

2024-05-23 Thread Jason Merrill

On 5/13/24 07:56, Nathaniel Shead wrote:

@@ -11751,9 +11767,16 @@ tsubst_friend_class (tree friend_tmpl, tree args)
 if (tmpl != error_mark_node)
{
  /* The new TMPL is not an instantiation of anything, so we
-forget its origins.  We don't reset CLASSTYPE_TI_TEMPLATE
+forget its origins.  It is also not a specialization of
+anything.  We don't reset CLASSTYPE_TI_TEMPLATE
 for the new type because that is supposed to be the
 corresponding template decl, i.e., TMPL.  */
+ spec_entry elt;
+ elt.tmpl = friend_tmpl;
+ elt.args = CLASSTYPE_TI_ARGS (TREE_TYPE (tmpl));
+ elt.spec = TREE_TYPE (tmpl);
+ type_specializations->remove_elt ();


For GCC 14.2 let's guard this with if (modules_p ()); for GCC 15 it can be
unconditional.  OK.


I'm looking to backport this patch to GCC 14 now that it's been on trunk
some time.  Here's the patch I'm aiming to add (squashed with the
changes from r15-220-gec2365e07537e8) after cherrypicking the
prerequisite commit r15-58-g2faf040335f9b4; is this OK?

Or should I keep it as two separate commits to make the cherrypicking
more obvious? Not entirely sure on the etiquette around this.


It's OK to squash them, but it's typical to use -x (directly or via git 
gcc-backport) to mention where a branch change was cherry-picked from, 
and in this case it would make sense to edit in the second commit so 
it's clear the backport includes both.  OK that way.


Jason



Re: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-05-23 Thread Jason Merrill

On 5/23/24 17:42, Patrick Palka wrote:

On Thu, 23 May 2024, Jason Merrill wrote:


On 5/23/24 14:06, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk/14?

-- >8 --

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
(equivalently C).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias CTAD.  This eventually needs to be done for inherited CTAD
too, but it's not clear what identifier to use there since it has to be
unique for each derived/base pair.  For

template struct A { ... };
template struct B : A { using A::A; }

at first glance it'd be reasonable to give inherited guides a name of
__dguide_B with TREE_TYPE A, but since that name is already
used B's own guides its TREE_TYPE is already B.


Why can't it be the same __dguide_B with TREE_TYPE B?


Ah because copy_guide_p relies on TREE_TYPE in order to recognize a copy
deduction guide, and with that TREE_TYPE it would still incorrectly
return false for an inherited copy deduction guide, e.g.

   A(A) -> A

gets transformed into

   B(A) -> B

and A != B so copy_guide_p returns false.


Hmm, that seems correct; the transformed candidate is not the copy 
deduction guide for B.



But it just occurred to me that this TREE_TYPE clobbering of the
__dguide_foo identifier already happens if we have two class templates
with the same name in different namespaces, since the identifier
contains only the terminal name.  Maybe this suggests that we should
use a tree flag to track whether a guide is the copy deduction guide
instead of setting TREE_TYPE of DECL_NAME?


Good point.

Jason



Re: [PATCH][14 backport] c++: Fix instantiation of imported temploid friends [PR114275]

2024-05-23 Thread Patrick Palka
On Mon, 13 May 2024, Nathaniel Shead wrote:

> > > @@ -11751,9 +11767,16 @@ tsubst_friend_class (tree friend_tmpl, tree args)
> > > if (tmpl != error_mark_node)
> > >   {
> > > /* The new TMPL is not an instantiation of anything, so we
> > > -  forget its origins.  We don't reset CLASSTYPE_TI_TEMPLATE
> > > +  forget its origins.  It is also not a specialization of
> > > +  anything.  We don't reset CLASSTYPE_TI_TEMPLATE
> > >for the new type because that is supposed to be the
> > >corresponding template decl, i.e., TMPL.  */
> > > +   spec_entry elt;
> > > +   elt.tmpl = friend_tmpl;
> > > +   elt.args = CLASSTYPE_TI_ARGS (TREE_TYPE (tmpl));
> > > +   elt.spec = TREE_TYPE (tmpl);
> > > +   type_specializations->remove_elt ();
> > 
> > For GCC 14.2 let's guard this with if (modules_p ()); for GCC 15 it can be
> > unconditional.  OK.
> > 
> > Jason
> > 
> 
> I'm looking to backport this patch to GCC 14 now that it's been on trunk
> some time.  Here's the patch I'm aiming to add (squashed with the
> changes from r15-220-gec2365e07537e8) after cherrypicking the
> prerequisite commit r15-58-g2faf040335f9b4; is this OK?
> 
> Or should I keep it as two separate commits to make the cherrypicking
> more obvious? Not entirely sure on the etiquette around this.

Since the first patch "only" causes sporadic testsuite failures (and
doesn't e.g. break bootstrap or anything serious like that), I reckon
it'd be fine to keep them as separate commits?  Not sure either.

> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu on top of the
> releases/gcc-14 branch.
> 
> -- >8 --
> 
> This patch fixes a number of issues with the handling of temploid friend
> declarations.
> 
> The primary issue is that instantiations of friend declarations should
> attach the declaration to the same module as the befriending class, by
> [module.unit] p7.1 and [temp.friend] p2; this could be a different
> module from the current TU, and so needs special handling.
> 
> The other main issue here is that we can't assume that just because name
> lookup didn't find a definition for a hidden class template, that it
> doesn't exist at all: it could be a non-exported entity that we've
> nevertheless streamed in from an imported module.  We need to ensure
> that when instantiating template friend classes that we return the same
> TEMPLATE_DECL that we got from our imports, otherwise we will get later
> issues with 'duplicate_decls' (rightfully) complaining that they're
> different when trying to merge.
> 
> This doesn't appear necessary for function templates due to the existing
> name lookup handling already finding these hidden declarations.
> 
>   PR c++/105320
>   PR c++/114275
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (propagate_defining_module): Declare.
>   (remove_defining_module): Declare.
>   (lookup_imported_hidden_friend): Declare.
>   * decl.cc (duplicate_decls): Also check if hidden decls can be
>   redeclared in this module. Call remove_defining_module on
>   to-be-freed newdecl.
>   * module.cc (imported_temploid_friends): New.
>   (init_modules): Initialize it.
>   (trees_out::decl_value): Write it; don't consider imported
>   temploid friends as attached to a module.
>   (trees_in::decl_value): Read it for non-discarded decls.
>   (get_originating_module_decl): Follow the owning decl for an
>   imported temploid friend.
>   (propagate_defining_module): New.
>   (remove_defining_module): New.
>   * name-lookup.cc (get_mergeable_namespace_binding): New.
>   (lookup_imported_hidden_friend): New.
>   * pt.cc (tsubst_friend_function): Propagate defining module for
>   new friend functions.
>   (tsubst_friend_class): Lookup imported hidden friends.  Check
>   for valid module attachment of existing names.  Propagate
>   defining module for new classes.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/tpl-friend-10_a.C: New test.
>   * g++.dg/modules/tpl-friend-10_b.C: New test.
>   * g++.dg/modules/tpl-friend-10_c.C: New test.
>   * g++.dg/modules/tpl-friend-10_d.C: New test.
>   * g++.dg/modules/tpl-friend-11_a.C: New test.
>   * g++.dg/modules/tpl-friend-11_b.C: New test.
>   * g++.dg/modules/tpl-friend-12_a.C: New test.
>   * g++.dg/modules/tpl-friend-12_b.C: New test.
>   * g++.dg/modules/tpl-friend-12_c.C: New test.
>   * g++.dg/modules/tpl-friend-12_d.C: New test.
>   * g++.dg/modules/tpl-friend-12_e.C: New test.
>   * g++.dg/modules/tpl-friend-12_f.C: New test.
>   * g++.dg/modules/tpl-friend-13_a.C: New test.
>   * g++.dg/modules/tpl-friend-13_b.C: New test.
>   * g++.dg/modules/tpl-friend-13_c.C: New test.
>   * g++.dg/modules/tpl-friend-13_d.C: New test.
>   * g++.dg/modules/tpl-friend-13_e.C: New test.
>   * g++.dg/modules/tpl-friend-13_f.C: New test.
>   * 

Re: [PATCH] [RFC] Target-independent store forwarding avoidance. [PR48696] Target-independent store forwarding avoidance.

2024-05-23 Thread Philipp Tomsich
On Thu, 23 May 2024 at 18:18, Andrew Pinski  wrote:
>
> On Thu, May 23, 2024 at 8:01 AM Manolis Tsamis  
> wrote:
> >
> > This pass detects cases of expensive store forwarding and tries to avoid 
> > them
> > by reordering the stores and using suitable bit insertion sequences.
> > For example it can transform this:
> >
> >  strbw2, [x1, 1]
> >  ldr x0, [x1]  # Epxensive store forwarding to larger load.

@Manolis: looks like a typo slipped through: Epxensive -> Expensive

> >
> > To:
> >
> >  ldr x0, [x1]
> >  strbw2, [x1]
> >  bfi x0, x2, 0, 8
> >
>
> Are you sure this is correct with respect to the C11/C++11 memory
> models? If not then the pass should be gated with
> flag_store_data_races.

This optimization (i.e., the reordering and usage of the
bfi-instruction) should always be safe and not violate the C++11
memory model, as we still perform the same stores (i.e., with the same
width).
Keeping the same stores around (and only reordering them relative to
the loads) ensures that only the bytes containing the adjacent bits
are overwritten.
This pass never tries to merge multiple stores (although later passes
may), but only reorders those relative to a (wider) load we are
forwarding into.

> Also stores like this start a new "alias set" (I can't remember the
> exact term here). So how do you represent the store's aliasing set? Do
> you change it? If not, are you sure that will do the right thing?
>
> You didn't document the new option or the new --param (invoke.texi);
> this is the bare minimum requirement.
> Note you should add documentation for the new pass in the internals
> manual (passes.texi) (note most folks forget to update this when
> adding a new pass).
>
> Thanks,
> Andrew
>
>
> > Assembly like this can appear with bitfields or type punning / unions.
> > On stress-ng when running the cpu-union microbenchmark the following 
> > speedups
> > have been observed.
> >
> >   Neoverse-N1:  +29.4%
> >   Intel Coffeelake: +13.1%
> >   AMD 5950X:+17.5%
> >
> > PR rtl-optimization/48696
> >
> > gcc/ChangeLog:
> >
> > * Makefile.in: Add avoid-store-forwarding.o.
> > * common.opt: New option -favoid-store-forwarding.
> > * params.opt: New param store-forwarding-max-distance.
> > * passes.def: Schedule a new pass.
> > * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
> > * avoid-store-forwarding.cc: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/avoid-store-forwarding-1.c: New test.
> > * gcc.dg/avoid-store-forwarding-2.c: New test.
> > * gcc.dg/avoid-store-forwarding-3.c: New test.
> >
> > Signed-off-by: Manolis Tsamis 
> > ---
> >
> >  gcc/Makefile.in   |   1 +
> >  gcc/avoid-store-forwarding.cc | 554 ++
> >  gcc/common.opt|   4 +
> >  gcc/params.opt|   4 +
> >  gcc/passes.def|   1 +
> >  .../gcc.dg/avoid-store-forwarding-1.c |  46 ++
> >  .../gcc.dg/avoid-store-forwarding-2.c |  39 ++
> >  .../gcc.dg/avoid-store-forwarding-3.c |  31 +
> >  gcc/tree-pass.h   |   1 +
> >  9 files changed, 681 insertions(+)
> >  create mode 100644 gcc/avoid-store-forwarding.cc
> >  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-2.c
> >  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c
> >
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index a7f15694c34..be969b1ca1d 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -1681,6 +1681,7 @@ OBJS = \
> > statistics.o \
> > stmt.o \
> > stor-layout.o \
> > +   avoid-store-forwarding.o \
> > store-motion.o \
> > streamer-hooks.o \
> > stringpool.o \
> > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
> > new file mode 100644
> > index 000..d90627c4872
> > --- /dev/null
> > +++ b/gcc/avoid-store-forwarding.cc
> > @@ -0,0 +1,554 @@
> > +/* Avoid store forwarding optimization pass.
> > +   Copyright (C) 2024 Free Software Foundation, Inc.
> > +   Contributed by VRULL GmbH.
> > +
> > +   This file is part of GCC.
> > +
> > +   GCC is free software; you can redistribute it and/or modify it
> > +   under the terms of the GNU General Public License as published by
> > +   the Free Software Foundation; either version 3, or (at your option)
> > +   any later version.
> > +
> > +   GCC is distributed in the hope that it will be useful, but
> > +   WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   General Public License for more details.
> > +
> > +   You should have received a copy of the GNU General Public License
> > +   along 

Re: [PATCH v26 01/13] libstdc++: Optimize std::is_const compilation performance

2024-05-23 Thread Ken Matsui
On Thu, May 23, 2024 at 3:15 PM Patrick Palka  wrote:
>
> On Sat, 11 May 2024, Ken Matsui wrote:
>
> > This patch optimizes the compilation performance of std::is_const
> > by dispatching to the new __is_const built-in trait.
>
> This patch series LGTM

Thank you!

>
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/std/type_traits (is_const): Use __is_const built-in
> >   trait.
> >   (is_const_v): Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  libstdc++-v3/include/std/type_traits | 12 
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index b441bf9908f..8df0cf3ac3b 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >// Type properties.
> >
> >/// is_const
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
> > +  template
> > +struct is_const
> > +: public __bool_constant<__is_const(_Tp)>
> > +{ };
> > +#else
> >template
> >  struct is_const
> >  : public false_type { };
> > @@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >template
> >  struct is_const<_Tp const>
> >  : public true_type { };
> > +#endif
> >
> >/// is_volatile
> >template
> > @@ -3331,10 +3338,15 @@ template 
> >inline constexpr bool is_member_pointer_v = 
> > is_member_pointer<_Tp>::value;
> >  #endif
> >
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
> > +template 
> > +  inline constexpr bool is_const_v = __is_const(_Tp);
> > +#else
> >  template 
> >inline constexpr bool is_const_v = false;
> >  template 
> >inline constexpr bool is_const_v = true;
> > +#endif
> >
> >  #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
> >  template 
> > --
> > 2.44.0
> >
> >
>


Re: [PATCH v26 01/13] libstdc++: Optimize std::is_const compilation performance

2024-05-23 Thread Patrick Palka
On Sat, 11 May 2024, Ken Matsui wrote:

> This patch optimizes the compilation performance of std::is_const
> by dispatching to the new __is_const built-in trait.

This patch series LGTM

> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/type_traits (is_const): Use __is_const built-in
>   trait.
>   (is_const_v): Likewise.
> 
> Signed-off-by: Ken Matsui 
> ---
>  libstdc++-v3/include/std/type_traits | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index b441bf9908f..8df0cf3ac3b 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// Type properties.
>  
>/// is_const
> +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
> +  template
> +struct is_const
> +: public __bool_constant<__is_const(_Tp)>
> +{ };
> +#else
>template
>  struct is_const
>  : public false_type { };
> @@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template
>  struct is_const<_Tp const>
>  : public true_type { };
> +#endif
>  
>/// is_volatile
>template
> @@ -3331,10 +3338,15 @@ template 
>inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
>  #endif
>  
> +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
> +template 
> +  inline constexpr bool is_const_v = __is_const(_Tp);
> +#else
>  template 
>inline constexpr bool is_const_v = false;
>  template 
>inline constexpr bool is_const_v = true;
> +#endif
>  
>  #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
>  template 
> -- 
> 2.44.0
> 
> 



Re: [PATCH] libstdc++: Implement ranges::concat_view from P2542R7

2024-05-23 Thread Patrick Palka
On Mon, 29 Apr 2024, Jonathan Wakely wrote:

> On Mon, 22 Apr 2024 at 22:43, Patrick Palka wrote:
> >
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk?  More tests
> > are needed but I figured I'd submit this now for possible consideration into
> > GCC 14 since we're getting close to release..  All changes are confined to
> > C++26.
> 
> OK for trunk. Maybe we can backport it for 14.2 later, but not now.
> Sorry for the review being slow.

No worries, thanks a lot!  I pushed this now, though I realized I didn't
implement the latest/approved revision of the paper, R8 vs R7, which
notably changes the constraints on operator-(it, default_sentinel).
Since that seems to be the only significant change, I reckon I'll fix
that in a follow-up patch.

> 
> 
> >
> > -- >8 --
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/bits/version.def (ranges_concat): Define.
> > * include/bits/version.h: Regenerate.
> > * include/std/ranges (__detail::__concat_reference_t): Define
> > for C++26.
> > (__detail::__concat_value_t): Likewise.
> > (__detail::__concat_rvalue_reference_t): Likewise.
> > (__detail::__concat_indirectly_readable_impl): Likewise.
> > (__detail::__concat_indirectly_readable): Likewise.
> > (__detail::__concatable): Likewise.
> > (__detail::__all_but_last_common): Likewise.
> > (__detail::__concat_is_random_access): Likewise.
> > (__detail::__concat_is_bidirectional): Likewise.
> > (__detail::__last_is_common): Likewise.
> > (concat_view): Likewise.
> > (__detail::__concat_view_iter_cat): Likewise.
> > (concat_view::iterator): Likewise.
> > (views::__detail::__can_concat_view): Likewise.
> > (views::_Concat, views::concat): Likewise.
> > * testsuite/std/ranges/concat/1.cc: New test.
> > ---
> >  libstdc++-v3/include/bits/version.def |   8 +
> >  libstdc++-v3/include/bits/version.h   |  10 +
> >  libstdc++-v3/include/std/ranges   | 584 ++
> >  libstdc++-v3/testsuite/std/ranges/concat/1.cc |  61 ++
> >  4 files changed, 663 insertions(+)
> >  create mode 100644 libstdc++-v3/testsuite/std/ranges/concat/1.cc
> >
> > diff --git a/libstdc++-v3/include/bits/version.def 
> > b/libstdc++-v3/include/bits/version.def
> > index 5c0477fb61e..af13090c094 100644
> > --- a/libstdc++-v3/include/bits/version.def
> > +++ b/libstdc++-v3/include/bits/version.def
> > @@ -1796,6 +1796,14 @@ ftms = {
> >};
> >  };
> >
> > +ftms = {
> > +  name = ranges_concat;
> > +  values = {
> > +v = 202403;
> > +cxxmin = 26;
> > +  };
> > +};
> > +
> >  // Standard test specifications.
> >  stds[97] = ">= 199711L";
> >  stds[03] = ">= 199711L";
> > diff --git a/libstdc++-v3/include/bits/version.h 
> > b/libstdc++-v3/include/bits/version.h
> > index 65e708c73fb..1f27bfe050d 100644
> > --- a/libstdc++-v3/include/bits/version.h
> > +++ b/libstdc++-v3/include/bits/version.h
> > @@ -2003,4 +2003,14 @@
> >  #endif /* !defined(__cpp_lib_to_string) && 
> > defined(__glibcxx_want_to_string) */
> >  #undef __glibcxx_want_to_string
> >
> > +#if !defined(__cpp_lib_ranges_concat)
> > +# if (__cplusplus >  202302L)
> > +#  define __glibcxx_ranges_concat 202403L
> > +#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_ranges_concat)
> > +#   define __cpp_lib_ranges_concat 202403L
> > +#  endif
> > +# endif
> > +#endif /* !defined(__cpp_lib_ranges_concat) && 
> > defined(__glibcxx_want_ranges_concat) */
> > +#undef __glibcxx_want_ranges_concat
> > +
> >  #undef __glibcxx_want_all
> > diff --git a/libstdc++-v3/include/std/ranges 
> > b/libstdc++-v3/include/std/ranges
> > index afce818376b..28a39bf6f34 100644
> > --- a/libstdc++-v3/include/std/ranges
> > +++ b/libstdc++-v3/include/std/ranges
> > @@ -55,6 +55,7 @@
> >  #define __glibcxx_want_ranges_as_const
> >  #define __glibcxx_want_ranges_as_rvalue
> >  #define __glibcxx_want_ranges_cartesian_product
> > +#define __glibcxx_want_ranges_concat
> >  #define __glibcxx_want_ranges_chunk
> >  #define __glibcxx_want_ranges_chunk_by
> >  #define __glibcxx_want_ranges_enumerate
> > @@ -9514,6 +9515,589 @@ namespace __detail
> >  } // namespace ranges
> >  #endif // __cpp_lib_ranges_to_container
> >
> > +#if __cpp_lib_ranges_concat // C++ >= C++26
> > +namespace ranges
> > +{
> > +  namespace __detail
> > +  {
> > +template
> > +  using __concat_reference_t = 
> > common_reference_t...>;
> > +
> > +template
> > +  using __concat_value_t = common_type_t...>;
> > +
> > +template
> > +  using __concat_rvalue_reference_t
> > +   = common_reference_t...>;
> > +
> > +template
> > +  concept __concat_indirectly_readable_impl = requires (const _It 
> > __it) {
> > +   { *__it } -> convertible_to<_Ref>;
> > +   { ranges::iter_move(__it) } -> convertible_to<_RRef>;
> > +  };
> > +
> > +template
> > +  concept __concat_indirectly_readable
> > +

Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Ian Lance Taylor
On Thu, May 23, 2024 at 2:48 PM Martin Uecker  wrote:
>
> Am Donnerstag, dem 23.05.2024 um 14:30 -0700 schrieb Ian Lance Taylor:
> > On Thu, May 23, 2024 at 2:00 PM Joseph Myers  wrote:
> > >
> > > On Tue, 21 May 2024, Martin Uecker wrote:
> > > >
> > > > C: allow aliasing of compatible types derived from enumeral types 
> > > > [PR115157]
> > > >
> > > > Aliasing of enumeral types with the underlying integer is now 
> > > > allowed
> > > > by setting the aliasing set to zero.  But this does not allow 
> > > > aliasing
> > > > of derived types which are compatible as required by ISO C.  
> > > > Instead,
> > > > initially set structural equality.  Then set TYPE_CANONICAL and 
> > > > update
> > > > pointers and main variants when the type is completed (as done for
> > > > structures and unions in C23).
> > > >
> > > > PR 115157
> > > >
> > > > gcc/c/
> > > > * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> > > > finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / 
> > > > TYPE_CANONICAL.
> > > > * c-obj-common.cc (get_alias_set): Remove special case.
> > > > (get_aka_type): Add special case.
> > > >
> > > > gcc/
> > > > * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT 
> > > > instead
> > > > of TYPE_CANONICAL.
> > > >
> > > > gcc/testsuite/
> > > > * gcc.dg/enum-alias-1.c: New test.
> > > > * gcc.dg/enum-alias-2.c: New test.
> > > > * gcc.dg/enum-alias-3.c: New test.
> > >
> > > OK, in the absence of objections on middle-end or Go grounds within the
> > > next week.
> >
> > The godump.cc patch is
> >
> >&& (TYPE_CANONICAL (TREE_TYPE (decl)) == NULL_TREE
> >   || !container->decls_seen.contains
> > -   (TYPE_CANONICAL (TREE_TYPE (decl)
> > +   (TYPE_MAIN_VARIANT (TREE_TYPE (decl)
> >  {
> >
> > What is the problem you are seeing?
>
> Test failures in godump-1.c
>
> >
> > This patch isn't right:
> >
> > 1) The code is saying if "X == NULL_TREE || !already_seen(X)".  This
> > patch is changing the latter X but not the former.  They should be
> > consistent.
>
> Maybe the X == NULL_TREE can be removed if we
> add TYPE_MAIN_VARIANTs instead?

If TYPE_MAIN_VARIANT is never NULL_TREE, then I agree that the
NULL_TREE test can be removed.

Ian


Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Martin Uecker
Am Donnerstag, dem 23.05.2024 um 14:30 -0700 schrieb Ian Lance Taylor:
> On Thu, May 23, 2024 at 2:00 PM Joseph Myers  wrote:
> > 
> > On Tue, 21 May 2024, Martin Uecker wrote:
> > > 
> > > C: allow aliasing of compatible types derived from enumeral types 
> > > [PR115157]
> > > 
> > > Aliasing of enumeral types with the underlying integer is now allowed
> > > by setting the aliasing set to zero.  But this does not allow aliasing
> > > of derived types which are compatible as required by ISO C.  Instead,
> > > initially set structural equality.  Then set TYPE_CANONICAL and update
> > > pointers and main variants when the type is completed (as done for
> > > structures and unions in C23).
> > > 
> > > PR 115157
> > > 
> > > gcc/c/
> > > * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> > > finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / 
> > > TYPE_CANONICAL.
> > > * c-obj-common.cc (get_alias_set): Remove special case.
> > > (get_aka_type): Add special case.
> > > 
> > > gcc/
> > > * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT instead
> > > of TYPE_CANONICAL.
> > > 
> > > gcc/testsuite/
> > > * gcc.dg/enum-alias-1.c: New test.
> > > * gcc.dg/enum-alias-2.c: New test.
> > > * gcc.dg/enum-alias-3.c: New test.
> > 
> > OK, in the absence of objections on middle-end or Go grounds within the
> > next week.
> 
> The godump.cc patch is
> 
>&& (TYPE_CANONICAL (TREE_TYPE (decl)) == NULL_TREE
>   || !container->decls_seen.contains
> -   (TYPE_CANONICAL (TREE_TYPE (decl)
> +   (TYPE_MAIN_VARIANT (TREE_TYPE (decl)
>  {
> 
> What is the problem you are seeing?

Test failures in godump-1.c

> 
> This patch isn't right:
> 
> 1) The code is saying if "X == NULL_TREE || !already_seen(X)".  This
> patch is changing the latter X but not the former.  They should be
> consistent.

Maybe the X == NULL_TREE can be removed if we
add TYPE_MAIN_VARIANTs instead?

> 
> 2) At the bottom of that conditional block is code that adds a value
> to container->decls_seen.  Today that code is adding TYPE_CANONICAL.
> If we change the condition to test TYPE_MAIN_VARIANT, then we need to
> add TYPE_MAIN_VARIANT to decls_seen.

Yes, obviously this is wrong. Thanks!

Martin
> 
> Hope that makes sense.
> 
> I don't know why the patch is required, but it's fine with those
> changes as long as the libgo tests continue to pass.


> 
> Ian



Re: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-05-23 Thread Patrick Palka
On Thu, 23 May 2024, Jason Merrill wrote:

> On 5/23/24 14:06, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > OK for trunk/14?
> > 
> > -- >8 --
> > 
> > Here we're neglecting to update DECL_NAME during the alias CTAD guide
> > transformation, which causes copy_guide_p to return false for the
> > transformed copy deduction guide since DECL_NAME is still __dguide_C
> > with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
> > (equivalently C).  This ultimately results in ambiguity during
> > overload resolution between the copy deduction guide vs copy ctor guide.
> > 
> > This patch makes us update DECL_NAME of a transformed guide accordingly
> > during alias CTAD.  This eventually needs to be done for inherited CTAD
> > too, but it's not clear what identifier to use there since it has to be
> > unique for each derived/base pair.  For
> > 
> >template struct A { ... };
> >template struct B : A { using A::A; }
> > 
> > at first glance it'd be reasonable to give inherited guides a name of
> > __dguide_B with TREE_TYPE A, but since that name is already
> > used B's own guides its TREE_TYPE is already B.
> 
> Why can't it be the same __dguide_B with TREE_TYPE B?

Ah because copy_guide_p relies on TREE_TYPE in order to recognize a copy
deduction guide, and with that TREE_TYPE it would still incorrectly
return false for an inherited copy deduction guide, e.g.

  A(A) -> A

gets transformed into

  B(A) -> B

and A != B so copy_guide_p returns false.

But it just occurred to me that this TREE_TYPE clobbering of the
__dguide_foo identifier already happens if we have two class templates
with the same name in different namespaces, since the identifier
contains only the terminal name.  Maybe this suggests that we should
use a tree flag to track whether a guide is the copy deduction guide
instead of setting TREE_TYPE of DECL_NAME?

> 
> > PR c++/115198
> > 
> > gcc/cp/ChangeLog:
> > 
> > * pt.cc (alias_ctad_tweaks): Update DECL_NAME of a transformed
> > guide during alias CTAD.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/class-deduction-alias22.C: New test.
> > ---
> >   gcc/cp/pt.cc   |  9 -
> >   .../g++.dg/cpp2a/class-deduction-alias22.C | 14 ++
> >   2 files changed, 22 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 0c4d96cf768..58873057abc 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -30304,13 +30304,14 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
> >any).  */
> >   enum { alias, inherited } ctad_kind;
> > -  tree atype, fullatparms, utype;
> > +  tree atype, fullatparms, utype, name;
> > if (TREE_CODE (tmpl) == TEMPLATE_DECL)
> >   {
> > ctad_kind = alias;
> > atype = TREE_TYPE (tmpl);
> > fullatparms = DECL_TEMPLATE_PARMS (tmpl);
> > utype = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
> > +  name = dguide_name (tmpl);
> >   }
> > else
> >   {
> > @@ -30318,6 +30319,10 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
> > atype = NULL_TREE;
> > fullatparms = TREE_PURPOSE (tmpl);
> > utype = TREE_VALUE (tmpl);
> > +  /* FIXME: What name should we give inherited guides?  It needs to be
> > +unique to the derived/base pair so that we don't clobber an earlier
> > +setting of TREE_TYPE.  */
> > +  name = NULL_TREE;
> >   }
> >   tsubst_flags_t complain = tf_warning_or_error;
> > @@ -30413,6 +30418,8 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
> > }
> >   if (g == error_mark_node)
> > continue;
> > + if (name)
> > +   DECL_NAME (g) = name;
> >   if (nfparms == 0)
> > {
> >   /* The targs are all non-dependent, so g isn't a template.  */
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
> > b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
> > new file mode 100644
> > index 000..9c6c841166a
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
> > @@ -0,0 +1,14 @@
> > +// PR c++/115198
> > +// { dg-do compile { target c++20 } }
> > +
> > +template
> > +struct C {
> > +  C() = default;
> > +  C(const C&) = default;
> > +};
> > +
> > +template
> > +using A = C;
> > +
> > +C c;
> > +A a = c; // { dg-bogus "ambiguous" }
> 
> 



Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Ian Lance Taylor
On Thu, May 23, 2024 at 2:00 PM Joseph Myers  wrote:
>
> On Tue, 21 May 2024, Martin Uecker wrote:
> >
> > C: allow aliasing of compatible types derived from enumeral types 
> > [PR115157]
> >
> > Aliasing of enumeral types with the underlying integer is now allowed
> > by setting the aliasing set to zero.  But this does not allow aliasing
> > of derived types which are compatible as required by ISO C.  Instead,
> > initially set structural equality.  Then set TYPE_CANONICAL and update
> > pointers and main variants when the type is completed (as done for
> > structures and unions in C23).
> >
> > PR 115157
> >
> > gcc/c/
> > * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> > finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / TYPE_CANONICAL.
> > * c-obj-common.cc (get_alias_set): Remove special case.
> > (get_aka_type): Add special case.
> >
> > gcc/
> > * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT instead
> > of TYPE_CANONICAL.
> >
> > gcc/testsuite/
> > * gcc.dg/enum-alias-1.c: New test.
> > * gcc.dg/enum-alias-2.c: New test.
> > * gcc.dg/enum-alias-3.c: New test.
>
> OK, in the absence of objections on middle-end or Go grounds within the
> next week.

The godump.cc patch is

   && (TYPE_CANONICAL (TREE_TYPE (decl)) == NULL_TREE
  || !container->decls_seen.contains
-   (TYPE_CANONICAL (TREE_TYPE (decl)
+   (TYPE_MAIN_VARIANT (TREE_TYPE (decl)
 {

What is the problem you are seeing?

This patch isn't right:

1) The code is saying if "X == NULL_TREE || !already_seen(X)".  This
patch is changing the latter X but not the former.  They should be
consistent.

2) At the bottom of that conditional block is code that adds a value
to container->decls_seen.  Today that code is adding TYPE_CANONICAL.
If we change the condition to test TYPE_MAIN_VARIANT, then we need to
add TYPE_MAIN_VARIANT to decls_seen.

Hope that makes sense.

I don't know why the patch is required, but it's fine with those
changes as long as the libgo tests continue to pass.

Ian


Re: [PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-05-23 Thread Jason Merrill

On 5/23/24 14:06, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk/14?

-- >8 --

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
(equivalently C).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias CTAD.  This eventually needs to be done for inherited CTAD
too, but it's not clear what identifier to use there since it has to be
unique for each derived/base pair.  For

   template struct A { ... };
   template struct B : A { using A::A; }

at first glance it'd be reasonable to give inherited guides a name of
__dguide_B with TREE_TYPE A, but since that name is already
used B's own guides its TREE_TYPE is already B.


Why can't it be the same __dguide_B with TREE_TYPE B?


PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of a transformed
guide during alias CTAD.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.
---
  gcc/cp/pt.cc   |  9 -
  .../g++.dg/cpp2a/class-deduction-alias22.C | 14 ++
  2 files changed, 22 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 0c4d96cf768..58873057abc 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30304,13 +30304,14 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
   any).  */
  
enum { alias, inherited } ctad_kind;

-  tree atype, fullatparms, utype;
+  tree atype, fullatparms, utype, name;
if (TREE_CODE (tmpl) == TEMPLATE_DECL)
  {
ctad_kind = alias;
atype = TREE_TYPE (tmpl);
fullatparms = DECL_TEMPLATE_PARMS (tmpl);
utype = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
+  name = dguide_name (tmpl);
  }
else
  {
@@ -30318,6 +30319,10 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
atype = NULL_TREE;
fullatparms = TREE_PURPOSE (tmpl);
utype = TREE_VALUE (tmpl);
+  /* FIXME: What name should we give inherited guides?  It needs to be
+unique to the derived/base pair so that we don't clobber an earlier
+setting of TREE_TYPE.  */
+  name = NULL_TREE;
  }
  
tsubst_flags_t complain = tf_warning_or_error;

@@ -30413,6 +30418,8 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
}
  if (g == error_mark_node)
continue;
+ if (name)
+   DECL_NAME (g) = name;
  if (nfparms == 0)
{
  /* The targs are all non-dependent, so g isn't a template.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
new file mode 100644
index 000..9c6c841166a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
@@ -0,0 +1,14 @@
+// PR c++/115198
+// { dg-do compile { target c++20 } }
+
+template
+struct C {
+  C() = default;
+  C(const C&) = default;
+};
+
+template
+using A = C;
+
+C c;
+A a = c; // { dg-bogus "ambiguous" }




Re: [PATCH] c++/modules: Improve errors for bad module-directives [PR115200]

2024-05-23 Thread Jason Merrill

On 5/23/24 10:54, Nathaniel Shead wrote:

Bootstrapped and regtested (so far just modules.exp and dg.exp) on
x86_64-pc-linux-gnu, OK for trunk if full regtest succeeds?

-- >8 --

This fixes an ICE when a module directive is not given at global scope.
Although not explicitly mentioned, it seems implied from [basic.link] p1
and [module.global.frag] that a module-declaration must appear at the
global scope after preprocessing.  Apart from this the patch also
slightly improves the errors given when accidentally using a module
control-line in other situations where it is not expected.


This could also come up with something like

int module;
int i =
  module; // error, unexpected module directive

Adding a line break seems like confusing advice for this problem; 
rather, they need to remove the line break before 'module'.  And 
possibly add it in somewhere else, but the problem is that 'module' is 
the first token on the line.  And if I put that in a namespace,


namespace A {
  int module;
  int i =
module; // error, unexpected module directive
}

the problem is the same, but we get a different diagnostic.

I think I'd leave the "must be at global scope" diagnostic to 
cp_parser_module_declaration, and assume that if we see a module keyword 
at function scope it wasn't intended to be a module directive.



PR c++/115200

gcc/cp/ChangeLog:

* parser.cc (cp_parser_error_1): Special-case unexpected module
directives for better diagnostics.
(cp_parser_module_declaration): Check that the module
declaration is at global scope.

gcc/testsuite/ChangeLog:

* g++.dg/modules/mod-decl-1.C: Update error messages.
* g++.dg/modules/mod-decl-6.C: New test.
* g++.dg/modules/mod-decl-7.C: New test.
* g++.dg/modules/mod-decl-8.C: New test.
* g++.dg/modules/mod-decl-8.h: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/parser.cc  | 32 +++
  gcc/testsuite/g++.dg/modules/mod-decl-1.C |  6 ++---
  gcc/testsuite/g++.dg/modules/mod-decl-6.C | 11 
  gcc/testsuite/g++.dg/modules/mod-decl-7.C | 12 +
  gcc/testsuite/g++.dg/modules/mod-decl-8.C |  9 +++
  gcc/testsuite/g++.dg/modules/mod-decl-8.h |  4 +++
  6 files changed, 71 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-6.C
  create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-7.C
  create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-8.C
  create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-8.h

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 476ddc0d63a..1c0543ba154 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -3230,6 +3230,31 @@ cp_parser_error_1 (cp_parser* parser, const char* gmsgid,
return;
  }
  
+  if (cp_token_is_module_directive (token))

+{
+  cp_token *next = (token->keyword == RID__EXPORT
+   ? cp_lexer_peek_nth_token (parser->lexer, 2) : token);
+
+  auto_diagnostic_group d;
+  error_at (token->location, "unexpected module directive");
+  tree scope = current_scope ();
+  if (next->keyword == RID__MODULE
+ && token->main_source_p
+ && scope != global_namespace)
+   {
+ /* Nicer error for unterminated scopes in GMF includes.  */
+ inform (token->location,
+ "module-declaration must be at global scope");
+ inform (location_of (scope), "scope opened here");
+   }
+  else
+   inform (token->location, "perhaps insert a line break, or other"
+   " disambiguation, to prevent this being considered a"
+   " module control-line");
+  cp_parser_skip_to_pragma_eol (parser, token);
+  return;
+}
+
/* If this is actually a conflict marker, report it as such.  */
if (token->type == CPP_LSHIFT
|| token->type == CPP_RSHIFT
@@ -15135,12 +15160,19 @@ cp_parser_module_declaration (cp_parser *parser, 
module_parse mp_state,
parser->lexer->in_pragma = true;
cp_token *token = cp_lexer_consume_token (parser->lexer);
  
+  tree scope = current_scope ();

if (flag_header_unit)
  {
error_at (token->location,
"module-declaration not permitted in header-unit");
goto skip_eol;
  }
+  else if (scope != global_namespace)
+{
+  error_at (token->location, "module-declaration must be at global scope");
+  inform (DECL_SOURCE_LOCATION (scope), "scope opened here");
+  goto skip_eol;
+}
else if (mp_state == MP_FIRST && !exporting
&& cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
  {
diff --git a/gcc/testsuite/g++.dg/modules/mod-decl-1.C 
b/gcc/testsuite/g++.dg/modules/mod-decl-1.C
index 23d34483dd7..84fa31c7024 100644
--- a/gcc/testsuite/g++.dg/modules/mod-decl-1.C
+++ b/gcc/testsuite/g++.dg/modules/mod-decl-1.C
@@ -10,17 +10,17 @@ module foo.second; // { dg-error "only permitted as" }
  
  namespace Foo

  {

Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Joseph Myers
On Tue, 21 May 2024, Martin Uecker wrote:
> 
> C: allow aliasing of compatible types derived from enumeral types 
> [PR115157]
> 
> Aliasing of enumeral types with the underlying integer is now allowed
> by setting the aliasing set to zero.  But this does not allow aliasing
> of derived types which are compatible as required by ISO C.  Instead,
> initially set structural equality.  Then set TYPE_CANONICAL and update
> pointers and main variants when the type is completed (as done for
> structures and unions in C23).
> 
> PR 115157
> 
> gcc/c/
> * c-decl.cc (shadow_tag-warned,parse_xref_tag,start_enum,
> finish_enum): Set SET_TYPE_STRUCTURAL_EQUALITY / TYPE_CANONICAL.
> * c-obj-common.cc (get_alias_set): Remove special case.
> (get_aka_type): Add special case.
> 
> gcc/
> * godump.cc (go_output_typedef): use TYPE_MAIN_VARIANT instead
> of TYPE_CANONICAL.
> 
> gcc/testsuite/
> * gcc.dg/enum-alias-1.c: New test.
> * gcc.dg/enum-alias-2.c: New test.
> * gcc.dg/enum-alias-3.c: New test.

OK, in the absence of objections on middle-end or Go grounds within the 
next week.

-- 
Joseph S. Myers
josmy...@redhat.com



[COMMITTED 11/12] - Make gori_map a shared component.

2024-05-23 Thread Andrew MacLeod
This patch moves the gori_map object out of the gori object, and into 
the range query.   it is required by gori, and will be created 
simultaneously with gori.


The dependency data it manages has uses outside of GORI, and this makes 
it easier to access by the fold_using_range routines, and others.


Documentation is coming :-P,

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.
From e81eafd81d76cf4e8b03089a94857b4b52a66bc7 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 21 May 2024 14:20:52 -0400
Subject: [PATCH 11/12] Make gori_map a shared component.

Move gori_map dependency and import/export object into a range query and
construct it simultaneously with a gori object.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Use gori_ssa.
	(ranger_cache::dump): Likewise.
	(ranger_cache::get_global_range): Likewise.
	(ranger_cache::set_global_range): Likewise.
	(ranger_cache::register_inferred_value): Likewise.
	* gimple-range-edge.h (gimple_outgoing_range::map): Remove.
	* gimple-range-fold.cc (fold_using_range::range_of_range_op): Use
	gori_ssa.
	(fold_using_range::range_of_address): Likewise.
	(fold_using_range::range_of_phi): Likewise.
	(fur_source::register_outgoing_edges): Likewise.
	* gimple-range-fold.h (fur_source::query): Make const.
	(gori_ssa): New.
	* gimple-range-gori.cc (gori_map::dump): Use 'this' pointer.
	(gori_compute::gori_compute): Construct with a gori_map.
	* gimple-range-gori.h (gori_compute:gori_compute): Change
	prototype.
	(gori_compute::map): Delete.
	(gori_compute::m_map): Change to a reference.
	(FOR_EACH_GORI_IMPORT_NAME): Change parameter gori to gorimap.
	(FOR_EACH_GORI_EXPORT_NAME): Likewise.
	* gimple-range-path.cc (path_range_query::compute_ranges_in_block):
	Use gori_ssa method.
	(path_range_query::compute_exit_dependencies): Likewise.
	* gimple-range.cc (gimple_ranger::range_of_stmt): Likewise.
	(gimple_ranger::register_transitive_inferred_ranges): Likewise.
	* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges):
	Likewise.
	* tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise.
	* tree-vrp.cc (remove_unreachable::handle_early): Likewise.
	(remove_unreachable::remove_and_update_globals): Likewise.
	* value-query.cc (range_query::create_gori): Create gori map.
	(range_query::share_query): Copy gori map member.
	(range_query::range_query): Initiialize gori_map member.
	* value-query.h (range_query::gori_ssa): New.
	(range_query::m_map): New.
---
 gcc/gimple-range-cache.cc  | 16 
 gcc/gimple-range-edge.h|  1 -
 gcc/gimple-range-fold.cc   | 22 +++---
 gcc/gimple-range-fold.h|  4 +++-
 gcc/gimple-range-gori.cc   |  9 +
 gcc/gimple-range-gori.h| 14 +++---
 gcc/gimple-range-path.cc   |  6 +++---
 gcc/gimple-range.cc|  6 +++---
 gcc/tree-ssa-dom.cc|  2 +-
 gcc/tree-ssa-threadedge.cc |  2 +-
 gcc/tree-vrp.cc|  8 
 gcc/value-query.cc |  6 +-
 gcc/value-query.h  |  2 ++
 13 files changed, 53 insertions(+), 45 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index e75cac66902..a511a2c3a4c 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -969,7 +969,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 {
   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
   if (bb)
-	gori ().map ()->exports (bb);
+	gori_ssa ()->exports (bb);
 }
   m_update = new update_list ();
 }
@@ -1000,7 +1000,7 @@ ranger_cache::dump (FILE *f)
 void
 ranger_cache::dump_bb (FILE *f, basic_block bb)
 {
-  gori ().map ()->dump (f, bb, false);
+  gori_ssa ()->dump (f, bb, false);
   m_on_entry.dump (f, bb);
   m_relation->dump (f, bb);
 }
@@ -1033,8 +1033,8 @@ ranger_cache::get_global_range (vrange , tree name, bool _p)
   current_p = false;
   if (had_global)
 current_p = r.singleton_p ()
-		|| m_temporal->current_p (name, gori ().map ()->depend1 (name),
-	  gori ().map ()->depend2 (name));
+		|| m_temporal->current_p (name, gori_ssa ()->depend1 (name),
+	  gori_ssa ()->depend2 (name));
   else
 {
   // If no global value has been set and value is VARYING, fold the stmt
@@ -1071,8 +1071,8 @@ ranger_cache::set_global_range (tree name, const vrange , bool changed)
   if (!changed)
 {
   // If there are dependencies, make sure this is not out of date.
-  if (!m_temporal->current_p (name, gori ().map ()->depend1 (name),
- gori ().map ()->depend2 (name)))
+  if (!m_temporal->current_p (name, gori_ssa ()->depend1 (name),
+ gori_ssa ()->depend2 (name)))
 	m_temporal->set_timestamp (name);
   return;
 }
@@ -1097,7 +1097,7 @@ ranger_cache::set_global_range (tree name, const vrange , bool changed)
 
   if (r.singleton_p ()
   || (POINTER_TYPE_P (TREE_TYPE (name)) && r.nonzero_p ()))
-gori ().map ()->set_range_invariant (name);
+gori_ssa ()->set_range_invariant (name);
   m_temporal->set_timestamp (name);
 }

[COMMITTED 06/12] tree-optimization/113879 - Add inferred ranges for range-ops based statements.

2024-05-23 Thread Andrew MacLeod
gimple_range_fold contains some shorthand fold_range routines for easy 
user consumption of the range-ops interface, but there is no equivalent

routines for op1_range and op2_range.  This patch provides basic versions.

I have started range-op documentation, but its very early days so not 
that useful yet: https://gcc.gnu.org/wiki/AndrewMacLeod/RangeOperator


Any range-op entry which has an op1_range or op2_range implemented can 
potentially also provide inferred ranges.  This is a step towards PR 
113879.  Default is currently OFF for performance reasons as it 
dramatically increases the number of inferred ranges past where the 
current engine is comfortable with, but the functionality will now be 
there to move towards fixing the PR.  It might be appropriate for -O3, 
but I'll hold of for the moment.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.







From 985581b05f32b62df15b60833a8a57544dbbd739 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Thu, 2 May 2024 12:23:18 -0400
Subject: [PATCH 06/12] Add inferred ranges for range-ops based statements.

Gimple_range_fold contains some shorthand fold_range routines for
easy user consumption of that range-ops interface, but there is no equivalent
routines for op1_range and op2_range.  This patch provides basic versions.

Any range-op entry which has an op1_range or op2_range implemented can
potentially also provide inferred ranges.  This is a step towards
PR 113879.  Default is currently OFF for performance reasons as it
dramtically increases the number of inferred ranges.

	PR tree-optimization/113879
	* gimple-range-fold.cc (op1_range): New.
	(op2_range): New.
	* gimple-range-fold.h (op1_range): New prototypes.
	(op2_range): New prototypes.
	* gimple-range-infer.cc (gimple_infer_range::add_range): Do not
	add an inferred range if it is VARYING.
	(gimple_infer_range::gimple_infer_range): Add inferred ranges
	for any range-op statements if requested.
	* gimple-range-infer.h (gimple_infer_range): Add parameter.
---
 gcc/gimple-range-fold.cc  | 71 +++
 gcc/gimple-range-fold.h   |  7 
 gcc/gimple-range-infer.cc | 41 +-
 gcc/gimple-range-infer.h  |  2 +-
 4 files changed, 119 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index 357a1beabd1..9e9c5960972 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -328,6 +328,77 @@ fold_range (vrange , gimple *s, edge on_edge, range_query *q)
   return f.fold_stmt (r, s, src);
 }
 
+// Calculate op1 on statetemt S with LHS into range R using range query Q
+// to resolve any other operands.
+
+bool
+op1_range (vrange , gimple *s, const vrange , range_query *q)
+{
+  gimple_range_op_handler handler (s);
+  if (!handler)
+return false;
+
+  fur_stmt src (s, q);
+
+  tree op2_expr = handler.operand2 ();
+  if (!op2_expr)
+return handler.calc_op1 (r, lhs);
+
+  Value_Range op2 (TREE_TYPE (op2_expr));
+  if (!src.get_operand (op2, op2_expr))
+return false;
+
+  return handler.calc_op1 (r, lhs, op2);
+}
+
+// Calculate op1 on statetemt S into range R using range query Q.
+// LHS is set to VARYING in this case.
+
+bool
+op1_range (vrange , gimple *s, range_query *q)
+{
+  tree lhs_type = gimple_range_type (s);
+  if (!lhs_type)
+return false;
+  Value_Range lhs_range;
+  lhs_range.set_varying (lhs_type);
+  return op1_range (r, s, lhs_range, q);
+}
+
+// Calculate op2 on statetemt S with LHS into range R using range query Q
+// to resolve any other operands.
+
+bool
+op2_range (vrange , gimple *s, const vrange , range_query *q)
+{
+
+  gimple_range_op_handler handler (s);
+  if (!handler)
+return false;
+
+  fur_stmt src (s, q);
+
+  Value_Range op1 (TREE_TYPE (handler.operand1 ()));
+  if (!src.get_operand (op1, handler.operand1 ()))
+return false;
+
+  return handler.calc_op2 (r, lhs, op1);
+}
+
+// Calculate op2 on statetemt S into range R using range query Q.
+// LHS is set to VARYING in this case.
+
+bool
+op2_range (vrange , gimple *s, range_query *q)
+{
+  tree lhs_type = gimple_range_type (s);
+  if (!lhs_type)
+return false;
+  Value_Range lhs_range;
+  lhs_range.set_varying (lhs_type);
+  return op2_range (r, s, lhs_range, q);
+}
+
 // Provide a fur_source which can be used to determine any relations on
 // a statement.  It manages the callback from fold_using_ranges to determine
 // a relation_trio for a statement.
diff --git a/gcc/gimple-range-fold.h b/gcc/gimple-range-fold.h
index 1925fb899e3..d974b0192c8 100644
--- a/gcc/gimple-range-fold.h
+++ b/gcc/gimple-range-fold.h
@@ -43,6 +43,13 @@ bool fold_range (vrange , gimple *s, vrange , vrange ,
 bool fold_range (vrange , gimple *s, unsigned num_elements, vrange **vector,
 		 range_query *q = NULL);
 
+// Calculate op1 on stmt S.
+bool op1_range (vrange &, gimple *s, range_query *q = NULL);
+bool op1_range (vrange &, gimple *s, const vrange , range_query *q = NULL);
+// Calculate op2 on 

[COMMITTED 09/12] - Gori_compute inherits from gimple_outgoing_range.

2024-05-23 Thread Andrew MacLeod
This patch makes gimple_outgoing_range a base class for the GORI API, 
and provides basic routines for the SSA-NAME versions returning false.   
gori_compute now inherits from gimple_outgoing_range and no longer needs 
it as a private member. This makes far more sense as GORI is adding the 
ability to calculate SSA_NAMEs on edges in addition to the basic static 
edge ranges.  It also renames outgoing_edge_range_p to edge_range_p for 
consistency with the static edge range routine.


The basic API for static edges (including switch ranges) is documented 
here. https://gcc.gnu.org/wiki/AndrewMacLeod/GimpleOutgoingRange


The more advanced GORI ssa-name processing engine has not been written 
yet.  its on the to-do list :-)


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.


From 8feb69600dd696fb8a6e3b88b7d159ced5cb0eb9 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Thu, 9 May 2024 16:34:12 -0400
Subject: [PATCH 09/12] Gori_compute inherits from gimple_outgoing_range.

Make gimple_outgoing_range a base class for the GORI API, and provide
base routines returning false.   gori_compute inherits from
gimple_outgoing_range and no longer needs it as a private member.
Rename outgoing_edge_range_p to edge_range_p.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust
	m_gori constructor.
	(ranger_cache::edge_range): Use renamed edge_range_p name.
	(ranger_cache::range_from_dom): Likewise.
	* gimple-range-edge.h (gimple_outgoing_range::condexpr_adjust): New.
	(gimple_outgoing_range::has_edge_range_p): New.
	(gimple_outgoing_range::dump): New.
	(gimple_outgoing_range::compute_operand_range): New.
	(gimple_outgoing_range::map): New.
	* gimple-range-fold.cc (fur_source::register_outgoing_edges ): Use
	renamed edge_range_p routine
	* gimple-range-gori.cc (gori_compute::gori_compute): Adjust
	constructor.
	(gori_compute::~gori_compute): New.
	(gori_compute::edge_range_p): Rename from outgoing_edge_range_p
	and use inherited routine instead of member method.
	* gimple-range-gori.h (class gori_compute): Inherit from
	gimple_outgoing_range, adjust protoypes.
	(gori_compute::outgpoing): Delete.
	* gimple-range-path.cc (path_range_query::compute_ranges_in_block): Use
	renamed edge_range_p routine.
	* tree-ssa-loop-unswitch.cc (evaluate_control_stmt_using_entry_checks):
	Likewise.
---
 gcc/gimple-range-cache.cc |  6 +++---
 gcc/gimple-range-edge.h   | 15 ++-
 gcc/gimple-range-fold.cc  |  4 ++--
 gcc/gimple-range-gori.cc  | 13 -
 gcc/gimple-range-gori.h   | 10 +-
 gcc/gimple-range-path.cc  |  4 ++--
 gcc/tree-ssa-loop-unswitch.cc |  4 ++--
 7 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index c52475852a9..40e4baa6289 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -950,7 +950,7 @@ update_list::pop ()
 // --
 
 ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
-		: m_gori (not_executable_flag)
+  : m_gori (not_executable_flag, param_vrp_switch_limit)
 {
   m_workback.create (0);
   m_workback.safe_grow_cleared (last_basic_block_for_fn (cfun));
@@ -1178,7 +1178,7 @@ ranger_cache::edge_range (vrange , edge e, tree name, enum rfd_mode mode)
   if ((e->flags & (EDGE_EH | EDGE_ABNORMAL)) == 0)
 infer_oracle ().maybe_adjust_range (r, name, e->src);
   Value_Range er (TREE_TYPE (name));
-  if (m_gori.outgoing_edge_range_p (er, e, name, *this))
+  if (m_gori.edge_range_p (er, e, name, *this))
 r.intersect (er);
   return true;
 }
@@ -1738,7 +1738,7 @@ ranger_cache::range_from_dom (vrange , tree name, basic_block start_bb,
 
   edge e = single_pred_edge (prev_bb);
   bb = e->src;
-  if (m_gori.outgoing_edge_range_p (er, e, name, *this))
+  if (m_gori.edge_range_p (er, e, name, *this))
 	{
 	  r.intersect (er);
 	  // If this is a normal edge, apply any inferred ranges.
diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
index ce8b04f6bad..be1f0c2cc15 100644
--- a/gcc/gimple-range-edge.h
+++ b/gcc/gimple-range-edge.h
@@ -48,9 +48,22 @@ class gimple_outgoing_range
 {
 public:
   gimple_outgoing_range (int max_sw_edges = 0);
-  ~gimple_outgoing_range ();
+  virtual ~gimple_outgoing_range ();
   gimple *edge_range_p (irange , edge e);
   void set_switch_limit (int max_sw_edges = INT_MAX);
+
+  virtual bool edge_range_p (vrange &, edge, tree, range_query &)
+{ return false; }
+  virtual bool condexpr_adjust (vrange &, vrange &, gimple *, tree, tree, tree,
+class fur_source &) { return false; }
+  virtual bool has_edge_range_p (tree, basic_block = NULL) { return false; }
+  virtual bool has_edge_range_p (tree, edge ) { return false; }
+  virtual void dump (FILE *) { }
+  virtual bool compute_operand_range (vrange &, gimple *, const vrange &, tree,
+  fur_source &,
+  class value_relation * = NULL)
+{ 

[COMMITTED 05/12] - Move infer_manager to a range_query oracle.

2024-05-23 Thread Andrew MacLeod
Turn the infer_manager class into an always available oracle accessible 
via a range_query object.   This will make it easier to share and query 
inferred range info between objects and also makes the information 
easily accessible to any pass that is interested. This again removes the 
need to check for a non-null object, and again makes for a slight 
performance improvement.


Documentation on the inferred range manager can be found at : 
https://gcc.gnu.org/wiki/AndrewMacLeod/InferredRanges


It also associates each inferred range with it's originating stmt which 
was missing before (we only knew what block it came from). Future 
functionality will make use of the more specific information.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.
From 837ce8a2d75231b68f13da00d9be8d2fd404804e Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 17 May 2024 10:50:24 -0400
Subject: [PATCH 05/12] Move infer_manager to a range_query oracle.

Turn the infer_manager class into an always available oracle accessible via a
range_query object.  Also assocaite each inferrred range with it's
originating stmt.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Create an infer
	oracle instead of a local member.
	(ranger_cache::~ranger_cache): Destroy the oracle.
	(ranger_cache::edge_range): Use oracle.
	(ranger_cache::fill_block_cache): Likewise.
	(ranger_cache::range_from_dom): Likewise.
	(ranger_cache::apply_inferred_ranges): Likewise.
	* gimple-range-cache.h (ranger_cache::m_exit): Delete.
	* gimple-range-infer.cc (infer_oracle): New static object;
	(class infer_oracle): New.
	(non_null_wrapper::non_null_wrapper): New.
	(non_null_wrapper::add_nonzero): New.
	(non_null_wrapper::add_range): New.
	(non_null_loadstore): Use nonnull_wrapper.
	(gimple_infer_range::gimple_infer_range): New alternate constructor.
	(exit_range::stmt): New.
	(infer_range_manager::has_range_p): Combine seperate methods.
	(infer_range_manager::maybe_adjust_range): Adjust has_range_p call.
	(infer_range_manager::add_ranges): New.
	(infer_range_manager::add_range): Take stmt rather than BB.
	(infer_range_manager::add_nonzero): Adjust from BB to stmt.
	* gimple-range-infer.h (class gimple_infer_range): Adjust methods.
	(infer_range_oracle): New.
	(class infer_range_manager): Inherit from infer_range_oracle.
	Adjust methods.
	* gimple-range-path.cc (path_range_query::range_defined_in_block): Use
	oracle.
	(path_range_query::adjust_for_non_null_uses): Likewise.
	* gimple-range.cc (gimple_ranger::range_on_edge): Likewise
	(gimple_ranger::register_transitive_inferred_ranges): Likewise.
	* value-query.cc (default_infer_oracle): New.
	(range_query::create_infer_oracle): New.
	(range_query::destroy_infer_oracle): New.
	(range_query::share_query): Copy infer pointer.
	(range_query::range_query): Initialize infer pointer.
	(range_query::~range_query): destroy infer object.
	* value-query.h (range_query::infer_oracle): New.
	(range_query::create_infer_oracle): New prototype.
	(range_query::destroy_infer_oracle): New prototype.
	(range_query::m_infer): New.
---
 gcc/gimple-range-cache.cc | 24 +--
 gcc/gimple-range-cache.h  |  1 -
 gcc/gimple-range-infer.cc | 90 +++
 gcc/gimple-range-infer.h  | 31 ++
 gcc/gimple-range-path.cc  |  4 +-
 gcc/gimple-range.cc   | 14 +++---
 gcc/value-query.cc| 20 +
 gcc/value-query.h |  5 +++
 8 files changed, 131 insertions(+), 58 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 55277ea8af1..34dc9c4a3ec 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -950,8 +950,7 @@ update_list::pop ()
 // --
 
 ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
-		: m_gori (not_executable_flag),
-		  m_exit (use_imm_uses)
+		: m_gori (not_executable_flag)
 {
   m_workback.create (0);
   m_workback.safe_grow_cleared (last_basic_block_for_fn (cfun));
@@ -960,6 +959,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 
   // If DOM info is available, spawn an oracle as well.
   create_relation_oracle ();
+  create_infer_oracle (use_imm_uses);
 
   unsigned x, lim = last_basic_block_for_fn (cfun);
   // Calculate outgoing range info upfront.  This will fully populate the
@@ -977,6 +977,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 ranger_cache::~ranger_cache ()
 {
   delete m_update;
+  destroy_infer_oracle ();
   destroy_relation_oracle ();
   delete m_temporal;
   m_workback.release ();
@@ -1175,7 +1176,7 @@ ranger_cache::edge_range (vrange , edge e, tree name, enum rfd_mode mode)
   exit_range (r, name, e->src, mode);
   // If this is not an abnormal edge, check for inferred ranges on exit.
   if ((e->flags & (EDGE_EH | EDGE_ABNORMAL)) == 0)
-m_exit.maybe_adjust_range (r, name, e->src);
+infer_oracle 

[COMMITTED 12/12] - Move condexpr_adjust into gimple-range-fold

2024-05-23 Thread Andrew MacLeod
Certain components of GORI were needed in order to process a COND_EXPR 
expression and calculate the 2 operands as if they were true and false edges
based on the condition.   With GORI available from the range_query 
object now, this can be moved into the fold_using_range code where it 
really belongs.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.
From eb66da78b896ad5e7f6a315413ed68273c83662f Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 21 May 2024 12:41:49 -0400
Subject: [PATCH 12/12] Move condexpr_adjust into gimple-range-fold

Certain components of GORI were needed in order to process a COND_EXPR
expression and calculate the 2 operands as if they were true and false edges
based on the condition.   With GORI available from the range_query
objcet now, this can be moved into the fold_using_range code where it
really belongs.

	* gimple-range-edge.h (range_query::condexpr_adjust): Delete.
	* gimpe-range-fold.cc (fold_using_range::range_of_range_op): Use
	gori_ssa routine.
	(fold_using_range::range_of_address): Likewise.
	(fold_using_range::range_of_phi): Likewise.
	(fold_using_range::condexpr_adjust): Relocated from gori_compute.
	(fold_using_range::range_of_cond_expr): Use local condexpr_adjust.
	(fur_source::register_outgoing_edges): Use gori_ssa routine.
	* gimple-range-fold.h (gori_ssa): Rename from gori_bb.
	(fold_using_range::condexpr_adjust): Add prototype.
	* gimple-range-gori.cc (gori_compute::condexpr_adjust): Relocate.
	* gimple-range-gori.h (gori_compute::condexpr_adjust): Delete.
---
 gcc/gimple-range-edge.h  |   4 +-
 gcc/gimple-range-fold.cc | 130 ---
 gcc/gimple-range-fold.h  |   4 +-
 gcc/gimple-range-gori.cc | 103 ---
 gcc/gimple-range-gori.h  |   2 -
 5 files changed, 113 insertions(+), 130 deletions(-)

diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
index 0096c02faf4..0de1cca4294 100644
--- a/gcc/gimple-range-edge.h
+++ b/gcc/gimple-range-edge.h
@@ -54,13 +54,11 @@ public:
 
   virtual bool edge_range_p (vrange &, edge, tree, range_query &)
 { return false; }
-  virtual bool condexpr_adjust (vrange &, vrange &, gimple *, tree, tree, tree,
-class fur_source &) { return false; }
   virtual bool has_edge_range_p (tree, basic_block = NULL) { return false; }
   virtual bool has_edge_range_p (tree, edge ) { return false; }
   virtual void dump (FILE *) { }
   virtual bool compute_operand_range (vrange &, gimple *, const vrange &, tree,
-  fur_source &,
+  class fur_source &,
   class value_relation * = NULL)
 { return false; }
 private:
diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index a0ff7f2b98b..b3965b5ee50 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -745,8 +745,8 @@ fold_using_range::range_of_range_op (vrange ,
 	r.set_varying (type);
 	  if (lhs && gimple_range_ssa_p (op1))
 	{
-	  if (src.gori_bb ())
-		src.gori_bb ()->register_dependency (lhs, op1);
+	  if (src.gori_ssa ())
+		src.gori_ssa ()->register_dependency (lhs, op1);
 	  relation_kind rel;
 	  rel = handler.lhs_op1_relation (r, range1, range1);
 	  if (rel != VREL_VARYING)
@@ -772,10 +772,10 @@ fold_using_range::range_of_range_op (vrange ,
 	relation_fold_and_or (as_a  (r), s, src, range1, range2);
 	  if (lhs)
 	{
-	  if (src.gori_bb ())
+	  if (src.gori_ssa ())
 		{
-		  src.gori_bb ()->register_dependency (lhs, op1);
-		  src.gori_bb ()->register_dependency (lhs, op2);
+		  src.gori_ssa ()->register_dependency (lhs, op1);
+		  src.gori_ssa ()->register_dependency (lhs, op2);
 		}
 	  if (gimple_range_ssa_p (op1))
 		{
@@ -843,8 +843,8 @@ fold_using_range::range_of_address (prange , gimple *stmt, fur_source )
 {
   tree ssa = TREE_OPERAND (base, 0);
   tree lhs = gimple_get_lhs (stmt);
-  if (lhs && gimple_range_ssa_p (ssa) && src.gori_bb ())
-	src.gori_bb ()->register_dependency (lhs, ssa);
+  if (lhs && gimple_range_ssa_p (ssa) && src.gori_ssa ())
+	src.gori_ssa ()->register_dependency (lhs, ssa);
   src.get_operand (r, ssa);
   range_cast (r, TREE_TYPE (gimple_assign_rhs1 (stmt)));
 
@@ -950,8 +950,8 @@ fold_using_range::range_of_phi (vrange , gphi *phi, fur_source )
 	  else
 	r.union_ (arg_range);
 
-	  if (gimple_range_ssa_p (arg) && src.gori_bb ())
-	src.gori_bb ()->register_dependency (phi_def, arg);
+	  if (gimple_range_ssa_p (arg) && src.gori_ssa ())
+	src.gori_ssa ()->register_dependency (phi_def, arg);
 	}
 
   // Track if all arguments are the same.
@@ -1114,6 +1114,95 @@ fold_using_range::range_of_call (vrange , gcall *call, fur_source &)
   return true;
 }
 
+// Given COND ? OP1 : OP2 with ranges R1 for OP1 and R2 for OP2, Use gori
+// to further resolve R1 and R2 if there are any dependencies between
+// OP1 and COND or OP2 and COND.  All values can are to be calculated using SRC
+// as the origination source location for 

[COMMITTED 10/12] - Make GORI a range_query component.

2024-05-23 Thread Andrew MacLeod
This patch moves the GORI component into the range_query object, and 
makes it generally available.  This makes it much easier to share 
between ranger, other range_queries, and the passes using them.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.
From 59a3a0ad763bc03ad5ab630a62fbc78ae50b486f Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 17 May 2024 14:27:12 -0400
Subject: [PATCH 10/12] Make GORI a range_query component.

This patch moves the GORI component into the range_query object, and
makes it generally available.  This makes it much easier to share
between ranger and the passes.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Create
	GORi via the range_query instead of a local member.
	(ranger_cache::dump_bb): Use gori via from the range_query parent.
	(ranger_cache::get_global_range): Likewise.
	(ranger_cache::set_global_range): Likewise.
	(ranger_cache::edge_range): Likewise.
	(anger_cache::block_range): Likewise.
	(ranger_cache::fill_block_cache): Likewise.
	(ranger_cache::range_from_dom): Likewise.
	(ranger_cache::register_inferred_value): Likewise.
	* gimple-range-cache.h (ranger_cache::m_gori): Delete.
	* gimple-range-fold.cc (fur_source::fur_source): Set m_depend_p.
	(fur_depend::fur_depend): Remove gori parameter.
	* gimple-range-fold.h (fur_source::gori): Adjust.
	(fur_source::m_gori): Delete.
	(fur_source::m_depend): New.
	(fur_depend::fur_depend): Adjust prototype.
	* gimple-range-path.cc (path_range_query::path_range_query): Share
	ranger oracles.
	(path_range_query::range_defined_in_block): Use oracle directly.
	(path_range_query::compute_ranges_in_block): Use new gori() method.
	(path_range_query::adjust_for_non_null_uses): Use oracle directly.
	(path_range_query::compute_exit_dependencies): Likewise.
	(jt_fur_source::jt_fur_source): No gori in the parameters.
	(path_range_query::range_of_stmt): Likewise.
	(path_range_query::compute_outgoing_relations): Likewise.
	* gimple-range.cc (gimple_ranger::fold_range_internal): Likewise.
	(gimple_ranger::range_of_stmt): Access gori via gori () method.
	(assume_query::range_of_expr): Create a gori object.
	(assume_query::~assume_query): Destroy a gori object.
	(assume_query::calculate_op): Remove old gori() accessor.
	* gimple-range.h (gimple_ranger::gori): Delete.
	(assume_query::~assume_query): New.
	(assume_query::m_gori): Delete.
	* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): use
	gori () method.
	* tree-ssa-threadedge.cc (compute_exit_dependencies): Likewise.
	* value-query.cc (default_gori): New.
	(range_query::create_gori): New.
	(range_query::destroy_gori): New.
	(range_query::share_oracles): Set m_gori.
	(range_query::range_query): Set m_gori to default.
	(range_query::~range_query): call destroy gori.
	* value-query.h (range_query): Adjust prototypes
	(range_query::m_gori): New.
---
 gcc/gimple-range-cache.cc  | 34 +-
 gcc/gimple-range-cache.h   |  1 -
 gcc/gimple-range-fold.cc   |  7 +++
 gcc/gimple-range-fold.h|  7 ---
 gcc/gimple-range-path.cc   | 28 ++--
 gcc/gimple-range.cc| 12 +---
 gcc/gimple-range.h |  3 +--
 gcc/tree-ssa-dom.cc|  3 +--
 gcc/tree-ssa-threadedge.cc |  4 +---
 gcc/value-query.cc | 20 
 gcc/value-query.h  |  5 +
 11 files changed, 75 insertions(+), 49 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 40e4baa6289..e75cac66902 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -950,7 +950,6 @@ update_list::pop ()
 // --
 
 ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
-  : m_gori (not_executable_flag, param_vrp_switch_limit)
 {
   m_workback.create (0);
   m_workback.safe_grow_cleared (last_basic_block_for_fn (cfun));
@@ -960,6 +959,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
   // If DOM info is available, spawn an oracle as well.
   create_relation_oracle ();
   create_infer_oracle (use_imm_uses);
+  create_gori (not_executable_flag, param_vrp_switch_limit);
 
   unsigned x, lim = last_basic_block_for_fn (cfun);
   // Calculate outgoing range info upfront.  This will fully populate the
@@ -969,7 +969,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 {
   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
   if (bb)
-	m_gori.map ()->exports (bb);
+	gori ().map ()->exports (bb);
 }
   m_update = new update_list ();
 }
@@ -1000,7 +1000,7 @@ ranger_cache::dump (FILE *f)
 void
 ranger_cache::dump_bb (FILE *f, basic_block bb)
 {
-  m_gori.map ()->dump (f, bb, false);
+  gori ().map ()->dump (f, bb, false);
   m_on_entry.dump (f, bb);
   m_relation->dump (f, bb);
 }
@@ -1033,8 +1033,8 @@ ranger_cache::get_global_range (vrange , tree name, bool _p)
   current_p = false;
   if (had_global)
 

[COMMITTED 03/12] - Rename relation oracle and API.

2024-05-23 Thread Andrew MacLeod

Could have been combined with the previous patch, but eh

This changes the relation oracle accessed via a range_query to the name 
'relation', as there are more oracles coming, and this is more 
description.  it also renames the registering and querying routines to 
have less redundant rext. relation->register_relation and 
relation->query_relation seem a bit texty.  they are now  
relation.record () and relation.query ()


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

From 3a5b702c4082950d614fe12a066609da23363246 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 17 May 2024 10:18:39 -0400
Subject: [PATCH 03/12] Rename relation oracle and API.

With more oracles incoming, rename the range_query oracle () method to
relation (), and remove the redundant 'relation' text from register and query
methods, resulting in calls that look like:
  relation ()->record (...)   and
  relation ()->query (...)

	* gimple-range-cache.cc (ranger_cache::dump_bb): Use m_relation.
	(ranger_cache::fill_block_cache): Likewise
	* gimple-range-fold.cc (fur_stmt::get_phi_operand): Use new names.
	(fur_depend::register_relation): Likewise.
	(fold_using_range::range_of_phi): Likewise.
	* gimple-range-path.cc (path_range_query::path_range_query): Likewise.
	(path_range_query::~path_range_query): Likewise.
	(ath_range_query::compute_ranges): Likewise.
	(jt_fur_source::register_relation): Likewise.
	(jt_fur_source::query_relation): Likewise.
	(path_range_query::maybe_register_phi_relation): Likewise.
	* gimple-range-path.h (get_path_oracle): Likewise.
	* value-query.cc (range_query::create_relation_oracle): Likewise.
	(range_query::destroy_relation_oracle): Likewise.
	(range_query::share_oracles): Likewise.
	(range_query::range_query): Likewise.
	* value-query.h (value_query::relation): Rename from oracle.
	(m_relation): Rename from m_oracle.
	* value-relation.cc (relation_oracle::query): Rename from
	query_relation.
	(equiv_oracle::query): Likewise.
	(equiv_oracle::record): Rename from register_relation.
	(relation_oracle::record): Likewise.
	(dom_oracle::record): Likewise.
	(dom_oracle::query): Rename from query_relation.
	(path_oracle::record): Rename from register_relation.
	(path_oracle::query): Rename from query_relation.
	* value-relation.h (*::record): Rename from register_relation.
	(*::query): Rename from query_relation.
---
 gcc/gimple-range-cache.cc |  4 +--
 gcc/gimple-range-fold.cc  | 10 
 gcc/gimple-range-path.cc  | 18 +++---
 gcc/gimple-range-path.h   |  2 +-
 gcc/gimple-range.cc   |  4 +--
 gcc/value-query.cc| 16 ++--
 gcc/value-query.h |  4 +--
 gcc/value-relation.cc | 51 ++-
 gcc/value-relation.h  | 39 +-
 9 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 020069fd635..55277ea8af1 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1001,7 +1001,7 @@ ranger_cache::dump_bb (FILE *f, basic_block bb)
 {
   m_gori.gori_map::dump (f, bb, false);
   m_on_entry.dump (f, bb);
-  m_oracle->dump (f, bb);
+  m_relation->dump (f, bb);
 }
 
 // Get the global range for NAME, and return in R.  Return false if the
@@ -1439,7 +1439,7 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
   tree equiv_name;
   relation_kind rel;
   int prec = TYPE_PRECISION (type);
-  FOR_EACH_PARTIAL_AND_FULL_EQUIV (m_oracle, bb, name, equiv_name, rel)
+  FOR_EACH_PARTIAL_AND_FULL_EQUIV (m_relation, bb, name, equiv_name, rel)
 	{
 	  basic_block equiv_bb = gimple_bb (SSA_NAME_DEF_STMT (equiv_name));
 
diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index eeffdce0b97..357a1beabd1 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -178,7 +178,7 @@ fur_stmt::get_phi_operand (vrange , tree expr, edge e)
 relation_kind
 fur_stmt::query_relation (tree op1, tree op2)
 {
-  return m_query->oracle ().query_relation (m_stmt, op1, op2);
+  return m_query->relation ().query (m_stmt, op1, op2);
 }
 
 // Instantiate a stmt based fur_source with a GORI object.
@@ -196,7 +196,7 @@ fur_depend::fur_depend (gimple *s, gori_compute *gori, range_query *q)
 void
 fur_depend::register_relation (gimple *s, relation_kind k, tree op1, tree op2)
 {
-  m_query->oracle ().register_relation (s, k, op1, op2);
+  m_query->relation ().record (s, k, op1, op2);
 }
 
 // Register a relation on an edge if there is an oracle.
@@ -204,7 +204,7 @@ fur_depend::register_relation (gimple *s, relation_kind k, tree op1, tree op2)
 void
 fur_depend::register_relation (edge e, relation_kind k, tree op1, tree op2)
 {
-  m_query->oracle ().register_relation (e, k, op1, op2);
+  m_query->relation ().record (e, k, op1, op2);
 }
 
 // This version of fur_source will pick a range up from a list of ranges
@@ -854,7 +854,7 @@ fold_using_range::range_of_phi (vrange , gphi *phi, 

[COMMITTED 04/12] - Allow components to be shared among range-queries.

2024-05-23 Thread Andrew MacLeod
Ranger and ranger's cache are both range_query based, but they need to 
share some common components. The path ranger also needs to share the 
GORI component.   Up until now, they have simple copied pointers to 
share, but this patch provides a protected API to allow them to share 
without knowing what all components are involved.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.


From ef99d19569a1c5fafa5784c2c2f7855b6e62ffd8 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 17 May 2024 10:44:27 -0400
Subject: [PATCH 04/12] Allow components to be shared among range-queries.

Ranger and the ranger cache need to share components, this provides a
blessed way to do so.

	* gimple-range.cc (gimple_ranger::gimple_ranger): Share the
	components from ranger_cache.
	(gimple_ranger::~gimple_ranger): Don't clear pointer.
	* value-query.cc (range_query::share_query): New.
	(range_query::range_query): Clear shared component flag.
	(range_query::~range_query): Don't free shared component copies.
	* value-query.h (share_query): New prototype.
	(m_shared_copy_p): New member.
---
 gcc/gimple-range.cc |  4 +---
 gcc/value-query.cc  | 11 +++
 gcc/value-query.h   |  5 +
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 9664300a80b..4326976fc2a 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -44,7 +44,7 @@ gimple_ranger::gimple_ranger (bool use_imm_uses) :
 	current_bb (NULL)
 {
   // Share the oracle from the cache.
-  m_relation = _cache.relation ();
+  share_query (m_cache);
   if (dump_file && (param_ranger_debug & RANGER_DEBUG_TRACE))
 tracer.enable_trace ();
   m_stmt_list.create (0);
@@ -67,8 +67,6 @@ gimple_ranger::gimple_ranger (bool use_imm_uses) :
 
 gimple_ranger::~gimple_ranger ()
 {
-  // Restore the original oracle.
-  m_relation = NULL;
   m_stmt_list.release ();
 }
 
diff --git a/gcc/value-query.cc b/gcc/value-query.cc
index db64a95a284..adcc59cadbf 100644
--- a/gcc/value-query.cc
+++ b/gcc/value-query.cc
@@ -211,13 +211,24 @@ range_query::destroy_relation_oracle ()
 }
 }
 
+void
+range_query::share_query (range_query )
+{
+  m_relation = q.m_relation;
+  m_shared_copy_p = true;
+}
+
 range_query::range_query ()
 {
   m_relation = _relation_oracle;
+  m_shared_copy_p = false;
 }
 
 range_query::~range_query ()
 {
+  // Do not destroy anything if this is a shared copy.
+  if (m_shared_copy_p)
+return;
   destroy_relation_oracle ();
 }
 
diff --git a/gcc/value-query.h b/gcc/value-query.h
index a8688a099fa..a5735902af0 100644
--- a/gcc/value-query.h
+++ b/gcc/value-query.h
@@ -88,6 +88,11 @@ protected:
 			 basic_block bbentry, basic_block bbexit);
   bool get_arith_expr_range (vrange , tree expr, gimple *stmt);
   relation_oracle *m_relation;
+  // When multiple related range queries wish to share oracles.
+  // This is an internal interface
+  void share_query (range_query );
+  bool m_shared_copy_p;
+
 };
 
 // Global ranges for SSA names using SSA_NAME_RANGE_INFO.
-- 
2.41.0



[COMMITTED 07/12] - Default gimple_outgoing_range to not process switches.

2024-05-23 Thread Andrew MacLeod
This patch adjusts the way gimple_outgoing_range works.  This is the 
static edge calculator that provide ranges on edges for TRUE/FALSE 
edges, as well as calculated the ranges on switch edges.   It was a 
component of ranger before or something that could be included if a pass 
wanted it.


this adjusts the way it works in preparation for being more tightly 
integrated into GORI.  It now works by always working for TRUE/FALSE 
edges, and uses a set_sw_limit routine to enable or disable switch 
processing.   Functionally there is little difference, but it will allow 
it to be the base for a GORI object now.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.
From 1ec8e2027a99a5ddca933a37b3cf5ef322208c5a Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 6 May 2024 12:04:24 -0400
Subject: [PATCH 07/12] Default gimple_outgoing_range to not process switches.

Change the default constructor to not process switches, add method to
enable/disable switch processing.

	* gimple-range-edge.cc (gimple_outgoing_range::gimple_outgoing_range):
	Do not allocate a range allocator at construction time.
	(gimple_outgoing_range::~gimple_outgoing_range): Delete allocator
	if one was allocated.
	(gimple_outgoing_range::set_switch_limit): New.
	(gimple_outgoing_range::switch_edge_range): Create an allocator if one
	does not exist.
	(gimple_outgoing_range::edge_range_p): Check for zero edges.
	* gimple-range-edge.h (class gimple_outgoing_range): Adjust prototypes.
---
 gcc/gimple-range-edge.cc | 23 +--
 gcc/gimple-range-edge.h  | 12 +++-
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/gcc/gimple-range-edge.cc b/gcc/gimple-range-edge.cc
index 3811a0995aa..0c75ad0519c 100644
--- a/gcc/gimple-range-edge.cc
+++ b/gcc/gimple-range-edge.cc
@@ -51,7 +51,6 @@ gimple_outgoing_range_stmt_p (basic_block bb)
   return NULL;
 }
 
-
 // Return a TRUE or FALSE range representing the edge value of a GCOND.
 
 void
@@ -64,22 +63,32 @@ gcond_edge_range (irange , edge e)
 r = range_false ();
 }
 
+// Construct a gimple_outgoing_range object.  No memory is allocated.
 
 gimple_outgoing_range::gimple_outgoing_range (int max_sw_edges)
 {
   m_edge_table = NULL;
+  m_range_allocator = NULL;
   m_max_edges = max_sw_edges;
-  m_range_allocator = new vrange_allocator;
 }
 
+// Destruct an edge object, disposing of any memory allocated.
 
 gimple_outgoing_range::~gimple_outgoing_range ()
 {
   if (m_edge_table)
 delete m_edge_table;
-  delete m_range_allocator;
+  if (m_range_allocator)
+delete m_range_allocator;
 }
 
+// Set a new switch limit.
+
+void
+gimple_outgoing_range::set_switch_limit (int max_sw_edges)
+{
+  m_max_edges = max_sw_edges;
+}
 
 // Get a range for a switch edge E from statement S and return it in R.
 // Use a cached value if it exists, or calculate it if not.
@@ -96,8 +105,10 @@ gimple_outgoing_range::switch_edge_range (irange , gswitch *sw, edge e)
   TYPE_PRECISION (TREE_TYPE (gimple_switch_index (sw
 return false;
 
-   if (!m_edge_table)
- m_edge_table = new hash_map (n_edges_for_fn (cfun));
+  if (!m_edge_table)
+m_edge_table = new hash_map (n_edges_for_fn (cfun));
+  if (!m_range_allocator)
+m_range_allocator = new vrange_allocator;
 
vrange_storage **val = m_edge_table->get (e);
if (!val)
@@ -202,7 +213,7 @@ gimple_outgoing_range::edge_range_p (irange , edge e)
 }
 
   // Only process switches if it within the size limit.
-  if (EDGE_COUNT (e->src->succs) > (unsigned)m_max_edges)
+  if (m_max_edges == 0 || (EDGE_COUNT (e->src->succs) > (unsigned)m_max_edges))
 return NULL;
 
   gcc_checking_assert (is_a (s));
diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
index 9ac0617f970..ce8b04f6bad 100644
--- a/gcc/gimple-range-edge.h
+++ b/gcc/gimple-range-edge.h
@@ -34,13 +34,23 @@ along with GCC; see the file COPYING3.  If not see
 // The API is simple, just ask for the range on the edge.
 // The return value is NULL for no range, or the branch statement which the
 // edge gets the range from, along with the range.
+//
+// THe switch_limit is the number of switch edges beyond which the switch
+// is ignored (ie, edge_range_p () will return NULL as if the sitch was not
+// there.  THis value can be adjusted any time via set_switch_limit ().
+// THe default is 0, no switches are precoessed until set_switch_limit () is
+// called, and then the default is INT_MAX.
+//
+// No memory is allocated until an edge for a switch is processed which also
+// falls under the edge limit criteria.
 
 class gimple_outgoing_range
 {
 public:
-  gimple_outgoing_range (int max_sw_edges = INT_MAX);
+  gimple_outgoing_range (int max_sw_edges = 0);
   ~gimple_outgoing_range ();
   gimple *edge_range_p (irange , edge e);
+  void set_switch_limit (int max_sw_edges = INT_MAX);
 private:
   void calc_switch_ranges (gswitch *sw);
   bool switch_edge_range (irange , gswitch *sw, edge e);
-- 
2.41.0



[PATCH] Use simple_dce_from_worklist in phiprop

2024-05-23 Thread Andrew Pinski
I noticed that phiprop leaves around phi nodes which
defines a ssa name which is unused. This just adds a
bitmap to mark those ssa names and then calls
simple_dce_from_worklist at the very end to remove
those phi nodes and all of the dependencies if there
was any. This might allow us to optimize something earlier
due to the removal of the phi which was taking the address
of the variables.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssa-phiprop.cc (phiprop_insert_phi): Add
dce_ssa_names argument. Add the phi's result to it.
(propagate_with_phi): Add dce_ssa_names argument.
Update call to phiprop_insert_phi.
(pass_phiprop::execute): Update call to propagate_with_phi.
Call simple_dce_from_worklist if there was a change.

Signed-off-by: Andrew Pinski 
---
 gcc/tree-ssa-phiprop.cc | 28 ++--
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/gcc/tree-ssa-phiprop.cc b/gcc/tree-ssa-phiprop.cc
index 041521ef106..2a1cdae46d2 100644
--- a/gcc/tree-ssa-phiprop.cc
+++ b/gcc/tree-ssa-phiprop.cc
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "tree-ssa-loop.h"
 #include "tree-cfg.h"
+#include "tree-ssa-dce.h"
 
 /* This pass propagates indirect loads through the PHI node for its
address to make the load source possibly non-addressable and to
@@ -132,12 +133,15 @@ phivn_valid_p (struct phiprop_d *phivn, tree name, 
basic_block bb)
 
 static tree
 phiprop_insert_phi (basic_block bb, gphi *phi, gimple *use_stmt,
-   struct phiprop_d *phivn, size_t n)
+   struct phiprop_d *phivn, size_t n,
+   bitmap dce_ssa_names)
 {
   tree res;
   gphi *new_phi = NULL;
   edge_iterator ei;
   edge e;
+  tree phi_result = PHI_RESULT (phi);
+  bitmap_set_bit (dce_ssa_names, SSA_NAME_VERSION (phi_result));
 
   gcc_assert (is_gimple_assign (use_stmt)
  && gimple_assign_rhs_code (use_stmt) == MEM_REF);
@@ -276,7 +280,7 @@ chk_uses (tree, tree *idx, void *data)
 
 static bool
 propagate_with_phi (basic_block bb, gphi *phi, struct phiprop_d *phivn,
-   size_t n)
+   size_t n, bitmap dce_ssa_names)
 {
   tree ptr = PHI_RESULT (phi);
   gimple *use_stmt;
@@ -420,9 +424,10 @@ propagate_with_phi (basic_block bb, gphi *phi, struct 
phiprop_d *phivn,
goto next;
}
 
- phiprop_insert_phi (bb, phi, use_stmt, phivn, n);
+ phiprop_insert_phi (bb, phi, use_stmt, phivn, n, dce_ssa_names);
 
- /* Remove old stmt.  The phi is taken care of by DCE.  */
+ /* Remove old stmt. The phi and all of maybe its depedencies
+will be removed later via simple_dce_from_worklist. */
  gsi = gsi_for_stmt (use_stmt);
  /* Unlinking the VDEF here is fine as we are sure that we process
 stmts in execution order due to aggregate copies having VDEFs
@@ -442,16 +447,15 @@ propagate_with_phi (basic_block bb, gphi *phi, struct 
phiprop_d *phivn,
 is the first load transformation.  */
   else if (!phi_inserted)
{
- res = phiprop_insert_phi (bb, phi, use_stmt, phivn, n);
+ res = phiprop_insert_phi (bb, phi, use_stmt, phivn, n, dce_ssa_names);
  type = TREE_TYPE (res);
 
  /* Remember the value we created for *ptr.  */
  phivn[SSA_NAME_VERSION (ptr)].value = res;
  phivn[SSA_NAME_VERSION (ptr)].vuse = vuse;
 
- /* Remove old stmt.  The phi is taken care of by DCE, if we
-want to delete it here we also have to delete all intermediate
-copies.  */
+ /* Remove old stmt.  The phi and all of maybe its depedencies
+will be removed later via simple_dce_from_worklist. */
  gsi = gsi_for_stmt (use_stmt);
  gsi_remove (, true);
 
@@ -514,6 +518,7 @@ pass_phiprop::execute (function *fun)
   gphi_iterator gsi;
   unsigned i;
   size_t n;
+  auto_bitmap dce_ssa_names;
 
   calculate_dominance_info (CDI_DOMINATORS);
 
@@ -531,11 +536,14 @@ pass_phiprop::execute (function *fun)
   if (bb_has_abnormal_pred (bb))
continue;
   for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next ())
-   did_something |= propagate_with_phi (bb, gsi.phi (), phivn, n);
+   did_something |= propagate_with_phi (bb, gsi.phi (), phivn, n, 
dce_ssa_names);
 }
 
   if (did_something)
-gsi_commit_edge_inserts ();
+{
+  gsi_commit_edge_inserts ();
+  simple_dce_from_worklist (dce_ssa_names);
+}
 
   free (phivn);
 
-- 
2.43.0



[COMMITTED 08/12] - Gori_compute no longer inherits from gori_map.

2024-05-23 Thread Andrew MacLeod
This patch moves the gori_compute object away from inheriting a gori_map 
object and instead it as a local member.  Export it via map ().


The gori_map object contains all the SSA name dependencies and 
import/export name lists for blocks.  GORI was inheriting from this 
originally as a convenient way to share the data, but it doesn't really 
belong there.  it is really a component that is used by GORI rather than 
part of what it is.  This more accurately reflects the relationship.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.


From 9b42fafa0ec385bbc86be1d9f1a86c140e1045c3 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Thu, 9 May 2024 14:14:31 -0400
Subject: [PATCH 08/12] Gori_compute no longer inherits from gori_map.

This patch moves the gori_compute object away from inheriting a
gori_map object and instead it as a local member.  Export it via map ().

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Access
	gori_map via member call.
	(ranger_cache::dump_bb): Likewise.
	(ranger_cache::get_global_range): Likewise.
	(ranger_cache::set_global_range): Likewise.
	(ranger_cache::register_inferred_value): Likewise.
	* gimple-range-fold.cc (fold_using_range::range_of_range_op): Likewise.
	(fold_using_range::range_of_address): Likewise.
	(fold_using_range::range_of_phi): Likewise.
	* gimple-range-gori.cc (gori_compute::compute_operand_range_switch):
	likewise.
	(gori_compute::compute_operand_range): Likewise.
	(gori_compute::compute_logical_operands): Likewise.
	(gori_compute::refine_using_relation): Likewise.
	(gori_compute::compute_operand1_and_operand2_range): Likewise.
	(gori_compute::may_recompute_p): Likewise.
	(gori_compute::has_edge_range_p): Likewise.
	(gori_compute::outgoing_edge_range_p): Likewise.
	(gori_compute::condexpr_adjust): Likewise.
	* gimple-range-gori.h (class gori_compute): Do not inherit from
	gori_map.
	(gori_compute::m_map): New.
	* gimple-range-path.cc (gimple-range-path.cc): Use gori_map member.
	(path_range_query::compute_exit_dependencies): Likewise.
	* gimple-range.cc (gimple_ranger::range_of_stmt): Likewise.
	(gimple_ranger::register_transitive_inferred_ranges): Likewise.
	* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges): Likewise.
	* tree-ssa-threadedge.cc
	  (hybrid_jt_simplifier::compute_exit_dependencies): Likewise.
	* tree-vrp.cc (remove_unreachable::handle_early): Likewise.
	(remove_unreachable::remove_and_update_globals): Likewise.
---
 gcc/gimple-range-cache.cc  | 16 ++---
 gcc/gimple-range-fold.cc   | 12 +-
 gcc/gimple-range-gori.cc   | 47 +++---
 gcc/gimple-range-gori.h|  4 +++-
 gcc/gimple-range-path.cc   |  6 ++---
 gcc/gimple-range.cc|  6 ++---
 gcc/tree-ssa-dom.cc|  2 +-
 gcc/tree-ssa-threadedge.cc |  2 +-
 gcc/tree-vrp.cc|  9 
 9 files changed, 54 insertions(+), 50 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 34dc9c4a3ec..c52475852a9 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -969,7 +969,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 {
   basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
   if (bb)
-	m_gori.exports (bb);
+	m_gori.map ()->exports (bb);
 }
   m_update = new update_list ();
 }
@@ -1000,7 +1000,7 @@ ranger_cache::dump (FILE *f)
 void
 ranger_cache::dump_bb (FILE *f, basic_block bb)
 {
-  m_gori.gori_map::dump (f, bb, false);
+  m_gori.map ()->dump (f, bb, false);
   m_on_entry.dump (f, bb);
   m_relation->dump (f, bb);
 }
@@ -1033,8 +1033,8 @@ ranger_cache::get_global_range (vrange , tree name, bool _p)
   current_p = false;
   if (had_global)
 current_p = r.singleton_p ()
-		|| m_temporal->current_p (name, m_gori.depend1 (name),
-	  m_gori.depend2 (name));
+		|| m_temporal->current_p (name, m_gori.map ()->depend1 (name),
+	  m_gori.map ()->depend2 (name));
   else
 {
   // If no global value has been set and value is VARYING, fold the stmt
@@ -1071,8 +1071,8 @@ ranger_cache::set_global_range (tree name, const vrange , bool changed)
   if (!changed)
 {
   // If there are dependencies, make sure this is not out of date.
-  if (!m_temporal->current_p (name, m_gori.depend1 (name),
- m_gori.depend2 (name)))
+  if (!m_temporal->current_p (name, m_gori.map ()->depend1 (name),
+ m_gori.map ()->depend2 (name)))
 	m_temporal->set_timestamp (name);
   return;
 }
@@ -1097,7 +1097,7 @@ ranger_cache::set_global_range (tree name, const vrange , bool changed)
 
   if (r.singleton_p ()
   || (POINTER_TYPE_P (TREE_TYPE (name)) && r.nonzero_p ()))
-m_gori.set_range_invariant (name);
+m_gori.map ()->set_range_invariant (name);
   m_temporal->set_timestamp (name);
 }
 
@@ -1783,7 +1783,7 @@ ranger_cache::register_inferred_value (const vrange , tree name,
   m_on_entry.set_bb_range (name, bb, r);
   // If this range was invariant before, remove 

[COMMITTED 00/12] Cleanup some ranger components and make them available via range_query.

2024-05-23 Thread Andrew MacLeod
This set of 12 patches overhauls some structural component layouts in 
ranger and makes them available via a simple range_query API.


There are 3 main groups of patches.

 The first group overhauls the relation oracle a bit and makes it 
accessing it via range_query more transparent.


 The second bunch incorporates the inferred range manager into an 
oracle also accessible via a range_query object.


The third and final group  reorganizes the GORI component and the 
dependency information it provides with the static edge calculator and 
makes this also accessible via a range_query.


This cleans up a number of things, and to go with this new cleanup comes 
some documentation on how they work (!!!).  well, the GORI documentation 
is pending but the rest is there.   Whats been written is currently 
available from at the root page : 
https://gcc.gnu.org/wiki/AndrewMacLeod/Ranger3.0 In each individual 
patch I also mention the specific page for that component.


Over the remainder of the year I will be adding to documentation this 
until ranger is fully documented, including range-ops, internals, etc


More details on specifics in each patch.

All patches combined result in a slight performance improvement of 0.4% 
in VRP,  0.5% in threading, and 0.07% total compilation time.


Andrew



[COMMITTED 02/12] - Move to an always available relation oracle.

2024-05-23 Thread Andrew MacLeod
This patch provides a basic oracle which doesn't do anything, but will 
still respond when queried.  This allows passes to avoid the NULL check 
for an oracle pointer before they do anything, and results in a slight 
speedup in VRP, and a slightly more significant 0.3% speedup in jump 
threading..


It also unifies the register and query names to nor specify what is 
already apparent in the parameters.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

From 2f80eb1feb3f92c7e9e57d4726ec52ca7d27ce92 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 30 Apr 2024 09:35:23 -0400
Subject: [PATCH 02/12] Move to an always available relation oracle.

This eliminates the need to check if the relation oracle pointer is NULL
before every call by providing a default oracle which does nothing.
REmove unused routines, and Unify register_relation method names.

	* gimple-range-cache.cc (ranger_cache::dump_bb): Remove check for
	NULL oracle pointer.
	(ranger_cache::fill_block_cache): Likewise.
	* gimple-range-fold.cc (fur_stmt::get_phi_operand): Likewise.
	(fur_depend::fur_depend): Likewise.
	(fur_depend::register_relation): Likewise, use qury_relation.
	(fold_using_range::range_of_phi): Likewise.
	(fold_using_range::relation_fold_and_or): Likewise.
	* gimple-range-fold.h (fur_source::m_oracle): Delete.  Oracle
	can be accessed dirctly via m_query now.
	* gimple-range-path.cc (path_range_query::path_range_query):
	Adjust for oracle reference pointer.
	(path_range_query::compute_ranges): Likewise.
	(jt_fur_source::jt_fur_source): Adjust for no m_oracle member.
	(jt_fur_source::register_relation): Do not check for NULL
	pointer.
	(jt_fur_source::query_relation): Likewise.
	* gimple-range.cc (gimple_ranger::gimple_ranger):  Adjust for
	reference pointer.
	* value_query.cc (default_relation_oracle): New.
	(range_query::create_relation_oracle): Relocate from header.
	Ensure not being added to global query.
	(range_query::destroy_relation_oracle): Relocate from header.
	(range_query::range_query): Initailize to default oracle.
	(ange_query::~range_query): Call destroy_relation_oracle.
	* value-query.h (class range_query): Adjust prototypes.
	(range_query::create_relation_oracle): Move to source file.
	(range_query::destroy_relation_oracle): Move to source file.
	* value-relation.cc (relation_oracle::validate_relation): Delete.
	(relation_oracle::register_stmt): Rename to register_relation.
	(relation_oracle::register_edge): Likewise.
	* value-relation.h (register_stmt): Rename to register_relation and
	provide default function in base class.
	(register_edge): Likewise.
	(relation_oracle::query_relation): Provide default in base class.
	(relation_oracle::dump): Likewise.
	(relation_oracle::equiv_set): Likewise.
	(default_relation_oracle): New extenal reference.
	(partial_equiv_set, add_partial_equiv): Move to protected.
	* value-relation.h (relation_oracle::validate_relation): Delete.
---
 gcc/gimple-range-cache.cc | 98 +++
 gcc/gimple-range-fold.cc  | 22 +++--
 gcc/gimple-range-fold.h   |  2 -
 gcc/gimple-range-path.cc  | 22 +++--
 gcc/gimple-range.cc   |  5 +-
 gcc/value-query.cc| 38 +--
 gcc/value-query.h | 31 ++---
 gcc/value-relation.cc | 66 ++
 gcc/value-relation.h  | 32 ++---
 9 files changed, 119 insertions(+), 197 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index cf17a6af9db..020069fd635 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1001,8 +1001,7 @@ ranger_cache::dump_bb (FILE *f, basic_block bb)
 {
   m_gori.gori_map::dump (f, bb, false);
   m_on_entry.dump (f, bb);
-  if (m_oracle)
-m_oracle->dump (f, bb);
+  m_oracle->dump (f, bb);
 }
 
 // Get the global range for NAME, and return in R.  Return false if the
@@ -1437,62 +1436,59 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
   // See if any equivalences can refine it.
   // PR 109462, like 108139 below, a one way equivalence introduced
   // by a PHI node can also be through the definition side.  Disallow it.
-  if (m_oracle)
+  tree equiv_name;
+  relation_kind rel;
+  int prec = TYPE_PRECISION (type);
+  FOR_EACH_PARTIAL_AND_FULL_EQUIV (m_oracle, bb, name, equiv_name, rel)
 	{
-	  tree equiv_name;
-	  relation_kind rel;
-	  int prec = TYPE_PRECISION (type);
-	  FOR_EACH_PARTIAL_AND_FULL_EQUIV (m_oracle, bb, name, equiv_name, rel)
-	{
-	  basic_block equiv_bb = gimple_bb (SSA_NAME_DEF_STMT (equiv_name));
+	  basic_block equiv_bb = gimple_bb (SSA_NAME_DEF_STMT (equiv_name));
 
-	  // Ignore partial equivs that are smaller than this object.
-	  if (rel != VREL_EQ && prec > pe_to_bits (rel))
-		continue;
+	  // Ignore partial equivs that are smaller than this object.
+	  if (rel != VREL_EQ && prec > pe_to_bits (rel))
+	continue;
 
-	  // Check if the equiv has 

[COMMITTED 01/12] - Move all relation queries into relation_oracle.

2024-05-23 Thread Andrew MacLeod
A range-query currently provides a couple of relation query routines, 
plus it also provides direct access to an oracle.   This patch moves 
those queries into the oracle where they should be, and ands the ability 
to create and destroy the basic dominance oracle ranger uses.  This is 
the usual oracle most passes would want, and this provides full access 
to it if ranger has been enabled.  It also allows passes which do not 
use ranger to turn on an oracle and work with it.


Full documentation  for relations and the oracle can be found at:   
https://gcc.gnu.org/wiki/AndrewMacLeod/Relations


Moving the queries into the oracle removes the need to check for a NULL 
pointer on every query, and results in speeding up VRP by about 0.7%


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

From b0cbffd5655b9fc108691c6b15e8eaed4ab9746a Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 29 Apr 2024 13:32:00 -0400
Subject: [PATCH 01/12] Move all relation queries into relation_oracle.

Move relation queries from range_query object into the relation oracle.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Call
	create_relation_oracle.
	(ranger_cache::~ranger_cache): Call destroy_relation_oracle.
	* gimple-range-fold.cc (fur_stmt::get_phi_operand): Check for
	relation oracle bnefore calling query_relation.
	(fold_using_range::range_of_phi): Likewise.
	* gimple-range-path.cc (path_range_query::path_range_query): Set
	relation oracle pointer to NULL after deleting it.
	* value-query.cc (range_query::~range_query): Ensure any
	relation oracle is destroyed.
	(range_query::query_relation): relocate to relation_oracle object.
	* value-query.h (class range_query): Adjust method proototypes.
	(range_query::create_relation_oracle): New.
	(range_query::destroy_relation_oracle): New.
	* value-relation.cc (relation_oracle::query_relation): Relocate
	from range query class.
	* value-relation.h (Call relation_oracle): New prototypes.
---
 gcc/gimple-range-cache.cc |  9 +++
 gcc/gimple-range-fold.cc  |  9 +--
 gcc/gimple-range-path.cc  |  1 +
 gcc/gimple-range.cc   |  1 +
 gcc/value-query.cc| 52 ++-
 gcc/value-query.h | 32 +++-
 gcc/value-relation.cc | 33 +
 gcc/value-relation.h  |  4 ++-
 8 files changed, 76 insertions(+), 65 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index bdd2832873a..cf17a6af9db 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -957,11 +957,9 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
   m_workback.safe_grow_cleared (last_basic_block_for_fn (cfun));
   m_workback.truncate (0);
   m_temporal = new temporal_cache;
+
   // If DOM info is available, spawn an oracle as well.
-  if (dom_info_available_p (CDI_DOMINATORS))
-  m_oracle = new dom_oracle ();
-else
-  m_oracle = NULL;
+  create_relation_oracle ();
 
   unsigned x, lim = last_basic_block_for_fn (cfun);
   // Calculate outgoing range info upfront.  This will fully populate the
@@ -979,8 +977,7 @@ ranger_cache::ranger_cache (int not_executable_flag, bool use_imm_uses)
 ranger_cache::~ranger_cache ()
 {
   delete m_update;
-  if (m_oracle)
-delete m_oracle;
+  destroy_relation_oracle ();
   delete m_temporal;
   m_workback.release ();
 }
diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index a9c8c4d03e6..41b6d350c40 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -178,7 +178,10 @@ fur_stmt::get_phi_operand (vrange , tree expr, edge e)
 relation_kind
 fur_stmt::query_relation (tree op1, tree op2)
 {
-  return m_query->query_relation (m_stmt, op1, op2);
+  relation_oracle *oracle = m_query->oracle ();
+  if (!oracle)
+return VREL_VARYING;
+  return oracle->query_relation (m_stmt, op1, op2);
 }
 
 // Instantiate a stmt based fur_source with a GORI object.
@@ -860,6 +863,7 @@ fold_using_range::range_of_phi (vrange , gphi *phi, fur_source )
   tree single_arg = NULL_TREE;
   bool seen_arg = false;
 
+  relation_oracle *oracle = src.query()->oracle ();
   // Start with an empty range, unioning in each argument's range.
   r.set_undefined ();
   for (x = 0; x < gimple_phi_num_args (phi); x++)
@@ -880,7 +884,8 @@ fold_using_range::range_of_phi (vrange , gphi *phi, fur_source )
 	  // Likewise, if the incoming PHI argument is equivalent to this
 	  // PHI definition, it provides no new info.  Accumulate these ranges
 	  // in case all arguments are equivalences.
-	  if (src.query ()->query_relation (e, arg, phi_def, false) == VREL_EQ)
+	  if (oracle
+	  && oracle->query_relation (e, arg, phi_def) == VREL_EQ)
 	equiv_range.union_(arg_range);
 	  else
 	r.union_ (arg_range);
diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index f1a12f76144..ef3db10470e 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -60,6 +60,7 @@ 

Re: [C PATCH, v2] Fix for redeclared enumerator initialized with different type [PR115109]

2024-05-23 Thread Joseph Myers
On Tue, 21 May 2024, Martin Uecker wrote:

> > The constraint violated is the general one "If an identifier has no 
> > linkage, there shall be no more than one declaration of the identifier (in 
> > a declarator or type specifier) with the same scope and in the same name 
> > space, except that: ... enumeration constants and tags may be redeclared 
> > as specified in 6.7.3.3 and 6.7.3.4, respectively." (where 6.7.3.3 says 
> > "Enumeration constants can be redefined in the same scope with the same 
> > value as part of a redeclaration of the same enumerated type." - as the 
> > redefinition is not with the same value, the "as specified in 6.7.3.3" is 
> > not satisfied and so the general constraint against redeclarations with no 
> > linkage applies).
> 
> This assumes that the value in question is the one of the initializer and not 
> the
> one after initialization (with no clear rules how this works in this case), 

There are no initializers here.  The constant-expression after '=' in the 
syntax for enumerator is not an initializer, and none of the rules for 
initializers apply to it.  (If the initializer with an out-of-range value 
also gets used in an expression, the constraint on constants would be 
violated, "Each constant shall have a type and the value of a constant 
shall be in the range of representable values for its type." - but the 
constraint on multiple declarations applies whether it's used or not.)

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] Fortran: improve attribute conflict checking [PR93635]

2024-05-23 Thread Mikael Morin

Le 23/05/2024 à 09:49, Mikael Morin a écrit :

Le 13/05/2024 à 09:25, Mikael Morin a écrit :

Le 10/05/2024 à 21:56, Harald Anlauf a écrit :

Am 10.05.24 um 21:48 schrieb Harald Anlauf:

Hi Mikael,

Am 10.05.24 um 11:45 schrieb Mikael Morin:

Le 09/05/2024 à 22:30, Harald Anlauf a écrit :

I'll stop here...


Thanks. Go figure, I have no problem reproducing today.
It's PR99798 (and there is even a patch for it).


this patch has rotten a bit: the type of gfc_reluease_symbol
has changed to bool, this can be fixed.

Unfortunately, applying the patch does not remove the ICEs here...


Oops, I take that back!  There was an error on my side applying the
patch; and now it does fix the ICEs after correcting that hickup

Now the PR99798 patch is ready to be pushed, but I won't be available 
for a few days.  We can finish our discussion on this topic afterwards.



Hello,

I'm coming back to this.
I think either one of Steve's patch or your variant in the PR is a 
better fix for the ICE as a first step; they seem less fragile at least.
Then we can look at a possible reordering of conflict checks as with the 
patch you originally submitted in this thread.



Replying to myself...
It's not a great plan if we want to avoid unnecessary churn in the 
testsuite.


[pushed] c++: deleting array temporary [PR115187]

2024-05-23 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Decaying the array temporary to a pointer and then deleting that crashes in
verify_gimple_stmt, because the TARGET_EXPR is first evaluated inside the
TRY_FINALLY_EXPR, but the cleanup point is outside.  Fixed by using
get_target_expr instead of save_expr.

I also adjust the stabilize_expr comment to prevent me from again thinking
it's a suitable replacement.

PR c++/115187

gcc/cp/ChangeLog:

* init.cc (build_delete): Use get_target_expr instead of save_expr.
* tree.cc (stabilize_expr): Update comment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/array-prvalue3.C: New test.
---
 gcc/cp/init.cc  | 9 -
 gcc/cp/tree.cc  | 6 +-
 gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C | 8 
 3 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 906e401974c..52396d87a8c 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -5228,9 +5228,13 @@ build_delete (location_t loc, tree otype, tree addr,
   addr = convert_force (build_pointer_type (type), addr, 0, complain);
 }
 
+  tree addr_expr = NULL_TREE;
   if (deleting)
 /* We will use ADDR multiple times so we must save it.  */
-addr = save_expr (addr);
+{
+  addr_expr = get_target_expr (addr);
+  addr = TARGET_EXPR_SLOT (addr_expr);
+}
 
   bool virtual_p = false;
   if (type_build_dtor_call (type))
@@ -5349,6 +5353,9 @@ build_delete (location_t loc, tree otype, tree addr,
   if (!integer_nonzerop (ifexp))
 expr = build3 (COND_EXPR, void_type_node, ifexp, expr, void_node);
 
+  if (addr_expr)
+expr = cp_build_compound_expr (addr_expr, expr, tf_none);
+
   protected_set_expr_location (expr, loc);
   return expr;
 }
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 4d87661b4ad..0485a618c6c 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -5924,7 +5924,11 @@ decl_storage_duration (tree decl)
*INITP) an expression that will perform the pre-evaluation.  The
value returned by this function is a side-effect free expression
equivalent to the pre-evaluated expression.  Callers must ensure
-   that *INITP is evaluated before EXP.  */
+   that *INITP is evaluated before EXP.
+
+   Note that if EXPR is a glvalue, the return value is a glvalue denoting the
+   same address; this function does not guard against modification of the
+   stored value like save_expr or get_target_expr do.  */
 
 tree
 stabilize_expr (tree exp, tree* initp)
diff --git a/gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C 
b/gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C
new file mode 100644
index 000..f264e46084a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/array-prvalue3.C
@@ -0,0 +1,8 @@
+// PR c++/115187
+// { dg-do compile { target c++17 } }
+
+void f() {
+  using T = int[2];
+  delete T{};  // { dg-warning "deleting array" }
+  // { dg-warning "unallocated object" "" { target *-*-* } .-1 }
+}

base-commit: 0b3b6a8df77b0ae15078402ea5fb933d6fccd585
-- 
2.44.0



Re: [PATCH] c++: mark TARGET_EXPRs for function arguments eliding [PR114707]

2024-05-23 Thread Jason Merrill

On 5/23/24 10:41, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Coming back to our discussion in
:
TARGET_EXPRs that initialize a function argument are not marked
TARGET_EXPR_ELIDING_P even though gimplify_arg drops such TARGET_EXPRs
on the floor.


But only if TREE_TYPE (TARGET_EXPR_INITIAL is non-void, I think we 
should check that here too to be parallel.


Perhaps most/all affected TARGET_EXPRs will have been handled earlier in 
the function under the TREE_ADDRESSABLE check, but I wouldn't rely on 
that without an assert.



To work around it, I added a pset to
replace_placeholders_for_class_temp_r, but it would be best to just rely
on TARGET_EXPR_ELIDING_P.

This patch changes the diagnostic we emit in constexpr-diag1.C: instead
of:

constexpr-diag1.C:20:21: error: temporary of non-literal type 'B' in a constant 
expression

we say:

constexpr-diag1.C:20:23: error: call to non-'constexpr' function 'B::B()'

PR c++/114707

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing): Call set_target_expr_eliding.
* typeck2.cc (replace_placeholders_for_class_temp_r): Don't use pset.
(digest_nsdmi_init): Call cp_walk_tree_without_duplicates instead of
cp_walk_tree.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-diag1.C: Adjust dg-error.
---
  gcc/cp/call.cc   |  3 +++
  gcc/cp/typeck2.cc| 20 
  gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C |  2 +-
  3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index ed68eb3c568..750ecf60fd9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9437,6 +9437,9 @@ convert_for_arg_passing (tree type, tree val, 
tsubst_flags_t complain)
if (complain & tf_warning)
  warn_for_address_of_packed_member (type, val);
  
+  /* gimplify_arg elides TARGET_EXPRs that initialize a function argument.  */

+  set_target_expr_eliding (val);
+
return val;
  }
  
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc

index 06bad4d3303..7782f38da43 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1409,16 +1409,14 @@ digest_init_flags (tree type, tree init, int flags, 
tsubst_flags_t complain)
 in the context of guaranteed copy elision).  */
  
  static tree

-replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
+replace_placeholders_for_class_temp_r (tree *tp, int *, void *)
  {
tree t = *tp;
-  auto pset = static_cast *>(data);
  
/* We're looking for a TARGET_EXPR nested in the whole expression.  */

if (TREE_CODE (t) == TARGET_EXPR
/* That serves as temporary materialization, not an initializer.  */
-  && !TARGET_EXPR_ELIDING_P (t)
-  && !pset->add (t))
+  && !TARGET_EXPR_ELIDING_P (t))
  {
tree init = TARGET_EXPR_INITIAL (t);
while (TREE_CODE (init) == COMPOUND_EXPR)
@@ -1433,16 +1431,6 @@ replace_placeholders_for_class_temp_r (tree *tp, int *, 
void *data)
  gcc_checking_assert (!find_placeholders (init));
}
  }
-  /* TARGET_EXPRs initializing function arguments are not marked as eliding,
- even though gimplify_arg drops them on the floor.  Don't go replacing
- placeholders in them.  */
-  else if (TREE_CODE (t) == CALL_EXPR || TREE_CODE (t) == AGGR_INIT_EXPR)
-for (int i = 0; i < call_expr_nargs (t); ++i)
-  {
-   tree arg = get_nth_callarg (t, i);
-   if (TREE_CODE (arg) == TARGET_EXPR && !TARGET_EXPR_ELIDING_P (arg))
- pset->add (arg);
-  }
  
return NULL_TREE;

  }
@@ -1490,8 +1478,8 @@ digest_nsdmi_init (tree decl, tree init, tsubst_flags_t 
complain)
   temporary materialization does not occur when initializing an object
   from a prvalue of the same type, therefore we must not replace the
   placeholder with a temporary object so that it can be elided.  */
-  hash_set pset;
-  cp_walk_tree (, replace_placeholders_for_class_temp_r, , nullptr);
+  cp_walk_tree_without_duplicates (, 
replace_placeholders_for_class_temp_r,
+  nullptr);
  
return init;

  }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
index ccb8d81adca..da6fc2030bc 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
@@ -17,4 +17,4 @@ constexpr int b = A().f();   // { dg-error "" }
  
  template 

  constexpr int f (T t) { return 42; }
-constexpr int x = f(B());   // { dg-error "non-literal" }
+constexpr int x = f(B());   // { dg-error "call to non-.constexpr." }

base-commit: 2b2476d4d18c92b8aba3567ebccd2100c2f7c258




Re: [PATCH] c++/modules: Ensure all partial specialisations are tracked [PR114947]

2024-05-23 Thread Jason Merrill

On 5/12/24 09:29, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

Constrained partial specialisations aren't all necessarily tracked on
the instantiation table.  The modules code uses a separate
'partial_specializations' table to track them instead to ensure that
they get walked and emitted when emitting a module, but currently this
does not always happen.

The attached testcase fails in two ways.  First, because the partial
specialisation is just a declaration (and not a definition),
'set_defining_module' never ends up getting called on it and so it never
gets added to the partial specialisation table.  We fix this by ensuring
that when partial specializations are created they always get added, and
so we never miss one. To prevent adding partial specialisations multiple
times we split this out as a new function.


Hmm, I wonder if it would make sense to move 
DECL_TEMPLATE_SPECIALIZATIONS from the template's DECL_SIZE into a hash 
table that the modules code could walk instead of managing its own 
separate table?


The patch is OK.


The second way it fails is that when exporting the primary interface for
a module with partitions, we also re-walk the specializations of all
imported partitions to merge them into a single BMI.  So this patch
ensures that after calling 'match_mergeable_specialization' we also
ensure that if the name came from a partition it gets added to the
specialization table so that a dependency is correctly created for it.

PR c++/114947

gcc/cp/ChangeLog:

* cp-tree.h (set_defining_module_for_partial_spec): Declare.
* module.cc (trees_in::decl_value): Track partial specs coming
from partitions.
(set_defining_module): Don't track partial specialisations here
anymore.
(set_defining_module_for_partial_spec): New function.
* pt.cc (process_partial_specialization): Call it.

gcc/testsuite/ChangeLog:

* g++.dg/modules/partial-4_a.C: New test.
* g++.dg/modules/partial-4_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/cp-tree.h   |  1 +
  gcc/cp/module.cc   | 22 ++
  gcc/cp/pt.cc   |  2 ++
  gcc/testsuite/g++.dg/modules/partial-4_a.C |  8 
  gcc/testsuite/g++.dg/modules/partial-4_b.C |  5 +
  5 files changed, 34 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-4_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/partial-4_b.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index db098c32f2d..2580bf05fb2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7418,6 +7418,7 @@ extern unsigned get_importing_module (tree, bool = false) 
ATTRIBUTE_PURE;
  /* Where current instance of the decl got declared/defined/instantiated.  */
  extern void set_instantiating_module (tree);
  extern void set_defining_module (tree);
+extern void set_defining_module_for_partial_spec (tree);
  extern void maybe_key_decl (tree ctx, tree decl);
  extern void propagate_defining_module (tree decl, tree orig);
  extern void remove_defining_module (tree decl);
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 520dd710549..3ca963cb3e9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -8416,6 +8416,11 @@ trees_in::decl_value ()
  add_mergeable_specialization (!is_type, , decl, spec_flags);
}
  
+  /* When making a CMI from a partition we're going to need to walk partial

+specializations again, so make sure they're tracked.  */
+  if (state->is_partition () && (spec_flags & 2))
+   set_defining_module_for_partial_spec (inner);
+
if (NAMESPACE_SCOPE_P (decl)
  && (mk == MK_named || mk == MK_unique
  || mk == MK_enum || mk == MK_friend_spec)
@@ -19246,13 +19251,22 @@ set_defining_module (tree decl)
  vec_safe_push (class_members, decl);
}
}
-  else if (DECL_IMPLICIT_TYPEDEF_P (decl)
-  && CLASSTYPE_TEMPLATE_SPECIALIZATION (TREE_TYPE (decl)))
-   /* This is a partial or explicit specialization.  */
-   vec_safe_push (partial_specializations, decl);
  }
  }
  
+/* Also remember DECL if it's a newly declared class template partial

+   specialization, because these are not necessarily added to the
+   instantiation tables.  */
+
+void
+set_defining_module_for_partial_spec (tree decl)
+{
+  if (module_p ()
+  && DECL_IMPLICIT_TYPEDEF_P (decl)
+  && CLASSTYPE_TEMPLATE_SPECIALIZATION (TREE_TYPE (decl)))
+vec_safe_push (partial_specializations, decl);
+}
+
  void
  set_originating_module (tree decl, bool friend_p ATTRIBUTE_UNUSED)
  {
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1816bfd1f40..6d33bac90b0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -5456,6 +5456,8 @@ process_partial_specialization (tree decl)
gcc_checking_assert (!TI_PARTIAL_INFO (tinfo));
TI_PARTIAL_INFO (tinfo) 

Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-23 Thread Uros Bizjak
On Thu, May 23, 2024 at 7:53 PM Evgeny Karpov
 wrote:
>
>
> Thursday, May 23, 2024 10:35 AM
> Uros Bizjak  wrote:
>
> > Richard Sandiford  wrote:
> > >
> > > > This looks good to me apart from a couple of very minor comments
> > > > below, but please get approval from the x86 maintainers as well.  In
> > > > particular, they might prefer to handle ix86_legitimize_pe_coff_symbol 
> > > > in
> > some other way.
> > >
> > > Jan and Uros, could you please review x86 refactoring for mingw part?
> >
> > Yes, perhaps legitimize_pe_coff_symbol should be handled similar to how
> > machopic_legitimize_pic_address is handled.and just use "#if TARGET_PECOFF"
> > at call sites when calling functions from the new winnt-dll.h. This would 
> > also
> > allow us to remove  the early check for !TARGET_PECOFF in
> > legitimize_pe_coff_symbol.
> >
> > Uros.
>
>
> The function legitimize_pe_coff_symbol is now part of mingw and will not be 
> used for linux targets.
> This is why ix86_legitimize_pe_coff_symbol has been introduced, to be 
> available for all platforms.

There is no need for a ix86_legitimize_pe_coff_symbol. This function
is now defined in a header that is not included by default, so the
call sites should use #if TARGET_PECOFF to isolate its use. Please see
how "#if TARGET_MACHO" is used in config/i386/* files for the similar
issue. I think that TARGET_PECOFF should follow this example.

Uros.


Re: [PATCH 2/2] c++/modules: Remember that header units have CMIs

2024-05-23 Thread Jason Merrill

On 5/23/24 09:29, Nathaniel Shead wrote:

And here's that patch.  As far as I can tell there should be no visible
change anymore, so there aren't any testcases.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


OK.


-- >8 --

This appears to be an oversight in the definition of module_has_cmi_p.
This change will allow us to use the function directly in more places
that need to additional work only if generating a module CMI in the
future, allowing us to do additional work only when we know we need it.

gcc/cp/ChangeLog:

* cp-tree.h (module_has_cmi_p): Also include header units.
(module_maybe_has_cmi_p): Update comment.
* module.cc (set_defining_module): Only need to track
declarations for later exporting if the module may have a CMI.
* name-lookup.cc (pushdecl): Likewise.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/cp-tree.h  | 7 +++
  gcc/cp/module.cc  | 2 +-
  gcc/cp/name-lookup.cc | 2 +-
  3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ba9e848c177..9472759d3c8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7381,7 +7381,7 @@ inline bool module_interface_p ()
  inline bool module_partition_p ()
  { return module_kind & MK_PARTITION; }
  inline bool module_has_cmi_p ()
-{ return module_kind & (MK_INTERFACE | MK_PARTITION); }
+{ return module_kind & (MK_INTERFACE | MK_PARTITION | MK_HEADER); }
  
  inline bool module_purview_p ()

  { return module_kind & MK_PURVIEW; }
@@ -7393,9 +7393,8 @@ inline bool named_module_purview_p ()
  inline bool named_module_attach_p ()
  { return named_module_p () && module_attach_p (); }
  
-/* We don't know if this TU will have a CMI while parsing the GMF,

-   so tentatively assume that it might, for the purpose of determining
-   whether no-linkage decls could be used by an importer.  */
+/* Like module_has_cmi_p, but tentatively assumes that this TU may have a
+   CMI if we haven't seen the module-declaration yet.  */
  inline bool module_maybe_has_cmi_p ()
  { return module_has_cmi_p () || (named_module_p () && !module_purview_p ()); }
  
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc

index 520dd710549..8639ed6f1a2 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -19216,7 +19216,7 @@ set_defining_module (tree decl)
gcc_checking_assert (!DECL_LANG_SPECIFIC (decl)
   || !DECL_MODULE_IMPORT_P (decl));
  
-  if (module_p ())

+  if (module_maybe_has_cmi_p ())
  {
/* We need to track all declarations within a module, not just those
 in the module purview, because we don't necessarily know yet if
diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 78f08acffaa..f1f8c19feb1 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -4103,7 +4103,7 @@ pushdecl (tree decl, bool hiding)
  
  	  if (level->kind == sk_namespace

  && TREE_PUBLIC (level->this_entity)
- && module_p ())
+ && module_maybe_has_cmi_p ())
maybe_record_mergeable_decl (slot, name, decl);
}
  }




Re: [PATCH 1/2] c++/modules: Fix treatment of unnamed types

2024-05-23 Thread Jason Merrill

On 5/23/24 09:27, Nathaniel Shead wrote:

On Mon, May 20, 2024 at 06:00:09PM -0400, Jason Merrill wrote:

On 5/17/24 02:14, Nathaniel Shead wrote:

On Tue, May 14, 2024 at 06:21:48PM -0400, Jason Merrill wrote:

On 5/12/24 22:58, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


OK.



I realised as I was looking over this again that I might have spoken too
soon with the header unit example being supported. Doing the following:

// a.H
struct { int y; } s;
decltype(s) f(decltype(s));  // { dg-error "used but never defined" }
inline auto x = f({ 123 });
// b.C
struct {} unrelated;
import "a.H";
decltype(s) f(decltype(s) x) {
  return { 456 + x.y };
}

// c.C
import "linkage-3_a.H";
int main() { auto a = x.y; }

Actually does fail to link, because in 'c.C' we call 'f(.anon_0)' but
the definition 'b.C' is f(.anon_1).

I don't think this is fixable, so I don't think this direction is
workable.


Since namespace-scope anonymous types are TU-local, we don't need to support
that for proper modules, but it's not clear to me that we don't need to
support it for header units.

OTOH, https://eel.is/c++draft/module#import-5.3 allows c.C to import a
different header unit than b.C, in which case the type is different and x
violates the odr.



Right; I think at this stage I don't know how to support this for header
units (and even for module interface units it doesn't actually work;
more on this below), so I think saying that this is actually an ODR
violation is OK.


That said, I think that it might still be worth making header modules
satisfy 'module_has_cmi_p', since that is true to the name, and will be
useful in other places we currently use 'module_p ()': in which case we
could instead make all the callers in 'no_linkage_check' do
'module_maybe_has_cmi_p () && !header_module_p ()'; something like the
following, perhaps?


If we need that condition, it should be its own predicate rather than
expecting callers to do that combined check.

But it's not clear to me how this is different from a type in the GMF of a
named module, which is exactly the maybe_has_cmi case; there we could again
see a different version of the type if another TU includes the header.

Jason



This made me go back and double-check for named modules and it actually
does fail as well; the following sample ICEs, even:

   export module M;
   struct {} s;
   int h(decltype(s));
   int x = h(s);  // ICE in write_unnamed_type_name, cp/mangle.cc:1806

So I think maybe the way to go here is to just not treat unnamed types
as something that could possibly be accessed from a different TU, like
the below.  Then we don't need to do the special handling for header
units, since as you say, they're not materially different anyway.
Thoughts?


Sounds good.


(And I've moved the original change to 'module_has_cmi_p' to a separate
patch given it's somewhat unrelated now.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk (and
maybe 14.2)?

-- >8 --

In r14-9530 we relaxed "depending on type with no-linkage" errors for
declarations that could actually be accessed from different TUs anyway.
However, this also enabled it for unnamed types, which never work.

In a normal module interface, an unnamed type is TU-local by
[basic.link] p15.2, and so cannot be exposed or the program is
ill-formed.  We don't yet implement this checking but we should assume
that we will later; currently supporting this actually causes ICEs when
attempting to create the mangled name in some situations.

For a header unit, by [module.import] p5.3 it is unspecified whether two
TUs importing a header unit providing such a declaration are importing
the same header unit.  In this case, we would require name mangling
changes to somehow allow the (anonymous) type exported by such a header
unit to correspond across different TUs in the presence of other
anonymous declarations, so for this patch just assume that this case
would be an ODR violation instead.

diff --git a/gcc/testsuite/g++.dg/modules/linkage-2.C 
b/gcc/testsuite/g++.dg/modules/linkage-2.C
index eb4d7b051af..f69bd7ff728 100644
--- a/gcc/testsuite/g++.dg/modules/linkage-2.C
+++ b/gcc/testsuite/g++.dg/modules/linkage-2.C
@@ -13,14 +13,15 @@ namespace {
  return A{};
}
decltype(f()) g();  // { dg-error "used but never defined" }
-
-  struct {} s;
-  decltype(s) h();  // { dg-error "used but never defined" }
  }
  
  export void use() {

g();
-  h();


The above changes seem undesirable; we should still give that error.


+// Additionally, unnamed types have no linkage but are also TU-local,
+// and thus cannot be in a module interface unit.
+// We don't yet implement this checking however.
+struct {} s;  // { dg-error "TU-local" "" { xfail *-*-* } }


The comment should clarify that the problem is the non-TU-local variable 
's' exposing the TU-local type.  If s is also TU-local (as the one in 
the anonymous 

[PATCH, v2] Fortran: improve attribute conflict checking [PR93635]

2024-05-23 Thread Harald Anlauf

Hi Mikael,

On 5/23/24 09:49, Mikael Morin wrote:

Le 13/05/2024 à 09:25, Mikael Morin a écrit :

Le 10/05/2024 à 21:56, Harald Anlauf a écrit :

Am 10.05.24 um 21:48 schrieb Harald Anlauf:

Hi Mikael,

Am 10.05.24 um 11:45 schrieb Mikael Morin:

Le 09/05/2024 à 22:30, Harald Anlauf a écrit :

I'll stop here...


Thanks. Go figure, I have no problem reproducing today.
It's PR99798 (and there is even a patch for it).


this patch has rotten a bit: the type of gfc_reluease_symbol
has changed to bool, this can be fixed.

Unfortunately, applying the patch does not remove the ICEs here...


Oops, I take that back!  There was an error on my side applying the
patch; and now it does fix the ICEs after correcting that hickup

Now the PR99798 patch is ready to be pushed, but I won't be available 
for a few days.  We can finish our discussion on this topic afterwards.



Hello,

I'm coming back to this.
I think either one of Steve's patch or your variant in the PR is a 
better fix for the ICE as a first step; they seem less fragile at least.
Then we can look at a possible reordering of conflict checks as with the 
patch you originally submitted in this thread.


like the attached variant?

Harald


Mikael

From 68d73e6e2efa692afff10ea16eafb88236cbe69c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 23 May 2024 21:13:00 +0200
Subject: [PATCH] Fortran: improve attribute conflict checking [PR93635]

gcc/fortran/ChangeLog:

	PR fortran/93635
	* symbol.cc (conflict_std): Helper function for reporting attribute
	conflicts depending on the Fortran standard version.
	(conf_std): Helper macro for checking standard-dependent conflicts.
	(gfc_check_conflict): Use it.

gcc/testsuite/ChangeLog:

	PR fortran/93635
	* gfortran.dg/c-interop/c1255-2.f90: Adjust pattern.
	* gfortran.dg/pr87907.f90: Likewise.
	* gfortran.dg/pr93635.f90: New test.

Co-authored-by: Steven G. Kargl 
---
 gcc/fortran/symbol.cc | 63 +--
 .../gfortran.dg/c-interop/c1255-2.f90 |  4 +-
 gcc/testsuite/gfortran.dg/pr87907.f90 |  8 ++-
 gcc/testsuite/gfortran.dg/pr93635.f90 | 19 ++
 4 files changed, 54 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr93635.f90

diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
index 0a1646def67..5db3c887127 100644
--- a/gcc/fortran/symbol.cc
+++ b/gcc/fortran/symbol.cc
@@ -407,18 +407,36 @@ gfc_check_function_type (gfc_namespace *ns)
 
 / Symbol attribute stuff */
 
+/* Older standards produced conflicts for some attributes that are allowed
+   in newer standards.  Check for the conflict and issue an error depending
+   on the standard in play.  */
+
+static bool
+conflict_std (int standard, const char *a1, const char *a2, const char *name,
+	  locus *where)
+{
+  if (name == NULL)
+{
+  return gfc_notify_std (standard, "%s attribute conflicts "
+			 "with %s attribute at %L", a1, a2,
+			 where);
+}
+  else
+{
+  return gfc_notify_std (standard, "%s attribute conflicts "
+			 "with %s attribute in %qs at %L",
+			 a1, a2, name, where);
+}
+}
+
 /* This is a generic conflict-checker.  We do this to avoid having a
single conflict in two places.  */
 
 #define conf(a, b) if (attr->a && attr->b) { a1 = a; a2 = b; goto conflict; }
 #define conf2(a) if (attr->a) { a2 = a; goto conflict; }
-#define conf_std(a, b, std) if (attr->a && attr->b)\
-  {\
-a1 = a;\
-a2 = b;\
-standard = std;\
-goto conflict_std;\
-  }
+#define conf_std(a, b, std) if (attr->a && attr->b \
+&& !conflict_std (std, a, b, name, where)) \
+return false;
 
 bool
 gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
@@ -451,7 +469,6 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
 		"OACC DECLARE DEVICE_RESIDENT";
 
   const char *a1, *a2;
-  int standard;
 
   if (attr->artificial)
 return true;
@@ -460,20 +477,10 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
 where = _current_locus;
 
   if (attr->pointer && attr->intent != INTENT_UNKNOWN)
-{
-  a1 = pointer;
-  a2 = intent;
-  standard = GFC_STD_F2003;
-  goto conflict_std;
-}
+conf_std (pointer, intent, GFC_STD_F2003);
 
-  if (attr->in_namelist && (attr->allocatable || attr->pointer))
-{
-  a1 = in_namelist;
-  a2 = attr->allocatable ? allocatable : pointer;
-  standard = GFC_STD_F2003;
-  goto conflict_std;
-}
+  conf_std (in_namelist, allocatable, GFC_STD_F2003);
+  conf_std (in_namelist, pointer, GFC_STD_F2003);
 
   /* Check for attributes not allowed in a BLOCK DATA.  */
   if (gfc_current_state () == COMP_BLOCK_DATA)
@@ -922,20 +929,6 @@ conflict:
 	   a1, a2, 

[to-be-committed] [RISC-V] Use bclri in constant synthesis

2024-05-23 Thread Jeff Law
So this is conceptually similar to how we handled direct generation of 
bseti for constant synthesis, but this time for bclr.


In the bclr case, we already have an expander for AND.  So we just 
needed to adjust the predicate to accept another class of constant 
operands (those with a single bit clear).


With that in place constant synthesis is adjusted so that it counts the 
number of bits clear in the high 33 bits of a 64bit word.  If that 
number is small relative to the current best cost, then we try to 
generate the constant with a lui based sequence for the low half which 
implicitly sets the upper 32 bits as well.  Then we bclri one or more of 
those upper 33 bits.


So as an example, this code goes from 4 instructions down to 3:


unsigned long foo_0xfffbf7ff(void) { return 0xfffbf7ffUL; }




Note the use of 33 bits above.  That's meant to capture cases like this:



unsigned long foo_0xfffd77ff(void) { return 0xfffd77ffUL; }




We can use lui+addi+bclri+bclri to synthesize that in 4 instructions 
instead of 5.





I'm including a handful of cases covering the two basic ideas above that 
were found by the testing code.


And, no, we're not done yet.  I see at least one more notable idiom 
missing before exploring zbkb's potential to improve things.


Tested in my tester and waiting on Rivos CI system before moving forward.

jeff


gcc/

* config/riscv/predicates.md (arith_operand_or_mode_mask): Renamed to..
(arith_or_mode_mask_or_zbs_operand): New predicate.
* config/riscv/riscv.md (and3): Update predicate for operand 2.
* config/riscv/riscv.cc (riscv_build_integer_1): Use bclri to clear
bits, particularly bits 31..63 when profitable to do so.

gcc/testsuite/

* gcc.target/riscv/synthesis-6.c: New test.

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 8948fbfc363..c1c693c7617 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -27,12 +27,6 @@ (define_predicate "arith_operand"
   (ior (match_operand 0 "const_arith_operand")
(match_operand 0 "register_operand")))
 
-(define_predicate "arith_operand_or_mode_mask"
-  (ior (match_operand 0 "arith_operand")
-   (and (match_code "const_int")
-(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
-|| UINTVAL (op) == GET_MODE_MASK (SImode)"
-
 (define_predicate "lui_operand"
   (and (match_code "const_int")
(match_test "LUI_OPERAND (INTVAL (op))")))
@@ -398,6 +392,14 @@ (define_predicate "not_single_bit_mask_operand"
   (and (match_code "const_int")
(match_test "SINGLE_BIT_MASK_OPERAND (~UINTVAL (op))")))
 
+(define_predicate "arith_or_mode_mask_or_zbs_operand"
+  (ior (match_operand 0 "arith_operand")
+   (and (match_test "TARGET_ZBS")
+   (match_operand 0 "not_single_bit_mask_operand"))
+   (and (match_code "const_int")
+(match_test "UINTVAL (op) == GET_MODE_MASK (HImode)
+|| UINTVAL (op) == GET_MODE_MASK (SImode)"
+
 (define_predicate "const_si_mask_operand"
   (and (match_code "const_int")
(match_test "(INTVAL (op) & (GET_MODE_BITSIZE (SImode) - 1))
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 85df5b7ab49..3b32b515fac 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -893,6 +893,40 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  codes[1].use_uw = false;
  cost = 2;
}
+
+  /* If LUI/ADDI are going to set bits 32..63 and we need a small
+number of them cleared, we might be able to use bclri profitably. 
+
+Note we may allow clearing of bit 31 using bclri.  There's a class
+of constants with that bit clear where this helps.  */
+  else if (TARGET_64BIT
+  && TARGET_ZBS
+  && (32 - popcount_hwi (value & HOST_WIDE_INT_C 
(0x8000))) + 1 < cost)
+   {
+ /* Turn on all those upper bits and synthesize the result.  */
+ HOST_WIDE_INT nval = value | HOST_WIDE_INT_C (0x8000);
+ alt_cost = riscv_build_integer_1 (alt_codes, nval, mode);
+
+ /* Now iterate over the bits we want to clear until the cost is
+too high or we're done.  */
+ nval = value ^ HOST_WIDE_INT_C (-1);
+ nval &= HOST_WIDE_INT_C (~0x7fff);
+ while (nval && alt_cost < cost)
+   {
+ HOST_WIDE_INT bit = ctz_hwi (nval);
+ alt_codes[alt_cost].code = AND;
+ alt_codes[alt_cost].value = ~(1UL << bit);
+ alt_codes[alt_cost].use_uw = false;
+ alt_cost++;
+ nval &= ~(1UL << bit);
+   }
+
+ if (alt_cost <= cost)
+   {
+ memcpy (codes, alt_codes, sizeof (alt_codes));
+ cost = alt_cost;
+   }
+   }
 }
 
   if (cost > 2 && 

Re: [PATCH 12/13] rs6000, remove __builtin_vsx_xvcmpeqsp built-in

2024-05-23 Thread Carl Love



On 5/13/24 22:37, Kewen.Lin wrote:
> Hi,
> 
> on 2024/4/20 05:18, Carl Love wrote:
>> rs6000, remove __builtin_vsx_xvcmpeqsp built-in
>>
>> The built-in __builtin_vsx_xvcmpeqsp is a duplicate of the overloaded
>> vec_cmpeq built-in.  The built-in is undocumented.  The built-in and
>> the test cases are removed.
>>
>> gcc/ChangeLog:
>>  * config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp):
>>  Remove built-in definition.
>>
> 
> Ah, you separated this __builtin_vsx_xvcmpeqsp from the one for
> __builtin_vsx_xvcmpeqsp_p, it's fine, please ignore the comments for
> considering this __builtin_vsx_xvcmpeqsp in my previous reply to 11/13.
> 
> 
>> gcc/testsuite/ChangeLog:
>>  * vsx-builtin-3.c (do_cmp): Remove test case for
>>  __builtin_vsx_xvcmpeqsp.
>> ---
>>  gcc/config/rs6000/rs6000-builtins.def| 3 ---
>>  gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c | 2 --
>>  2 files changed, 5 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
>> b/gcc/config/rs6000/rs6000-builtins.def
>> index 2f6149edd5f..19d05b8043a 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1613,9 +1613,6 @@
>>const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd);
>>  XVCMPEQDP_P vector_eq_v2df_p {pred}
>>  
>> -  const vf __builtin_vsx_xvcmpeqsp (vf, vf);
>> -XVCMPEQSP vector_eqv4sf {}
>> -
>>const vd __builtin_vsx_xvcmpgedp (vd, vd);
>>  XVCMPGEDP vector_gev2df {}
>>  
>> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c 
>> b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
>> index 35ea31b2616..245893dc0e3 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
>> @@ -27,7 +27,6 @@
>>  /* { dg-final { scan-assembler "xvcmpeqdp" } } */
>>  /* { dg-final { scan-assembler "xvcmpgtdp" } } */
>>  /* { dg-final { scan-assembler "xvcmpgedp" } } */
>> -/* { dg-final { scan-assembler "xvcmpeqsp" } } */
>>  /* { dg-final { scan-assembler "xvcmpgtsp" } } */
>>  /* { dg-final { scan-assembler "xvcmpgesp" } } */
>>  /* { dg-final { scan-assembler "xxsldwi" } } */
>> @@ -112,7 +111,6 @@ int do_cmp (void)
>>d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++;
>>d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++;
>>  
>> -  f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++;
>>f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++;
>>f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++;
>>return i;
> 
> As the other in this patch series, I prefer to change it with
> vec_cmpeq here, OK for trunk with this tweaked (also keep the
> scan there), thanks!

When I went to change the test case I noticed that __builtin_vsx_xvcmpeqsp and 
vec_cmpeq both return a vector where the element is all ones if the comparison 
is True and zeros if False.  However, the return type for 
__builtin_vsx_xvcmpeqsp is vector floats but vec_cmpeq returns vector bool.

The PVIPR says the vec_cmpeq built-in returns a value where each bit in the 
vector element is a 1 if the comparison is equal and 0 otherwise.  However, the 
documented result is a vector bool int for the floating point comparison.  The 
return value for __builtin_vsx_xvcmpeqsp was vector float.  

So, the "bit values" returned are the same but not of the same type. So 
technically vec_cmpeq is not a drop in replacement for __builtin_vsx_xvcmpeqsp. 
 Given that, perhaps we should not be removing __builtin_vsx_xvcmpeqsp?

The testcase has to be changed from:
 f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++;
 bi[i][0] = vec_cmpeq (f[i][1], f[i][2]); i++;

I am thinking we should drop this patch from the series, i.e. don't remove 
__builtin_vsx_xvcmpeqsp.  Thoughts?

 Carl 
 

> 
> BR,
> Kewen
> 


[PATCH] c++: alias CTAD and copy deduction guide [PR115198]

2024-05-23 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk/14?

-- >8 --

Here we're neglecting to update DECL_NAME during the alias CTAD guide
transformation, which causes copy_guide_p to return false for the
transformed copy deduction guide since DECL_NAME is still __dguide_C
with TREE_TYPE C but it should be __dguide_A with TREE_TYPE A
(equivalently C).  This ultimately results in ambiguity during
overload resolution between the copy deduction guide vs copy ctor guide.

This patch makes us update DECL_NAME of a transformed guide accordingly
during alias CTAD.  This eventually needs to be done for inherited CTAD
too, but it's not clear what identifier to use there since it has to be
unique for each derived/base pair.  For

  template struct A { ... };
  template struct B : A { using A::A; }

at first glance it'd be reasonable to give inherited guides a name of
__dguide_B with TREE_TYPE A, but since that name is already
used B's own guides its TREE_TYPE is already B.

PR c++/115198

gcc/cp/ChangeLog:

* pt.cc (alias_ctad_tweaks): Update DECL_NAME of a transformed
guide during alias CTAD.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-alias22.C: New test.
---
 gcc/cp/pt.cc   |  9 -
 .../g++.dg/cpp2a/class-deduction-alias22.C | 14 ++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 0c4d96cf768..58873057abc 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30304,13 +30304,14 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
  any).  */
 
   enum { alias, inherited } ctad_kind;
-  tree atype, fullatparms, utype;
+  tree atype, fullatparms, utype, name;
   if (TREE_CODE (tmpl) == TEMPLATE_DECL)
 {
   ctad_kind = alias;
   atype = TREE_TYPE (tmpl);
   fullatparms = DECL_TEMPLATE_PARMS (tmpl);
   utype = DECL_ORIGINAL_TYPE (DECL_TEMPLATE_RESULT (tmpl));
+  name = dguide_name (tmpl);
 }
   else
 {
@@ -30318,6 +30319,10 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
   atype = NULL_TREE;
   fullatparms = TREE_PURPOSE (tmpl);
   utype = TREE_VALUE (tmpl);
+  /* FIXME: What name should we give inherited guides?  It needs to be
+unique to the derived/base pair so that we don't clobber an earlier
+setting of TREE_TYPE.  */
+  name = NULL_TREE;
 }
 
   tsubst_flags_t complain = tf_warning_or_error;
@@ -30413,6 +30418,8 @@ alias_ctad_tweaks (tree tmpl, tree uguides)
}
  if (g == error_mark_node)
continue;
+ if (name)
+   DECL_NAME (g) = name;
  if (nfparms == 0)
{
  /* The targs are all non-dependent, so g isn't a template.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
new file mode 100644
index 000..9c6c841166a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-alias22.C
@@ -0,0 +1,14 @@
+// PR c++/115198
+// { dg-do compile { target c++20 } }
+
+template
+struct C {
+  C() = default;
+  C(const C&) = default;
+};
+
+template
+using A = C;
+
+C c;
+A a = c; // { dg-bogus "ambiguous" }
-- 
2.45.1.216.g4365c6fcf9



Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-23 Thread Evgeny Karpov

Thursday, May 23, 2024 10:35 AM
Uros Bizjak  wrote:

> Richard Sandiford  wrote:
> >
> > > This looks good to me apart from a couple of very minor comments
> > > below, but please get approval from the x86 maintainers as well.  In
> > > particular, they might prefer to handle ix86_legitimize_pe_coff_symbol in
> some other way.
> >
> > Jan and Uros, could you please review x86 refactoring for mingw part?
> 
> Yes, perhaps legitimize_pe_coff_symbol should be handled similar to how
> machopic_legitimize_pic_address is handled.and just use "#if TARGET_PECOFF"
> at call sites when calling functions from the new winnt-dll.h. This would also
> allow us to remove  the early check for !TARGET_PECOFF in
> legitimize_pe_coff_symbol.
> 
> Uros.


The function legitimize_pe_coff_symbol is now part of mingw and will not be 
used for linux targets. 
This is why ix86_legitimize_pe_coff_symbol has been introduced, to be available 
for all platforms.

Regards,
Evgeny


Re: [PATCH] Avoid vector -Wfree-nonheap-object warnings

2024-05-23 Thread François Dumont


On 23/05/2024 15:31, Jonathan Wakely wrote:

On 23/05/24 06:55 +0200, François Dumont wrote:

As explained in this email:

https://gcc.gnu.org/pipermail/libstdc++/2024-April/058552.html

I experimented -Wfree-nonheap-object because of my enhancements on 
algos.


So here is a patch to extend the usage of the _Guard type to other 
parts of vector.


Nice, that fixes the warning you were seeing?


Yes ! I indeed forgot to say so :-)




We recently got a bug report about -Wfree-nonheap-object in
std::vector, but that is coming from _M_realloc_append which already
uses the RAII guard :-(
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115016


Note that I also had to move call to __uninitialized_copy_a before 
assigning this->_M_impl._M_start so get rid of the -Wfree-nonheap-object 
warn. But _M_realloc_append is already doing potentially throwing 
operations before assigning this->_M_impl so it must be something else.


Though it made me notice another occurence of _Guard in this method. Now 
replaced too in this new patch.


    libstdc++: Use RAII to replace try/catch blocks

    Move _Guard into std::vector declaration and use it to guard all 
calls to

    vector _M_allocate.

    Doing so the compiler has more visibility on what is done with the 
pointers

    and do not raise anymore the -Wfree-nonheap-object warning.

    libstdc++-v3/ChangeLog:

    * include/bits/vector.tcc (_Guard): Move all the nested 
duplicated class...

    * include/bits/stl_vector.h (_Guard_alloc): ...here.
    (_M_allocate_and_copy): Use latter.
    (_M_initialize_dispatch): Likewise and set _M_finish first 
from the result

    of __uninitialize_fill_n_a that can throw.
    (_M_range_initialize): Likewise.

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h

index 31169711a48..4ea74e3339a 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1607,6 +1607,39 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  clear() _GLIBCXX_NOEXCEPT
  { _M_erase_at_end(this->_M_impl._M_start); }

+    private:
+  // RAII guard for allocated storage.
+  struct _Guard


If it's being defined at class scope instead of locally in a member
function, I think a better name would be good. Maybe _Ptr_guard or
_Dealloc_guard or something.

_Guard_alloc chosen.



+  {
+    pointer _M_storage;    // Storage to deallocate
+    size_type _M_len;
+    _Base& _M_vect;
+
+    _GLIBCXX20_CONSTEXPR
+    _Guard(pointer __s, size_type __l, _Base& __vect)
+    : _M_storage(__s), _M_len(__l), _M_vect(__vect)
+    { }
+
+    _GLIBCXX20_CONSTEXPR
+    ~_Guard()
+    {
+  if (_M_storage)
+    _M_vect._M_deallocate(_M_storage, _M_len);
+    }
+
+    _GLIBCXX20_CONSTEXPR
+    pointer
+    _M_release()
+    {
+  pointer __res = _M_storage;
+  _M_storage = 0;


I don't think the NullablePointer requirements include assigning 0,
only from nullptr, which isn't valid in C++98.

https://en.cppreference.com/w/cpp/named_req/NullablePointer

Please use _M_storage = pointer() instead.


I forgot about user fancy pointer, fixed.





+  return __res;
+    }
+
+  private:
+    _Guard(const _Guard&);
+  };
+
    protected:
  /**
   *  Memory expansion handler.  Uses the member allocation 
function to

@@ -1618,18 +1651,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_M_allocate_and_copy(size_type __n,
 _ForwardIterator __first, _ForwardIterator __last)
{
-  pointer __result = this->_M_allocate(__n);
-  __try
-    {
-  std::__uninitialized_copy_a(__first, __last, __result,
-  _M_get_Tp_allocator());
-  return __result;
-    }
-  __catch(...)
-    {
-  _M_deallocate(__result, __n);
-  __throw_exception_again;
-    }
+  _Guard __guard(this->_M_allocate(__n), __n, *this);
+  std::__uninitialized_copy_a
+    (__first, __last, __guard._M_storage, _M_get_Tp_allocator());
+  return __guard._M_release();
}


@@ -1642,13 +1667,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  // 438. Ambiguity in the "do the right thing" clause
  template
void
-    _M_initialize_dispatch(_Integer __n, _Integer __value, __true_type)
+    _M_initialize_dispatch(_Integer __int_n, _Integer __value, 
__true_type)

{
-  this->_M_impl._M_start = _M_allocate(_S_check_init_len(
-    static_cast(__n), _M_get_Tp_allocator()));
-  this->_M_impl._M_end_of_storage =
-    this->_M_impl._M_start + static_cast(__n);
-  _M_fill_initialize(static_cast(__n), __value);


Please fix the comment on _M_fill_initialize if you're removing the
use of it here.


Already done in this initial patch proposal, see below.




+  const size_type __n = static_cast(__int_n);
+  _Guard __guard(_M_allocate(_S_check_init_len(
+    __n, _M_get_Tp_allocator())), __n, *this);


I think this would be easier to 

[c-family] Another small fix to implementation of -fdump-ada-spec

2024-05-23 Thread Eric Botcazou
This avoids generating invalid Ada code for functions with a multidimensional 
array parameter and also cleans things up left and right.

Tested on x86-64/Linux, applied on the mainline.


2024-05-23  Eric Botcazou  

* c-ada-spec.cc (check_type_name_conflict): Add guard.
(is_char_array): Simplify.
(dump_ada_array_type): Use strip_array_types.
(dump_ada_node) : Deal with anonymous array types.
(dump_nested_type): Use strip_array_types.

-- 
Eric Botcazoudiff --git a/gcc/c-family/c-ada-spec.cc b/gcc/c-family/c-ada-spec.cc
index 46fee30b6b9..8f0849bd427 100644
--- a/gcc/c-family/c-ada-spec.cc
+++ b/gcc/c-family/c-ada-spec.cc
@@ -1558,7 +1558,7 @@ check_type_name_conflict (pretty_printer *buffer, tree t)
   while (TREE_CODE (tmp) == POINTER_TYPE && !TYPE_NAME (tmp))
 tmp = TREE_TYPE (tmp);
 
-  if (TREE_CODE (tmp) != FUNCTION_TYPE)
+  if (TREE_CODE (tmp) != FUNCTION_TYPE && tmp != error_mark_node)
 {
   const char *s;
 
@@ -1788,17 +1788,9 @@ dump_sloc (pretty_printer *buffer, tree node)
 static bool
 is_char_array (tree t)
 {
-  int num_dim = 0;
-
-  while (TREE_CODE (t) == ARRAY_TYPE)
-{
-  num_dim++;
-  t = TREE_TYPE (t);
-}
-
-  return num_dim == 1
-	 && TREE_CODE (t) == INTEGER_TYPE
-	 && id_equal (DECL_NAME (TYPE_NAME (t)), "char");
+  return TREE_CODE (t) == ARRAY_TYPE
+	 && TREE_CODE (TREE_TYPE (t)) == INTEGER_TYPE
+	 && id_equal (DECL_NAME (TYPE_NAME (TREE_TYPE (t))), "char");
 }
 
 /* Dump in BUFFER an array type NODE in Ada syntax.  SPC is the indentation
@@ -1821,9 +1813,7 @@ dump_ada_array_type (pretty_printer *buffer, tree node, int spc)
   /* Print the component type.  */
   if (!char_array)
 {
-  tree tmp = node;
-  while (TREE_CODE (tmp) == ARRAY_TYPE)
-	tmp = TREE_TYPE (tmp);
+  tree tmp = strip_array_types (node);
 
   pp_string (buffer, " of ");
 
@@ -2350,6 +2340,11 @@ dump_ada_node (pretty_printer *buffer, tree node, tree type, int spc,
 		  && DECL_ORIGINAL_TYPE (DECL_CHAIN (stub)) == ref_type)
 		ref_type = TREE_TYPE (DECL_CHAIN (stub));
 
+		  /* If this is a pointer to an anonymous array type, then use
+		 the name of the component type.  */
+		  else if (!type_name && is_access)
+		ref_type = strip_array_types (ref_type);
+
 		  /* Generate "access " instead of "access "
 		 if the subtype comes from another file, because subtype
 		 declarations do not contribute to the limited view of a
@@ -2639,10 +2634,7 @@ dump_nested_type (pretty_printer *buffer, tree field, tree t, int spc)
   if (!bitmap_set_bit (dumped_anonymous_types, TYPE_UID (field_type)))
 	return;
 
-  /* Recurse on the element type if need be.  */
-  tmp = TREE_TYPE (field_type);
-  while (TREE_CODE (tmp) == ARRAY_TYPE)
-	tmp = TREE_TYPE (tmp);
+  tmp = strip_array_types (field_type);
   decl = get_underlying_decl (tmp);
   if (decl
 	  && !DECL_NAME (decl)


Re: [PATCH] [RFC] Target-independent store forwarding avoidance. [PR48696] Target-independent store forwarding avoidance.

2024-05-23 Thread Andrew Pinski
On Thu, May 23, 2024 at 8:01 AM Manolis Tsamis  wrote:
>
> This pass detects cases of expensive store forwarding and tries to avoid them
> by reordering the stores and using suitable bit insertion sequences.
> For example it can transform this:
>
>  strbw2, [x1, 1]
>  ldr x0, [x1]  # Epxensive store forwarding to larger load.
>
> To:
>
>  ldr x0, [x1]
>  strbw2, [x1]
>  bfi x0, x2, 0, 8
>

Are you sure this is correct with respect to the C11/C++11 memory
models? If not then the pass should be gated with
flag_store_data_races.
Also stores like this start a new "alias set" (I can't remember the
exact term here). So how do you represent the store's aliasing set? Do
you change it? If not, are you sure that will do the right thing?

You didn't document the new option or the new --param (invoke.texi);
this is the bare minimum requirement.
Note you should add documentation for the new pass in the internals
manual (passes.texi) (note most folks forget to update this when
adding a new pass).

Thanks,
Andrew


> Assembly like this can appear with bitfields or type punning / unions.
> On stress-ng when running the cpu-union microbenchmark the following speedups
> have been observed.
>
>   Neoverse-N1:  +29.4%
>   Intel Coffeelake: +13.1%
>   AMD 5950X:+17.5%
>
> PR rtl-optimization/48696
>
> gcc/ChangeLog:
>
> * Makefile.in: Add avoid-store-forwarding.o.
> * common.opt: New option -favoid-store-forwarding.
> * params.opt: New param store-forwarding-max-distance.
> * passes.def: Schedule a new pass.
> * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
> * avoid-store-forwarding.cc: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/avoid-store-forwarding-1.c: New test.
> * gcc.dg/avoid-store-forwarding-2.c: New test.
> * gcc.dg/avoid-store-forwarding-3.c: New test.
>
> Signed-off-by: Manolis Tsamis 
> ---
>
>  gcc/Makefile.in   |   1 +
>  gcc/avoid-store-forwarding.cc | 554 ++
>  gcc/common.opt|   4 +
>  gcc/params.opt|   4 +
>  gcc/passes.def|   1 +
>  .../gcc.dg/avoid-store-forwarding-1.c |  46 ++
>  .../gcc.dg/avoid-store-forwarding-2.c |  39 ++
>  .../gcc.dg/avoid-store-forwarding-3.c |  31 +
>  gcc/tree-pass.h   |   1 +
>  9 files changed, 681 insertions(+)
>  create mode 100644 gcc/avoid-store-forwarding.cc
>  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index a7f15694c34..be969b1ca1d 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1681,6 +1681,7 @@ OBJS = \
> statistics.o \
> stmt.o \
> stor-layout.o \
> +   avoid-store-forwarding.o \
> store-motion.o \
> streamer-hooks.o \
> stringpool.o \
> diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
> new file mode 100644
> index 000..d90627c4872
> --- /dev/null
> +++ b/gcc/avoid-store-forwarding.cc
> @@ -0,0 +1,554 @@
> +/* Avoid store forwarding optimization pass.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   Contributed by VRULL GmbH.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3, or (at your option)
> +   any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but
> +   WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "rtl.h"
> +#include "alias.h"
> +#include "rtlanal.h"
> +#include "tree-pass.h"
> +#include "cselib.h"
> +#include "predict.h"
> +#include "insn-config.h"
> +#include "expmed.h"
> +#include "recog.h"
> +#include "regset.h"
> +#include "df.h"
> +#include "expr.h"
> +#include "memmodel.h"
> +#include "emit-rtl.h"
> +#include "vec.h"
> +
> +/* This pass tries to detect and avoid cases of store forwarding.
> +   On many processors there is a large penalty when smaller stores are
> +   forwarded to larger loads.  The idea used to avoid the stall is to move
> +   the store after the load and in addition emit a bit insert sequence so
> +   the load register has the 

[PATCH] Add config file so b4 uses inbox.sourceware.org automatically

2024-05-23 Thread Jonathan Wakely
It looks like my patch[1] to make b4 figure this out automagically won't
be accepted, so this makes it work for GCC. A similar commit could be
done for each project hosted on sourceware.org if desired.

[1] 
https://lore.kernel.org/tools/20240523143752.385810-1-jwak...@redhat.com/T/#u

OK for trunk?

-- >8 --

This makes b4 use inbox.sourceware.org instead of the default host
lore.kernel.org, so that every b4 user doesn't have to configure this
themselves.

ChangeLog:

* .b4-config: New file.
---
 .b4-config | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 .b4-config

diff --git a/.b4-config b/.b4-config
new file mode 100644
index 000..d5ba8e08446
--- /dev/null
+++ b/.b4-config
@@ -0,0 +1,3 @@
+[b4]
+midmask = https://inbox.sourceware.org/%s
+linkmask = https://inbox.sourceware.org/%s
-- 
2.45.1



[PATCH] [RFC] Target-independent store forwarding avoidance. [PR48696] Target-independent store forwarding avoidance.

2024-05-23 Thread Manolis Tsamis
This pass detects cases of expensive store forwarding and tries to avoid them
by reordering the stores and using suitable bit insertion sequences.
For example it can transform this:

 strbw2, [x1, 1]
 ldr x0, [x1]  # Epxensive store forwarding to larger load.

To:

 ldr x0, [x1]
 strbw2, [x1]
 bfi x0, x2, 0, 8

Assembly like this can appear with bitfields or type punning / unions.
On stress-ng when running the cpu-union microbenchmark the following speedups
have been observed.

  Neoverse-N1:  +29.4%
  Intel Coffeelake: +13.1%
  AMD 5950X:+17.5%

PR rtl-optimization/48696

gcc/ChangeLog:

* Makefile.in: Add avoid-store-forwarding.o.
* common.opt: New option -favoid-store-forwarding.
* params.opt: New param store-forwarding-max-distance.
* passes.def: Schedule a new pass.
* tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
* avoid-store-forwarding.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/avoid-store-forwarding-1.c: New test.
* gcc.dg/avoid-store-forwarding-2.c: New test.
* gcc.dg/avoid-store-forwarding-3.c: New test.

Signed-off-by: Manolis Tsamis 
---

 gcc/Makefile.in   |   1 +
 gcc/avoid-store-forwarding.cc | 554 ++
 gcc/common.opt|   4 +
 gcc/params.opt|   4 +
 gcc/passes.def|   1 +
 .../gcc.dg/avoid-store-forwarding-1.c |  46 ++
 .../gcc.dg/avoid-store-forwarding-2.c |  39 ++
 .../gcc.dg/avoid-store-forwarding-3.c |  31 +
 gcc/tree-pass.h   |   1 +
 9 files changed, 681 insertions(+)
 create mode 100644 gcc/avoid-store-forwarding.cc
 create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-1.c
 create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-2.c
 create mode 100644 gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a7f15694c34..be969b1ca1d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1681,6 +1681,7 @@ OBJS = \
statistics.o \
stmt.o \
stor-layout.o \
+   avoid-store-forwarding.o \
store-motion.o \
streamer-hooks.o \
stringpool.o \
diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
new file mode 100644
index 000..d90627c4872
--- /dev/null
+++ b/gcc/avoid-store-forwarding.cc
@@ -0,0 +1,554 @@
+/* Avoid store forwarding optimization pass.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   Contributed by VRULL GmbH.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "alias.h"
+#include "rtlanal.h"
+#include "tree-pass.h"
+#include "cselib.h"
+#include "predict.h"
+#include "insn-config.h"
+#include "expmed.h"
+#include "recog.h"
+#include "regset.h"
+#include "df.h"
+#include "expr.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
+#include "vec.h"
+
+/* This pass tries to detect and avoid cases of store forwarding.
+   On many processors there is a large penalty when smaller stores are
+   forwarded to larger loads.  The idea used to avoid the stall is to move
+   the store after the load and in addition emit a bit insert sequence so
+   the load register has the correct value.  For example the following:
+
+ strbw2, [x1, 1]
+ ldr x0, [x1]
+
+   Will be transformed to:
+
+ ldr x0, [x1]
+ and w2, w2, 255
+ strbw2, [x1]
+ bfi x0, x2, 0, 8
+*/
+
+namespace {
+
+const pass_data pass_data_avoid_store_forwarding =
+{
+  RTL_PASS, /* type.  */
+  "avoid_store_forwarding", /* name.  */
+  OPTGROUP_NONE, /* optinfo_flags.  */
+  TV_NONE, /* tv_id.  */
+  0, /* properties_required.  */
+  0, /* properties_provided.  */
+  0, /* properties_destroyed.  */
+  0, /* todo_flags_start.  */
+  TODO_df_finish /* todo_flags_finish.  */
+};
+
+class pass_rtl_avoid_store_forwarding : public rtl_opt_pass
+{
+public:
+  pass_rtl_avoid_store_forwarding (gcc::context *ctxt)
+: rtl_opt_pass (pass_data_avoid_store_forwarding, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+{
+  return 

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-23 Thread Jeff Law




On 5/23/24 6:14 AM, Richard Biener wrote:

On Thu, May 23, 2024 at 1:08 PM Li, Pan2  wrote:


I have a try to convert the PHI from Part-A to Part-B, aka PHI to _2 = phi_cond 
? _1 : 255.
And then we can do the matching on COND_EXPR in the underlying widen-mul pass.

Unfortunately, meet some ICE when verify_gimple_phi in sccopy1 pass =>
sat_add.c:66:1: internal compiler error: tree check: expected class ‘type’, 
have ‘exceptional’ (error_mark) in useless_type_conversion_p, at 
gimple-expr.cc:86


Likely you have released _2, more comments below on your previous mail.
You can be sure by calling debug_tree () on the SSA_NAME node in 
question.  If it reports "in-free-list", then that's definitive that the 
SSA_NAME was released back to the SSA_NAME manager.  If that SSA_NAME is 
still in the IL, then that's very bad.


jeff



[PATCH] c++/modules: Improve errors for bad module-directives [PR115200]

2024-05-23 Thread Nathaniel Shead
Bootstrapped and regtested (so far just modules.exp and dg.exp) on
x86_64-pc-linux-gnu, OK for trunk if full regtest succeeds?

-- >8 --

This fixes an ICE when a module directive is not given at global scope.
Although not explicitly mentioned, it seems implied from [basic.link] p1
and [module.global.frag] that a module-declaration must appear at the
global scope after preprocessing.  Apart from this the patch also
slightly improves the errors given when accidentally using a module
control-line in other situations where it is not expected.

PR c++/115200

gcc/cp/ChangeLog:

* parser.cc (cp_parser_error_1): Special-case unexpected module
directives for better diagnostics.
(cp_parser_module_declaration): Check that the module
declaration is at global scope.

gcc/testsuite/ChangeLog:

* g++.dg/modules/mod-decl-1.C: Update error messages.
* g++.dg/modules/mod-decl-6.C: New test.
* g++.dg/modules/mod-decl-7.C: New test.
* g++.dg/modules/mod-decl-8.C: New test.
* g++.dg/modules/mod-decl-8.h: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/parser.cc  | 32 +++
 gcc/testsuite/g++.dg/modules/mod-decl-1.C |  6 ++---
 gcc/testsuite/g++.dg/modules/mod-decl-6.C | 11 
 gcc/testsuite/g++.dg/modules/mod-decl-7.C | 12 +
 gcc/testsuite/g++.dg/modules/mod-decl-8.C |  9 +++
 gcc/testsuite/g++.dg/modules/mod-decl-8.h |  4 +++
 6 files changed, 71 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-6.C
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-7.C
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-8.C
 create mode 100644 gcc/testsuite/g++.dg/modules/mod-decl-8.h

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 476ddc0d63a..1c0543ba154 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -3230,6 +3230,31 @@ cp_parser_error_1 (cp_parser* parser, const char* gmsgid,
   return;
 }
 
+  if (cp_token_is_module_directive (token))
+{
+  cp_token *next = (token->keyword == RID__EXPORT
+   ? cp_lexer_peek_nth_token (parser->lexer, 2) : token);
+
+  auto_diagnostic_group d;
+  error_at (token->location, "unexpected module directive");
+  tree scope = current_scope ();
+  if (next->keyword == RID__MODULE
+ && token->main_source_p
+ && scope != global_namespace)
+   {
+ /* Nicer error for unterminated scopes in GMF includes.  */
+ inform (token->location,
+ "module-declaration must be at global scope");
+ inform (location_of (scope), "scope opened here");
+   }
+  else
+   inform (token->location, "perhaps insert a line break, or other"
+   " disambiguation, to prevent this being considered a"
+   " module control-line");
+  cp_parser_skip_to_pragma_eol (parser, token);
+  return;
+}
+
   /* If this is actually a conflict marker, report it as such.  */
   if (token->type == CPP_LSHIFT
   || token->type == CPP_RSHIFT
@@ -15135,12 +15160,19 @@ cp_parser_module_declaration (cp_parser *parser, 
module_parse mp_state,
   parser->lexer->in_pragma = true;
   cp_token *token = cp_lexer_consume_token (parser->lexer);
 
+  tree scope = current_scope ();
   if (flag_header_unit)
 {
   error_at (token->location,
"module-declaration not permitted in header-unit");
   goto skip_eol;
 }
+  else if (scope != global_namespace)
+{
+  error_at (token->location, "module-declaration must be at global scope");
+  inform (DECL_SOURCE_LOCATION (scope), "scope opened here");
+  goto skip_eol;
+}
   else if (mp_state == MP_FIRST && !exporting
   && cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
 {
diff --git a/gcc/testsuite/g++.dg/modules/mod-decl-1.C 
b/gcc/testsuite/g++.dg/modules/mod-decl-1.C
index 23d34483dd7..84fa31c7024 100644
--- a/gcc/testsuite/g++.dg/modules/mod-decl-1.C
+++ b/gcc/testsuite/g++.dg/modules/mod-decl-1.C
@@ -10,17 +10,17 @@ module foo.second; // { dg-error "only permitted as" }
 
 namespace Foo 
 {
-module third;  // { dg-error "only permitted as" }
+module third;  // { dg-error "must be at global scope" }
 }
 
 struct Baz
 {
-  module forth; // { dg-error "expected" }
+  module forth; // { dg-error "unexpected module directive" }
 };
 
 void Bink ()
 {
-  module fifth; // { dg-error "expected" }
+  module fifth; // { dg-error "unexpected module directive" }
 }
 
 module a.; // { dg-error "only permitted as" }
diff --git a/gcc/testsuite/g++.dg/modules/mod-decl-6.C 
b/gcc/testsuite/g++.dg/modules/mod-decl-6.C
new file mode 100644
index 000..5fb342455e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/mod-decl-6.C
@@ -0,0 +1,11 @@
+// PR c++/115200
+// { dg-additional-options "-fmodules-ts -Wno-global-module" }
+// { dg-module-cmi !M }
+
+module;
+
+namespace ns {  // { dg-message "scope opened 

RE: [PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-23 Thread Li, Pan2
Committed as passed below test suites, thanks Richard.

* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 regression test.

Pan

-Original Message-
From: Li, Pan2 
Sent: Thursday, May 23, 2024 8:06 PM
To: Richard Biener 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com; pins...@gmail.com
Subject: RE: [PATCH v4] Match: Add overloaded types_match to avoid code dup 
[NFC]

> the above three lines are redundant.
> OK with those removed.

Got it, will commit it after no surprise in test for removal.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, May 23, 2024 7:49 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com; pins...@gmail.com
Subject: Re: [PATCH v4] Match: Add overloaded types_match to avoid code dup 
[NFC]

On Thu, May 23, 2024 at 2:24 AM  wrote:
>
> From: Pan Li 
>
> There are sorts of match pattern for SAT related cases,  there will be
> some duplicated code to check the dest, op_0, op_1 are same tree types.
> Aka ternary tree type matches.  Thus,  add overloaded types_match func
> do this and avoid match code duplication.
>
> The below test suites are passed for this patch:
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 regression test.
>
> gcc/ChangeLog:
>
> * generic-match-head.cc (types_match): Add overloaded types_match
> for 3 types.
> * gimple-match-head.cc (types_match): Ditto.
> * match.pd: Leverage overloaded types_match.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/generic-match-head.cc | 14 ++
>  gcc/gimple-match-head.cc  | 14 ++
>  gcc/match.pd  | 30 ++
>  3 files changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
> index 0d3f648fe8d..8d8ecfaeb1d 100644
> --- a/gcc/generic-match-head.cc
> +++ b/gcc/generic-match-head.cc
> @@ -59,6 +59,20 @@ types_match (tree t1, tree t2)
>return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GENERIC.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

the above three lines are redundant.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GENERIC, we assume this is
> always true.  */
>
> diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
> index 5f8a1a1ad8e..2b7f746ab13 100644
> --- a/gcc/gimple-match-head.cc
> +++ b/gcc/gimple-match-head.cc
> @@ -79,6 +79,20 @@ types_match (tree t1, tree t2)
>return types_compatible_p (t1, t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GIMPLE.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

likewise.

OK with those removed.

Richard.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GIMPLE, we also allow any
> non-SSA_NAME (ie constants) and zero uses to cope with uses
> that aren't linked up yet.  */
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 35e3d82b131..7081d76d56a 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3048,38 +3048,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* Unsigned Saturation Add */
>  (match (usadd_left_part_1 @0 @1)
>   (plus:c @0 @1)
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_left_part_2 @0 @1)
>   (realpart (IFN_ADD_OVERFLOW:c @0 @1))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (lt (plus:c @0 @1) @0)))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (gt @0 

[PATCH] c++: mark TARGET_EXPRs for function arguments eliding [PR114707]

2024-05-23 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Coming back to our discussion in
:
TARGET_EXPRs that initialize a function argument are not marked
TARGET_EXPR_ELIDING_P even though gimplify_arg drops such TARGET_EXPRs
on the floor.  To work around it, I added a pset to
replace_placeholders_for_class_temp_r, but it would be best to just rely
on TARGET_EXPR_ELIDING_P.

This patch changes the diagnostic we emit in constexpr-diag1.C: instead
of:

constexpr-diag1.C:20:21: error: temporary of non-literal type 'B' in a constant 
expression

we say:

constexpr-diag1.C:20:23: error: call to non-'constexpr' function 'B::B()'

PR c++/114707

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing): Call set_target_expr_eliding.
* typeck2.cc (replace_placeholders_for_class_temp_r): Don't use pset.
(digest_nsdmi_init): Call cp_walk_tree_without_duplicates instead of
cp_walk_tree.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-diag1.C: Adjust dg-error.
---
 gcc/cp/call.cc   |  3 +++
 gcc/cp/typeck2.cc| 20 
 gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C |  2 +-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index ed68eb3c568..750ecf60fd9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9437,6 +9437,9 @@ convert_for_arg_passing (tree type, tree val, 
tsubst_flags_t complain)
   if (complain & tf_warning)
 warn_for_address_of_packed_member (type, val);
 
+  /* gimplify_arg elides TARGET_EXPRs that initialize a function argument.  */
+  set_target_expr_eliding (val);
+
   return val;
 }
 
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 06bad4d3303..7782f38da43 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1409,16 +1409,14 @@ digest_init_flags (tree type, tree init, int flags, 
tsubst_flags_t complain)
in the context of guaranteed copy elision).  */
 
 static tree
-replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
+replace_placeholders_for_class_temp_r (tree *tp, int *, void *)
 {
   tree t = *tp;
-  auto pset = static_cast *>(data);
 
   /* We're looking for a TARGET_EXPR nested in the whole expression.  */
   if (TREE_CODE (t) == TARGET_EXPR
   /* That serves as temporary materialization, not an initializer.  */
-  && !TARGET_EXPR_ELIDING_P (t)
-  && !pset->add (t))
+  && !TARGET_EXPR_ELIDING_P (t))
 {
   tree init = TARGET_EXPR_INITIAL (t);
   while (TREE_CODE (init) == COMPOUND_EXPR)
@@ -1433,16 +1431,6 @@ replace_placeholders_for_class_temp_r (tree *tp, int *, 
void *data)
  gcc_checking_assert (!find_placeholders (init));
}
 }
-  /* TARGET_EXPRs initializing function arguments are not marked as eliding,
- even though gimplify_arg drops them on the floor.  Don't go replacing
- placeholders in them.  */
-  else if (TREE_CODE (t) == CALL_EXPR || TREE_CODE (t) == AGGR_INIT_EXPR)
-for (int i = 0; i < call_expr_nargs (t); ++i)
-  {
-   tree arg = get_nth_callarg (t, i);
-   if (TREE_CODE (arg) == TARGET_EXPR && !TARGET_EXPR_ELIDING_P (arg))
- pset->add (arg);
-  }
 
   return NULL_TREE;
 }
@@ -1490,8 +1478,8 @@ digest_nsdmi_init (tree decl, tree init, tsubst_flags_t 
complain)
  temporary materialization does not occur when initializing an object
  from a prvalue of the same type, therefore we must not replace the
  placeholder with a temporary object so that it can be elided.  */
-  hash_set pset;
-  cp_walk_tree (, replace_placeholders_for_class_temp_r, , nullptr);
+  cp_walk_tree_without_duplicates (, 
replace_placeholders_for_class_temp_r,
+  nullptr);
 
   return init;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
index ccb8d81adca..da6fc2030bc 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-diag1.C
@@ -17,4 +17,4 @@ constexpr int b = A().f(); // { dg-error "" }
 
 template 
 constexpr int f (T t) { return 42; }
-constexpr int x = f(B());   // { dg-error "non-literal" }
+constexpr int x = f(B());   // { dg-error "call to non-.constexpr." }

base-commit: 2b2476d4d18c92b8aba3567ebccd2100c2f7c258
-- 
2.45.1



[PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.

gcc/ChangeLog:

* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_attempt_dlstp_transform): New declaration.
* config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
(TARGET_PREDICT_DOLOOP_P): Likewise.
(arm_target_bb_ok_for_lob): Adapt condition.
(arm_mve_get_vctp_lanes): New function.
(arm_dl_usage_type): New internal enum.
(arm_get_required_vpr_reg): New function.
(arm_get_required_vpr_reg_param): New function.
(arm_get_required_vpr_reg_ret_val): New function.
(arm_mve_get_loop_vctp): New function.
(arm_mve_insn_predicated_by): New function.
(arm_mve_across_lane_insn_p): New function.
(arm_mve_load_store_insn_p): New function.
(arm_mve_impl_pred_on_outputs_p): New function.
(arm_mve_impl_pred_on_inputs_p): New function.
(arm_last_vect_def_insn): New function.
(arm_mve_impl_predicated_p): New function.
(arm_mve_check_reg_origin_is_num_elems): New function.
(arm_mve_dlstp_check_inc_counter): New function.
(arm_mve_dlstp_check_dec_counter): New function.
(arm_mve_loop_valid_for_dlstp): New function.
(arm_predict_doloop_p): New function.
(arm_loop_unroll_adjust): New function.
(arm_emit_mve_unpredicated_insn_to_seq): New function.
(arm_attempt_dlstp_transform): New function.
* config/arm/arm.opt (mdlstp): New option.
* config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
(DLSTP, LETP): New iterators.
(predicated_doloop_end_internal): New pattern.
(dlstp_insn): New pattern.
* config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
loops.
(doloop_begin): Likewise.
* config/arm/types.md (mve_misc): New mve type to represent
predicated_loop_end insn sequences.
* config/arm/unspecs.md:
(DLSTP8, DLSTP16, DLSTP32, DSLTP64,
LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lob.h: Add new helpers.
* gcc.target/arm/lob1.c: Use new helpers.
* gcc.target/arm/lob6.c: Likewise.
* gcc.target/arm/dlstp-compile-asm-1.c: New test.
* gcc.target/arm/dlstp-compile-asm-2.c: New test.
* gcc.target/arm/dlstp-compile-asm-3.c: New test.
* gcc.target/arm/dlstp-int8x16.c: New test.
* gcc.target/arm/dlstp-int8x16-run.c: New test.
* gcc.target/arm/dlstp-int16x8.c: New test.
* gcc.target/arm/dlstp-int16x8-run.c: New test.
* gcc.target/arm/dlstp-int32x4.c: New test.
* gcc.target/arm/dlstp-int32x4-run.c: New test.
* gcc.target/arm/dlstp-int64x2.c: New test.
* gcc.target/arm/dlstp-int64x2-run.c: New test.
* gcc.target/arm/dlstp-invalid-asm.c: New test.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/config/arm/arm-protos.h   |4 +-
 gcc/config/arm/arm.cc | 1249 -
 gcc/config/arm/arm.opt|3 +
 gcc/config/arm/iterators.md   |   15 +
 gcc/config/arm/mve.md |   50 +
 gcc/config/arm/thumb2.md  |  138 +-
 gcc/config/arm/types.md   |6 +-
 gcc/config/arm/unspecs.md |   14 +-
 gcc/testsuite/gcc.target/arm/lob.h|  128 +-
 gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
 gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
 .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
 .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
 .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
 .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
 .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
 .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
 .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
 .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
 23 files changed, 3321 insertions(+), 82 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c
 create mode 100644 

[PATCH 1/2] doloop: Add support for predicated vectorized loops

2024-05-23 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops.  Arm is currently the only target that
will make use of this feature.

gcc/ChangeLog:

* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_regno_only_def_find): Declare new function.
* loop-doloop.cc (doloop_condition_get): Add support for detecting
predicated vectorized hardware loops.
(doloop_modify): Add support for GTU condition checks.
(doloop_optimize): Update costing computation to support alterations to
desc->niter_expr by the backend.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/df-core.cc |  15 +
 gcc/df.h   |   1 +
 gcc/loop-doloop.cc | 164 +++--
 3 files changed, 113 insertions(+), 67 deletions(-)

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index f0eb4c93957..b0e8a88d433 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -1964,6 +1964,21 @@ df_bb_regno_last_def_find (basic_block bb, unsigned int regno)
   return NULL;
 }
 
+/* Return the one and only def of REGNO within BB.  If there is no def or
+   there are multiple defs, return NULL.  */
+
+df_ref
+df_bb_regno_only_def_find (basic_block bb, unsigned int regno)
+{
+  df_ref temp = df_bb_regno_first_def_find (bb, regno);
+  if (!temp)
+return NULL;
+  else if (temp == df_bb_regno_last_def_find (bb, regno))
+return temp;
+  else
+return NULL;
+}
+
 /* Finds the reference corresponding to the definition of REG in INSN.
DF is the dataflow object.  */
 
diff --git a/gcc/df.h b/gcc/df.h
index 84e5aa8b524..c4e690b40cf 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -987,6 +987,7 @@ extern void df_check_cfg_clean (void);
 #endif
 extern df_ref df_bb_regno_first_def_find (basic_block, unsigned int);
 extern df_ref df_bb_regno_last_def_find (basic_block, unsigned int);
+extern df_ref df_bb_regno_only_def_find (basic_block, unsigned int);
 extern df_ref df_find_def (rtx_insn *, rtx);
 extern bool df_reg_defined (rtx_insn *, rtx);
 extern df_ref df_find_use (rtx_insn *, rtx);
diff --git a/gcc/loop-doloop.cc b/gcc/loop-doloop.cc
index 529e810e530..8953e1de960 100644
--- a/gcc/loop-doloop.cc
+++ b/gcc/loop-doloop.cc
@@ -85,10 +85,10 @@ doloop_condition_get (rtx_insn *doloop_pat)
  forms:
 
  1)  (parallel [(set (pc) (if_then_else (condition)
-	  			(label_ref (label))
-(pc)))
-	 (set (reg) (plus (reg) (const_int -1)))
-	 (additional clobbers and uses)])
+	(label_ref (label))
+	(pc)))
+		 (set (reg) (plus (reg) (const_int -1)))
+		 (additional clobbers and uses)])
 
  The branch must be the first entry of the parallel (also required
  by jump.cc), and the second entry of the parallel must be a set of
@@ -96,19 +96,33 @@ doloop_condition_get (rtx_insn *doloop_pat)
  the loop counter in an if_then_else too.
 
  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-	 (label_ref (label))
-			 (pc))).  
+	 (set (pc) (if_then_else (reg != 0)
+ (label_ref (label))
+ (pc))).
 
- Some targets (ARM) do the comparison before the branch, as in the
+ 3) Some targets (Arm) do the comparison before the branch, as in the
  following form:
 
- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
-(label_ref (label))
-(pc))) */
-
+ (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
+		(set (reg) (plus (reg) (const_int -1)))])
+ (set (pc) (if_then_else (cc == NE)
+			 (label_ref (label))
+			 (pc)))
+
+  4) This form supports a construct that is used to represent a vectorized
+  do loop with predication, however we do not need to care about the
+  details of the predication here.
+  Arm uses this construct to support MVE tail predication.
+
+  (parallel
+   [(set (pc)
+	 (if_then_else (gtu (plus (reg) (const_int -n))
+(const_int n-1))
+			   (label_ref)
+			   (pc)))
+	(set (reg) (plus (reg) (const_int -n)))
+	(additional clobbers and uses)])
+ */
   pattern = PATTERN (doloop_pat);
 
   if (GET_CODE (pattern) != PARALLEL)
@@ -173,15 +187,17 @@ doloop_condition_get (rtx_insn *doloop_pat)
   if (! REG_P (reg))
 return 0;
 
-  /* Check if something = (plus (reg) (const_int -1)).
+  /* Check if something = (plus (reg) (const_int -n)).
  On IA-64, this decrement is wrapped in an if_then_else.  */
   inc_src = SET_SRC (inc);
   if (GET_CODE (inc_src) == IF_THEN_ELSE)
 inc_src = XEXP (inc_src, 1);
   if (GET_CODE (inc_src) != PLUS
-  || XEXP (inc_src, 0) != reg
-  || XEXP (inc_src, 1) != constm1_rtx)
+  || !rtx_equal_p (XEXP (inc_src, 0), reg)
+  || 

[PATCH 0/2] arm, doloop: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

Hi,

  We held these two patches back in stage 4 because they touched 
target-agnostic code, though I am quite confident they will not affect other 
targets. Given stage one has reopened, I am reposting them, I rebased them but 
they seem to apply cleanly on trunk.
  No changes from previously reviewed patches.

  OK for trunk?

Andre Vieira (2):
  doloop: Add support for predicated vectorized loops
  arm: Add support for MVE Tail-Predicated Low Overhead Loops

 gcc/config/arm/arm-protos.h   |4 +-
 gcc/config/arm/arm.cc | 1249 -
 gcc/config/arm/arm.opt|3 +
 gcc/config/arm/iterators.md   |   15 +
 gcc/config/arm/mve.md |   50 +
 gcc/config/arm/thumb2.md  |  138 +-
 gcc/config/arm/types.md   |6 +-
 gcc/config/arm/unspecs.md |   14 +-
 gcc/df-core.cc|   15 +
 gcc/df.h  |1 +
 gcc/loop-doloop.cc|  164 ++-
 gcc/testsuite/gcc.target/arm/lob.h|  128 +-
 gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
 gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
 .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
 .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
 .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
 .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
 .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
 .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
 .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
 .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
 26 files changed, 3434 insertions(+), 149 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-invalid-asm.c

-- 
2.17.1


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Qing Zhao


> On May 23, 2024, at 10:13, David Malcolm  wrote:
> 
> On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote:
> 
> [...snip...]
> 
>> Is “location_adhoc_data” an available data structure in current GCC?
>> I just searched GCC source tree, cannot find it.
> 
> It's in libcpp/include/line-table.h; see the big comment at the top of
> that file.

Thanks a lot.

I found it in libcpp/include/line-map.h. reading it right now.

Qing
> 
> Dave
> 



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread David Malcolm
On Thu, 2024-05-23 at 14:03 +, Qing Zhao wrote:

[...snip...]

> Is “location_adhoc_data” an available data structure in current GCC?
> I just searched GCC source tree, cannot find it. 

It's in libcpp/include/line-table.h; see the big comment at the top of
that file.

Dave



Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Qing Zhao


> On May 23, 2024, at 07:46, Richard Biener  wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener  
>>> wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
 
 On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> Thanks for the comments and suggestions.
> 
>> On May 15, 2024, at 10:00, David Malcolm 
>> wrote:
>> 
>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>> 
 -Warray-bounds is an important option to enable linux kernal to
 keep
 the array out-of-bound errors out of the source tree.
 
 However, due to the false positive warnings reported in
 PR109071
 (-Warray-bounds false positive warnings due to code duplication
 from
 jump threading), -Warray-bounds=1 cannot be added on by
 default.
 
 Although it's impossible to elinimate all the false positive
 warnings
 from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
 documentation says "always out of bounds"), we should minimize
 the
 false positive warnings in -Warray-bounds=1.
 
 The root reason for the false positive warnings reported in
 PR109071 is:
 
 When the thread jump optimization tries to reduce the # of
 branches
 inside the routine, sometimes it needs to duplicate the code
 and
 split into two conditional pathes. for example:
 
 The original code:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   warn ();
 *ptr = 0;
 *val = sg->vals[index];
 if (index >= 4)
   warn ();
 *ptr = *val;
 
 return;
 }
 
 With the thread jump, the above becomes:
 
 void sparx5_set (int * ptr, struct nums * sg, int index)
 {
 if (index >= 4)
   {
 warn ();
 *ptr = 0; // Code duplications since "warn" does
 return;
 *val = sg->vals[index];   // same this line.
   // In this path, since it's
 under
 the condition
   // "index >= 4", the compiler
 knows
 the value
   // of "index" is larger then 4,
 therefore the
   // out-of-bound warning.
 warn ();
   }
 else
   {
 *ptr = 0;
 *val = sg->vals[index];
   }
 *ptr = *val;
 return;
 }
 
 We can see, after the thread jump optimization, the # of
 branches
 inside
 the routine "sparx5_set" is reduced from 2 to 1, however,  due
 to
 the
 code duplication (which is needed for the correctness of the
 code),
 we
 got a false positive out-of-bound warning.
 
 In order to eliminate such false positive out-of-bound warning,
 
 A. Add one more flag for GIMPLE: is_splitted.
 B. During the thread jump optimization, when the basic blocks
 are
  duplicated, mark all the STMTs inside the original and
 duplicated
  basic blocks as "is_splitted";
 C. Inside the array bound checker, add the following new
 heuristic:
 
 If
  1. the stmt is duplicated and splitted into two conditional
 paths;
 +  2. the warning level < 2;
 +  3. the current block is not dominating the exit block
 Then not report the warning.
 
 The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.
 
 Bootstrapped and regression tested on both x86 and aarch64.
 adjusted
 -Warray-bounds-61.c due to the false positive warnings.
 
 Let me know if you have any comments and suggestions.
>>> 
>>> At the last Cauldron I talked with David Malcolm about these kind
>>> of
>>> issues and thought of instead of suppressing diagnostics to
>>> record
>>> how a block was duplicated.  For jump threading my idea was to
>>> record
>>> the condition that was proved true when entering the path and do
>>> this
>>> by recording the corresponding locations
> 
> Is only recording the location for the TRUE path  enough?
> We might need to record the corresponding locations for both TRUE and
> FALSE paths since the VRP might be more accurate on both paths.
> Is only recording the location is enough?
> Do we need to record the pointer to the original condition stmt?
 
 Just to be 

[PATCH] RISC-V: Avoid splitting store dataref groups during SLP discovery

2024-05-23 Thread Richard Biener
The following avoids splitting store dataref groups during SLP
discovery but instead forces (eventually single-lane) consecutive
lane SLP discovery for all lanes of the group, creating VEC_PERM
SLP nodes merging them so the store will always cover the whole group.

With this for example

int x[1024], y[1024], z[1024], w[1024];
void foo (void)
{
  for (int i = 0; i < 256; i++)
{
  x[4*i+0] = y[2*i+0];
  x[4*i+1] = y[2*i+1];
  x[4*i+2] = z[i];
  x[4*i+3] = w[i];
}
}

which was previously using hybrid SLP can now be fully SLPed and
SSE code generated looks better (but of course you never know,
I didn't actually benchmark).  We of course need a VF of four here.

.L2:
movdqa  z(%rax), %xmm0
movdqa  w(%rax), %xmm4
movdqa  y(%rax,%rax), %xmm2
movdqa  y+16(%rax,%rax), %xmm1
movdqa  %xmm0, %xmm3
punpckhdq   %xmm4, %xmm0
punpckldq   %xmm4, %xmm3
movdqa  %xmm2, %xmm4
shufps  $238, %xmm3, %xmm2
movaps  %xmm2, x+16(,%rax,4)
movdqa  %xmm1, %xmm2
shufps  $68, %xmm3, %xmm4
shufps  $68, %xmm0, %xmm2
movaps  %xmm4, x(,%rax,4)
shufps  $238, %xmm0, %xmm1
movaps  %xmm2, x+32(,%rax,4)
movaps  %xmm1, x+48(,%rax,4)
addq$16, %rax
cmpq$1024, %rax
jne .L2

The extra permute nodes merging distinct branches of the SLP
tree might be unexpected for some code, esp. since
SLP_TREE_REPRESENTATIVE cannot be meaningfully set and we
cannot populate SLP_TREE_SCALAR_STMTS or SLP_TREE_SCALAR_OPS
consistently as we can have a mix of both.

The patch keeps the sub-trees form consecutive lanes but that's
in principle not necessary if we for example have an even/odd
split which now would result in N single-lane sub-trees.  That's
left for future improvements.

The interesting part is how VLA vector ISAs handle merging of
two vectors that's not trivial even/odd merging.  The strathegy
of how to build the permute tree might need adjustments for that
(in the end splitting each branch to single lanes and then doing
even/odd merging would be the brute-force fallback).  Not sure
how much we can or should rely on the SLP optimize pass to handle
this.

The gcc.dg/vect/slp-12a.c case is interesting as we currently split
the 8 store group into lanes 0-5 which we SLP with an unroll factor
of two (on x86-64 with SSE) and the remaining two lanes are using
interleaving vectorization with a final unroll factor of four.  Thus
we're using hybrid SLP within a single store group.  After the change
we discover the same 0-5 lane SLP part as well as two single-lane
parts feeding the full store group.  But that results in a load
permutation that isn't supported (I have WIP patchs to rectify that).
So we end up cancelling SLP and vectorizing the whole loop with
interleaving which is IMO good and results in better code.

This is similar for gcc.target/i386/pr52252-atom.c where interleaving
generates much better code than hybrid SLP.  I'm unsure how to update
the testcase though.

gcc.dg/vect/slp-21.c runs into similar situations.  Note that when
when analyzing SLP operations we discard an instance we currently
force the full loop to have no SLP because hybrid detection is
broken.  It's probably not worth fixing this at this moment.

For gcc.dg/vect/pr97428.c we are not splitting the 16 store group
into two but merge the two 8 lane loads into one before doing the
store and thus have only a single SLP instance.  A similar situation
happens in gcc.dg/vect/slp-11c.c but the branches feeding the
single SLP store only have a single lane.  Likewise for
gcc.dg/vect/vect-complex-5.c and gcc.dg/vect/vect-gather-2.c.

gcc.dg/vect/slp-cond-1.c has an additional SLP vectorization
with a SLP store group of size two but two single-lane branches.

(merged with the testsuite changes, re-posted because the RISC-V
CI ran on a tree w/o a fix, hopefully fixing all the reported
ICEs)

* tree-vect-slp.cc (vect_build_slp_instance): Do not split
store dataref groups on loop SLP discovery failure but create
a single SLP instance for the stores but branch to SLP sub-trees
and merge with a series of VEC_PERM nodes.

* gcc.dg/vect/pr97428.c: Expect a single store SLP group.
* gcc.dg/vect/slp-11c.c: Likewise, if !vect_load_lanes.
* gcc.dg/vect/vect-complex-5.c: Likewise.
* gcc.dg/vect/slp-12a.c: Do not expect SLP.
* gcc.dg/vect/slp-21.c: Remove not important scanning for SLP.
* gcc.dg/vect/slp-cond-1.c: Expect one more SLP if !vect_load_lanes.
* gcc.dg/vect/vect-gather-2.c: Expect SLP to be used.
* gcc.target/i386/pr52252-atom.c: XFAIL test for palignr.
---
 gcc/testsuite/gcc.dg/vect/pr97428.c  |   2 +-
 gcc/testsuite/gcc.dg/vect/slp-11c.c  |   6 +-
 gcc/testsuite/gcc.dg/vect/slp-12a.c  |   6 +-
 gcc/testsuite/gcc.dg/vect/slp-21.c   |  18 +-
 

Re: [PATCH] wwwdocs: contribute.html: Update consensus on patch content.

2024-05-23 Thread Christophe Lyon
On Mon, 20 May 2024 at 15:23, Nick Clifton  wrote:
>
> Hi Christophe,
>
> > I have a follow-up one: I think the same applies to binutils, but I
> > don't think any maintainer / contributor expressed an opinion, and
> > IIUC patch policy for binutils is (lightly) documented at
> > https://sourceware.org/binutils/wiki/HowToContribute
> > Maybe Nick can update it?
>
> Done.

Thanks!

>
> > (I don't have such rights)
>
> Would you like them ?  It is easy enough to set up.
>
No need to bother :-)

Christophe

> Cheers
>Nick
>
>


[PATCH] tree-optimization/115197 - fix ICE w/ constant in LC PHI and loop distribution

2024-05-23 Thread Richard Biener
Forgot a check for an SSA name before trying to replace a PHI arg with
its current definition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/115197
* tree-loop-distribution.cc (copy_loop_before): Constant PHI
args remain the same.

* gcc.dg/pr115197.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr115197.c | 14 ++
 gcc/tree-loop-distribution.cc   |  7 +--
 2 files changed, 19 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr115197.c

diff --git a/gcc/testsuite/gcc.dg/pr115197.c b/gcc/testsuite/gcc.dg/pr115197.c
new file mode 100644
index 000..00d674b3bd9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr115197.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fno-tree-scev-cprop -ftree-pre 
-ftree-loop-distribute-patterns" } */
+
+int a, b[2], c, d, e, f[2];
+int main() {
+  while (a)
+if (d) {
+  if (e)
+return 0;
+  for (; c; c++)
+f[c] = 0 < (b[c] = ~(f[c + 1] < a));
+}
+  return 0;
+}
diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc
index 668dc420449..4d1ed234fcb 100644
--- a/gcc/tree-loop-distribution.cc
+++ b/gcc/tree-loop-distribution.cc
@@ -977,8 +977,11 @@ copy_loop_before (class loop *loop, bool 
redirect_lc_phi_defs)
  if (virtual_operand_p (gimple_phi_result (phi)))
continue;
  use_operand_p use_p = PHI_ARG_DEF_PTR_FROM_EDGE (phi, exit);
- tree new_def = get_current_def (USE_FROM_PTR (use_p));
- SET_USE (use_p, new_def);
+ if (TREE_CODE (USE_FROM_PTR (use_p)) == SSA_NAME)
+   {
+ tree new_def = get_current_def (USE_FROM_PTR (use_p));
+ SET_USE (use_p, new_def);
+   }
}
 }
 
-- 
2.35.3


Re: [PATCH] [testsuite] conditionalize dg-additional-sources on target and type

2024-05-23 Thread Christophe Lyon
Hi Alexandre,


On Thu, 23 May 2024 at 15:29, Alexandre Oliva  wrote:
>
> On Apr 30, 2024, Christophe Lyon  wrote:
>
> > On Tue, 30 Apr 2024 at 01:31, Alexandre Oliva  wrote:
> >> >> for  gcc/testsuite/ChangeLog
> >> >>
> >> >> * lib/target-supports.exp (check_vect_support_and_set_flags):
> >> >> Decay to link rather than compile.
> >>
> >> Alas, linking may fail because of an incompatible libc, as reported by
> >> Linaro with a link to their issue GNU-1206 (I'm not posting the link to
> >> the fully-Javascrippled Jira web page; it shows nothing useful, and I
> >> can't post feedback there) and to
> >> https://ci.linaro.org/job/tcwg_gnu_embed_check_gcc--master-thumb_m7_hard_eabi-build/10/artifact/artifacts/00-sumfiles/
> >> (where I could get useful information)
> >>
> >> I'm reverting the patch, and I'll see about some alternate approach
>
> > Indeed, that's another instance of the tricky multilibs configuration 
> > issues.
>
> > - we run the tests with
> > qemu/-mthumb/-march=armv7e-m+fp.dp/-mtune=cortex-m7/-mfloat-abi=hard/-mfpu=auto
> > which matches the GCC configuration flags,
> > but the vect.exp tests add -mfpu=neon -mfloat-abi=softfp -march=armv7-a
> > and link fails because the toolchain does not support softfp libs
>
> Hello, Christophe, thanks for the info.
>
> I came up with an entirely different approach:
>
>
> g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when
> check_vect_support_and_set_flags finds vector support lacking for
> execution tests: tests decay to compile tests, and additional sources
> are rejected by the compiler when compiling to a named output file.
>
> At first I considered using some effective target to conditionalize
> the additional sources.  There was no support for target-specific
> additional sources, so I added that.
>
> But then, I found that adding an effective target to check whether the
> test involves linking would just make for busy work in this case, and
> so I went ahead and adjusted the handling of additional sources to
> refrain from adding them on compile tests, reporting them as
> unsupported.
>
> That solves the problem without using the newly-added machinery for
> per-target additional sources, but I figured since I'd implemented it
> I might as well contribute it, since there might be other uses for it.

Thanks for improving this, LGTM at quick glance, but I can't approve :-)

Christophe

>
> Regstrapped on x86_64-linux-gnu.  Also tested on ppc64-vx7r2 with
> gcc-13.  Ok to install?
>
>
> for  gcc/ChangeLog
>
> * doc/sourcebuild.texi (dg-additional-sources): Document
> newly-added support for target selectors, and implicit discard
> on non-linking tests that name the compiler output explicitly.
>
> for  gcc/testsuite/ChangeLog
>
> * lib/gcc-defs.exp (dg-additional-sources): Support target
> selectors.  Make it cumulative.
> (dg-additional-files-options): Take dest and type.  Note
> unsupported additional sources when not linking and naming the
> compiler output.  Adjust source dirname prepending to cope
> with leading blanks.
> * lib/g++.exp (g++_target_compile): Pass dest and type on to
> dg-additional-files-options.
> * lib/gcc.exp (gcc_target_compile): Likewise.
> * lib/gdc.exp (gdb_target_compile): Likewise.
> * lib/gfortran.exp (gfortran_target_compile): Likewise.
> * lib/go.exp (go_target_compile): Likewise.
> * lib/obj-c++.exp (obj-c++_target_compile): Likewise.
> * lib/objc.exp (objc_target_compile): Likewise.
> * lib/rust.exp (rust_target_compile): Likewise.
> * lib/profopt.exp (profopt-execute): Likewise-ish.
> ---
>  gcc/doc/sourcebuild.texi   |8 +++-
>  gcc/testsuite/lib/g++.exp  |2 +-
>  gcc/testsuite/lib/gcc-defs.exp |   35 ++-
>  gcc/testsuite/lib/gcc.exp  |2 +-
>  gcc/testsuite/lib/gdc.exp  |2 +-
>  gcc/testsuite/lib/gfortran.exp |2 +-
>  gcc/testsuite/lib/go.exp   |2 +-
>  gcc/testsuite/lib/obj-c++.exp  |2 +-
>  gcc/testsuite/lib/objc.exp |2 +-
>  gcc/testsuite/lib/profopt.exp  |2 +-
>  gcc/testsuite/lib/rust.exp |2 +-
>  11 files changed, 46 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 8e4e59ac44c74..e997dbec3334b 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1320,9 +1320,15 @@ to @var{var_value} before execution of the program 
> created by the test.
>  Specify additional files, other than source files, that must be copied
>  to the system where the compiler runs.
>
> -@item @{ dg-additional-sources "@var{filelist}" @}
> +@item @{ dg-additional-sources "@var{filelist}" [@{ target @var{selector} 
> @}] @}
>  Specify additional source files to appear in the compile line
>  following the main test file.
> +If the directive includes the optional @samp{@{ @var{selector} 

Re: [PATCH] c++/modules: Ensure all partial specialisations are tracked [PR114947]

2024-05-23 Thread Nathaniel Shead
On Sun, May 12, 2024 at 11:29:39PM +1000, Nathaniel Shead wrote:
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> Constrained partial specialisations aren't all necessarily tracked on
> the instantiation table.  The modules code uses a separate
> 'partial_specializations' table to track them instead to ensure that
> they get walked and emitted when emitting a module, but currently this
> does not always happen.
> 
> The attached testcase fails in two ways.  First, because the partial
> specialisation is just a declaration (and not a definition),
> 'set_defining_module' never ends up getting called on it and so it never
> gets added to the partial specialisation table.  We fix this by ensuring
> that when partial specializations are created they always get added, and
> so we never miss one. To prevent adding partial specialisations multiple
> times we split this out as a new function.
> 
> The second way it fails is that when exporting the primary interface for
> a module with partitions, we also re-walk the specializations of all
> imported partitions to merge them into a single BMI.  So this patch
> ensures that after calling 'match_mergeable_specialization' we also
> ensure that if the name came from a partition it gets added to the
> specialization table so that a dependency is correctly created for it.
> 
>   PR c++/114947
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (set_defining_module_for_partial_spec): Declare.
>   * module.cc (trees_in::decl_value): Track partial specs coming
>   from partitions.
>   (set_defining_module): Don't track partial specialisations here
>   anymore.
>   (set_defining_module_for_partial_spec): New function.
>   * pt.cc (process_partial_specialization): Call it.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/partial-4_a.C: New test.
>   * g++.dg/modules/partial-4_b.C: New test.
> 
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/cp/cp-tree.h   |  1 +
>  gcc/cp/module.cc   | 22 ++
>  gcc/cp/pt.cc   |  2 ++
>  gcc/testsuite/g++.dg/modules/partial-4_a.C |  8 
>  gcc/testsuite/g++.dg/modules/partial-4_b.C |  5 +
>  5 files changed, 34 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/partial-4_a.C
>  create mode 100644 gcc/testsuite/g++.dg/modules/partial-4_b.C
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index db098c32f2d..2580bf05fb2 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7418,6 +7418,7 @@ extern unsigned get_importing_module (tree, bool = 
> false) ATTRIBUTE_PURE;
>  /* Where current instance of the decl got declared/defined/instantiated.  */
>  extern void set_instantiating_module (tree);
>  extern void set_defining_module (tree);
> +extern void set_defining_module_for_partial_spec (tree);
>  extern void maybe_key_decl (tree ctx, tree decl);
>  extern void propagate_defining_module (tree decl, tree orig);
>  extern void remove_defining_module (tree decl);
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 520dd710549..3ca963cb3e9 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -8416,6 +8416,11 @@ trees_in::decl_value ()
> add_mergeable_specialization (!is_type, , decl, spec_flags);
>   }
>  
> +  /* When making a CMI from a partition we're going to need to walk 
> partial
> +  specializations again, so make sure they're tracked.  */
> +  if (state->is_partition () && (spec_flags & 2))
> + set_defining_module_for_partial_spec (inner);
> +
>if (NAMESPACE_SCOPE_P (decl)
> && (mk == MK_named || mk == MK_unique
> || mk == MK_enum || mk == MK_friend_spec)
> @@ -19246,13 +19251,22 @@ set_defining_module (tree decl)
> vec_safe_push (class_members, decl);
>   }
>   }
> -  else if (DECL_IMPLICIT_TYPEDEF_P (decl)
> -&& CLASSTYPE_TEMPLATE_SPECIALIZATION (TREE_TYPE (decl)))
> - /* This is a partial or explicit specialization.  */
> - vec_safe_push (partial_specializations, decl);
>  }
>  }
>  
> +/* Also remember DECL if it's a newly declared class template partial
> +   specialization, because these are not necessarily added to the
> +   instantiation tables.  */
> +
> +void
> +set_defining_module_for_partial_spec (tree decl)
> +{
> +  if (module_p ()

...but this can be 'module_maybe_has_cmi_p' if my change to
'module_has_cmi_p' to include header units is merged first.

> +  && DECL_IMPLICIT_TYPEDEF_P (decl)
> +  && CLASSTYPE_TEMPLATE_SPECIALIZATION (TREE_TYPE (decl)))
> +vec_safe_push (partial_specializations, decl);
> +}
> +
>  void
>  set_originating_module (tree decl, bool friend_p ATTRIBUTE_UNUSED)
>  {
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 1816bfd1f40..6d33bac90b0 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -5456,6 +5456,8 @@ process_partial_specialization (tree 

Re: [PATCH] Avoid vector -Wfree-nonheap-object warnings

2024-05-23 Thread Jonathan Wakely

On 23/05/24 06:55 +0200, François Dumont wrote:

As explained in this email:

https://gcc.gnu.org/pipermail/libstdc++/2024-April/058552.html

I experimented -Wfree-nonheap-object because of my enhancements on algos.

So here is a patch to extend the usage of the _Guard type to other 
parts of vector.


Nice, that fixes the warning you were seeing?

We recently got a bug report about -Wfree-nonheap-object in
std::vector, but that is coming from _M_realloc_append which already
uses the RAII guard :-(
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115016


    libstdc++: Use RAII to replace try/catch blocks

    Move _Guard into std::vector declaration and use it to guard all 
calls to

    vector _M_allocate.

    Doing so the compiler has more visibility on what is done with the 
pointers

    and do not raise anymore the -Wfree-nonheap-object warning.

    libstdc++-v3/ChangeLog:

    * include/bits/vector.tcc (_Guard): Move...
    * include/bits/stl_vector.h: ...here.
    (_M_allocate_and_copy): Use latter.
    (_M_initialize_dispatch): Likewise and set _M_finish first 
from the result

    of __uninitialize_fill_n_a that can throw.
    (_M_range_initialize): Likewise.

Tested under Linux x86_64, ok to commit ?

François




diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 31169711a48..4ea74e3339a 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -1607,6 +1607,39 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  clear() _GLIBCXX_NOEXCEPT
  { _M_erase_at_end(this->_M_impl._M_start); }

+private:
+  // RAII guard for allocated storage.
+  struct _Guard


If it's being defined at class scope instead of locally in a member
function, I think a better name would be good. Maybe _Ptr_guard or
_Dealloc_guard or something.


+  {
+   pointer _M_storage; // Storage to deallocate
+   size_type _M_len;
+   _Base& _M_vect;
+
+   _GLIBCXX20_CONSTEXPR
+   _Guard(pointer __s, size_type __l, _Base& __vect)
+   : _M_storage(__s), _M_len(__l), _M_vect(__vect)
+   { }
+
+   _GLIBCXX20_CONSTEXPR
+   ~_Guard()
+   {
+ if (_M_storage)
+   _M_vect._M_deallocate(_M_storage, _M_len);
+   }
+
+   _GLIBCXX20_CONSTEXPR
+   pointer
+   _M_release()
+   {
+ pointer __res = _M_storage;
+ _M_storage = 0;


I don't think the NullablePointer requirements include assigning 0,
only from nullptr, which isn't valid in C++98.

https://en.cppreference.com/w/cpp/named_req/NullablePointer

Please use _M_storage = pointer() instead.


+ return __res;
+   }
+
+  private:
+   _Guard(const _Guard&);
+  };
+
protected:
  /**
   *  Memory expansion handler.  Uses the member allocation function to
@@ -1618,18 +1651,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_M_allocate_and_copy(size_type __n,
 _ForwardIterator __first, _ForwardIterator __last)
{
- pointer __result = this->_M_allocate(__n);
- __try
-   {
- std::__uninitialized_copy_a(__first, __last, __result,
- _M_get_Tp_allocator());
- return __result;
-   }
- __catch(...)
-   {
- _M_deallocate(__result, __n);
- __throw_exception_again;
-   }
+ _Guard __guard(this->_M_allocate(__n), __n, *this);
+ std::__uninitialized_copy_a
+   (__first, __last, __guard._M_storage, _M_get_Tp_allocator());
+ return __guard._M_release();
}


@@ -1642,13 +1667,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  // 438. Ambiguity in the "do the right thing" clause
  template
void
-   _M_initialize_dispatch(_Integer __n, _Integer __value, __true_type)
+   _M_initialize_dispatch(_Integer __int_n, _Integer __value, __true_type)
{
- this->_M_impl._M_start = _M_allocate(_S_check_init_len(
-   static_cast(__n), _M_get_Tp_allocator()));
- this->_M_impl._M_end_of_storage =
-   this->_M_impl._M_start + static_cast(__n);
- _M_fill_initialize(static_cast(__n), __value);


Please fix the comment on _M_fill_initialize if you're removing the
use of it here.


+ const size_type __n = static_cast(__int_n);
+ _Guard __guard(_M_allocate(_S_check_init_len(
+   __n, _M_get_Tp_allocator())), __n, *this);


I think this would be easier to read if the _S_check_init_len call was
done first, and maybe the allocation too, since we are going to need a
local __start later anyway. So maybe like this:

  template
void
_M_initialize_dispatch(_Integer __ni, _Integer __value, __true_type)
{
  const size_type __n = static_cast(__ni);
  pointer __start = _M_allocate(_S_check_init_len(__n),

[PATCH 2/2] c++/modules: Remember that header units have CMIs

2024-05-23 Thread Nathaniel Shead
And here's that patch.  As far as I can tell there should be no visible
change anymore, so there aren't any testcases.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

This appears to be an oversight in the definition of module_has_cmi_p.
This change will allow us to use the function directly in more places
that need to additional work only if generating a module CMI in the
future, allowing us to do additional work only when we know we need it.

gcc/cp/ChangeLog:

* cp-tree.h (module_has_cmi_p): Also include header units.
(module_maybe_has_cmi_p): Update comment.
* module.cc (set_defining_module): Only need to track
declarations for later exporting if the module may have a CMI.
* name-lookup.cc (pushdecl): Likewise.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/cp-tree.h  | 7 +++
 gcc/cp/module.cc  | 2 +-
 gcc/cp/name-lookup.cc | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ba9e848c177..9472759d3c8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7381,7 +7381,7 @@ inline bool module_interface_p ()
 inline bool module_partition_p ()
 { return module_kind & MK_PARTITION; }
 inline bool module_has_cmi_p ()
-{ return module_kind & (MK_INTERFACE | MK_PARTITION); }
+{ return module_kind & (MK_INTERFACE | MK_PARTITION | MK_HEADER); }
 
 inline bool module_purview_p ()
 { return module_kind & MK_PURVIEW; }
@@ -7393,9 +7393,8 @@ inline bool named_module_purview_p ()
 inline bool named_module_attach_p ()
 { return named_module_p () && module_attach_p (); }
 
-/* We don't know if this TU will have a CMI while parsing the GMF,
-   so tentatively assume that it might, for the purpose of determining
-   whether no-linkage decls could be used by an importer.  */
+/* Like module_has_cmi_p, but tentatively assumes that this TU may have a
+   CMI if we haven't seen the module-declaration yet.  */
 inline bool module_maybe_has_cmi_p ()
 { return module_has_cmi_p () || (named_module_p () && !module_purview_p ()); }
 
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 520dd710549..8639ed6f1a2 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -19216,7 +19216,7 @@ set_defining_module (tree decl)
   gcc_checking_assert (!DECL_LANG_SPECIFIC (decl)
   || !DECL_MODULE_IMPORT_P (decl));
 
-  if (module_p ())
+  if (module_maybe_has_cmi_p ())
 {
   /* We need to track all declarations within a module, not just those
 in the module purview, because we don't necessarily know yet if
diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 78f08acffaa..f1f8c19feb1 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -4103,7 +4103,7 @@ pushdecl (tree decl, bool hiding)
 
  if (level->kind == sk_namespace
  && TREE_PUBLIC (level->this_entity)
- && module_p ())
+ && module_maybe_has_cmi_p ())
maybe_record_mergeable_decl (slot, name, decl);
}
 }
-- 
2.43.2



[PATCH] [testsuite] conditionalize dg-additional-sources on target and type

2024-05-23 Thread Alexandre Oliva
On Apr 30, 2024, Christophe Lyon  wrote:

> On Tue, 30 Apr 2024 at 01:31, Alexandre Oliva  wrote:
>> >> for  gcc/testsuite/ChangeLog
>> >>
>> >> * lib/target-supports.exp (check_vect_support_and_set_flags):
>> >> Decay to link rather than compile.
>> 
>> Alas, linking may fail because of an incompatible libc, as reported by
>> Linaro with a link to their issue GNU-1206 (I'm not posting the link to
>> the fully-Javascrippled Jira web page; it shows nothing useful, and I
>> can't post feedback there) and to
>> https://ci.linaro.org/job/tcwg_gnu_embed_check_gcc--master-thumb_m7_hard_eabi-build/10/artifact/artifacts/00-sumfiles/
>> (where I could get useful information)
>> 
>> I'm reverting the patch, and I'll see about some alternate approach

> Indeed, that's another instance of the tricky multilibs configuration issues.

> - we run the tests with
> qemu/-mthumb/-march=armv7e-m+fp.dp/-mtune=cortex-m7/-mfloat-abi=hard/-mfpu=auto
> which matches the GCC configuration flags,
> but the vect.exp tests add -mfpu=neon -mfloat-abi=softfp -march=armv7-a
> and link fails because the toolchain does not support softfp libs

Hello, Christophe, thanks for the info.

I came up with an entirely different approach:


g++.dg/vect/pr95401.cc has dg-additional-sources, and that fails when
check_vect_support_and_set_flags finds vector support lacking for
execution tests: tests decay to compile tests, and additional sources
are rejected by the compiler when compiling to a named output file.

At first I considered using some effective target to conditionalize
the additional sources.  There was no support for target-specific
additional sources, so I added that.

But then, I found that adding an effective target to check whether the
test involves linking would just make for busy work in this case, and
so I went ahead and adjusted the handling of additional sources to
refrain from adding them on compile tests, reporting them as
unsupported.

That solves the problem without using the newly-added machinery for
per-target additional sources, but I figured since I'd implemented it
I might as well contribute it, since there might be other uses for it.

Regstrapped on x86_64-linux-gnu.  Also tested on ppc64-vx7r2 with
gcc-13.  Ok to install?


for  gcc/ChangeLog

* doc/sourcebuild.texi (dg-additional-sources): Document
newly-added support for target selectors, and implicit discard
on non-linking tests that name the compiler output explicitly.

for  gcc/testsuite/ChangeLog

* lib/gcc-defs.exp (dg-additional-sources): Support target
selectors.  Make it cumulative.
(dg-additional-files-options): Take dest and type.  Note
unsupported additional sources when not linking and naming the
compiler output.  Adjust source dirname prepending to cope
with leading blanks.
* lib/g++.exp (g++_target_compile): Pass dest and type on to
dg-additional-files-options.
* lib/gcc.exp (gcc_target_compile): Likewise.
* lib/gdc.exp (gdb_target_compile): Likewise.
* lib/gfortran.exp (gfortran_target_compile): Likewise.
* lib/go.exp (go_target_compile): Likewise.
* lib/obj-c++.exp (obj-c++_target_compile): Likewise.
* lib/objc.exp (objc_target_compile): Likewise.
* lib/rust.exp (rust_target_compile): Likewise.
* lib/profopt.exp (profopt-execute): Likewise-ish.
---
 gcc/doc/sourcebuild.texi   |8 +++-
 gcc/testsuite/lib/g++.exp  |2 +-
 gcc/testsuite/lib/gcc-defs.exp |   35 ++-
 gcc/testsuite/lib/gcc.exp  |2 +-
 gcc/testsuite/lib/gdc.exp  |2 +-
 gcc/testsuite/lib/gfortran.exp |2 +-
 gcc/testsuite/lib/go.exp   |2 +-
 gcc/testsuite/lib/obj-c++.exp  |2 +-
 gcc/testsuite/lib/objc.exp |2 +-
 gcc/testsuite/lib/profopt.exp  |2 +-
 gcc/testsuite/lib/rust.exp |2 +-
 11 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 8e4e59ac44c74..e997dbec3334b 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1320,9 +1320,15 @@ to @var{var_value} before execution of the program 
created by the test.
 Specify additional files, other than source files, that must be copied
 to the system where the compiler runs.
 
-@item @{ dg-additional-sources "@var{filelist}" @}
+@item @{ dg-additional-sources "@var{filelist}" [@{ target @var{selector} @}] 
@}
 Specify additional source files to appear in the compile line
 following the main test file.
+If the directive includes the optional @samp{@{ @var{selector} @}}
+then the additional sources are only added if the target system
+matches the @var{selector}.
+Additional sources are generally used only in @samp{link} and @samp{run}
+tests; they are reported as unsupported and discarded in other kinds of
+tests that direct the compiler to output to a single file.
 @end table
 
 @subsubsection Add checks at the end 

[PATCH 1/2] c++/modules: Fix treatment of unnamed types

2024-05-23 Thread Nathaniel Shead
On Mon, May 20, 2024 at 06:00:09PM -0400, Jason Merrill wrote:
> On 5/17/24 02:14, Nathaniel Shead wrote:
> > On Tue, May 14, 2024 at 06:21:48PM -0400, Jason Merrill wrote:
> > > On 5/12/24 22:58, Nathaniel Shead wrote:
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > 
> > > OK.
> > > 
> > 
> > I realised as I was looking over this again that I might have spoken too
> > soon with the header unit example being supported. Doing the following:
> > 
> >// a.H
> >struct { int y; } s;
> >decltype(s) f(decltype(s));  // { dg-error "used but never defined" }
> >inline auto x = f({ 123 });
> >// b.C
> >struct {} unrelated;
> >import "a.H";
> >decltype(s) f(decltype(s) x) {
> >  return { 456 + x.y };
> >}
> > 
> >// c.C
> >import "linkage-3_a.H";
> >int main() { auto a = x.y; }
> > 
> > Actually does fail to link, because in 'c.C' we call 'f(.anon_0)' but
> > the definition 'b.C' is f(.anon_1).
> > 
> > I don't think this is fixable, so I don't think this direction is
> > workable.
> 
> Since namespace-scope anonymous types are TU-local, we don't need to support
> that for proper modules, but it's not clear to me that we don't need to
> support it for header units.
> 
> OTOH, https://eel.is/c++draft/module#import-5.3 allows c.C to import a
> different header unit than b.C, in which case the type is different and x
> violates the odr.
> 

Right; I think at this stage I don't know how to support this for header
units (and even for module interface units it doesn't actually work;
more on this below), so I think saying that this is actually an ODR
violation is OK.

> > That said, I think that it might still be worth making header modules
> > satisfy 'module_has_cmi_p', since that is true to the name, and will be
> > useful in other places we currently use 'module_p ()': in which case we
> > could instead make all the callers in 'no_linkage_check' do
> > 'module_maybe_has_cmi_p () && !header_module_p ()'; something like the
> > following, perhaps?
> 
> If we need that condition, it should be its own predicate rather than
> expecting callers to do that combined check.
> 
> But it's not clear to me how this is different from a type in the GMF of a
> named module, which is exactly the maybe_has_cmi case; there we could again
> see a different version of the type if another TU includes the header.
> 
> Jason
> 

This made me go back and double-check for named modules and it actually
does fail as well; the following sample ICEs, even:

  export module M;
  struct {} s;
  int h(decltype(s));
  int x = h(s);  // ICE in write_unnamed_type_name, cp/mangle.cc:1806

So I think maybe the way to go here is to just not treat unnamed types
as something that could possibly be accessed from a different TU, like
the below.  Then we don't need to do the special handling for header
units, since as you say, they're not materially different anyway.
Thoughts?

(And I've moved the original change to 'module_has_cmi_p' to a separate
patch given it's somewhat unrelated now.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk (and
maybe 14.2)?

-- >8 --

In r14-9530 we relaxed "depending on type with no-linkage" errors for
declarations that could actually be accessed from different TUs anyway.
However, this also enabled it for unnamed types, which never work.

In a normal module interface, an unnamed type is TU-local by
[basic.link] p15.2, and so cannot be exposed or the program is
ill-formed.  We don't yet implement this checking but we should assume
that we will later; currently supporting this actually causes ICEs when
attempting to create the mangled name in some situations.

For a header unit, by [module.import] p5.3 it is unspecified whether two
TUs importing a header unit providing such a declaration are importing
the same header unit.  In this case, we would require name mangling
changes to somehow allow the (anonymous) type exported by such a header
unit to correspond across different TUs in the presence of other
anonymous declarations, so for this patch just assume that this case
would be an ODR violation instead.

gcc/cp/ChangeLog:

* tree.cc (no_linkage_check): Anonymous types can't be accessed
in a different TU.

gcc/testsuite/ChangeLog:

* g++.dg/modules/linkage-1_a.C: Remove anonymous type test.
* g++.dg/modules/linkage-1_b.C: Likewise.
* g++.dg/modules/linkage-1_c.C: Likewise.
* g++.dg/modules/linkage-2.C: Add note about anonymous types.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/tree.cc | 10 +-
 gcc/testsuite/g++.dg/modules/linkage-1_a.C |  4 
 gcc/testsuite/g++.dg/modules/linkage-1_b.C |  1 -
 gcc/testsuite/g++.dg/modules/linkage-1_c.C |  1 -
 gcc/testsuite/g++.dg/modules/linkage-2.C   |  9 +
 5 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 9d37d255d8d..e2d0d3229c1 100644
--- 

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-05-23 Thread Alexandre Oliva
On Apr 29, 2024, "Kewen.Lin"  wrote:

> I think you can still push the patch as the testing just exposes
> another issue.

ACK, thanks, I've just confirmed that the problem I reported on
ppc64el-linux-gnu didn't come up when testing on ppc64-vx7r2 with a
non-power8 emulated cpu, so I'm going to install it.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] RISC-V: Fix missing boolean_expression in zmmul extension

2024-05-23 Thread Kito Cheng
Could you add a testcase to make sure zmmul will generate mul instruction?

Liao Shihua  於 2024年5月23日 週四 18:48 寫道:

> Missing boolean_expression TARGET_ZMMUL in riscv_rtx_costs() casuse
> different instructions when multiplying an integer with a constant.
> ( https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1482 )
>
> int foo(int *ib) {
> *ib = *ib * 33938;
> return 0;
> }
>
> rv64im:
> lw  a4,0(a1)
> li  a5,32768
> addiw   a5,a5,1170
> mulwa5,a5,a4
> sw  a5,0(a1)
> ret
>
> rv64i_zmmul:
> lw  a4,0(a1)
> slliw   a5,a4,5
> addwa5,a5,a4
> slliw   a5,a5,3
> addwa5,a5,a4
> slliw   a5,a5,3
> addwa5,a5,a4
> slliw   a5,a5,3
> addwa5,a5,a4
> slliw   a5,a5,1
> sw  a5,0(a1)
> ret
>
> Fixed.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ZMMUL.
>
> ---
>  gcc/config/riscv/riscv.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 85df5b7ab49..580ae007181 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3753,7 +3753,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int
> outer_code, int opno ATTRIBUTE_UN
>  case MULT:
>if (float_mode_p)
> *total = tune_param->fp_mul[mode == DFmode];
> -  else if (!TARGET_MUL)
> +  else if (!(TARGET_MUL || TARGET_ZMMUL))
> /* Estimate the cost of a library call.  */
> *total = COSTS_N_INSNS (speed ? 32 : 6);
>else if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD)
> --
> 2.34.1
>
>


Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-23 Thread Richard Biener
On Thu, May 23, 2024 at 1:08 PM Li, Pan2  wrote:
>
> I have a try to convert the PHI from Part-A to Part-B, aka PHI to _2 = 
> phi_cond ? _1 : 255.
> And then we can do the matching on COND_EXPR in the underlying widen-mul pass.
>
> Unfortunately, meet some ICE when verify_gimple_phi in sccopy1 pass =>
> sat_add.c:66:1: internal compiler error: tree check: expected class ‘type’, 
> have ‘exceptional’ (error_mark) in useless_type_conversion_p, at 
> gimple-expr.cc:86

Likely you have released _2, more comments below on your previous mail.

> will go on to see if this works or not.
>
> Part-A:
> uint8_t sat_add_u_1_uint8_t (uint8_t x, uint8_t y)
> {
>   unsigned char _1;
>   uint8_t _2;
>
>:
>   _1 = x_3(D) + y_4(D);
>   if (_1 >= x_3(D))
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>
>:
>   # _2 = PHI <255(2), _1(3)>
>   return _2;
>
> }
>
> Part-B:
> uint8_t sat_add_u_1_uint8_t (uint8_t x, uint8_t y)
> {
>   unsigned char _1;
>   _Bool phi_cond_6;
>
>:
>   _1 = x_3(D) + y_4(D);
>   phi_cond_6 = _1 >= x_3(D);
>   _2 = phi_cond_6 ? _1 : 255;
>   return _2;
>
> }
>
> -Original Message-
> From: Li, Pan2
> Sent: Thursday, May 23, 2024 12:17 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> tamar.christ...@arm.com; pins...@gmail.com
> Subject: RE: [PATCH v2] Match: Support __builtin_add_overflow branch form for 
> unsigned SAT_ADD
>
> Thanks Richard for reviewing.
>
> > I'm not convinced we should match this during early if-conversion, should 
> > we?
> > The middle-end doesn't really know .SAT_ADD but some handling of
> > .ADD_OVERFLOW is present.
>
> I tried to do the branch (aka cond) match in widen-mult pass similar as 
> previous branchless form.
> Unfortunately, the branch will be converted to PHI when widen-mult, thus try 
> to bypass the PHI handling
> and convert the branch form to the branchless form in v2.
>
> > But please add a comment before the new pattern, esp. since it's
> > non-obvious that this is an improvent.
>
> Sure thing.
>
> > I suspect you rely on this form being recognized as .SAT_ADD later but
> > what prevents us from breaking this?  Why not convert it to .SAT_ADD
> > immediately?  If this is because the ISEL pass (or the widen-mult pass)
> > cannot handle PHIs then I would suggest to split out enough parts of
> > tree-ssa-phiopt.cc to be able to query match.pd for COND_EXPRs.
>
> Yes, this is sort of redundant, we can also convert it to .SAT_ADD 
> immediately in match.pd before widen-mult.
>
> Sorry I may get confused here, for branch form like below, what transform 
> should we perform in phiopt?
> The gimple_simplify_phiopt mostly leverage the simplify in match.pd but we 
> may hit the simplify in the
> other early pass.
>
> Or we can leverage branch version of unsigned_integer_sat_add gimple match in 
> phiopt and generate the gimple call .SAT_ADD
> In phiopt (mostly like what we do in widen-mult).
> Not sure if my understanding is correct or not, thanks again for help.

The trick for widen-mult (or ISEL) would be to try to match the PHI
nodes in a similar way as to
gimple_simplify_phiopt calls op.resimplify.  The difficulty resides in
that the (match ...) generated
code gets the entry to the stmt root.  Either we'd teach genmatch to
recognize a PHI def
as a COND or we make (match ..) (additionally?) generate entry points
taking a gimple_match_op
so the toplevel COND trick works.  Note it's already a bit awkward
because we build a GENERIC
form of the condition and that's now invalid in the IL for a GIMPLE
COND_EXPR but still present
because of that phiopt trick.  There isn't a SSA def for the condition
in the IL (it's only part
of a GIMPLE_COND and that one doesn't define "CC flags").

That means possibly special-casing (match (..) (cond (cmp ...) ..)) in
genmatch to handle
PHI defs might be the easiest "trick" here.

Not sure what you did for the IL you quoted above.

Richard.

> #define SAT_ADD_U_1(T) \
> T sat_add_u_1_##T(T x, T y) \
> { \
>   return (T)(x + y) >= x ? (x + y) : -1; \
> }
>
> SAT_ADD_U_1(uint8_t);
>
> Pan
>
> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, May 22, 2024 9:14 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> tamar.christ...@arm.com; pins...@gmail.com
> Subject: Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for 
> unsigned SAT_ADD
>
> On Wed, May 22, 2024 at 3:17 AM  wrote:
> >
> > From: Pan Li 
> >
> > This patch would like to support the __builtin_add_overflow branch form for
> > unsigned SAT_ADD.  For example as below:
> >
> > uint64_t
> > sat_add (uint64_t x, uint64_t y)
> > {
> >   uint64_t ret;
> >   return __builtin_add_overflow (x, y, ) ? -1 : ret;
> > }
> >
> > Different to the branchless version,  we leverage the simplify to
> > convert the branch version of SAT_ADD into branchless if and only
> > if the backend has supported the IFN_SAT_ADD.  Thus,  the backend has
> 

RE: [PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-23 Thread Li, Pan2
> the above three lines are redundant.
> OK with those removed.

Got it, will commit it after no surprise in test for removal.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, May 23, 2024 7:49 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com; pins...@gmail.com
Subject: Re: [PATCH v4] Match: Add overloaded types_match to avoid code dup 
[NFC]

On Thu, May 23, 2024 at 2:24 AM  wrote:
>
> From: Pan Li 
>
> There are sorts of match pattern for SAT related cases,  there will be
> some duplicated code to check the dest, op_0, op_1 are same tree types.
> Aka ternary tree type matches.  Thus,  add overloaded types_match func
> do this and avoid match code duplication.
>
> The below test suites are passed for this patch:
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 regression test.
>
> gcc/ChangeLog:
>
> * generic-match-head.cc (types_match): Add overloaded types_match
> for 3 types.
> * gimple-match-head.cc (types_match): Ditto.
> * match.pd: Leverage overloaded types_match.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/generic-match-head.cc | 14 ++
>  gcc/gimple-match-head.cc  | 14 ++
>  gcc/match.pd  | 30 ++
>  3 files changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
> index 0d3f648fe8d..8d8ecfaeb1d 100644
> --- a/gcc/generic-match-head.cc
> +++ b/gcc/generic-match-head.cc
> @@ -59,6 +59,20 @@ types_match (tree t1, tree t2)
>return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GENERIC.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

the above three lines are redundant.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GENERIC, we assume this is
> always true.  */
>
> diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
> index 5f8a1a1ad8e..2b7f746ab13 100644
> --- a/gcc/gimple-match-head.cc
> +++ b/gcc/gimple-match-head.cc
> @@ -79,6 +79,20 @@ types_match (tree t1, tree t2)
>return types_compatible_p (t1, t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GIMPLE.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

likewise.

OK with those removed.

Richard.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GIMPLE, we also allow any
> non-SSA_NAME (ie constants) and zero uses to cope with uses
> that aren't linked up yet.  */
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 35e3d82b131..7081d76d56a 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3048,38 +3048,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* Unsigned Saturation Add */
>  (match (usadd_left_part_1 @0 @1)
>   (plus:c @0 @1)
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_left_part_2 @0 @1)
>   (realpart (IFN_ADD_OVERFLOW:c @0 @1))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (lt (plus:c @0 @1) @0)))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (gt @0 (plus:c @0 @1
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_2 @0 @1)
>   (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED 

Re: [PATCH] .gitattributes: disable crlf translation

2024-05-23 Thread Richard Biener
On Thu, May 23, 2024 at 5:50 AM Peter Damianov  wrote:
>
> By default, git has the "autocrlf" """feature""" enabled. This causes the 
> files
> to have CRLF line endings when checked out on windows, which in the case of
> configure, causes confusing errors like:
>
> ./gcc/configure: line 14: $'\r': command not found
> ./gcc/configure: line 29: syntax error near unexpected token `newline'
> '/gcc/configure: line 29: ` ;;
>
> when it is invoked.
>
> Any files damaged in this way can be fixed with:
> $ git config core.autocrlf false
> $ git reset
> $ git checkout .
>
> But, it's better to simply avoid this problem in the first place.
> This behavior is never helpful or desired for gcc.

For files added/edited on Windows does this then also strip the \r
(upon which action?)?  Otherwise I think this looks good but I'm not
a git expert.

Richard.

> Signed-off-by: Peter Damianov 
> ---
>  .gitattributes | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/.gitattributes b/.gitattributes
> index e75bfc595bf..1e116987c98 100644
> --- a/.gitattributes
> +++ b/.gitattributes
> @@ -8,3 +8,6 @@ ChangeLog 
> whitespace=indent-with-non-tab,space-before-tab,trailing-space
>  # Use together with git config diff.md.xfuncname '^\(define.*$'
>  # which is run by contrib/gcc-git-customization.sh too.
>  *.md diff=md
> +
> +# Disable lf -> crlf translation on windows.
> +* -crlf
> --
> 2.39.2
>


Re: [V2 PATCH] Don't reduce estimated unrolled size for innermost loop at cunrolli.

2024-05-23 Thread Richard Biener
On Wed, May 22, 2024 at 7:07 AM liuhongt  wrote:
>
> >> Hard to find a default value satisfying all testcases.
> >> some require loop unroll with 7 insns increment, some don't want loop
> >> unroll w/ 5 insn increment.
> >> The original 2/3 reduction happened to meet all those testcases(or the
> >> testcases are constructed based on the old 2/3).
> >> Can we define the parameter as the size of the loop, below the size we
> >> still do the reduction, so the small loop can be unrolled?
>
> >Yeah, that's also a sensible possibility.  Does it work to have a parameter
> >for the unrolled body size?  Thus, amend the existing
> >--param max-completely-peeled-insns with a --param
> >max-completely-peeled-insns-nogrowth?
>
> Update V2:
> It's still hard to find a default value for loop boday size. So I move the
> 2 / 3 reduction from estimated_unrolled_size to try_unroll_loop_completely.
> For the check of body size shrink, 2 / 3 reduction is added, so small loops
> can still be unrolled.
> For the check of comparison between body size and 
> param_max_completely_peeled_insns,
> 2 / 3 is conditionally added for loop->inner || !cunrolli.
> Then the patch avoid gcc testsuite regression, and also prevent big inner loop
> completely unrolled at cunrolli.
>
> --
>
> For the innermost loop, after completely loop unroll, it will most likely
> not be able to reduce the body size to 2/3. The current 2/3 reduction
> will make some of the larger loops completely unrolled during
> cunrolli, which will then result in them not being able to be
> vectorized. It also increases the register pressure. The patch move
> from estimated_unrolled_size to
> the 2/3 reduction at cunrolli.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR tree-optimization/112325
> * tree-ssa-loop-ivcanon.cc (estimated_unrolled_size): Move the
> 2 / 3 loop body size reduction to ..
> (try_unroll_loop_completely): .. here, add it for the check of
> body size shrink, and the check of comparison against
> param_max_completely_peeled_insns when
> (!cunrolli ||loop->inner).
> (canonicalize_loop_induction_variables): Add new parameter
> cunrolli and pass down.
> (tree_unroll_loops_completely_1): Ditto.
> (tree_unroll_loops_completely): Ditto.
> (canonicalize_induction_variables): Handle new parameter.
> (pass_complete_unrolli::execute): Ditto.
> (pass_complete_unroll::execute): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr112325.c: New test.
> * gcc.dg/vect/pr69783.c: Add extra option --param
> max-completely-peeled-insns=300.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr112325.c | 57 
>  gcc/testsuite/gcc.dg/vect/pr69783.c  |  2 +-
>  gcc/tree-ssa-loop-ivcanon.cc | 45 ++-
>  3 files changed, 83 insertions(+), 21 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
> new file mode 100644
> index 000..14208b3e7f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
> @@ -0,0 +1,57 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cunrolli-details" } */
> +
> +typedef unsigned short ggml_fp16_t;
> +static float table_f32_f16[1 << 16];
> +
> +inline static float ggml_lookup_fp16_to_fp32(ggml_fp16_t f) {
> +unsigned short s;
> +__builtin_memcpy(, , sizeof(unsigned short));
> +return table_f32_f16[s];
> +}
> +
> +typedef struct {
> +ggml_fp16_t d;
> +ggml_fp16_t m;
> +unsigned char qh[4];
> +unsigned char qs[32 / 2];
> +} block_q5_1;
> +
> +typedef struct {
> +float d;
> +float s;
> +char qs[32];
> +} block_q8_1;
> +
> +void ggml_vec_dot_q5_1_q8_1(const int n, float * restrict s, const void * 
> restrict vx, const void * restrict vy) {
> +const int qk = 32;
> +const int nb = n / qk;
> +
> +const block_q5_1 * restrict x = vx;
> +const block_q8_1 * restrict y = vy;
> +
> +float sumf = 0.0;
> +
> +for (int i = 0; i < nb; i++) {
> +unsigned qh;
> +__builtin_memcpy(, x[i].qh, sizeof(qh));
> +
> +int sumi = 0;
> +
> +for (int j = 0; j < qk/2; ++j) {
> +const unsigned char xh_0 = ((qh >> (j + 0)) << 4) & 0x10;
> +const unsigned char xh_1 = ((qh >> (j + 12)) ) & 0x10;
> +
> +const int x0 = (x[i].qs[j] & 0xF) | xh_0;
> +const int x1 = (x[i].qs[j] >> 4) | xh_1;
> +
> +sumi += (x0 * y[i].qs[j]) + (x1 * y[i].qs[j + qk/2]);
> +}
> +
> +sumf += (ggml_lookup_fp16_to_fp32(x[i].d)*y[i].d)*sumi + 
> ggml_lookup_fp16_to_fp32(x[i].m)*y[i].s;
> +}
> +
> +*s = sumf;
> +}
> +
> +/* { dg-final { scan-tree-dump {(?n)Not unrolling loop [1-9] \(--param 
> max-completely-peel-times limit 

Re: [PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-23 Thread Richard Biener
On Thu, May 23, 2024 at 2:24 AM  wrote:
>
> From: Pan Li 
>
> There are sorts of match pattern for SAT related cases,  there will be
> some duplicated code to check the dest, op_0, op_1 are same tree types.
> Aka ternary tree type matches.  Thus,  add overloaded types_match func
> do this and avoid match code duplication.
>
> The below test suites are passed for this patch:
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 regression test.
>
> gcc/ChangeLog:
>
> * generic-match-head.cc (types_match): Add overloaded types_match
> for 3 types.
> * gimple-match-head.cc (types_match): Ditto.
> * match.pd: Leverage overloaded types_match.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/generic-match-head.cc | 14 ++
>  gcc/gimple-match-head.cc  | 14 ++
>  gcc/match.pd  | 30 ++
>  3 files changed, 38 insertions(+), 20 deletions(-)
>
> diff --git a/gcc/generic-match-head.cc b/gcc/generic-match-head.cc
> index 0d3f648fe8d..8d8ecfaeb1d 100644
> --- a/gcc/generic-match-head.cc
> +++ b/gcc/generic-match-head.cc
> @@ -59,6 +59,20 @@ types_match (tree t1, tree t2)
>return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GENERIC.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

the above three lines are redundant.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GENERIC, we assume this is
> always true.  */
>
> diff --git a/gcc/gimple-match-head.cc b/gcc/gimple-match-head.cc
> index 5f8a1a1ad8e..2b7f746ab13 100644
> --- a/gcc/gimple-match-head.cc
> +++ b/gcc/gimple-match-head.cc
> @@ -79,6 +79,20 @@ types_match (tree t1, tree t2)
>return types_compatible_p (t1, t2);
>  }
>
> +/* Routine to determine if the types T1, T2 and T3 are effectively
> +   the same for GIMPLE.  If T1, T2 or T2 is not a type, the test
> +   applies to their TREE_TYPE.  */
> +
> +static inline bool
> +types_match (tree t1, tree t2, tree t3)
> +{
> +  t1 = TYPE_P (t1) ? t1 : TREE_TYPE (t1);
> +  t2 = TYPE_P (t2) ? t2 : TREE_TYPE (t2);
> +  t3 = TYPE_P (t3) ? t3 : TREE_TYPE (t3);

likewise.

OK with those removed.

Richard.

> +  return types_match (t1, t2) && types_match (t2, t3);
> +}
> +
>  /* Return if T has a single use.  For GIMPLE, we also allow any
> non-SSA_NAME (ie constants) and zero uses to cope with uses
> that aren't linked up yet.  */
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 35e3d82b131..7081d76d56a 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3048,38 +3048,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  /* Unsigned Saturation Add */
>  (match (usadd_left_part_1 @0 @1)
>   (plus:c @0 @1)
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_left_part_2 @0 @1)
>   (realpart (IFN_ADD_OVERFLOW:c @0 @1))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (lt (plus:c @0 @1) @0)))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_1 @0 @1)
>   (negate (convert (gt @0 (plus:c @0 @1
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  (match (usadd_right_part_2 @0 @1)
>   (negate (convert (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)))
> - (if (INTEGRAL_TYPE_P (type)
> -  && TYPE_UNSIGNED (TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@0))
> -  && types_match (type, TREE_TYPE (@1)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
>
>  /* We cannot merge or overload usadd_left_part_1 and usadd_left_part_2
> because the sub part of left_part_2 cannot work with right_part_1.
> --
> 2.34.1
>


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Richard Biener
On Wed, May 22, 2024 at 8:53 PM Qing Zhao  wrote:
>
>
>
> > On May 22, 2024, at 03:38, Richard Biener  
> > wrote:
> >
> > On Tue, May 21, 2024 at 11:36 PM David Malcolm  wrote:
> >>
> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> >>> Thanks for the comments and suggestions.
> >>>
>  On May 15, 2024, at 10:00, David Malcolm 
>  wrote:
> 
>  On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
> > On Mon, 13 May 2024, Qing Zhao wrote:
> >
> >> -Warray-bounds is an important option to enable linux kernal to
> >> keep
> >> the array out-of-bound errors out of the source tree.
> >>
> >> However, due to the false positive warnings reported in
> >> PR109071
> >> (-Warray-bounds false positive warnings due to code duplication
> >> from
> >> jump threading), -Warray-bounds=1 cannot be added on by
> >> default.
> >>
> >> Although it's impossible to elinimate all the false positive
> >> warnings
> >> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
> >> documentation says "always out of bounds"), we should minimize
> >> the
> >> false positive warnings in -Warray-bounds=1.
> >>
> >> The root reason for the false positive warnings reported in
> >> PR109071 is:
> >>
> >> When the thread jump optimization tries to reduce the # of
> >> branches
> >> inside the routine, sometimes it needs to duplicate the code
> >> and
> >> split into two conditional pathes. for example:
> >>
> >> The original code:
> >>
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>  if (index >= 4)
> >>warn ();
> >>  *ptr = *val;
> >>
> >>  return;
> >> }
> >>
> >> With the thread jump, the above becomes:
> >>
> >> void sparx5_set (int * ptr, struct nums * sg, int index)
> >> {
> >>  if (index >= 4)
> >>{
> >>  warn ();
> >>  *ptr = 0; // Code duplications since "warn" does
> >> return;
> >>  *val = sg->vals[index];   // same this line.
> >>// In this path, since it's
> >> under
> >> the condition
> >>// "index >= 4", the compiler
> >> knows
> >> the value
> >>// of "index" is larger then 4,
> >> therefore the
> >>// out-of-bound warning.
> >>  warn ();
> >>}
> >>  else
> >>{
> >>  *ptr = 0;
> >>  *val = sg->vals[index];
> >>}
> >>  *ptr = *val;
> >>  return;
> >> }
> >>
> >> We can see, after the thread jump optimization, the # of
> >> branches
> >> inside
> >> the routine "sparx5_set" is reduced from 2 to 1, however,  due
> >> to
> >> the
> >> code duplication (which is needed for the correctness of the
> >> code),
> >> we
> >> got a false positive out-of-bound warning.
> >>
> >> In order to eliminate such false positive out-of-bound warning,
> >>
> >> A. Add one more flag for GIMPLE: is_splitted.
> >> B. During the thread jump optimization, when the basic blocks
> >> are
> >>   duplicated, mark all the STMTs inside the original and
> >> duplicated
> >>   basic blocks as "is_splitted";
> >> C. Inside the array bound checker, add the following new
> >> heuristic:
> >>
> >> If
> >>   1. the stmt is duplicated and splitted into two conditional
> >> paths;
> >> +  2. the warning level < 2;
> >> +  3. the current block is not dominating the exit block
> >> Then not report the warning.
> >>
> >> The false positive warnings are moved from -Warray-bounds=1 to
> >> -Warray-bounds=2 now.
> >>
> >> Bootstrapped and regression tested on both x86 and aarch64.
> >> adjusted
> >> -Warray-bounds-61.c due to the false positive warnings.
> >>
> >> Let me know if you have any comments and suggestions.
> >
> > At the last Cauldron I talked with David Malcolm about these kind
> > of
> > issues and thought of instead of suppressing diagnostics to
> > record
> > how a block was duplicated.  For jump threading my idea was to
> > record
> > the condition that was proved true when entering the path and do
> > this
> > by recording the corresponding locations
> >>>
> >>> Is only recording the location for the TRUE path  enough?
> >>> We might need to record the corresponding locations for both TRUE and
> >>> FALSE paths since the VRP might be more accurate on both paths.
> >>> Is only recording the location is enough?
> >>> Do we need to record the pointer to the original condition stmt?
> >>
> >> Just to be clear: I don't plan to work on this myself (I have far too
> >> 

[PATCH] tree-optimization/115199 - fix PTA constraint processing for LHS

2024-05-23 Thread Richard Biener
When processing a  = X constraint we treat it as *ANYTHING = X
during constraint processing but then end up recording it as
 = X anyway, breaking constraint graph building.  This is
because we only update the local copy of the LHS and not the constraint
itself.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

PR tree-optimization/115199
* tree-ssa-structalias.cc (process_constraint): Also
record  = X as *ANYTING = X in the end.

* gcc.dg/torture/pr115199.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr115199.c | 24 
 gcc/tree-ssa-structalias.cc |  2 +-
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr115199.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr115199.c 
b/gcc/testsuite/gcc.dg/torture/pr115199.c
new file mode 100644
index 000..981a7330b32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr115199.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+
+struct b {
+  char *volatile c;
+};
+struct b * __attribute__((noipa))
+d()
+{
+  char *e;
+  struct b *b = __builtin_malloc(sizeof(b));
+  void *f = __builtin_malloc(1);
+
+  e = __builtin_memcpy(f, "z", 1);
+  b->c = e;
+  return b;
+}
+
+int main()
+{
+  struct b b = *d();
+  if (b.c[0] != 'z')
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-ssa-structalias.cc b/gcc/tree-ssa-structalias.cc
index 0e9423a78ec..a39b36c146e 100644
--- a/gcc/tree-ssa-structalias.cc
+++ b/gcc/tree-ssa-structalias.cc
@@ -3104,7 +3104,7 @@ process_constraint (constraint_t t)
  it here by turning it into *ANYTHING.  */
   if (lhs.type == ADDRESSOF
   && lhs.var == anything_id)
-lhs.type = DEREF;
+t->lhs.type = lhs.type = DEREF;
 
   /* ADDRESSOF on the lhs is invalid.  */
   gcc_assert (lhs.type != ADDRESSOF);
-- 
2.35.3


[PATCH] tree-optimization/115138 - ptr-vs-ptr and FUNCTION_DECLs

2024-05-23 Thread Richard Biener
I failed to realize we do not represent FUNCTION_DECLs or LABEL_DECLs
in vars explicitly and thus have to compare pt.vars_contains_nonlocal.

Bootstrapped and tested with bootstrap-O3 and D to verify the
comparison fail is fixed.  I'm now doing a regular bootstrap and
regtest with the volatile fix and will push afterwards.

PR tree-optimization/115138
* tree-ssa-alias.cc (ptrs_compare_unequal): Make sure
pt.vars_contains_nonlocal differs since we do not represent
FUNCTION_DECLs or LABEL_DECLs in vars explicitly.

* gcc.dg/torture/pr115138.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr115138.c | 28 +
 gcc/tree-ssa-alias.cc   |  6 ++
 2 files changed, 34 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr115138.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr115138.c 
b/gcc/testsuite/gcc.dg/torture/pr115138.c
new file mode 100644
index 000..6becaecbaff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr115138.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+
+int foo (int) {}
+int bar (int) {}
+
+typedef int (*pred)(int);
+
+int x, y;
+pred A () { if (x) return foo; else return bar; }
+pred B () { if (y) return foo; else return bar; }
+int __attribute__((noipa)) baz()
+{
+  pred a = A();
+  pred b = B();
+  if (a != b)
+return 42;
+  return 0;
+}
+
+int main()
+{
+  if (baz () != 0)
+__builtin_abort ();
+  y = 1;
+  if (baz () != 42)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index d64d6d02f4a..1a91d63a31e 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -501,6 +501,12 @@ ptrs_compare_unequal (tree ptr1, tree ptr2)
  || pi2->pt.vars_contains_interposable)
return false;
  if ((!pi1->pt.null || !pi2->pt.null)
+ /* ???  We do not represent FUNCTION_DECL and LABEL_DECL
+in pt.vars but only set pt.vars_contains_nonlocal.  This
+makes compares involving those and other nonlocals
+imprecise.  */
+ && (!pi1->pt.vars_contains_nonlocal
+ || !pi2->pt.vars_contains_nonlocal)
  && (!pt_solution_includes_const_pool (>pt)
  || !pt_solution_includes_const_pool (>pt)))
return !pt_solutions_intersect (>pt, >pt);
-- 
2.35.3


RE: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-23 Thread Li, Pan2
I have a try to convert the PHI from Part-A to Part-B, aka PHI to _2 = phi_cond 
? _1 : 255.
And then we can do the matching on COND_EXPR in the underlying widen-mul pass.

Unfortunately, meet some ICE when verify_gimple_phi in sccopy1 pass => 
sat_add.c:66:1: internal compiler error: tree check: expected class ‘type’, 
have ‘exceptional’ (error_mark) in useless_type_conversion_p, at 
gimple-expr.cc:86

will go on to see if this works or not.

Part-A:
uint8_t sat_add_u_1_uint8_t (uint8_t x, uint8_t y)
{
  unsigned char _1;
  uint8_t _2;

   :
  _1 = x_3(D) + y_4(D);
  if (_1 >= x_3(D))
goto ; [INV]
  else
goto ; [INV]

   :

   :
  # _2 = PHI <255(2), _1(3)>
  return _2;

}

Part-B:
uint8_t sat_add_u_1_uint8_t (uint8_t x, uint8_t y)
{
  unsigned char _1;
  _Bool phi_cond_6;

   :
  _1 = x_3(D) + y_4(D);
  phi_cond_6 = _1 >= x_3(D);
  _2 = phi_cond_6 ? _1 : 255;
  return _2;

}

-Original Message-
From: Li, Pan2 
Sent: Thursday, May 23, 2024 12:17 PM
To: Richard Biener 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com; pins...@gmail.com
Subject: RE: [PATCH v2] Match: Support __builtin_add_overflow branch form for 
unsigned SAT_ADD

Thanks Richard for reviewing.

> I'm not convinced we should match this during early if-conversion, should we?
> The middle-end doesn't really know .SAT_ADD but some handling of
> .ADD_OVERFLOW is present.

I tried to do the branch (aka cond) match in widen-mult pass similar as 
previous branchless form.
Unfortunately, the branch will be converted to PHI when widen-mult, thus try to 
bypass the PHI handling
and convert the branch form to the branchless form in v2.

> But please add a comment before the new pattern, esp. since it's
> non-obvious that this is an improvent.

Sure thing.

> I suspect you rely on this form being recognized as .SAT_ADD later but
> what prevents us from breaking this?  Why not convert it to .SAT_ADD
> immediately?  If this is because the ISEL pass (or the widen-mult pass)
> cannot handle PHIs then I would suggest to split out enough parts of
> tree-ssa-phiopt.cc to be able to query match.pd for COND_EXPRs.

Yes, this is sort of redundant, we can also convert it to .SAT_ADD immediately 
in match.pd before widen-mult.

Sorry I may get confused here, for branch form like below, what transform 
should we perform in phiopt?
The gimple_simplify_phiopt mostly leverage the simplify in match.pd but we may 
hit the simplify in the
other early pass.

Or we can leverage branch version of unsigned_integer_sat_add gimple match in 
phiopt and generate the gimple call .SAT_ADD
In phiopt (mostly like what we do in widen-mult).
Not sure if my understanding is correct or not, thanks again for help.

#define SAT_ADD_U_1(T) \
T sat_add_u_1_##T(T x, T y) \
{ \
  return (T)(x + y) >= x ? (x + y) : -1; \
}

SAT_ADD_U_1(uint8_t);

Pan

-Original Message-
From: Richard Biener  
Sent: Wednesday, May 22, 2024 9:14 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
tamar.christ...@arm.com; pins...@gmail.com
Subject: Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for 
unsigned SAT_ADD

On Wed, May 22, 2024 at 3:17 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to support the __builtin_add_overflow branch form for
> unsigned SAT_ADD.  For example as below:
>
> uint64_t
> sat_add (uint64_t x, uint64_t y)
> {
>   uint64_t ret;
>   return __builtin_add_overflow (x, y, ) ? -1 : ret;
> }
>
> Different to the branchless version,  we leverage the simplify to
> convert the branch version of SAT_ADD into branchless if and only
> if the backend has supported the IFN_SAT_ADD.  Thus,  the backend has
> the ability to choose branch or branchless implementation of .SAT_ADD.
> For example,  some target can take care of branches code more optimally.
>
> When the target implement the IFN_SAT_ADD for unsigned and before this
> patch:
>
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   long unsigned int _2;
>   uint64_t _3;
>   __complex__ long unsigned int _6;
>
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _6 = .ADD_OVERFLOW (x_4(D), y_5(D));
>   _2 = IMAGPART_EXPR <_6>;
>   if (_2 != 0)
> goto ; [35.00%]
>   else
> goto ; [65.00%]
> ;;succ:   4
> ;;3
>
> ;;   basic block 3, loop depth 0
> ;;pred:   2
>   _1 = REALPART_EXPR <_6>;
> ;;succ:   4
>
> ;;   basic block 4, loop depth 0
> ;;pred:   3
> ;;2
>   # _3 = PHI <_1(3), 18446744073709551615(2)>
>   return _3;
> ;;succ:   EXIT
> }
>
> After this patch:
> uint64_t sat_add (uint64_t x, uint64_t y)
> {
>   long unsigned int _12;
>
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _12 = .SAT_ADD (x_4(D), y_5(D)); [tail call]
>   return _12;
> ;;succ:   EXIT
> }
>
> The below test suites are passed for this patch:
> * The x86 bootstrap test.
> * The x86 fully regression 

[PATCH] RISC-V: Fix missing boolean_expression in zmmul extension

2024-05-23 Thread Liao Shihua
Missing boolean_expression TARGET_ZMMUL in riscv_rtx_costs() casuse different 
instructions when multiplying an integer with a constant.
( https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1482 )

int foo(int *ib) {
*ib = *ib * 33938;
return 0;
}

rv64im:
lw  a4,0(a1)
li  a5,32768
addiw   a5,a5,1170
mulwa5,a5,a4
sw  a5,0(a1)
ret

rv64i_zmmul:
lw  a4,0(a1)
slliw   a5,a4,5
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,3
addwa5,a5,a4
slliw   a5,a5,1
sw  a5,0(a1)
ret

Fixed.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_rtx_costs): Add TARGET_ZMMUL.

---
 gcc/config/riscv/riscv.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 85df5b7ab49..580ae007181 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3753,7 +3753,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 case MULT:
   if (float_mode_p)
*total = tune_param->fp_mul[mode == DFmode];
-  else if (!TARGET_MUL)
+  else if (!(TARGET_MUL || TARGET_ZMMUL))
/* Estimate the cost of a library call.  */
*total = COSTS_N_INSNS (speed ? 32 : 6);
   else if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD)
-- 
2.34.1



RE: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c

2024-05-23 Thread Di Zhao OS
> -Original Message-
> From: Jeff Law 
> Sent: Wednesday, May 22, 2024 11:14 PM
> To: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c
> 
> 
> 
> On 5/22/24 5:46 AM, Di Zhao OS wrote:
> > The test case is for targets that support FMA. Previously
> > the "target" selector is missed in dg-final command.
> >
> > Tested on x86_64-pc-linux-gnu.
> >
> > Thanks
> > Di Zhao
> >
> > gcc/testsuite/ChangeLog:
> >
> >  * gcc.dg/pr110279-1.c: add target selector.
> Rather than list targets explicitly in the test, wouldn't it be better
> to have a common routine that could be used in other cases where we have
> a test that requires FMA?
> 
> So something similar to check_effective_target_scalar_all_fma?
> 
> 
> Jeff

Here is an updated version of the patch. Sorry I'm not very familiar
with the testsuite commands.

gcc/testsuite/ChangeLog:

* gcc.dg/pr110279-1.c: add target selector.

---
gcc/testsuite/gcc.dg/pr110279-1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr110279-1.c 
b/gcc/testsuite/gcc.dg/pr110279-1.c
index a8c7257b28d..c4f94ea5810 100644
--- a/gcc/testsuite/gcc.dg/pr110279-1.c
+++ b/gcc/testsuite/gcc.dg/pr110279-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { scalar_all_fma || { i?86-*-* x86_64-*-* } } } } 
*/
 /* { dg-options "-Ofast --param avoid-fma-max-bits=512 --param 
tree-reassoc-width=4 -fdump-tree-widening_mul-details" } */
 /* { dg-additional-options "-mcpu=generic" { target aarch64*-*-* } } */
 /* { dg-additional-options "-mfma" { target i?86-*-* x86_64-*-* } } */
@@ -64,4 +64,4 @@ foo3 (data_e a, data_e b, data_e c, data_e d)
   return result;
 }
 
-/* { dg-final { scan-tree-dump-times "Generated FMA" 3 "widening_mul"} } */
\ No newline at end of file
+/* { dg-final { scan-tree-dump-times "Generated FMA" 3 "widening_mul" } } */
-- 
2.25.1




Re: [PATCH] invoke.texi: Clarify -march=lujiazui

2024-05-23 Thread mayshao-oc

Hi Jakub:

  I think the modified lujiazui description is what actually 
happens,thanks.


BR
Mayshao



[这封邮件来自外部发件人 谨防风险]

Hi!

Yesterday I was searching which exact CPUs are affected by the PR114576
wrong-code issue and went from the PTA_* bitmasks in GCC, so arrived
at the goldmont, goldmont-plus, tremont and lujiazui CPUs (as -march=
cases which do enable -maes and don't enable -mavx).
But when double-checking that against the invoke.texi documentation,
that was true for the first 3, but lujiazui said it supported AVX.
I was really confused by that, until I found the
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604407.html
explanation.  So, seems the CPUs do have AVX and F16C but -march=lujiazui
doesn't enable those and even activelly attempts to filter those out from
the announced CPUID features, in glibc as well as e.g. in libgcc.

Thus, I think we should document what actually happens, otherwise
users could assume that
gcc -march=lujiazui predefines __AVX__ and __F16C__, which it doesn't.

Tested on x86_64, ok for trunk?

2024-04-11  Jakub Jelinek  

 * doc/invoke.texi (lujiazui): Clarify that while the CPUs do support
 AVX and F16C, -march=lujiazui actually doesn't enable those.

--- gcc/doc/invoke.texi.jj  2024-04-11 09:26:01.156865894 +0200
+++ gcc/doc/invoke.texi 2024-04-11 10:47:53.457582922 +0200
@@ -34696,8 +34696,10 @@ instruction set support.

  @item lujiazui
  ZHAOXIN lujiazui CPU with x86-64, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1,
-SSE4.2, AVX, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE, CX16,
-ABM, BMI, BMI2, F16C, FXSR, RDSEED instruction set support.
+SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE, CX16,
+ABM, BMI, BMI2, FXSR, RDSEED instruction set support.  While the CPUs
+do support AVX and F16C, these aren't enabled by @code{-march=lujiazui}
+for performance reasons.

  @item yongfeng
  ZHAOXIN yongfeng CPU with x86-64, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1,

 Jakub



Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-23 Thread Uros Bizjak
On Thu, May 23, 2024 at 10:35 AM Uros Bizjak  wrote:
>
> On Wed, May 22, 2024 at 4:32 PM Evgeny Karpov
>  wrote:
> >
> > Wednesday, May 22, 2024 1:06 PM
> > Richard Sandiford  wrote:
> >
> > > This looks good to me apart from a couple of very minor comments below, 
> > > but
> > > please get approval from the x86 maintainers as well.  In particular, 
> > > they might
> > > prefer to handle ix86_legitimize_pe_coff_symbol in some other way.
> >
> > Thanks, Richard, for the review!
> > The suggestions will be addressed in the next version.
> >
> > Jan and Uros, could you please review x86 refactoring for mingw part? 
> > Thanks.
>
> Yes, perhaps legitimize_pe_coff_symbol should be handled similar to
> how machopic_legitimize_pic_address is handled.and just use "#if
> TARGET_PECOFF" at call sites when calling functions from the new
> winnt-dll.h. This would also allow us to remove  the early check for
> !TARGET_PECOFF in legitimize_pe_coff_symbol.

Maybe you should look how TARGET_MACHO is handled in config/i386/* files.

Uros.


Re: [PATCH 2/3] [APX CCMP] Adjust startegy for selecting ccmp candidates

2024-05-23 Thread Hongyu Wang
Gently ping for this :)
Hi Richard, Is it OK to adopt the ccmp change? Or did you know who can
help to review this part?
Thanks.

Hongyu Wang  于2024年5月15日周三 16:25写道:
>
> CC'd Richard for ccmp part as previously it is added only for aarch64.
> The original logic will not interrupted since if
> aarch64_gen_ccmp_first succeeded, aarch64_gen_ccmp_next will also
> success, the cmp/fcmp and ccmp/fccmp supports all GPI/GPF, and the
> prepare_operand will fixup the input that cmp supports but ccmp not,
> so ret/ret2 will all be valid when comparing cost.
> Thanks in advance.
>
> Hongyu Wang  于2024年5月15日周三 16:22写道:
> >
> > For general ccmp scenario, the tree sequence is like
> >
> > _1 = (a < b)
> > _2 = (c < d)
> > _3 = _1 & _2
> >
> > current ccmp expanding will try to swap compare order for _1 and _2,
> > compare the cost/cost2 between compare _1 and _2 first, then return the
> > sequence with lower cost.
> >
> > For x86 ccmp, we don't support FP compare as ccmp operand, but we
> > support fp comi + int ccmp sequence. With current cost comparison
> > model, the fp comi + int ccmp can never be generated since it doesn't
> > check whether expand_ccmp_next returns available result and the rtl
> > cost for the empty ccmp sequence is always smaller.
> >
> > Check the expand_ccmp_next result ret and ret2, returns the valid one
> > before cost comparison.
> >
> > gcc/ChangeLog:
> >
> > * ccmp.cc (expand_ccmp_expr_1): Check ret and ret2 of
> > expand_ccmp_next, returns the valid one first before
> > comparing cost.
> > ---
> >  gcc/ccmp.cc | 12 +++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/ccmp.cc b/gcc/ccmp.cc
> > index 7cb525addf4..4b424220068 100644
> > --- a/gcc/ccmp.cc
> > +++ b/gcc/ccmp.cc
> > @@ -247,7 +247,17 @@ expand_ccmp_expr_1 (gimple *g, rtx_insn **prep_seq, 
> > rtx_insn **gen_seq)
> >   cost2 = seq_cost (prep_seq_2, speed_p);
> >   cost2 += seq_cost (gen_seq_2, speed_p);
> > }
> > - if (cost2 < cost1)
> > +
> > + /* For x86 target the ccmp does not support fp operands, but
> > +have fcomi insn that can produce eflags and then do int
> > +ccmp. So if one of the op is fp compare, ret1 or ret2 can
> > +fail, and the cost of the corresponding empty seq will
> > +always be smaller, then the NULL sequence will be returned.
> > +Add check for ret and ret2, returns the available one if
> > +the other is NULL.  */
> > + if ((!ret && ret2)
> > + || (!(ret && !ret2)
> > + && cost2 < cost1))
> > {
> >   *prep_seq = prep_seq_2;
> >   *gen_seq = gen_seq_2;
> > --
> > 2.31.1
> >


Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-23 Thread Uros Bizjak
On Wed, May 22, 2024 at 4:32 PM Evgeny Karpov
 wrote:
>
> Wednesday, May 22, 2024 1:06 PM
> Richard Sandiford  wrote:
>
> > This looks good to me apart from a couple of very minor comments below, but
> > please get approval from the x86 maintainers as well.  In particular, they 
> > might
> > prefer to handle ix86_legitimize_pe_coff_symbol in some other way.
>
> Thanks, Richard, for the review!
> The suggestions will be addressed in the next version.
>
> Jan and Uros, could you please review x86 refactoring for mingw part? Thanks.

Yes, perhaps legitimize_pe_coff_symbol should be handled similar to
how machopic_legitimize_pic_address is handled.and just use "#if
TARGET_PECOFF" at call sites when calling functions from the new
winnt-dll.h. This would also allow us to remove  the early check for
!TARGET_PECOFF in legitimize_pe_coff_symbol.

Uros.


[PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

This patch adds support for MVE Tail-Predicated Low Overhead Loops by using the
doloop funcitonality added to support predicated vectorized hardware loops.

gcc/ChangeLog:

* config/arm/arm-protos.h (arm_target_bb_ok_for_lob): Change
declaration to pass basic_block.
(arm_attempt_dlstp_transform): New declaration.
* config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): Define targethook.
(TARGET_PREDICT_DOLOOP_P): Likewise.
(arm_target_bb_ok_for_lob): Adapt condition.
(arm_mve_get_vctp_lanes): New function.
(arm_dl_usage_type): New internal enum.
(arm_get_required_vpr_reg): New function.
(arm_get_required_vpr_reg_param): New function.
(arm_get_required_vpr_reg_ret_val): New function.
(arm_mve_get_loop_vctp): New function.
(arm_mve_insn_predicated_by): New function.
(arm_mve_across_lane_insn_p): New function.
(arm_mve_load_store_insn_p): New function.
(arm_mve_impl_pred_on_outputs_p): New function.
(arm_mve_impl_pred_on_inputs_p): New function.
(arm_last_vect_def_insn): New function.
(arm_mve_impl_predicated_p): New function.
(arm_mve_check_reg_origin_is_num_elems): New function.
(arm_mve_dlstp_check_inc_counter): New function.
(arm_mve_dlstp_check_dec_counter): New function.
(arm_mve_loop_valid_for_dlstp): New function.
(arm_predict_doloop_p): New function.
(arm_loop_unroll_adjust): New function.
(arm_emit_mve_unpredicated_insn_to_seq): New function.
(arm_attempt_dlstp_transform): New function.
* config/arm/arm.opt (mdlstp): New option.
* config/arm/iteratords.md (dlstp_elemsize, letp_num_lanes,
letp_num_lanes_neg, letp_num_lanes_minus_1): New attributes.
(DLSTP, LETP): New iterators.
(predicated_doloop_end_internal): New pattern.
(dlstp_insn): New pattern.
* config/arm/thumb2.md (doloop_end): Adapt to support tail-predicated
loops.
(doloop_begin): Likewise.
* config/arm/types.md (mve_misc): New mve type to represent
predicated_loop_end insn sequences.
* config/arm/unspecs.md:
(DLSTP8, DLSTP16, DLSTP32, DSLTP64,
LETP8, LETP16, LETP32, LETP64): New unspecs for DLSTP and LETP.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lob.h: Add new helpers.
* gcc.target/arm/lob1.c: Use new helpers.
* gcc.target/arm/lob6.c: Likewise.
* gcc.target/arm/dlstp-compile-asm-1.c: New test.
* gcc.target/arm/dlstp-compile-asm-2.c: New test.
* gcc.target/arm/dlstp-compile-asm-3.c: New test.
* gcc.target/arm/dlstp-int8x16.c: New test.
* gcc.target/arm/dlstp-int8x16-run.c: New test.
* gcc.target/arm/dlstp-int16x8.c: New test.
* gcc.target/arm/dlstp-int16x8-run.c: New test.
* gcc.target/arm/dlstp-int32x4.c: New test.
* gcc.target/arm/dlstp-int32x4-run.c: New test.
* gcc.target/arm/dlstp-int64x2.c: New test.
* gcc.target/arm/dlstp-int64x2-run.c: New test.
* gcc.target/arm/dlstp-invalid-asm.c: New test.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/config/arm/arm-protos.h   |4 +-
 gcc/config/arm/arm.cc | 1249 -
 gcc/config/arm/arm.opt|3 +
 gcc/config/arm/iterators.md   |   15 +
 gcc/config/arm/mve.md |   50 +
 gcc/config/arm/thumb2.md  |  138 +-
 gcc/config/arm/types.md   |6 +-
 gcc/config/arm/unspecs.md |   14 +-
 gcc/testsuite/gcc.target/arm/lob.h|  128 +-
 gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
 gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
 .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
 .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
 .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
 .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
 .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
 .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
 .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
 .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
 23 files changed, 3321 insertions(+), 82 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c
 create mode 100644 

[PATCH 1/2] doloop: Add support for predicated vectorized loops

2024-05-23 Thread Andre Vieira

This patch adds support in the target agnostic doloop pass for the detection of
predicated vectorized hardware loops.  Arm is currently the only target that
will make use of this feature.

gcc/ChangeLog:

* df-core.cc (df_bb_regno_only_def_find): New helper function.
* df.h (df_bb_regno_only_def_find): Declare new function.
* loop-doloop.cc (doloop_condition_get): Add support for detecting
predicated vectorized hardware loops.
(doloop_modify): Add support for GTU condition checks.
(doloop_optimize): Update costing computation to support alterations to
desc->niter_expr by the backend.

Co-authored-by: Stam Markianos-Wright 
---
 gcc/df-core.cc |  15 +
 gcc/df.h   |   1 +
 gcc/loop-doloop.cc | 164 +++--
 3 files changed, 113 insertions(+), 67 deletions(-)

diff --git a/gcc/df-core.cc b/gcc/df-core.cc
index f0eb4c93957..b0e8a88d433 100644
--- a/gcc/df-core.cc
+++ b/gcc/df-core.cc
@@ -1964,6 +1964,21 @@ df_bb_regno_last_def_find (basic_block bb, unsigned int regno)
   return NULL;
 }
 
+/* Return the one and only def of REGNO within BB.  If there is no def or
+   there are multiple defs, return NULL.  */
+
+df_ref
+df_bb_regno_only_def_find (basic_block bb, unsigned int regno)
+{
+  df_ref temp = df_bb_regno_first_def_find (bb, regno);
+  if (!temp)
+return NULL;
+  else if (temp == df_bb_regno_last_def_find (bb, regno))
+return temp;
+  else
+return NULL;
+}
+
 /* Finds the reference corresponding to the definition of REG in INSN.
DF is the dataflow object.  */
 
diff --git a/gcc/df.h b/gcc/df.h
index 84e5aa8b524..c4e690b40cf 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -987,6 +987,7 @@ extern void df_check_cfg_clean (void);
 #endif
 extern df_ref df_bb_regno_first_def_find (basic_block, unsigned int);
 extern df_ref df_bb_regno_last_def_find (basic_block, unsigned int);
+extern df_ref df_bb_regno_only_def_find (basic_block, unsigned int);
 extern df_ref df_find_def (rtx_insn *, rtx);
 extern bool df_reg_defined (rtx_insn *, rtx);
 extern df_ref df_find_use (rtx_insn *, rtx);
diff --git a/gcc/loop-doloop.cc b/gcc/loop-doloop.cc
index 529e810e530..8953e1de960 100644
--- a/gcc/loop-doloop.cc
+++ b/gcc/loop-doloop.cc
@@ -85,10 +85,10 @@ doloop_condition_get (rtx_insn *doloop_pat)
  forms:
 
  1)  (parallel [(set (pc) (if_then_else (condition)
-	  			(label_ref (label))
-(pc)))
-	 (set (reg) (plus (reg) (const_int -1)))
-	 (additional clobbers and uses)])
+	(label_ref (label))
+	(pc)))
+		 (set (reg) (plus (reg) (const_int -1)))
+		 (additional clobbers and uses)])
 
  The branch must be the first entry of the parallel (also required
  by jump.cc), and the second entry of the parallel must be a set of
@@ -96,19 +96,33 @@ doloop_condition_get (rtx_insn *doloop_pat)
  the loop counter in an if_then_else too.
 
  2)  (set (reg) (plus (reg) (const_int -1))
- (set (pc) (if_then_else (reg != 0)
-	 (label_ref (label))
-			 (pc))).  
+	 (set (pc) (if_then_else (reg != 0)
+ (label_ref (label))
+ (pc))).
 
- Some targets (ARM) do the comparison before the branch, as in the
+ 3) Some targets (Arm) do the comparison before the branch, as in the
  following form:
 
- 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
-   (set (reg) (plus (reg) (const_int -1)))])
-(set (pc) (if_then_else (cc == NE)
-(label_ref (label))
-(pc))) */
-
+ (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
+		(set (reg) (plus (reg) (const_int -1)))])
+ (set (pc) (if_then_else (cc == NE)
+			 (label_ref (label))
+			 (pc)))
+
+  4) This form supports a construct that is used to represent a vectorized
+  do loop with predication, however we do not need to care about the
+  details of the predication here.
+  Arm uses this construct to support MVE tail predication.
+
+  (parallel
+   [(set (pc)
+	 (if_then_else (gtu (plus (reg) (const_int -n))
+(const_int n-1))
+			   (label_ref)
+			   (pc)))
+	(set (reg) (plus (reg) (const_int -n)))
+	(additional clobbers and uses)])
+ */
   pattern = PATTERN (doloop_pat);
 
   if (GET_CODE (pattern) != PARALLEL)
@@ -173,15 +187,17 @@ doloop_condition_get (rtx_insn *doloop_pat)
   if (! REG_P (reg))
 return 0;
 
-  /* Check if something = (plus (reg) (const_int -1)).
+  /* Check if something = (plus (reg) (const_int -n)).
  On IA-64, this decrement is wrapped in an if_then_else.  */
   inc_src = SET_SRC (inc);
   if (GET_CODE (inc_src) == IF_THEN_ELSE)
 inc_src = XEXP (inc_src, 1);
   if (GET_CODE (inc_src) != PLUS
-  || XEXP (inc_src, 0) != reg
-  || XEXP (inc_src, 1) != constm1_rtx)
+  || !rtx_equal_p (XEXP (inc_src, 0), reg)
+  || 

[PATCH 0/2] arm, doloop: Add support for MVE Tail-Predicated Low Overhead Loops

2024-05-23 Thread Andre Vieira

Hi,

  We held these two patches back in stage 4 because they touched 
target-agnostic code, though I am quite confident they will not affect other 
targets. Given stage one has reopened, I am reposting them, I rebased them but 
they seem to apply cleanly on trunk.

  OK for trunk?

Andre Vieira (2):
  doloop: Add support for predicated vectorized loops
  arm: Add support for MVE Tail-Predicated Low Overhead Loops

 gcc/config/arm/arm-protos.h   |4 +-
 gcc/config/arm/arm.cc | 1249 -
 gcc/config/arm/arm.opt|3 +
 gcc/config/arm/iterators.md   |   15 +
 gcc/config/arm/mve.md |   50 +
 gcc/config/arm/thumb2.md  |  138 +-
 gcc/config/arm/types.md   |6 +-
 gcc/config/arm/unspecs.md |   14 +-
 gcc/df-core.cc|   15 +
 gcc/df.h  |1 +
 gcc/loop-doloop.cc|  164 ++-
 gcc/testsuite/gcc.target/arm/lob.h|  128 +-
 gcc/testsuite/gcc.target/arm/lob1.c   |   23 +-
 gcc/testsuite/gcc.target/arm/lob6.c   |8 +-
 .../gcc.target/arm/mve/dlstp-compile-asm-1.c  |  146 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-2.c  |  749 ++
 .../gcc.target/arm/mve/dlstp-compile-asm-3.c  |   46 +
 .../gcc.target/arm/mve/dlstp-int16x8-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int16x8.c|   31 +
 .../gcc.target/arm/mve/dlstp-int32x4-run.c|   45 +
 .../gcc.target/arm/mve/dlstp-int32x4.c|   31 +
 .../gcc.target/arm/mve/dlstp-int64x2-run.c|   48 +
 .../gcc.target/arm/mve/dlstp-int64x2.c|   28 +
 .../gcc.target/arm/mve/dlstp-int8x16-run.c|   44 +
 .../gcc.target/arm/mve/dlstp-int8x16.c|   32 +
 .../gcc.target/arm/mve/dlstp-invalid-asm.c|  521 +++
 26 files changed, 3434 insertions(+), 149 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-compile-asm-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int16x8.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int32x4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int64x2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16-run.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-int8x16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/dlstp-invalid-asm.c

-- 
2.17.1


  1   2   >