Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu  wrote:
>
> On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich  wrote:
> >
> > On 25.06.2023 07:12, Hongtao Liu wrote:
> > > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches
> > >  wrote:
> > >>
> > >> ---
> > >> For the purpose here (and elsewhere) bcst_vector_operand() (really:
> > >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow
> > >> 128-bit and 256-bit types as well irrespective of AVX512VL being
> > >> enabled. This would likely require a new predicate
> > >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name
> > >> selection it will want considering that this is applicable to certain
> > >> non-calculational FP operations as well.)
> > > I think so.
> >
> > Any preference towards predicate and constraint naming?
> something like bcst_mem_operand_$suffiix, $suffix indicates the
> pattern may use zmm instruction for 128/256-bit operand.
> maybe just bcst_mem_operand_zmm?
For constraint, maybe we can reuse Br, relax Br to match bcst_mem_operand_zmm.
For those original patterns with bcst_mem_operand, it should be ok
since it's already guarded by the predicate, the constraint must be
valid.
>
> >
> > Plus I think there's a more general question behind this: A new
> > predicate / constraint pair is likely just one way of dealing
> > with the issue. Another would appear to be to remove the
> > restriction of 128- and 256-byte types when AVX512VL is not
> > enabled, but AVX512F is. While that would require touching a
> > lot of insn constraints, it looks as if lifting that restriction
> > would "merely" require much wider use of Yv where v is used
> > right now. But of course I may well be unaware of (some of) the
> > reasons why that restriction was put in place in the first place
> > (it can't really be the lack of suitable move insns, as those
> > can be synthesized by using e.g. vextract{32,64}x4).
> Also be careful of SIMD Floating-Point Exception if we use the zmm
> version for those arithmetic instructions, the upper bits need to be
> explicitly cleared for 128/256-bit operand.
> For pternlog or other logic instructions, it's ok since there's no
> SIMD Floating-Point Exception for such instructions.
>
> >
> > Jan
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich  wrote:
>
> On 25.06.2023 07:12, Hongtao Liu wrote:
> > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches
> >  wrote:
> >>
> >> ---
> >> For the purpose here (and elsewhere) bcst_vector_operand() (really:
> >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow
> >> 128-bit and 256-bit types as well irrespective of AVX512VL being
> >> enabled. This would likely require a new predicate
> >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name
> >> selection it will want considering that this is applicable to certain
> >> non-calculational FP operations as well.)
> > I think so.
>
> Any preference towards predicate and constraint naming?
something like bcst_mem_operand_$suffiix, $suffix indicates the
pattern may use zmm instruction for 128/256-bit operand.
maybe just bcst_mem_operand_zmm?

>
> Plus I think there's a more general question behind this: A new
> predicate / constraint pair is likely just one way of dealing
> with the issue. Another would appear to be to remove the
> restriction of 128- and 256-byte types when AVX512VL is not
> enabled, but AVX512F is. While that would require touching a
> lot of insn constraints, it looks as if lifting that restriction
> would "merely" require much wider use of Yv where v is used
> right now. But of course I may well be unaware of (some of) the
> reasons why that restriction was put in place in the first place
> (it can't really be the lack of suitable move insns, as those
> can be synthesized by using e.g. vextract{32,64}x4).
Also be careful of SIMD Floating-Point Exception if we use the zmm
version for those arithmetic instructions, the upper bits need to be
explicitly cleared for 128/256-bit operand.
For pternlog or other logic instructions, it's ok since there's no
SIMD Floating-Point Exception for such instructions.

>
> Jan



-- 
BR,
Hongtao


Re: [PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Sun, Jun 25, 2023 at 2:16 PM Jan Beulich  wrote:
>
> On 25.06.2023 07:06, Hongtao Liu wrote:
> > On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches
> >  wrote:
> >>
> >> With respective two-operand bitwise operations now expressable by a
> >> single VPTERNLOG, add splitters to also deal with ior and xor
> >> counterparts of the original and-only case. Note that the splitters need
> >> to be separate, as the placement of "not" differs in the final insns
> >> (*iornot3, *xnor3) which are intended to pick up one half of
> >> the result.
> >>
> >> gcc/
> >>
> >> * config/i386/sse.md: New splitters to simplify
> >> not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}.
> >>
> >> gcc/testsuite/
> >>
> >> * gcc.target/i386/pr100711-4.c: New test.
> >> * gcc.target/i386/pr100711-5.c: New test.
> >>
> >> --- a/gcc/config/i386/sse.md
> >> +++ b/gcc/config/i386/sse.md
> >> @@ -17366,6 +17366,36 @@
> >> (match_dup 2)))]
> >>"operands[3] = gen_reg_rtx (mode);")
> >>
> >> +(define_split
> >> +  [(set (match_operand:VI 0 "register_operand")
> >> +   (ior:VI
> >> + (vec_duplicate:VI
> >> +   (not:
> >> + (match_operand: 1 "nonimmediate_operand")))
> >> + (match_operand:VI 2 "vector_operand")))]
> >> +  " == 64 || TARGET_AVX512VL
> >> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
> >> +  [(set (match_dup 3)
> >> +   (vec_duplicate:VI (match_dup 1)))
> >> +   (set (match_dup 0)
> >> +   (ior:VI (not:VI (match_dup 3)) (match_dup 2)))]
> >> +  "operands[3] = gen_reg_rtx (mode);")
> >> +
> >> +(define_split
> >> +  [(set (match_operand:VI 0 "register_operand")
> >> +   (xor:VI
> >> + (vec_duplicate:VI
> >> +   (not:
> >> + (match_operand: 1 "nonimmediate_operand")))
> >> + (match_operand:VI 2 "vector_operand")))]
> >> +  " == 64 || TARGET_AVX512VL
> >> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
> >> +  [(set (match_dup 3)
> >> +   (vec_duplicate:VI (match_dup 1)))
> >> +   (set (match_dup 0)
> >> +   (not:VI (xor:VI (match_dup 3) (match_dup 2]
> >> +  "operands[3] = gen_reg_rtx (mode);")
> >> +
> > Can we merge this splitter(xor:not) into ior:not one with a code
> > iterator for xor,ior, They look the same except for the xor/ior.
>
> They're only almost the same: Note (ior (not )) vs (not (xor )) as
> the result of the splitting. The difference is necessary to fit
> with what patch 1 introduces (which in turn is the way it is to
> fit with what generic code transforms things to up front). (I had
> it the way you suggest initially, until I figured why one of the
> two would end up never being used.)
>
3597  /* Convert (XOR (NOT x) (NOT y)) to (XOR x y).
3598 Also convert (XOR (NOT x) y) to (NOT (XOR x y)), similarly for
3599 (NOT y).  */
3600  {
3601int num_negated = 0;
3602
3603if (GET_CODE (op0) == NOT)
3604  num_negated++, op0 = XEXP (op0, 0);
3605if (GET_CODE (op1) == NOT)
3606  num_negated++, op1 = XEXP (op1, 0);

It looks simplify_rtx plays the trick.

And it's documented.
8602@cindex @code{xor}, canonicalization of
 8603@item
 8604The only possible RTL expressions involving both bitwise exclusive-or
 8605and bitwise negation are @code{(xor:@var{m} @var{x} @var{y})}
 8606and @code{(not:@var{m} (xor:@var{m} @var{x} @var{y}))}.

Then the original patch LGTM.

> Jan



-- 
BR,
Hongtao


Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 07:12, Hongtao Liu wrote:
> On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches
>  wrote:
>>
>> ---
>> For the purpose here (and elsewhere) bcst_vector_operand() (really:
>> bcst_mem_operand()) isn't permissive enough: We'd want it to allow
>> 128-bit and 256-bit types as well irrespective of AVX512VL being
>> enabled. This would likely require a new predicate
>> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name
>> selection it will want considering that this is applicable to certain
>> non-calculational FP operations as well.)
> I think so.

Any preference towards predicate and constraint naming?

Plus I think there's a more general question behind this: A new
predicate / constraint pair is likely just one way of dealing
with the issue. Another would appear to be to remove the
restriction of 128- and 256-byte types when AVX512VL is not
enabled, but AVX512F is. While that would require touching a
lot of insn constraints, it looks as if lifting that restriction
would "merely" require much wider use of Yv where v is used
right now. But of course I may well be unaware of (some of) the
reasons why that restriction was put in place in the first place
(it can't really be the lack of suitable move insns, as those
can be synthesized by using e.g. vextract{32,64}x4).

Jan


Re: [PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 07:06, Hongtao Liu wrote:
> On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches
>  wrote:
>>
>> With respective two-operand bitwise operations now expressable by a
>> single VPTERNLOG, add splitters to also deal with ior and xor
>> counterparts of the original and-only case. Note that the splitters need
>> to be separate, as the placement of "not" differs in the final insns
>> (*iornot3, *xnor3) which are intended to pick up one half of
>> the result.
>>
>> gcc/
>>
>> * config/i386/sse.md: New splitters to simplify
>> not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pr100711-4.c: New test.
>> * gcc.target/i386/pr100711-5.c: New test.
>>
>> --- a/gcc/config/i386/sse.md
>> +++ b/gcc/config/i386/sse.md
>> @@ -17366,6 +17366,36 @@
>> (match_dup 2)))]
>>"operands[3] = gen_reg_rtx (mode);")
>>
>> +(define_split
>> +  [(set (match_operand:VI 0 "register_operand")
>> +   (ior:VI
>> + (vec_duplicate:VI
>> +   (not:
>> + (match_operand: 1 "nonimmediate_operand")))
>> + (match_operand:VI 2 "vector_operand")))]
>> +  " == 64 || TARGET_AVX512VL
>> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
>> +  [(set (match_dup 3)
>> +   (vec_duplicate:VI (match_dup 1)))
>> +   (set (match_dup 0)
>> +   (ior:VI (not:VI (match_dup 3)) (match_dup 2)))]
>> +  "operands[3] = gen_reg_rtx (mode);")
>> +
>> +(define_split
>> +  [(set (match_operand:VI 0 "register_operand")
>> +   (xor:VI
>> + (vec_duplicate:VI
>> +   (not:
>> + (match_operand: 1 "nonimmediate_operand")))
>> + (match_operand:VI 2 "vector_operand")))]
>> +  " == 64 || TARGET_AVX512VL
>> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
>> +  [(set (match_dup 3)
>> +   (vec_duplicate:VI (match_dup 1)))
>> +   (set (match_dup 0)
>> +   (not:VI (xor:VI (match_dup 3) (match_dup 2]
>> +  "operands[3] = gen_reg_rtx (mode);")
>> +
> Can we merge this splitter(xor:not) into ior:not one with a code
> iterator for xor,ior, They look the same except for the xor/ior.

They're only almost the same: Note (ior (not )) vs (not (xor )) as
the result of the splitting. The difference is necessary to fit
with what patch 1 introduces (which in turn is the way it is to
fit with what generic code transforms things to up front). (I had
it the way you suggest initially, until I figured why one of the
two would end up never being used.)

Jan


RE: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 24, 2023 10:45 PM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de; richard.sandif...@arm.com
Subject: Re: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD



On 6/23/23 07:48, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong 
> 
> Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer.
> 
> Currenly,
> 1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones.
> 2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF.
> 
> Now, I think it makes sense that we can support
> 
> fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - 
> bias) is VF.
>   
> gcc/ChangeLog:
> 
>  * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}.
>  (gimple_fold_partial_load_store_mem_ref): Ditto.
>  (gimple_fold_partial_store): Ditto.
>  (gimple_fold_call): Ditto.
OK
jeff


Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 06:42, Hongtao Liu wrote:
> On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches
>  wrote:
>>
>> +(define_code_iterator andor [and ior])
>> +(define_code_attr nlogic [(and "nor") (ior "nand")])
>> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")])
>> +
>> +(define_insn "*3"
>> +  [(set (match_operand:VI 0 "register_operand" "=v,v")
>> +   (andor:VI
>> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v"))
>> + (not:VI (match_operand:VI 2 "bcst_vector_operand" "vBr,m"]
> I'm thinking of doing it in simplify_rtx or gimple match.pd to transform
> (and (not op1))  (not op2)) -> (not: (ior: op1 op2))

This wouldn't be a win (not + andn) -> (or + not), but what's
more important is ...

> (ior (not op1) (not op2)) -> (not : (and op1 op2))
> 
> Even w/o avx512f, the transformation should also benefit since it
> takes less logic operations 3 -> 2.(or 2 -> 2 for pandn).

... that these transformations (from the, as per the doc,
canonical representation of nand and nor) are already occurring
in common code, _if_ no suitable insn can be found. That was at
least the conclusion I drew from looking around a lot, supported
by the code that's generated prior to this change.

Jan


RE: [PATCH] New finish_compare_by_pieces target hook (for x86).

2023-06-24 Thread Roger Sayle


On Tue, 13 June 2023 12:02, Richard Biener wrote:
> On Mon, Jun 12, 2023 at 4:04 PM Roger Sayle 
> wrote:
> > The following simple test case, from PR 104610, shows that memcmp ()
> > == 0 can result in some bizarre code sequences on x86.
> >
> > int foo(char *a)
> > {
> > static const char t[] = "0123456789012345678901234567890";
> > return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; }
> >
> > with -O2 currently contains both:
> > xorl%eax, %eax
> > xorl$1, %eax
> > and also
> > movl$1, %eax
> > xorl$1, %eax
> >
> > Changing the return type of foo to _Bool results in the equally
> > bizarre:
> > xorl%eax, %eax
> > testl   %eax, %eax
> > sete%al
> > and also
> > movl$1, %eax
> > testl   %eax, %eax
> > sete%al
> >
> > All these sequences set the result to a constant, but this
> > optimization opportunity only occurs very late during compilation, by
> > basic block duplication in the 322r.bbro pass, too late for CSE or
> > peephole2 to do anything about it.  The problem is that the idiom
> > expanded by compare_by_pieces for __builtin_memcmp_eq contains basic
> > blocks that can't easily be optimized by if-conversion due to the
> > multiple incoming edges on the fail block.
> >
> > In summary, compare_by_pieces generates code that looks like:
> >
> > if (x[0] != y[0]) goto fail_label;
> > if (x[1] != y[1]) goto fail_label;
> > ...
> > if (x[n] != y[n]) goto fail_label;
> > result = 1;
> > goto end_label;
> > fail_label:
> > result = 0;
> > end_label:
> >
> > In theory, the RTL if-conversion pass could be enhanced to tackle
> > arbitrarily complex if-then-else graphs, but the solution proposed
> > here is to allow suitable targets to perform if-conversion during
> > compare_by_pieces.  The x86, for example, can take advantage that all
> > of the above comparisons set and test the zero flag (ZF), which can
> > then be used in combination with sete.  Hence compare_by_pieces could
> > instead generate:
> >
> > if (x[0] != y[0]) goto fail_label;
> > if (x[1] != y[1]) goto fail_label;
> > ...
> > if (x[n] != y[n]) goto fail_label;
> > fail_label:
> > sete result
> >
> > which requires one less basic block, and the redundant conditional
> > branch to a label immediately after is cleaned up by GCC's existing
> > RTL optimizations.
> >
> > For the test case above, where -O2 -msse4 previously generated:
> >
> > foo:movdqu  (%rdi), %xmm0
> > pxor.LC0(%rip), %xmm0
> > ptest   %xmm0, %xmm0
> > je  .L5
> > .L2:movl$1, %eax
> > xorl$1, %eax
> > ret
> > .L5:movdqu  16(%rdi), %xmm0
> > pxor.LC1(%rip), %xmm0
> > ptest   %xmm0, %xmm0
> > jne .L2
> > xorl%eax, %eax
> > xorl$1, %eax
> > ret
> >
> > we now generate:
> >
> > foo:movdqu  (%rdi), %xmm0
> > pxor.LC0(%rip), %xmm0
> > ptest   %xmm0, %xmm0
> > jne .L2
> > movdqu  16(%rdi), %xmm0
> > pxor.LC1(%rip), %xmm0
> > ptest   %xmm0, %xmm0
> > .L2:sete%al
> > movzbl  %al, %eax
> > ret
> >
> > Using a target hook allows the large amount of intelligence already in
> > compare_by_pieces to be re-used by the i386 backend, but this can also
> > help other backends with condition flags where the equality result can
> > be materialized.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> 
> What's the guarantee that the zero flag is appropriately set on all edges 
> incoming
> now and forever?

Is there any reason why this target hook can't be removed (in future) should it 
stop
being useful?  It's completely optional and not required for the correct 
functioning
of the compiler.

> Does this require target specific knowledge on how do_compare_rtx_and_jump
> is emitting RTL?

Yes.  Each backend can decide how best to implement finish_compare_by_pieces
given its internal knowledge of how do_compare_rtx_and_jump works.  It's not
important to the middle-end how the underlying invariants are guaranteed, just
that they are and the backend produces correct code.  A backend may store flags
on the target label, or maintain state in cfun.  Future changes to the i386 
backend
might cause it to revert to the default finish_compare_by_pieces, or provide an
alternate implementation, but at the moment this patch improves the code that
GCC generates.  Very little (in software like GCC) is forever.

> Do you see matching this in ifcvt to be unreasonable?  I'm thinking of 
> "reducing"
> the incoming edges pairwise without actually looking at the ifcvt code.

There's nothing about the proposed patch that prevents or blocks improvements

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches
 wrote:
>
> Following two-operand bitwise operations, add another splitter to also
> deal with not followed by broadcast all on its own, which can be
> expressed as simple embedded broadcast instead once a broadcast operand
> is actually permitted in the respective insn. While there also permit
> a broadcast operand in the corresponding expander.
The patch LGTM.
>
> gcc/
>
> * config/i386/sse.md: New splitters to simplify
> not;vec_duplicate as a singular vpternlog.
> (one_cmpl2): Allow broadcast for operand 1.
> (one_cmpl2): Likewise.
>
> gcc/testsuite/
>
> * gcc.target/i386/pr100711-6.c: New test.
> ---
> For the purpose here (and elsewhere) bcst_vector_operand() (really:
> bcst_mem_operand()) isn't permissive enough: We'd want it to allow
> 128-bit and 256-bit types as well irrespective of AVX512VL being
> enabled. This would likely require a new predicate
> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name
> selection it will want considering that this is applicable to certain
> non-calculational FP operations as well.)
I think so.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17156,7 +17156,7 @@
>
>  (define_expand "one_cmpl2"
>[(set (match_operand:VI 0 "register_operand")
> -   (xor:VI (match_operand:VI 1 "vector_operand")
> +   (xor:VI (match_operand:VI 1 "bcst_vector_operand")
> (match_dup 2)))]
>"TARGET_SSE"
>  {
> @@ -17168,7 +17168,7 @@
>
>  (define_insn "one_cmpl2"
>[(set (match_operand:VI 0 "register_operand" "=v,v")
> -   (xor:VI (match_operand:VI 1 "nonimmediate_operand" "v,m")
> +   (xor:VI (match_operand:VI 1 "bcst_vector_operand" "vBr,m")
> (match_operand:VI 2 "vector_all_ones_operand" "BC,BC")))]
>"TARGET_AVX512F
> && (!
> @@ -17191,6 +17191,19 @@
>   (symbol_ref " == 64 || TARGET_AVX512VL")
>   (const_int 1)))])
>
> +(define_split
> +  [(set (match_operand:VI48_AVX512F 0 "register_operand")
> +   (vec_duplicate:VI48_AVX512F
> + (not:
> +   (match_operand: 1 "nonimmediate_operand"]
> +  " == 64 || TARGET_AVX512VL
> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
> +  [(set (match_dup 0)
> +   (xor:VI48_AVX512F
> + (vec_duplicate:VI48_AVX512F (match_dup 1))
> + (match_dup 2)))]
> +  "operands[2] = CONSTM1_RTX (mode);")
> +
>  (define_expand "_andnot3"
>[(set (match_operand:VI_AVX2 0 "register_operand")
> (and:VI_AVX2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr100711-6.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
> +
> +typedef int v16si __attribute__ ((vector_size (64)));
> +typedef long long v8di __attribute__((vector_size (64)));
> +
> +v16si foo_v16si (const int *a)
> +{
> +return (__extension__ (v16si) {~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a,
> +  ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a});
> +}
> +
> +v8di foo_v8di (const long long *a)
> +{
> +return (__extension__ (v8di) {~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a});
> +}
> +
> +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x55, 
> \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}" 2 } } */
>


-- 
BR,
Hongtao


Re: [PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches
 wrote:
>
> With respective two-operand bitwise operations now expressable by a
> single VPTERNLOG, add splitters to also deal with ior and xor
> counterparts of the original and-only case. Note that the splitters need
> to be separate, as the placement of "not" differs in the final insns
> (*iornot3, *xnor3) which are intended to pick up one half of
> the result.
>
> gcc/
>
> * config/i386/sse.md: New splitters to simplify
> not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}.
>
> gcc/testsuite/
>
> * gcc.target/i386/pr100711-4.c: New test.
> * gcc.target/i386/pr100711-5.c: New test.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17366,6 +17366,36 @@
> (match_dup 2)))]
>"operands[3] = gen_reg_rtx (mode);")
>
> +(define_split
> +  [(set (match_operand:VI 0 "register_operand")
> +   (ior:VI
> + (vec_duplicate:VI
> +   (not:
> + (match_operand: 1 "nonimmediate_operand")))
> + (match_operand:VI 2 "vector_operand")))]
> +  " == 64 || TARGET_AVX512VL
> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
> +  [(set (match_dup 3)
> +   (vec_duplicate:VI (match_dup 1)))
> +   (set (match_dup 0)
> +   (ior:VI (not:VI (match_dup 3)) (match_dup 2)))]
> +  "operands[3] = gen_reg_rtx (mode);")
> +
> +(define_split
> +  [(set (match_operand:VI 0 "register_operand")
> +   (xor:VI
> + (vec_duplicate:VI
> +   (not:
> + (match_operand: 1 "nonimmediate_operand")))
> + (match_operand:VI 2 "vector_operand")))]
> +  " == 64 || TARGET_AVX512VL
> +   || (TARGET_AVX512F && !TARGET_PREFER_AVX256)"
> +  [(set (match_dup 3)
> +   (vec_duplicate:VI (match_dup 1)))
> +   (set (match_dup 0)
> +   (not:VI (xor:VI (match_dup 3) (match_dup 2]
> +  "operands[3] = gen_reg_rtx (mode);")
> +
Can we merge this splitter(xor:not) into ior:not one with a code
iterator for xor,ior, They look the same except for the xor/ior.
No need to merge it into and:not case which have different guard conditions.
Others LGTM.
>  (define_insn "*andnot3_mask"
>[(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v")
> (vec_merge:VI48_AVX512VL
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr100711-4.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */
> +
> +typedef char v64qi __attribute__ ((vector_size (64)));
> +typedef short v32hi __attribute__ ((vector_size (64)));
> +typedef int v16si __attribute__ ((vector_size (64)));
> +typedef long long v8di __attribute__((vector_size (64)));
> +
> +v64qi foo_v64qi (char a, v64qi b)
> +{
> +return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b;
> +}
> +
> +v32hi foo_v32hi (short a, v32hi b)
> +{
> +return (__extension__ (v32hi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b;
> +}
> +
> +v16si foo_v16si (int a, v16si b)
> +{
> +return (__extension__ (v16si) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +  ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b;
> +}
> +
> +v8di foo_v8di (long long a, v8di b)
> +{
> +return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b;
> +}
> +
> +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 4 { 
> target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 2 { 
> target { ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xdd" 2 { 
> target { ia32 } } } } */
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr100711-5.c
> @@ -0,0 +1,40 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */
> +
> +typedef char v64qi __attribute__ ((vector_size (64)));
> +typedef short v32hi __attribute__ ((vector_size (64)));
> +typedef int v16si __attribute__ ((vector_size (64)));
> +typedef long long v8di __attribute__((vector_size (64)));
> +
> +v64qi foo_v64qi (char a, v64qi b)
> +{
> +return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a,
> +   ~a, ~a, ~a, ~a, ~a, ~a, 

Re: [PATCH 3/5] x86: allow memory operand for AVX2 splitter for PR target/100711

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches
 wrote:
>
> The intended broadcast (with AVX512) can very well be done right from
> memory.
Ok.
>
> gcc/
>
> * config/i386/sse.md: Permit non-immediate operand 1 in AVX2
> form of splitter for PR target/100711.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17356,7 +17356,7 @@
> (and:VI_AVX2
>   (vec_duplicate:VI_AVX2
> (not:
> - (match_operand: 1 "register_operand")))
> + (match_operand: 1 "nonimmediate_operand")))
>   (match_operand:VI_AVX2 2 "vector_operand")))]
>"TARGET_AVX2"
>[(set (match_dup 3)
>


-- 
BR,
Hongtao


Re: [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 21, 2023 at 2:27 PM Jan Beulich via Gcc-patches
 wrote:
>
> When it's the memory operand which is to be inverted, using VPANDN*
> requires a further load instruction. The same can be achieved by a
> single VPTERNLOG*. Add two new alternatives (for plain memory and
> embedded broadcast), adjusting the predicate for the first operand
> accordingly.
>
> Two pre-existing testcases actually end up being affected (improved) by
> the change, which is reflected in updated expectations there.
LGTM.
>
> gcc/
>
> PR target/93768
> * config/i386/sse.md (*andnot3): Add new alternatives
> for memory form operand 1.
>
> gcc/testsuite/
>
> PR target/93768
> * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test.
> * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations
> towards generated code.
> * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit
> code.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17210,11 +17210,13 @@
>"TARGET_AVX512F")
>
>  (define_insn "*andnot3"
> -  [(set (match_operand:VI 0 "register_operand" "=x,x,v")
> +  [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v")
> (and:VI
> - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v"))
> - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))]
> -  "TARGET_SSE"
> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br"))
> + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v,v")))]
> +  "TARGET_SSE
> +   && (register_operand (operands[1], mode)
> +   || register_operand (operands[2], mode))"
>  {
>char buf[64];
>const char *ops;
> @@ -17281,6 +17283,15 @@
>  case 2:
>ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
>break;
> +case 3:
> +case 4:
> +  tmp = "pternlog";
> +  ssesuffix = "";
> +  if (which_alternative != 4 || TARGET_AVX512VL)
> +   ops = "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}";
> +  else
> +   ops = "v%s%s\t{$0x44, %%g1, %%g2, %%g0|%%g0, %%g2, %%g1, $0x44}";
> +  break;
>  default:
>gcc_unreachable ();
>  }
> @@ -17289,7 +17300,7 @@
>output_asm_insn (buf, operands);
>return "";
>  }
> -  [(set_attr "isa" "noavx,avx,avx")
> +  [(set_attr "isa" "noavx,avx,avx,*,*")
> (set_attr "type" "sselog")
> (set (attr "prefix_data16")
>   (if_then_else
> @@ -17297,9 +17308,12 @@
> (eq_attr "mode" "TI"))
> (const_string "1")
> (const_string "*")))
> -   (set_attr "prefix" "orig,vex,evex")
> +   (set_attr "prefix" "orig,vex,evex,evex,evex")
> (set (attr "mode")
> -   (cond [(match_test "TARGET_AVX2")
> +   (cond [(and (eq_attr "alternative" "3,4")
> +   (match_test " < 64 && !TARGET_AVX512VL"))
> +(const_string "XI")
> +  (match_test "TARGET_AVX2")
>  (const_string "")
>(match_test "TARGET_AVX")
>  (if_then_else
> @@ -17310,7 +17324,15 @@
> (match_test "optimize_function_for_size_p (cfun)"))
>  (const_string "V4SF")
>   ]
> - (const_string "")))])
> + (const_string "")))
> +   (set (attr "enabled")
> +   (cond [(eq_attr "alternative" "3")
> +(symbol_ref " == 64 || TARGET_AVX512VL")
> +  (eq_attr "alternative" "4")
> +(symbol_ref " == 64 || TARGET_AVX512VL
> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)")
> + ]
> + (const_string "*")))])
>
>  ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn
>  (define_split
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
> +/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, 
> \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
> +/* { dg-final { scan-assembler-not "vpbroadcast" } } */
> +
> +#define type __m512i
> +#define vec 512
> +#define op andnot
> +#define suffix epi64
> +#define SCALAR long long
> +
> +#include "avx512-binop-2.h"
> --- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
> @@ -1,7 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mavx512f -O2" } */
> -/* { dg-final { scan-assembler-times "vpbroadcastd\[^\n\]*%zmm\[0-9\]+" 1 } 
> } */
> -/* { dg-final { scan-assembler-times "vpandnd\[^\n\]*%zmm\[0-9\]+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, 
> \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
> +/* { dg-final { scan-assembler-not "vpbroadcast" } } */
>
>  #define type __m512i
>  #define vec 512
> --- a/gcc/testsuite/gcc.target/i386/pr100711-3.c
> +++ b/gc

Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches
 wrote:
>
> All combinations of and, ior, xor, and not involving two operands can be
> expressed that way in a single insn.
>
> gcc/
>
> PR target/93768
> * config/i386/i386.cc (ix86_rtx_costs): Further special-case
> bitwise vector operations.
> * config/i386/sse.md (*iornot3): New insn.
> (*xnor3): Likewise.
> (*3): Likewise.
> (andor): New code iterator.
> (nlogic): New code attribute.
> (ternlog_nlogic): Likewise.
>
> gcc/testsuite/
>
> PR target/93768
> gcc.target/i386/avx512-binop-not-1.h: New.
> gcc.target/i386/avx512-binop-not-2.h: New.
> gcc.target/i386/avx512f-orn-si-zmm-1.c: New test.
> gcc.target/i386/avx512f-orn-si-zmm-2.c: New test.
> ---
> The use of VI matches that in e.g. one_cmpl2 /
> one_cmpl2 and *andnot3, despite
> (here and there)
> - V64QI and V32HI being needlessly excluded when AVX512BW isn't enabled,
> - VTI not being covered,
> - vector modes more narrow than 16 bytes not being covered.
>
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -21178,6 +21178,32 @@ ix86_rtx_costs (rtx x, machine_mode mode
>return false;
>
>  case IOR:
> +  if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
> +   {
> + /* (ior (not ...) ...) can be a single insn in AVX512.  */
> + if (GET_CODE (XEXP (x, 0)) == NOT && TARGET_AVX512F
> + && (GET_MODE_SIZE (mode) == 64
> + || (TARGET_AVX512VL
> + && (GET_MODE_SIZE (mode) == 32
> + || GET_MODE_SIZE (mode) == 16
> +   {
> + rtx right = GET_CODE (XEXP (x, 1)) != NOT
> + ? XEXP (x, 1) : XEXP (XEXP (x, 1), 0);
> +
> + *total = ix86_vec_cost (mode, cost->sse_op)
> +  + rtx_cost (XEXP (XEXP (x, 0), 0), mode,
> +  outer_code, opno, speed)
> +  + rtx_cost (right, mode, outer_code, opno, speed);
> + return true;
> +   }
> + *total = ix86_vec_cost (mode, cost->sse_op);
> +   }
> +  else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
> +   *total = cost->add * 2;
> +  else
> +   *total = cost->add;
> +  return false;
> +
>  case XOR:
>if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
> *total = ix86_vec_cost (mode, cost->sse_op);
> @@ -21198,11 +21224,20 @@ ix86_rtx_costs (rtx x, machine_mode mode
>   /* pandn is a single instruction.  */
>   if (GET_CODE (XEXP (x, 0)) == NOT)
> {
> + rtx right = XEXP (x, 1);
> +
> + /* (and (not ...) (not ...)) can be a single insn in AVX512.  */
> + if (GET_CODE (right) == NOT && TARGET_AVX512F
> + && (GET_MODE_SIZE (mode) == 64
> + || (TARGET_AVX512VL
> + && (GET_MODE_SIZE (mode) == 32
> + || GET_MODE_SIZE (mode) == 16
> +   right = XEXP (right, 0);
> +
>   *total = ix86_vec_cost (mode, cost->sse_op)
>+ rtx_cost (XEXP (XEXP (x, 0), 0), mode,
>outer_code, opno, speed)
> -  + rtx_cost (XEXP (x, 1), mode,
> -  outer_code, opno, speed);
> +  + rtx_cost (right, mode, outer_code, opno, speed);
>   return true;
> }
>   else if (GET_CODE (XEXP (x, 1)) == NOT)
> @@ -21260,8 +21295,25 @@ ix86_rtx_costs (rtx x, machine_mode mode
>
>  case NOT:
>if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
> -   // vnot is pxor -1.
> -   *total = ix86_vec_cost (mode, cost->sse_op) + 1;
> +   {
> + /* (not (xor ...)) can be a single insn in AVX512.  */
> + if (GET_CODE (XEXP (x, 0)) == XOR && TARGET_AVX512F
> + && (GET_MODE_SIZE (mode) == 64
> + || (TARGET_AVX512VL
> + && (GET_MODE_SIZE (mode) == 32
> + || GET_MODE_SIZE (mode) == 16
> +   {
> + *total = ix86_vec_cost (mode, cost->sse_op)
> +  + rtx_cost (XEXP (XEXP (x, 0), 0), mode,
> +  outer_code, opno, speed)
> +  + rtx_cost (XEXP (XEXP (x, 0), 1), mode,
> +  outer_code, opno, speed);
> + return true;
> +   }
> +
> + // vnot is pxor -1.
> + *total = ix86_vec_cost (mode, cost->sse_op) + 1;
> +   }
>else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
> *total = cost->add * 2;
>else
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17616,6 +17616,98 @@
>operands[2] = force_reg (V1TImode, CONSTM1_RTX (V1TImode));
>  })
>
> +(define_insn "*iornot3"
> +  [(set (match_operand:VI 0

Re: [PATCH] RISC-V: force arg and target to reg rtx under -O0

2023-06-24 Thread juzhe.zh...@rivai.ai
Hi, Li.
Appreciate for catching this!

I think it's better:
-emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src));
+emit_move_insn (gen_lowpart (e.vector_mode (), e.target), src);
do this to fix this issue.

Thanks.


juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-06-25 11:08
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; Li Xu
Subject: [PATCH] RISC-V: force arg and target to reg rtx under -O0
arg and target should be expanded to reg rtx during expand pass.
 
Consider this following case:
void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) {
  vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1);
}
 
Compilation fails with:
test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4':
test.c:5:1: error: unrecognizable insn:
5 | }
  | ^
(insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32])
(mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1
 (nil))
during RTL pass: vregs
test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791
0x7c61b8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
../.././riscv-gcc/gcc/rtl-error.cc:108
0x7c61d7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../.././riscv-gcc/gcc/rtl-error.cc:116
0xed58a7 extract_insn(rtx_insn*)
../.././riscv-gcc/gcc/recog.cc:2791
0xb7f789 instantiate_virtual_regs_in_insn
../.././riscv-gcc/gcc/function.cc:1611
0xb7f789 instantiate_virtual_regs
../.././riscv-gcc/gcc/function.cc:1984
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-bases.cc: force arg and target to 
reg rtx.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test.
---
gcc/config/riscv/riscv-vector-builtins-bases.cc   | 5 -
gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c | 8 
2 files changed, 12 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index c6c53dc13a5..f135f7971fa 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1567,7 +1567,10 @@ public:
   {
 tree arg = CALL_EXPR_ARG (e.exp, 0);
 rtx src = expand_normal (arg);
-emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src));
+if (MEM_P (e.target))
+  e.target = force_reg (GET_MODE (e.target), e.target);
+emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target),
+  MEM_P (src) ? force_reg (GET_MODE (src), src) : src));
 return e.target;
   }
};
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c
new file mode 100644
index 000..2b088b53546
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */
+
+#include "riscv_vector.h"
+
+void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) {
+  vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1);
+}
-- 
2.17.1
 
 


[PATCH] internal-fn: Fix bug of BIAS argument index

2023-06-24 Thread juzhe . zhong
From: Ju-Zhe Zhong 

When trying to enable LEN_MASK_{LOAD,STORE} in RISC-V port,
I found I made a mistake in case of argument index of BIAS.

This patch is an obvious fix,

Ok for trunk ?

gcc/ChangeLog:

* internal-fn.cc (expand_partial_store_optab_fn): Fix bug of BIAS 
argument index.

---
 gcc/internal-fn.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 1c2fd487e2a..9017176dc7a 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -2991,7 +2991,7 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall 
*stmt, convert_optab optab
   maskt = gimple_call_arg (stmt, 3);
   mask = expand_normal (maskt);
   create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  biast = gimple_call_arg (stmt, 4);
+  biast = gimple_call_arg (stmt, 5);
   bias = expand_normal (biast);
   create_input_operand (&ops[4], bias, QImode);
   icode = convert_optab_handler (optab, TYPE_MODE (type), GET_MODE (mask));
-- 
2.36.3



[PATCH] RISC-V: force arg and target to reg rtx under -O0

2023-06-24 Thread Li Xu
arg and target should be expanded to reg rtx during expand pass.

Consider this following case:
void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) {
  vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1);
}

Compilation fails with:
test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4':
test.c:5:1: error: unrecognizable insn:
5 | }
  | ^
(insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32])
(mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1
 (nil))
during RTL pass: vregs
test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791
0x7c61b8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
../.././riscv-gcc/gcc/rtl-error.cc:108
0x7c61d7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
../.././riscv-gcc/gcc/rtl-error.cc:116
0xed58a7 extract_insn(rtx_insn*)
../.././riscv-gcc/gcc/recog.cc:2791
0xb7f789 instantiate_virtual_regs_in_insn
../.././riscv-gcc/gcc/function.cc:1611
0xb7f789 instantiate_virtual_regs
../.././riscv-gcc/gcc/function.cc:1984

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: force arg and target to 
reg rtx.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test.
---
 gcc/config/riscv/riscv-vector-builtins-bases.cc   | 5 -
 gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c | 8 
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index c6c53dc13a5..f135f7971fa 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1567,7 +1567,10 @@ public:
   {
 tree arg = CALL_EXPR_ARG (e.exp, 0);
 rtx src = expand_normal (arg);
-emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src));
+if (MEM_P (e.target))
+  e.target = force_reg (GET_MODE (e.target), e.target);
+emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target),
+  MEM_P (src) ? force_reg (GET_MODE (src), src) : src));
 return e.target;
   }
 };
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c
new file mode 100644
index 000..2b088b53546
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */
+
+#include "riscv_vector.h"
+
+void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) {
+  vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1);
+}
-- 
2.17.1



Re: Re: [PATCH V1] RISC-V:Add float16 tuple type support

2023-06-24 Thread juzhe.zh...@rivai.ai
Such issue will be addressed by this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622440.html 
But still wait for Jakub's comments.



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2023-06-23 18:25
To: shiyulong
CC: gcc-patches; palmer; kito.cheng; jim.wilson.gcc; juzhe.zhong; pan2.li; 
wuwei2016; jiawei; shihua; dje.gcc; mirimmad
Subject: Re: [PATCH V1] RISC-V:Add float16 tuple type support
../../gcc/lto-streamer-out.cc: In function 'void lto_output_init_mode_table()':
../../gcc/lto-streamer-out.cc:3177:10: error: 'void* memset(void*, int, 
size_t)' forming offset [256, 283] is out of the bounds [0, 256] of object 
'streamer_mode_table' with type 'unsigned char [256]' [-Werror=array-bounds=]
3177 |   memset (streamer_mode_table, '\0', MAX_MACHINE_MODE);
  |   ~~~^
In file included from ../../gcc/gimple-streamer.h:25,
 from ../../gcc/lto-streamer-out.cc:33:
../../gcc/tree-streamer.h:78:22: note: 'streamer_mode_table' declared here
   78 | extern unsigned char streamer_mode_table[1 << 8];
  |  ^~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:1180: lto-streamer-out.o] Error 1
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."
 


[PATCHv4, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-06-24 Thread HAO CHEN GUI via Gcc-patches
Hi,
  This patch adds a new insn for vector splat with small V2DI constants on P8.
If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded
with vspltisw and vupkhsw on P8. It should be efficient than loading vector from
memory.

  Compared to last version, the main change is to remove the new constraint and
use a super constraint in the insn and set the check into insn condition.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen

ChangeLog
2023-06-25  Haochen Gui 

gcc/
PR target/104124
* config/rs6000/altivec.md (*altivec_vupkhs_direct): Rename
to...
(altivec_vupkhs_direct): ...this.
* config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split): New
predicate to test if a constant can be loaded with vspltisw and
vupkhsw.
(easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check if
a vector constant can be synthesized with a vspltisw and a vupkhsw.
* config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p): Declare.
* config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): New function
to return true if OP mode is V2DI and can be synthesized with vupkhsw
and vspltisw.
* config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up
constants with vspltisw and vupkhsw.

gcc/testsuite/
PR target/104124
* gcc.target/powerpc/pr104124.c: New.

patch.diff
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 49b0c964f4d..2c932854c33 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2542,7 +2542,7 @@ (define_insn "altivec_vupkhs"
 }
   [(set_attr "type" "vecperm")])

-(define_insn "*altivec_vupkhs_direct"
+(define_insn "altivec_vupkhs_direct"
   [(set (match_operand:VP 0 "register_operand" "=v")
(unspec:VP [(match_operand: 1 "register_operand" "v")]
 UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 52c65534e51..f62a4d9b506 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -694,6 +694,12 @@ (define_predicate "xxspltib_constant_split"
   return num_insns > 1;
 })

+;; Return true if the operand is a constant that can be loaded with a vspltisw
+;; instruction and then a vupkhsw instruction.
+
+(define_predicate "vspltisw_vupkhsw_constant_split"
+  (and (match_code "const_vector")
+   (match_test "vspltisw_vupkhsw_constant_p (op, mode)")))

 ;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
 ;; instruction.
@@ -742,6 +748,11 @@ (define_predicate "easy_vector_constant"
   && xxspltib_constant_p (op, mode, &num_insns, &value))
return true;

+  /* V2DI constant within RANGE (-16, 15) can be synthesized with a
+vspltisw and a vupkhsw.  */
+  if (vspltisw_vupkhsw_constant_p (op, mode, &value))
+   return true;
+
   return easy_altivec_constant (op, mode);
 }

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 1a4fc1df668..00cb2d82953 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, 
rtx, int, int, int,

 extern int easy_altivec_constant (rtx, machine_mode);
 extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
+extern bool vspltisw_vupkhsw_constant_p (rtx, machine_mode, int * = nullptr);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3be5860dd9b..ae34a02b282 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -6638,6 +6638,36 @@ xxspltib_constant_p (rtx op,
   return true;
 }

+/* Return true if OP mode is V2DI and can be synthesized with ISA 2.07
+   instructions vupkhsw and vspltisw.
+
+   Return the constant that is being split via CONSTANT_PTR.  */
+
+bool
+vspltisw_vupkhsw_constant_p (rtx op, machine_mode mode, int *constant_ptr)
+{
+  HOST_WIDE_INT value;
+  rtx elt;
+
+  if (!TARGET_P8_VECTOR)
+return false;
+
+  if (mode != V2DImode)
+return false;
+
+  if (!const_vec_duplicate_p (op, &elt))
+return false;
+
+  value = INTVAL (elt);
+  if (value == 0 || value == 1
+  || !EASY_VECTOR_15 (value))
+return false;
+
+  if (constant_ptr)
+*constant_ptr = (int) value;
+  return true;
+}
+
 const char *
 output_vec_const_move (rtx *operands)
 {
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 7d845df5c2d..4919b073e50 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1174,6 +1174,30 @@ (define_insn_and_split "*xxspltib__split"
   [(set_attr "type" "vecperm")
(set_attr "length" "8")])

+(defi

Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-24 Thread Stefan O'Rear via Gcc-patches
On Sat, Jun 24, 2023, at 11:01 AM, Jeff Law via Gcc-patches wrote:
> On 6/21/23 02:14, Wang, Yanzhang wrote:
>> Hi Jeff, sorry for the late reply.
>> 
>>> The long branch handling is done at the assembler level.  So the clobbering
>>> of $ra isn't visible to the compiler.  Thus the compiler has to be
>>> extremely careful to not hold values in $ra because the assembler may
>>> clobber $ra.
>> 
>> If assembler will modify the $ra behavior, it seems the rules we defined in
>> the riscv.cc will be ignored. For example, the $ra saving generated by this
>> patch may be modified by the assmebler and all others depends on it will be
>> wrong. So implementing the long jump in the compiler is better.
> Basically correct.  The assembler potentially clobbers $ra.  That's why 
> in the long jump patches $ra becomes a fixed register -- the compiler 
> doesn't know when it's clobbered by the assembler.
>
> Even if this were done in the compiler, we'd still have to do something 
> special with $ra.  The point at which decisions about register 
> allocation and such are made is before the point where we know the final 
> positions of jumps/labels.  It's a classic problem in GCC's design.

Do you have a reference for more information on the long jump patches?

I'm particularly curious about why $ra was selected as the temporary instead
of $t1 like the tail pseudoinstruction uses.

-s


RE: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 24, 2023 10:09 PM
To: 钟居哲 ; gcc-patches 
Cc: rguenther ; richard.sandiford 
Subject: Re: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias 
analysis



On 6/23/23 17:21, 钟居哲 wrote:
> Not sure since I saw MASK_STORE/LEN_STORE didn't compute size.
Yea, I think you're right.  We take the size from the LHS.  My mistake.

This is fine for the trunk.

jeff


Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-24 Thread Hongtao Liu via Gcc-patches
On Sun, Jun 25, 2023 at 9:17 AM Liu, Hongtao  wrote:
>
>
>
> > -Original Message-
> > From: Jan Beulich 
> > Sent: Wednesday, June 21, 2023 8:40 PM
> > To: Hongtao Liu 
> > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin ; Liu,
> > Hongtao 
> > Subject: Re: [PATCH v2] x86: make better use of VBROADCASTSS /
> > VPBROADCASTD
> >
> > On 21.06.2023 09:44, Jan Beulich wrote:
> > > On 21.06.2023 09:37, Hongtao Liu wrote:
> > >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches
> > >>  wrote:
> > >>>
> > >>> Isn't prefix_extra use bogus here? What extra prefix does
> > >>> vbroadcastss
> > >> According to comments, yes, no extra prefix is needed.
> > >>
> > >> ;; There are also additional prefixes in 3DNOW, SSSE3.
> > >> ;; ssemuladd,sse4arg default to 0f24/0f25 and DREX byte, ;;
> > >> sseiadd1,ssecvt1 to 0f7a with no DREX byte.
> > >> ;; 3DNOW has 0f0f prefix, SSSE3 and SSE4_{1,2} 0f38/0f3a.
> > >
> > > Right, that's what triggered my question. I guess dropping these
> > > "prefix_extra" really wants to be a separate patch (or maybe even
> > > multiple, but it's hard to see how to split), dealing with all of the
> > > instances which likely have accumulated simply via copy-and-paste.
> >
> > Or wait - I'm altering those lines anyway, so I could as well drop them 
> > right
> > away (and slightly shrink patch size), if that's okay with you. Of course I
> > should then not forget to also mention this in the changelog entry.
> >
> Yes.
>Would you be okay for me to fold in that adjustment, or do you
>insist on a separate patch?
Also for this, no need for a separate patch.
> > Jan



-- 
BR,
Hongtao


RE: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-24 Thread Liu, Hongtao via Gcc-patches


> -Original Message-
> From: Jan Beulich 
> Sent: Wednesday, June 21, 2023 8:40 PM
> To: Hongtao Liu 
> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin ; Liu,
> Hongtao 
> Subject: Re: [PATCH v2] x86: make better use of VBROADCASTSS /
> VPBROADCASTD
> 
> On 21.06.2023 09:44, Jan Beulich wrote:
> > On 21.06.2023 09:37, Hongtao Liu wrote:
> >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches
> >>  wrote:
> >>>
> >>> Isn't prefix_extra use bogus here? What extra prefix does
> >>> vbroadcastss
> >> According to comments, yes, no extra prefix is needed.
> >>
> >> ;; There are also additional prefixes in 3DNOW, SSSE3.
> >> ;; ssemuladd,sse4arg default to 0f24/0f25 and DREX byte, ;;
> >> sseiadd1,ssecvt1 to 0f7a with no DREX byte.
> >> ;; 3DNOW has 0f0f prefix, SSSE3 and SSE4_{1,2} 0f38/0f3a.
> >
> > Right, that's what triggered my question. I guess dropping these
> > "prefix_extra" really wants to be a separate patch (or maybe even
> > multiple, but it's hard to see how to split), dealing with all of the
> > instances which likely have accumulated simply via copy-and-paste.
> 
> Or wait - I'm altering those lines anyway, so I could as well drop them right
> away (and slightly shrink patch size), if that's okay with you. Of course I
> should then not forget to also mention this in the changelog entry.
> 
Yes.
> Jan


RE: [PATCH V1] RISC-V:Add float16 tuple type abi

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, June 24, 2023 10:51 PM
To: juzhe.zh...@rivai.ai; yulong ; gcc-patches 

Cc: palmer ; Kito.cheng ; Li, Pan2 
; wuwei2016 ; jiawei 
; shihua ; dje.gcc ; 
pinskia ; Robin Dapp 
Subject: Re: [PATCH V1] RISC-V:Add float16 tuple type abi



On 6/21/23 01:46, juzhe.zh...@rivai.ai wrote:
> LGTM. Thanks.
OK from me as well.
jeff


Re: [Patch, fortran] PR49213 - [OOP] gfortran rejects structure constructor expression

2023-06-24 Thread Harald Anlauf via Gcc-patches

Hi Paul!

On 6/24/23 15:18, Paul Richard Thomas via Gcc-patches wrote:

I have included the adjustment to 'gfc_is_ptr_fcn' and eliminating the
extra blank line, introduced by my last patch. I played safe and went
exclusively for class functions with attr.class_pointer set on the
grounds that these have had all the accoutrements checked and built
(ie. class_ok). I am still not sure if this is necessary or not.


maybe it is my fault, but I find the version in the patch confusing:

@@ -816,7 +816,7 @@ bool
 gfc_is_ptr_fcn (gfc_expr *e)
 {
   return e != NULL && e->expr_type == EXPR_FUNCTION
- && (gfc_expr_attr (e).pointer
+ && ((e->ts.type != BT_CLASS && gfc_expr_attr (e).pointer)
  || (e->ts.type == BT_CLASS
  && CLASS_DATA (e)->attr.class_pointer));
 }

The caller 'gfc_is_ptr_fcn' has e->expr_type == EXPR_FUNCTION, so
gfc_expr_attr (e) boils down to:

  if (e->value.function.esym && e->value.function.esym->result)
{
  gfc_symbol *sym = e->value.function.esym->result;
  attr = sym->attr;
  if (sym->ts.type == BT_CLASS && sym->attr.class_ok)
{
  attr.dimension = CLASS_DATA (sym)->attr.dimension;
  attr.pointer = CLASS_DATA (sym)->attr.class_pointer;
  attr.allocatable = CLASS_DATA (sym)->attr.allocatable;
}
}
...
  else if (e->symtree)
attr = gfc_variable_attr (e, NULL);

So I thought this should already do what you want if you do

gfc_is_ptr_fcn (gfc_expr *e)
{
  return e != NULL && e->expr_type == EXPR_FUNCTION && gfc_expr_attr 
(e).pointer;

}

or what am I missing?  The additional checks in gfc_expr_attr are
there to avoid ICEs in case CLASS_DATA (sym) has issues, and we all
know Gerhard who showed that he is an expert in exploiting this.

To sum up, I'd prefer to use the safer form if it works.  If it
doesn't, I would expect a latent issue.

The rest of the code looked good to me, but I was suspicious about
the handling of CHARACTER.

Nasty as I am, I modified the testcase to use character(kind=4)
instead of kind=1 (see attached).  This either fails here (stop 10),
or if I activate the marked line

!cont = tContainer('hello!')   ! ### ICE! ###

I get an ICE.

Can you have another look?

Thanks,
Harald






OK for trunk?

Paul

Fortran: Enable class expressions in structure constructors [PR49213]

2023-06-24  Paul Thomas  

gcc/fortran
PR fortran/49213
* expr.cc (gfc_is_ptr_fcn): Guard pointer attribute to exclude
class expressions.
* resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow
associate names with pointer function targets to be used in
variable definition context.
* trans-decl.cc (get_symbol_decl): Remove extraneous line.
* trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain
size of intrinsic and character expressions.
(gfc_trans_subcomponent_assign): Expand assignment to class
components to include intrinsic and character expressions.

gcc/testsuite/
PR fortran/49213
* gfortran.dg/pr49213.f90 : New test
! { dg-do run }
!
! Contributed by Neil Carlson  
!
program main
! character(2) :: c
  character(2,kind=4) :: c

  type :: S
integer :: n
  end type
  type(S) :: Sobj

  type, extends(S) :: S2
integer :: m
  end type
  type(S2) :: S2obj

  type :: T
class(S), allocatable :: x
  end type
  type(T) :: Tobj

  Sobj = S(1)
  Tobj = T(Sobj)

  S2obj = S2(1,2)
  Tobj = T(S2obj)! Failed here
  select type (x => Tobj%x)
type is (S2)
  if ((x%n .ne. 1) .or. (x%m .ne. 2)) stop 1
class default
  stop 2
  end select

  c = 4_"  "
  call pass_it (T(Sobj))
  if (c .ne. 4_"S ") stop 3
  call pass_it (T(S2obj))! and here
  if (c .ne. 4_"S2") stop 4

  call bar

contains

  subroutine pass_it (foo)
type(T), intent(in) :: foo
select type (x => foo%x)
  type is (S)
c = 4_"S "
if (x%n .ne. 1) stop 5
  type is (S2)
c = 4_"S2"
if ((x%n .ne. 1) .or. (x%m .ne. 2)) stop 6
  class default
stop 7
end select
  end subroutine

  subroutine bar
   ! Test from comment #29 of the PR - due to Janus Weil
type tContainer
  class(*), allocatable :: x
end type
integer, parameter :: i = 0
character(7,kind=4) :: chr = 4_"goodbye"
type(tContainer) :: cont

cont%x = i ! linker error: undefined reference to `__copy_INTEGER_4_.3804'

cont = tContainer(i+42) ! Failed here
select type (z => cont%x)
  type is (integer)
if (z .ne. 42) stop 8
  class default
stop 9
end select

!cont = tContainer('hello!')   ! ### ICE! ###
cont = tContainer(4_'hello!')
select type (z => cont%x)
  type is (character(*,kind=4))
if (z .ne. 4_'hello!') stop 10
  class default
stop 11
end select

cont = tContainer(chr)
select type (z => cont%x)
  type is (character(*,kind=4))
if (z .ne. 4_'goodbye') stop 12
  class default
 

[PATCH, part2, committed] Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]

2023-06-24 Thread Harald Anlauf via Gcc-patches
Dear all,

the first part of the patch came with a testcase that also exercised
code for constant string arguments, which was not touched by that patch
but seems to have caused runtime failures on big-endian platforms
(e.g. Power-* BE) for all optimization levels, and on x86 / -m32
at -O1 and higher (not at -O0).

I did not see any issues on x86 / -m64 and any optimization level,
but could reproduce a problem with x86 / -m32 at -O1, which appears
to be related how arguments that are to be passed by value are
handled when there is a mismatch between the function prototype
and the passed argument.  The solution is to truncate too long
constant string arguments, fixed by the attached patch, pushed as:

https://gcc.gnu.org/g:3f97d10aa1ff5984d6fd657f246d3f251b254ff1

and see attached.

* * *

I found gcc-testresults quite helpful in checking whether my patch
caused trouble on architectures different from the one I'm working
on.  The value (pun intended) would have been even greater if
output of runtime failures would also be made available.
Many (Fortran) tests provide either a stop code, or some hopefully
helpful diagnostic output on stdout intended for locating errors
on platforms where one has no direct access to, or is less
familiar with.  Far better than a plain

FAIL: gfortran.dg/value_9.f90   -O1  execution test

* * *

Thanks,
Harald

From 3f97d10aa1ff5984d6fd657f246d3f251b254ff1 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sat, 24 Jun 2023 20:36:53 +0200
Subject: [PATCH] Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument
 [PR110360]

gcc/fortran/ChangeLog:

	PR fortran/110360
	* trans-expr.cc (gfc_conv_procedure_call): Truncate constant string
	argument of length > 1 passed to scalar CHARACTER(1),VALUE dummy.
---
 gcc/fortran/trans-expr.cc | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index c92fccd0be2..63e3cf9681e 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -6395,20 +6395,25 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,

 		/* ABI: actual arguments to CHARACTER(len=1),VALUE
 		   dummy arguments are actually passed by value.
-		   The BIND(C) case is handled elsewhere.
-		   TODO: truncate constant strings to length 1.  */
+		   Constant strings are truncated to length 1.
+		   The BIND(C) case is handled elsewhere.  */
 		if (fsym->ts.type == BT_CHARACTER
 			&& !fsym->ts.is_c_interop
 			&& fsym->ts.u.cl->length->expr_type == EXPR_CONSTANT
 			&& fsym->ts.u.cl->length->ts.type == BT_INTEGER
 			&& (mpz_cmp_ui
-			(fsym->ts.u.cl->length->value.integer, 1) == 0)
-			&& e->expr_type != EXPR_CONSTANT)
+			(fsym->ts.u.cl->length->value.integer, 1) == 0))
 		  {
-			parmse.expr = gfc_string_to_single_character
-			  (build_int_cst (gfc_charlen_type_node, 1),
-			   parmse.expr,
-			   e->ts.kind);
+			if (e->expr_type != EXPR_CONSTANT)
+			  parmse.expr = gfc_string_to_single_character
+			(build_int_cst (gfc_charlen_type_node, 1),
+			 parmse.expr,
+			 e->ts.kind);
+			else if (e->value.character.length > 1)
+			  {
+			e->value.character.length = 1;
+			gfc_conv_expr (&parmse, e);
+			  }
 		  }

 		if (fsym->attr.optional
--
2.35.3



[x86_PATCH] New *ashl_doubleword_highpart define_insn_and_split.

2023-06-24 Thread Roger Sayle

This patch contains a pair of (related) optimizations in i386.md that
allow us to generate better code for the example below (this is a step
towards fixing a bugzilla PR, but I've forgotten the number).

__int128 foo64(__int128 x, long long y)
{
  __int128 t = (__int128)y << 64;
  return x ^ t;
}

The hidden issue is that the RTL currently seen by reload contains
the sign extension of y from DImode to TImode, even though this is
dead (not required) for left shifts by more than WORD_SIZE bits.

(insn 11 8 12 2 (parallel [
(set (reg:TI 0 ax [orig:91 y ] [91])
(sign_extend:TI (reg:DI 1 dx [97])))
(clobber (reg:CC 17 flags))
(clobber (scratch:DI))
]) {extendditi2}

What makes this particularly undesirable is that the sign-extension
pattern above requires an additional DImode scratch register, indicated
by the clobber, which unnecessarily increases register pressure.

The proposed solution is to add a define_insn_and_split for such
left shifts (of sign or zero extensions) that only have a non-zero
highpart, where the extension is redundant and eliminated, that can
be split after reload, without scratch registers or early clobbers.

This (late split) exposes a second optimization opportunity where
setting the lowpart to zero can sometimes be combined/simplified with
the following instruction during peephole2.

For the test case above, we previously generated with -O2:

foo64:  xorl%eax, %eax
xorq%rsi, %rdx
xorq%rdi, %rax
ret

with this patch, we now generate:

foo64:  movq%rdi, %rax
xorq%rsi, %rdx
ret

Likewise for the related -m32 test case, we go from:

foo32:  movl12(%esp), %eax
movl%eax, %edx
xorl%eax, %eax
xorl8(%esp), %edx
xorl4(%esp), %eax
ret

to the improved:

foo32:  movl12(%esp), %edx
movl4(%esp), %eax
xorl8(%esp), %edx
ret


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-24  Roger Sayle  

gcc/ChangeLog
* config/i386/i386.md (peephole2): Simplify zeroing a register
followed by an IOR, XOR or PLUS operation on it, into a move.
(*ashl3_doubleword_highpart): New define_insn_and_split to
eliminate (and hide from reload) unnecessary word to doubleword
extensions that are followed by left shifts by sufficient large
(but valid) bit counts.

gcc/testsuite/ChangeLog
* gcc.target/i386/ashldi3-1.c: New 32-bit test case.
* gcc.target/i386/ashlti3-2.c: New 64-bit test case.


Thanks again,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 95a6653c..7664dff 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12206,6 +12206,18 @@
(set_attr "type" "alu")
(set_attr "mode" "QI")])
 
+;; Peephole2 rega = 0; rega op= regb into rega = regb.
+(define_peephole2
+  [(parallel [(set (match_operand:SWI 0 "general_reg_operand")
+  (const_int 0))
+ (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 0)
+  (any_or_plus:SWI (match_dup 0)
+   (match_operand:SWI 1 "")))
+ (clobber (reg:CC FLAGS_REG))])]
+  ""
+  [(set (match_dup 0) (match_dup 1))])
+   
 ;; Split DST = (HI<<32)|LO early to minimize register usage.
 (define_insn_and_split "*concat3_1"
   [(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
@@ -13365,6 +13377,28 @@
   [(const_int 0)]
   "ix86_split_ashl (operands, operands[3], mode); DONE;")
 
+(define_insn_and_split "*ashl3_doubleword_highpart"
+  [(set (match_operand: 0 "register_operand" "=r")
+   (ashift:
+ (any_extend: (match_operand:DWIH 1 "nonimmediate_operand" "rm"))
+ (match_operand:QI 2 "const_int_operand")))
+   (clobber (reg:CC FLAGS_REG))]
+  "INTVAL (operands[2]) >=  * BITS_PER_UNIT
+   && INTVAL (operands[2]) <  * BITS_PER_UNIT * 2"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  split_double_mode (mode, &operands[0], 1, &operands[0], &operands[3]);
+  int bits = INTVAL (operands[2]) - ( * BITS_PER_UNIT);
+  if (!rtx_equal_p (operands[3], operands[1]))
+emit_move_insn (operands[3], operands[1]);
+  if (bits > 0)
+emit_insn (gen_ashl3 (operands[3], operands[3], GEN_INT (bits)));
+  ix86_expand_clear (operands[0]);
+  DONE;
+})
+
 (define_insn "x86_64_shld"
   [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m")
 (ior:DI (ashift:DI (match_dup 0)
diff --git a/gcc/testsuite/gcc.target/i386/ashldi3-1.c 
b/gcc/testsuite/gcc.target/i386/ashldi3-1.c
new file mode 100644
index 000..b61d63b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/ashldi3-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2" } */
+
+long long foo(long long x, int y)
+{
+  lon

PR82943 - Suggested patch to fix

2023-06-24 Thread Alexander Westbrooks via Gcc-patches
Hello,

I am new to the GFortran community. Over the past two weeks I created a
patch that should fix PR82943 for GFortran. I have attached it to this
email. The patch allows the code below to compile successfully. I am
working on creating test cases next, but I am new to the process so it may
take me some time. After I make test cases, do I email them to you as well?
Do I need to make a pull-request on github in order to get the patch
reviewed?

Thank you,

Alexander Westbrooks

module testmod

public :: foo

type, public :: tough_lvl_0(a, b)
integer, kind :: a = 1
integer, len :: b
contains
procedure :: foo
end type

type, public, EXTENDS(tough_lvl_0) :: tough_lvl_1 (c)
integer, len :: c
contains
procedure :: bar
end type

type, public, EXTENDS(tough_lvl_1) :: tough_lvl_2 (d)
integer, len :: d
contains
procedure :: foobar
end type

contains
subroutine foo(this)
class(tough_lvl_0(1,*)), intent(inout) :: this
end subroutine

subroutine bar(this)
class(tough_lvl_1(1,*,*)), intent(inout) :: this
end subroutine

subroutine foobar(this)
class(tough_lvl_2(1,*,*,*)), intent(inout) :: this
end subroutine

end module

PROGRAM testprogram
USE testmod

TYPE(tough_lvl_0(1,5)) :: test_pdt_0
TYPE(tough_lvl_1(1,5,6))   :: test_pdt_1
TYPE(tough_lvl_2(1,5,6,7)) :: test_pdt_2

CALL test_pdt_0%foo()

CALL test_pdt_1%foo()
CALL test_pdt_1%bar()

CALL test_pdt_2%foo()
CALL test_pdt_2%bar()
CALL test_pdt_2%foobar()


END PROGRAM testprogram


0001-bug-patch-PR82943.patch
Description: Binary data


[x86_64 PATCH] Handle SUBREG conversions in TImode STV (for ptest).

2023-06-24 Thread Roger Sayle

This patch teaches i386's STV pass how to handle SUBREG conversions,
i.e. that a TImode SUBREG can be transformed into a V1TImode SUBREG,
without worrying about other DEFs and USEs.

A motivating example where this is useful is

typedef long long __m128i __attribute__ ((__vector_size__ (16)));
int foo (__m128i x, __m128i y) {
  return (__int128)x == (__int128)y;
}

where with -O2 -msse4 we can now scalar-to-vector transform:

(insn 7 4 8 2 (set (reg:CCZ 17 flags)
(compare:CCZ (subreg:TI (reg/v:V2DI 86 [ x ]) 0)
(subreg:TI (reg/v:V2DI 87 [ y ]) 0))) {*cmpti_doubleword}

into

(insn 17 4 7 2 (set (reg:V1TI 91)
(xor:V1TI (subreg:V1TI (reg/v:V2DI 86 [ x ]) 0)
(subreg:V1TI (reg/v:V2DI 87 [ y ]) 0)))
 (nil))
(insn 7 17 8 2 (set (reg:CCZ 17 flags)
(unspec:CCZ [
(reg:V1TI 91) repeated x2
] UNSPEC_PTEST)) {*sse4_1_ptestv1ti}
 (expr_list:REG_DEAD (reg/v:V2DI 87 [ y ])
(expr_list:REG_DEAD (reg/v:V2DI 86 [ x ])
(nil

with the dramatic effect that the assembly output before:

foo:movaps  %xmm0, -40(%rsp)
movq-32(%rsp), %rdx
movq%xmm0, %rax
movq%xmm1, %rsi
movaps  %xmm1, -24(%rsp)
movq-16(%rsp), %rcx
xorq%rsi, %rax
xorq%rcx, %rdx
orq %rdx, %rax
sete%al
movzbl  %al, %eax
ret

now becomes

foo:pxor%xmm1, %xmm0
xorl%eax, %eax
ptest   %xmm0, %xmm0
sete%al
ret

i.e. a 128-bit vector doesn't need to be transferred to the
scalar unit to be tested for equality.  The new test case includes
additional related examples that show similar improvements.

Previously we explicitly checked *cmpti_doubleword operands to be
either immediate constants, or a TImode REG or a TImode MEM.  By
enhancing this to allow a TImode SUBREG, we now handle everything
that would match the general_operand predicate, making this part
of STV more like other RTL passes (lra/reload).  The big change is
that unlike a regular DF USE, a SUBREG USE doesn't require us to
analyze and convert the rest of the DEF-USE chain.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-24  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-features.cc (scalar_chain:add_insn): Don't
call analyze_register_chain if the USE is a SUBREG.
(timode_scalar_chain::convert_op): Call gen_lowpart to convert
TImode SUBREGs to V1TImode SUBREGs.
(convertible_comparison_p): We can now handle all general_operands
of *cmp_doubleword.
(timode_remove_non_convertible_regs): We only need to check TImode
uses that aren't TImode SUBREGs of registers in other modes.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-ptest-7.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 4a3b07a..6e9ba54 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -449,7 +449,8 @@ scalar_chain::add_insn (bitmap candidates, unsigned int 
insn_uid,
 return true;
 
   for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
-if (!DF_REF_REG_MEM_P (ref))
+if (DF_REF_TYPE (ref) == DF_REF_REG_USE
+   && !SUBREG_P (DF_REF_REG (ref)))
   if (!analyze_register_chain (candidates, ref, disallowed))
return false;
 
@@ -1621,7 +1622,8 @@ timode_scalar_chain::convert_op (rtx *op, rtx_insn *insn)
   else
 {
   gcc_assert (SUBREG_P (*op));
-  gcc_assert (GET_MODE (*op) == vmode);
+  if (GET_MODE (*op) != V1TImode)
+   *op = gen_lowpart (V1TImode, *op);
 }
 }
 
@@ -1912,12 +1914,8 @@ convertible_comparison_p (rtx_insn *insn, enum 
machine_mode mode)
   rtx op2 = XEXP (src, 1);
 
   /* *cmp_doubleword.  */
-  if ((CONST_SCALAR_INT_P (op1)
-   || ((REG_P (op1) || MEM_P (op1))
-  && GET_MODE (op1) == mode))
-  && (CONST_SCALAR_INT_P (op2)
- || ((REG_P (op2) || MEM_P (op2))
- && GET_MODE (op2) == mode)))
+  if (general_operand (op1, mode)
+  && general_operand (op2, mode))
 return true;
 
   /* *testti_doubleword.  */
@@ -2244,8 +2242,9 @@ timode_remove_non_convertible_regs (bitmap candidates)
   DF_REF_REGNO (ref));
 
FOR_EACH_INSN_USE (ref, insn)
- if (!DF_REF_REG_MEM_P (ref)
- && GET_MODE (DF_REF_REG (ref)) == TImode)
+ if (DF_REF_TYPE (ref) == DF_REF_REG_USE
+ && GET_MODE (DF_REF_REG (ref)) == TImode
+ && !SUBREG_P (DF_REF_REG (ref)))
timode_check_non_convertible_regs (candidates, regs,
   DF_REF_REGNO (ref));
   }
diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-ptest-7.c 
b

Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/22/23 07:03, Robin Dapp wrote:

Hi,

when working on FP widening/narrowing I realized the Zvfhmin handling
is not ideal right now:  We use the "enabled" insn attribute to disable
instructions not available with Zvfhmin (but only with Zvfh).

However, "enabled == 0" only disables insn alternatives, in our case all
of them when the mode is a HFmode.  The insn itself remains available
(e.g. for combine to match) and we end up with an insn without alternatives
that reload cannot handle --> ICE.

The proper solution is to disable the instruction for the respective
mode altogether.  This patch achieves this by splitting the VF as well
as VWEXTF iterators into variants with TARGET_ZVFH and
TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or
TARGET_ZVFHMIN are true).  Also, VWCONVERTI, VHF and VHF_LMUL1 need
adjustments.

Regards
  Robin

gcc/ChangeLog:

* config/riscv/autovec.md: VF_AUTO -> VF.
* config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN,
VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and
VHF_LMUL1.
* config/riscv/vector.md: Use new iterators.

OK for the trunk.  Thanks for walking everyone through the issues here.

jeff


Re: [PATCH v7 0/6] c++, libstdc++: get std::is_object to dispatch to new built-in traits

2023-06-24 Thread Ken Matsui via Gcc-patches
On Tue, Jun 20, 2023 at 8:32 AM Patrick Palka  wrote:
>
> On Thu, 15 Jun 2023, Ken Matsui via Libstdc++ wrote:
>
> > Hi,
> >
> > For those curious about the performance improvements of this patch, I
> > conducted a benchmark that instantiates 256k specializations of
> > is_object_v based on Patrick's code. You can find the benchmark code
> > at this link:
> >
> > https://github.com/ken-matsui/gcc-benches/blob/main/is_object_benchmark.cc
> >
> > On my computer, using the gcc HEAD of this patch for a release build,
> > the patch with -DUSE_BUILTIN took 64% less time and used 44-47% less
> > memory compared to not using it.
>
> That's more like it :D  Though the benchmark should also invoke the
> trait on non-object types too, e.g. Instantiator& or Instantiator(int).

Here is the updated benchmark:

https://github.com/ken-matsui/gcc-benches/blob/main/is_object.md#sat-jun-24-080110-am-pdt-2023

Time: -74.7544%
Peak Memory Usage: -62.5913%
Total Memory Usage: -64.2708%

> >
> > Sincerely,
> > Ken Matsui
> >
> > On Mon, Jun 12, 2023 at 3:49 PM Ken Matsui  
> > wrote:
> > >
> > > Hi,
> > >
> > > This patch series gets std::is_object to dispatch to built-in traits and
> > > implements the following built-in traits, on which std::object depends.
> > >
> > > * __is_reference
> > > * __is_function
> > > * __is_void
> > >
> > > std::is_object was depending on them with disjunction and negation.
> > >
> > > __not_<__or_, is_reference<_Tp>, is_void<_Tp>>>::type
> > >
> > > Therefore, this patch uses them directly instead of implementing an 
> > > additional
> > > built-in trait __is_object, which makes the compiler slightly bigger and
> > > slower.
> > >
> > > __bool_constant > > __is_void(_Tp))>
> > >
> > > This would instantiate only __bool_constant and 
> > > __bool_constant,
> > > which can be mostly shared. That is, the purpose of built-in traits is
> > > considered as achieved.
> > >
> > > Changes in v7
> > >
> > > * Removed an unnecessary new line.
> > >
> > > Ken Matsui (6):
> > >   c++: implement __is_reference built-in trait
> > >   libstdc++: use new built-in trait __is_reference for std::is_reference
> > >   c++: implement __is_function built-in trait
> > >   libstdc++: use new built-in trait __is_function for std::is_function
> > >   c++, libstdc++: implement __is_void built-in trait
> > >   libstdc++: make std::is_object dispatch to new built-in traits
> > >
> > >  gcc/cp/constraint.cc  |  9 +++
> > >  gcc/cp/cp-trait.def   |  3 +
> > >  gcc/cp/semantics.cc   | 12 
> > >  gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  9 +++
> > >  gcc/testsuite/g++.dg/ext/is_function.C| 58 +++
> > >  gcc/testsuite/g++.dg/ext/is_reference.C   | 34 +++
> > >  gcc/testsuite/g++.dg/ext/is_void.C| 35 +++
> > >  gcc/testsuite/g++.dg/tm/pr46567.C |  6 +-
> > >  libstdc++-v3/include/bits/cpp_type_traits.h   | 15 -
> > >  libstdc++-v3/include/debug/helper_functions.h |  5 +-
> > >  libstdc++-v3/include/std/type_traits  | 51 
> > >  11 files changed, 216 insertions(+), 21 deletions(-)
> > >  create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C
> > >  create mode 100644 gcc/testsuite/g++.dg/ext/is_reference.C
> > >  create mode 100644 gcc/testsuite/g++.dg/ext/is_void.C
> > >
> > > --
> > > 2.41.0
> > >
> >
> >


Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/21/23 02:14, Wang, Yanzhang wrote:

Hi Jeff, sorry for the late reply.


The long branch handling is done at the assembler level.  So the clobbering
of $ra isn't visible to the compiler.  Thus the compiler has to be
extremely careful to not hold values in $ra because the assembler may
clobber $ra.


If assembler will modify the $ra behavior, it seems the rules we defined in
the riscv.cc will be ignored. For example, the $ra saving generated by this
patch may be modified by the assmebler and all others depends on it will be
wrong. So implementing the long jump in the compiler is better.
Basically correct.  The assembler potentially clobbers $ra.  That's why 
in the long jump patches $ra becomes a fixed register -- the compiler 
doesn't know when it's clobbered by the assembler.


Even if this were done in the compiler, we'd still have to do something 
special with $ra.  The point at which decisions about register 
allocation and such are made is before the point where we know the final 
positions of jumps/labels.  It's a classic problem in GCC's design.




If you're not going to use dwarf, then my recommendation is to ensure that
the data you need is *always* available in the stack at known
offsets.   That will mean your code isn't optimized as well.  It means
hand written assembly code has to follow the conventions, you can't link
against libraries that do not follow those conventions, etc etc.  But
that's the price you pay for not using dwarf (or presumably ORC/SFRAME
which I haven't studied in detail).


Yes. That's right. All the libraries need to follow the same logic. But as
you said, this is the price if we choose this solution. And fortunately,
this will only be used in special scenarios.
The key point is you want the location of the return pointer to be 
consistent in every function and you want to know that every function 
has a frame pointer.


Otherwise you end up having to either consult on-the-side tables (at 
which point you might as well look at ORC/SFRAME) or disassembling code 
in the executable to deduce where to find fp, ra, etc (which is a path 
to madness).


Thus for the usage scenario you're looking at, I would recommend always 
having a frame pointer, every function, no matter how trivial and that 
$ra always be saved into a suitable slot relative to the frame pointer, 
again, no matter how trivial the function.




And Jeff, do you have any other comments about this patch? Should we add
some descriptions somewhere in the doc?
We may need to adjust the documentation a bit since I think I'm 
suggesting slight changes in the behavior of existing -m options.


I'd like to see an updated patch before commenting further on 
implementation details.


jeff


Re: [PATCH V1] RISC-V:Add float16 tuple type abi

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/21/23 01:46, juzhe.zh...@rivai.ai wrote:

LGTM. Thanks.

OK from me as well.
jeff


Re: [PATCH v2 1/2] c++: implement __is_volatile built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
Here is the benchmark result for is_volatile:

https://github.com/ken-matsui/gcc-benches/blob/main/is_volatile.md#sat-jun-24-074036-am-pdt-2023

Time: -2.42335%
Peak Memory Usage: -1.07651%
Total Memory Usage: -1.62369%

On Sat, Jun 24, 2023 at 7:24 AM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_volatile.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_volatile.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile.
> * g++.dg/ext/is_volatile.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 +++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
>  gcc/testsuite/g++.dg/ext/is_volatile.C   | 19 +++
>  5 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..e971d67ee25 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_VOLATILE:
> +  inform (loc, "  %qT is not a volatile type", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..414b1065a11 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..10934d01504 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_ENUM:
>return type_code1 == ENUMERAL_TYPE;
>
> +case CPTK_IS_VOLATILE:
> +  return CP_TYPE_VOLATILE_P (type1);
> +
>  case CPTK_IS_FINAL:
>return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_VOLATILE:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..7ad640f141b 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_volatile)
> +# error "__has_builtin (__is_volatile) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C 
> b/gcc/testsuite/g++.dg/ext/is_volatile.C
> new file mode 100644
> index 000..004e397e5e7
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_volatile.C
> @@ -0,0 +1,19 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +// Positive tests.
> +SA(__is_volatile(volatile int));
> +SA(__is_volatile(const volatile int));
> +SA(__is_volatile(vClassType));
> +SA(__is_volatile(cvClassType));
> +
> +// Negative tests.
> +SA(!__is_volatile(int));
> +SA(!__is_volatile(const int));
> +SA(!__is_volatile(ClassType));
> +SA(!__is_volatile(cClassType));
> --
> 2.41.0
>


Re: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/23/23 07:48, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer.

Currenly,
1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones.
2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF.

Now, I think it makes sense that we can support

fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - bias) 
is VF.
  
gcc/ChangeLog:


 * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}.
 (gimple_fold_partial_load_store_mem_ref): Ditto.
 (gimple_fold_partial_store): Ditto.
 (gimple_fold_call): Ditto.

OK
jeff


RE: [PATCH] RISC-V: Refactor the integer ternary autovec pattern

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 24, 2023 10:04 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Refactor the integer ternary autovec pattern



On 6/21/23 16:38, Juzhe-Zhong wrote:
> Long time ago, I encounter ICE when trying to set clobber register as Pmode
> and I forgot the reason.
> 
> So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which
> makes patterns look unreasonable.
> 
> According to Jeff's comments, I tried it again, it works now when we try to
> set clobber register as Pmode and the patterns look more reasonable now.
> 
> The tests are all passed, Ok for trunk.
> 
> gcc/ChangeLog:
> 
>  * config/riscv/autovec.md (*fma): set clobber to Pmode in 
> expand stage.
>  (*fma): Ditto.
>  (*fnma): Ditto.
>  (*fnma): Ditto.
OK
jeff


RE: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 24, 2023 10:06 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com
Subject: Re: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization



On 6/21/23 09:53, Juzhe-Zhong wrote:
> This patch adds RVV floating-point auto-vectorization.
> Also, fix attribute bug of floating-point ternary operations in vector.md.
> 
> gcc/ChangeLog:
> 
>  * config/riscv/autovec.md (fma4): New pattern.
>  (*fma): Ditto.
>  (fnma4): Ditto.
>  (*fnma): Ditto.
>  (fms4): Ditto.
>  (*fms): Ditto.
>  (fnms4): Ditto.
>  (*fnms): Ditto.
>  * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New 
> function.
>  * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto.
>  * config/riscv/vector.md: Fix attribute bug.
OK.  Thanks for digging into that clobber issue.

Jeff



Re: [PATCH v2] RISC-V: Implement autovec copysign.

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/21/23 08:24, 钟居哲 wrote:

LGTM.

Likewise.  OK for the trunk.
jeff


Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/22/23 00:39, Richard Biener wrote:




I suspect there's no way to specify the desired semantics?  OTOH
code that looks at the MEM operand only and not the insn (which
should have some UNSPEC wrapped) needs to be conservative, so maybe
the alias code shouldn't assume that a (mem:V16SI ..) actually
performs an access of the size of V16SI at the specified location?
I'm not aware of a way to express the semantics fully right now.  We'd 
need some way to indicate that the MEM is a partial and pass along the 
actual length.


We could do both through MEM_ATTRS with some work. For example we could 
declare that for vector modes full semantic information is carried in 
the MEM_ATTRS rather than by the mode itself.  So it falls into a space 
between how we currently think of something like V16SI and BLK.  The 
mode specifies a maximum size and how to interpret the elements.  But 
actual size and perhaps mask info would be found in MEM_ATTRS.


jeff


[PATCH v2 2/2] libstdc++: use new built-in trait __is_volatile

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_volatile.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_volatile): Use __is_volatile built-in
trait.
(is_volatile_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 13 +
 1 file changed, 13 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..db74b884b35 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -773,6 +773,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public true_type { };
 
   /// is_volatile
+#if __has_builtin(__is_volatile)
+  template
+struct is_volatile
+: public __bool_constant<__is_volatile(_Tp)>
+{ };
+#else
   template
 struct is_volatile
 : public false_type { };
@@ -780,6 +786,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_volatile<_Tp volatile>
 : public true_type { };
+#endif
 
   /// is_trivial
   template
@@ -3214,10 +3221,16 @@ template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+
+#if __has_builtin(__is_volatile)
+template 
+  inline constexpr bool is_volatile_v = __is_volatile(_Tp);
+#else
 template 
   inline constexpr bool is_volatile_v = false;
 template 
   inline constexpr bool is_volatile_v = true;
+#endif
 
 template 
   inline constexpr bool is_trivial_v = __is_trivial(_Tp);
-- 
2.41.0



Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/22/23 07:42, Jan Hubicka wrote:



On 6/22/23 00:31, Richard Biener wrote:

I think there's a difference in that __builtin_trap () is observable
while __builtin_unreachable () is not and reaching __builtin_unreachable
() invokes undefined behavior while reaching __builtin_trap () does not.

So the isolation code marking the trapping code volatile should be
enough and the trap () is just there to end the basic block
(and maybe be on the safe side to really trap).

Agreed WRT observability -- but that's not really the point of the trap and
if we wanted we could change that behavior.

The trap is there to halt execution immediately rather than letting it keep
running.  That was a design decision from a security standpoint. If we've
detected that we're executing undefined behavior, stop rather than
potentially letting a malicious actor turn a bug into an exploit.


Also as discussed some time ago, the volatile loads between traps has
effect of turning previously pure/const functions into non-const which
is somewhat sad, so it is still on my todo list to change it this stage1
to something more careful.   We discussed internal functions trap_store
and trap_load which will expand to load/store + trap but will make it
clear that side effect does not count to modref.
It's been a long time since I looked at this code -- isn't it the case 
that we already must have had a load/store and that all we've done is 
change its form (to enable more DCE) and added the volatile marker?


Meaning that it couldn't have been pure/cost before, could it?  Or is it 
the case that you want to not have the erroneous path be the sole reason 
to spoil pure/const detection -- does that happen often in practice?


jeff


[PATCH v2 1/2] c++: implement __is_volatile built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_volatile.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_volatile.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile.
* g++.dg/ext/is_volatile.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_volatile.C   | 19 +++
 5 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..e971d67ee25 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_VOLATILE:
+  inform (loc, "  %qT is not a volatile type", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..414b1065a11 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..10934d01504 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_ENUM:
   return type_code1 == ENUMERAL_TYPE;
 
+case CPTK_IS_VOLATILE:
+  return CP_TYPE_VOLATILE_P (type1);
+
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_VOLATILE:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..7ad640f141b 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_volatile)
+# error "__has_builtin (__is_volatile) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C 
b/gcc/testsuite/g++.dg/ext/is_volatile.C
new file mode 100644
index 000..004e397e5e7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_volatile.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+// Positive tests.
+SA(__is_volatile(volatile int));
+SA(__is_volatile(const volatile int));
+SA(__is_volatile(vClassType));
+SA(__is_volatile(cvClassType));
+
+// Negative tests.
+SA(!__is_volatile(int));
+SA(!__is_volatile(const int));
+SA(!__is_volatile(ClassType));
+SA(!__is_volatile(cClassType));
-- 
2.41.0



Re: [PATCH v2 1/2] c++: implement __is_array built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
Here is the benchmark result for is_array:

https://github.com/ken-matsui/gcc-benches/blob/main/is_array.md#sat-jun-24-070630-am-pdt-2023

Time: -15.511%
Peak Memory Usage: +0.173923%
Total Memory Usage: -6.2037%

On Sat, Jun 24, 2023 at 6:54 AM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_array.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_array.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARRAY.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_array.
> * g++.dg/ext/is_array.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 +++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
>  gcc/testsuite/g++.dg/ext/is_array.C  | 28 
>  5 files changed, 39 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_array.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..7cec7eba591 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_ARRAY:
> +  inform (loc, "  %qT is not an array", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..f68c7f2e8ec 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..22f2700ec0b 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_UNION:
>return type_code1 == UNION_TYPE;
>
> +case CPTK_IS_ARRAY:
> +  return type_code1 == ARRAY_TYPE;
> +
>  case CPTK_IS_ASSIGNABLE:
>return is_xible (MODIFY_EXPR, type1, type2);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_ARRAY:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..56485ae62be 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_array)
> +# error "__has_builtin (__is_array) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
> b/gcc/testsuite/g++.dg/ext/is_array.C
> new file mode 100644
> index 000..facfed5c7cb
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_array.C
> @@ -0,0 +1,28 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +#define SA_TEST_CATEGORY(TRAIT, X, expect) \
> +  SA(TRAIT(X) == expect);  \
> +  SA(TRAIT(const X) == expect);\
> +  SA(TRAIT(volatile X) == expect); \
> +  SA(TRAIT(const volatile X) == expect)
> +
> +SA_TEST_CATEGORY(__is_array, int[2], true);
> +SA_TEST_CATEGORY(__is_array, int[], true);
> +SA_TEST_CATEGORY(__is_array, int[2][3], true);
> +SA_TEST_CATEGORY(__is_array, int[][3], true);
> +SA_TEST_CATEGORY(__is_array, float*[2], true);
> +SA_TEST_CATEGORY(__is_array, float*[], true);
> +SA_TEST_CATEGORY(__is_array, float*[2][3], true);
> +SA_TEST_CATEGORY(__is_array, float*[][3], true);
> +SA_TEST_CATEGORY(__is_array, ClassType[2], true);
> +SA_TEST_CATEGORY(__is_array, ClassType[], true);
> +SA_TEST_CATEGORY(__is_array, ClassType[2][3], true);
> +SA_TEST_CATEGORY(__is_array, ClassType[][3], true);
> +
> +// Sanity check.
> +SA_TEST_CATEGORY(__is_array, ClassType, false);
> --
> 2.41.0
>


Re: [PATCH] SSA ALIAS: Apply LEN_MASK_STORE to 'ref_maybe_used_by_call_p_1'

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/23/23 17:20, 钟居哲 wrote:

Not sure since I saw MASK_STORE/LEN_STORE didn't compute size.
Also OK after a re-review on my part.  The code sets the size to -1 
after doing the ao_ref_init_from_ptr_and_size, meaning it's not a known 
size.


jeff



Re: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/23/23 17:21, 钟居哲 wrote:

Not sure since I saw MASK_STORE/LEN_STORE didn't compute size.

Yea, I think you're right.  We take the size from the LHS.  My mistake.

This is fine for the trunk.

jeff


Re: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/21/23 09:53, Juzhe-Zhong wrote:

This patch adds RVV floating-point auto-vectorization.
Also, fix attribute bug of floating-point ternary operations in vector.md.

gcc/ChangeLog:

 * config/riscv/autovec.md (fma4): New pattern.
 (*fma): Ditto.
 (fnma4): Ditto.
 (*fnma): Ditto.
 (fms4): Ditto.
 (*fms): Ditto.
 (fnms4): Ditto.
 (*fnms): Ditto.
 * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New 
function.
 * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto.
 * config/riscv/vector.md: Fix attribute bug.

OK.  Thanks for digging into that clobber issue.

Jeff



Re: [PATCH] RISC-V: Refactor the integer ternary autovec pattern

2023-06-24 Thread Jeff Law via Gcc-patches




On 6/21/23 16:38, Juzhe-Zhong wrote:

Long time ago, I encounter ICE when trying to set clobber register as Pmode
and I forgot the reason.

So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which
makes patterns look unreasonable.

According to Jeff's comments, I tried it again, it works now when we try to
set clobber register as Pmode and the patterns look more reasonable now.

The tests are all passed, Ok for trunk.

gcc/ChangeLog:

 * config/riscv/autovec.md (*fma): set clobber to Pmode in expand 
stage.
 (*fma): Ditto.
 (*fnma): Ditto.
 (*fnma): Ditto.

OK
jeff


Re: [PATCH v2 1/3] c++: Track lifetimes in constant evaluation [PR70331, PR96630, PR98675]

2023-06-24 Thread Nathaniel Shead via Gcc-patches
On Fri, Jun 23, 2023 at 12:43:21PM -0400, Patrick Palka wrote:
> On Wed, 29 Mar 2023, Nathaniel Shead via Gcc-patches wrote:
> 
> > This adds rudimentary lifetime tracking in C++ constexpr contexts,
> > allowing the compiler to report errors with using values after their
> > backing has gone out of scope. We don't yet handle other ways of ending
> > lifetimes (e.g. explicit destructor calls).
> 
> Awesome!
> 
> > 
> > PR c++/96630
> > PR c++/98675
> > PR c++/70331
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constexpr.cc (constexpr_global_ctx::put_value): Mark value as
> > in lifetime.
> > (constexpr_global_ctx::remove_value): Mark value as expired.
> > (cxx_eval_call_expression): Remove comment that is no longer
> > applicable.
> > (non_const_var_error): Add check for expired values.
> > (cxx_eval_constant_expression): Add checks for expired values. Forget
> > local variables at end of bind expressions. Forget temporaries at end
> > of cleanup points.
> > * cp-tree.h (struct lang_decl_base): New flag to track expired values
> > in constant evaluation.
> > (DECL_EXPIRED_P): Access the new flag.
> > (SET_DECL_EXPIRED_P): Modify the new flag.
> > * module.cc (trees_out::lang_decl_bools): Write out the new flag.
> > (trees_in::lang_decl_bools): Read in the new flag.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/constexpr-ice20.C: Update error raised by test.
> > * g++.dg/cpp1y/constexpr-lifetime1.C: New test.
> > * g++.dg/cpp1y/constexpr-lifetime2.C: New test.
> > * g++.dg/cpp1y/constexpr-lifetime3.C: New test.
> > * g++.dg/cpp1y/constexpr-lifetime4.C: New test.
> > * g++.dg/cpp1y/constexpr-lifetime5.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >  gcc/cp/constexpr.cc   | 69 +++
> >  gcc/cp/cp-tree.h  | 10 ++-
> >  gcc/cp/module.cc  |  2 +
> >  gcc/testsuite/g++.dg/cpp0x/constexpr-ice20.C  |  2 +-
> >  .../g++.dg/cpp1y/constexpr-lifetime1.C| 13 
> >  .../g++.dg/cpp1y/constexpr-lifetime2.C| 20 ++
> >  .../g++.dg/cpp1y/constexpr-lifetime3.C| 13 
> >  .../g++.dg/cpp1y/constexpr-lifetime4.C| 11 +++
> >  .../g++.dg/cpp1y/constexpr-lifetime5.C| 11 +++
> >  9 files changed, 137 insertions(+), 14 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime1.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime2.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime3.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime4.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C
> > 
> > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > index 3de60cfd0f8..bdbc12144a7 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -1185,10 +1185,22 @@ public:
> >void put_value (tree t, tree v)
> >{
> >  bool already_in_map = values.put (t, v);
> > +if (!already_in_map && DECL_P (t))
> > +  {
> > +   if (!DECL_LANG_SPECIFIC (t))
> > + retrofit_lang_decl (t);
> > +   if (DECL_LANG_SPECIFIC (t))
> > + SET_DECL_EXPIRED_P (t, false);
> > +  }
> 
> Since this new flag would only be used only during constexpr evaluation,
> could we instead use an on-the-side hash_set in constexpr_global_ctx for
> tracking expired-ness?  That way we won't have to allocate a
> DECL_LANG_SPECIFIC structure for decls that lack it, and won't have to
> worry about the flag in other parts of the compiler.

I've tried this but I haven't been able to get it to work well. The main
issue I'm running into is the caching of function calls in constant
evaluation. For example, consider the following:

constexpr const double& test() {
  const double& local = 3.0;
  return local;
}

constexpr int foo(const double&) { return 5; }

constexpr int a = foo(test());
static_assert(test() == 3.0);

When constant-evaluating 'a', we evaluate 'test()'. It returns a value
that ends its lifetime immediately, so we mark this in 'ctx->global' as
expired. However, 'foo()' never actually evaluates this expired value,
so the initialisation of 'a' succeeds.

However, then when the static assertion attempts to constant evaluate a
second time, the result of 'test' has already been cached, and we just
get directly handed a value. This is a new constant evaluation, so
'ctx->global' has been reset, and because we just got the result of the
cached function we don't actually know whether this is expired or not
anymore, and so this compiles without any error in case it was valid.

I haven't yet been able to come up with a good way of avoiding this
issue without complicating the caching of call expressions overly much.
I suppose I could add an extra field to 'constexpr_call' to track if the
value has already been expired (which would solve this particular ca

[PATCH v2 2/2] libstdc++: use new built-in trait __is_array

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_array.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_array): Use __is_array built-in trait.
(is_array_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..f2a3a327e7d 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -503,6 +503,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_array
+#if __has_builtin(__is_array)
+  template
+struct is_array
+: public __bool_constant<__is_array(_Tp)>
+{ };
+#else
   template
 struct is_array
 : public false_type { };
@@ -514,6 +520,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_array<_Tp[]>
 : public true_type { };
+#endif
 
   template
 struct __is_pointer_helper
@@ -3161,12 +3168,17 @@ template 
 template 
   inline constexpr bool is_floating_point_v = is_floating_point<_Tp>::value;
 
+#if __has_builtin(__is_array)
+template 
+  inline constexpr bool is_array_v = __is_array(_Tp);
+#else
 template 
   inline constexpr bool is_array_v = false;
 template 
   inline constexpr bool is_array_v<_Tp[]> = true;
 template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
+#endif
 
 template 
   inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
-- 
2.41.0



[PATCH v2 1/2] c++: implement __is_array built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_array.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_array.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARRAY.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_array.
* g++.dg/ext/is_array.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_array.C  | 28 
 5 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_array.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..7cec7eba591 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_ARRAY:
+  inform (loc, "  %qT is not an array", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..f68c7f2e8ec 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..22f2700ec0b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_ARRAY:
+  return type_code1 == ARRAY_TYPE;
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_ARRAY:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..56485ae62be 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_array)
+# error "__has_builtin (__is_array) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
b/gcc/testsuite/g++.dg/ext/is_array.C
new file mode 100644
index 000..facfed5c7cb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_array.C
@@ -0,0 +1,28 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, X, expect) \
+  SA(TRAIT(X) == expect);  \
+  SA(TRAIT(const X) == expect);\
+  SA(TRAIT(volatile X) == expect); \
+  SA(TRAIT(const volatile X) == expect)
+
+SA_TEST_CATEGORY(__is_array, int[2], true);
+SA_TEST_CATEGORY(__is_array, int[], true);
+SA_TEST_CATEGORY(__is_array, int[2][3], true);
+SA_TEST_CATEGORY(__is_array, int[][3], true);
+SA_TEST_CATEGORY(__is_array, float*[2], true);
+SA_TEST_CATEGORY(__is_array, float*[], true);
+SA_TEST_CATEGORY(__is_array, float*[2][3], true);
+SA_TEST_CATEGORY(__is_array, float*[][3], true);
+SA_TEST_CATEGORY(__is_array, ClassType[2], true);
+SA_TEST_CATEGORY(__is_array, ClassType[], true);
+SA_TEST_CATEGORY(__is_array, ClassType[2][3], true);
+SA_TEST_CATEGORY(__is_array, ClassType[][3], true);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_array, ClassType, false);
-- 
2.41.0



Re: [PATCH v2 1/2] c++: implement __is_const built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
Here is the benchmark result for is_const.

https://github.com/ken-matsui/gcc-benches/blob/main/is_const.md#sat-jun-24-044815-am-pdt-2023

Time: -2.86467%
Peak Memory Usage: -1.0654%
Total Memory Usage: -1.62369%

On Sat, Jun 24, 2023 at 6:41 AM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_const.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_const.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
> * g++.dg/ext/is_const.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 +++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
>  gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
>  5 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..ff4ae831def 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_CONST:
> +  inform (loc, "  %qT is not a const type", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..b40b475b86d 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..011ba8e46e1 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_ENUM:
>return type_code1 == ENUMERAL_TYPE;
>
> +case CPTK_IS_CONST:
> +  return CP_TYPE_CONST_P (type1);
> +
>  case CPTK_IS_FINAL:
>return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_CONST:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..965309a333a 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_const)
> +# error "__has_builtin (__is_const) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
> b/gcc/testsuite/g++.dg/ext/is_const.C
> new file mode 100644
> index 000..8f2d7c2fce9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_const.C
> @@ -0,0 +1,19 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +// Positive tests.
> +SA(__is_const(const int));
> +SA(__is_const(const volatile int));
> +SA(__is_const(cClassType));
> +SA(__is_const(cvClassType));
> +
> +// Negative tests.
> +SA(!__is_const(int));
> +SA(!__is_const(volatile int));
> +SA(!__is_const(ClassType));
> +SA(!__is_const(vClassType));
> --
> 2.41.0
>


[PATCH v2 2/2] libstdc++: use new built-in trait __is_const

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_const.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..3a46eca5377 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -764,6 +764,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if __has_builtin(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -771,6 +777,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3210,10 +3217,17 @@ template 
   inline constexpr bool is_compound_v = is_compound<_Tp>::value;
 template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
+
+#if __has_builtin(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
+
 template 
   inline constexpr bool is_volatile_v = false;
 template 
-- 
2.41.0



[PATCH v2 1/2] c++: implement __is_const built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_const.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_const.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
* g++.dg/ext/is_const.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
 5 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..ff4ae831def 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_CONST:
+  inform (loc, "  %qT is not a const type", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..b40b475b86d 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..011ba8e46e1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_ENUM:
   return type_code1 == ENUMERAL_TYPE;
 
+case CPTK_IS_CONST:
+  return CP_TYPE_CONST_P (type1);
+
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_CONST:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..965309a333a 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_const)
+# error "__has_builtin (__is_const) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
b/gcc/testsuite/g++.dg/ext/is_const.C
new file mode 100644
index 000..8f2d7c2fce9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_const.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+// Positive tests.
+SA(__is_const(const int));
+SA(__is_const(const volatile int));
+SA(__is_const(cClassType));
+SA(__is_const(cvClassType));
+
+// Negative tests.
+SA(!__is_const(int));
+SA(!__is_const(volatile int));
+SA(!__is_const(ClassType));
+SA(!__is_const(vClassType));
-- 
2.41.0



Re: [PATCH] c++: Report invalid id-expression in decltype [PR100482]

2023-06-24 Thread Nathaniel Shead via Gcc-patches
On Fri, Jun 23, 2023 at 12:15:32PM -0400, Patrick Palka wrote:
> On Sun, 30 Apr 2023, Nathaniel Shead via Gcc-patches wrote:
> 
> > This patch ensures that any errors raised by finish_id_expression when
> > parsing a decltype expression are properly reported, rather than
> > potentially going ignored and causing invalid code to be accepted.
> > 
> > We can also now remove the separate check for templates without args as
> > this is also checked for in finish_id_expression.
> > 
> > PR 100482
> > 
> > gcc/cp/ChangeLog:
> > 
> > * parser.cc (cp_parser_decltype_expr): Report errors raised by
> > finish_id_expression.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/pr100482.C: New test.
> 
> LGTM.  Some minor comments about the new testcase below:
> 
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >  gcc/cp/parser.cc| 22 +++---
> >  gcc/testsuite/g++.dg/pr100482.C | 11 +++
> >  2 files changed, 22 insertions(+), 11 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/pr100482.C
> > 
> > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > index e5f032f2330..20ebcdc3cfd 100644
> > --- a/gcc/cp/parser.cc
> > +++ b/gcc/cp/parser.cc
> > @@ -16508,10 +16508,6 @@ cp_parser_decltype_expr (cp_parser *parser,
> > expr = cp_parser_lookup_name_simple (parser, expr,
> >  id_expr_start_token->location);
> >  
> > -  if (expr && TREE_CODE (expr) == TEMPLATE_DECL)
> > -   /* A template without args is not a complete id-expression.  */
> > -   expr = error_mark_node;
> > -
> >if (expr
> >&& expr != error_mark_node
> >&& TREE_CODE (expr) != TYPE_DECL
> > @@ -16532,13 +16528,17 @@ cp_parser_decltype_expr (cp_parser *parser,
> > &error_msg,
> >id_expr_start_token->location));
> >  
> > -  if (expr == error_mark_node)
> > -/* We found an id-expression, but it was something that we
> > -   should not have found. This is an error, not something
> > -   we can recover from, so note that we found an
> > -   id-expression and we'll recover as gracefully as
> > -   possible.  */
> > -id_expression_or_member_access_p = true;
> > + if (error_msg)
> > +   {
> > + /* We found an id-expression, but it was something that we
> > +should not have found. This is an error, not something
> > +we can recover from, so report the error we found and
> > +we'll recover as gracefully as possible.  */
> > + cp_parser_parse_definitely (parser);
> > + cp_parser_error (parser, error_msg);
> > + id_expression_or_member_access_p = true;
> > + return error_mark_node;
> > +   }
> >  }
> >  
> >if (expr
> > diff --git a/gcc/testsuite/g++.dg/pr100482.C 
> > b/gcc/testsuite/g++.dg/pr100482.C
> > new file mode 100644
> > index 000..dcf6722fda5
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pr100482.C
> 
> We generally prefer to organize tests according to the language dialect
> they apply to and the langugae construct that they're primarily testing.
> In this case we could name the test e.g.
> 
>   gcc/testsuite/g++.dg/cpp0x/decltype-100482.C
> 
> > @@ -0,0 +1,11 @@
> > +// { dg-do compile { target c++10 } }
> 
> We also usually mention the PR number in the test as a comment:
> 
> // PR c++/100482
> 
> One benefit of doing so is that the git alias 'git gcc-commit-mklog'
> (https://gcc.gnu.org/gitwrite.html#vendor) will then automatically
> include the PR number in the commit message template.
> 
> > +
> > +namespace N {}
> > +decltype(std) x;   // { dg-error "expected primary-expression" }
> > +
> > +struct S {};
> > +decltype(S) y;  // { dg-error "argument to .decltype. must be an 
> > expression" }
> > +
> > +template 
> > +struct U {};
> > +decltype(U) z;  // { dg-error "missing template arguments" }
> > -- 
> > 2.40.0
> > 
> > 
> 

Thanks for the comments. I've fixed the test, does this look OK?

-- 8< --

This patch ensures that any errors raised by finish_id_expression when
parsing a decltype expression are properly reported, rather than
potentially going ignored and causing invalid code to be accepted.

We can also now remove the separate check for templates without args as
this is also checked for in finish_id_expression.

PR c++/100482

gcc/cp/ChangeLog:

* parser.cc (cp_parser_decltype_expr): Report errors raised by
finish_id_expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype-100482.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/parser.cc | 22 ++--
 gcc/testsuite/g++.dg/cpp0x/decltype-100482.C | 12 +++
 2 files changed, 23 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-100482.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index dd3665

Re: [PATCH] c++: Fix ICE with parameter pack of decltype(auto) [PR103497]

2023-06-24 Thread Nathaniel Shead via Gcc-patches
On Fri, Jun 23, 2023 at 11:59:51AM -0400, Patrick Palka wrote:
> Hi,
> 
> On Sat, 22 Apr 2023, Nathaniel Shead via Gcc-patches wrote:
> 
> > Bootstrapped and tested on x86_64-pc-linux-gnu.
> > 
> > -- 8< --
> > 
> > This patch raises an error early when the decltype(auto) specifier is
> > used as a parameter of a function. This prevents any issues with an
> > unexpected tree type later on when performing the call.
> 
> Thanks very much for the patch!  Some minor comments below.
> 
> > 
> > PR 103497
> 
> We should include the bug component name when referring to the PR in the
> commit message (i.e. PR c++/103497) so that upon pushing the patch the
> post-commit hook automatically adds a comment to the PR reffering to the
> commit.  I could be wrong but AFAIK the hook only performs this when the
> component name is included.

Thanks for the review! Fixed.

> > 
> > gcc/cp/ChangeLog:
> > 
> > * parser.cc (cp_parser_simple_type_specifier): Add check for
> > decltype(auto) as function parameter.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/pr103497.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >  gcc/cp/parser.cc| 10 ++
> >  gcc/testsuite/g++.dg/pr103497.C |  7 +++
> >  2 files changed, 17 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/pr103497.C
> > 
> > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> > index e5f032f2330..1415e07e152 100644
> > --- a/gcc/cp/parser.cc
> > +++ b/gcc/cp/parser.cc
> > @@ -19884,6 +19884,16 @@ cp_parser_simple_type_specifier (cp_parser* parser,
> >&& cp_lexer_peek_nth_token (parser->lexer, 2)->type != CPP_SCOPE)
> >  {
> >type = saved_checks_value (token->u.tree_check_value);
> > +  /* Within a function parameter declaration, decltype(auto) is always 
> > an
> > +error.  */
> > +  if (parser->auto_is_implicit_function_template_parm_p
> > + && TREE_CODE (type) == TEMPLATE_TYPE_PARM
> 
> We could check is_auto (type) here instead, to avoid any confusion with
> checking AUTO_IS_DECLTYPE for a non-auto TEMPLATE_TYPE_PARM.
> 
> > + && AUTO_IS_DECLTYPE (type))
> > +   {
> > + error_at (token->location,
> > +   "cannot declare a parameter with %");
> > + type = error_mark_node;
> > +   }
> >if (decl_specs)
> > {
> >   cp_parser_set_decl_spec_type (decl_specs, type,
> > diff --git a/gcc/testsuite/g++.dg/pr103497.C 
> > b/gcc/testsuite/g++.dg/pr103497.C
> > new file mode 100644
> > index 000..bcd421c2907
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pr103497.C
> > @@ -0,0 +1,7 @@
> > +// { dg-do compile { target c++14 } }
> > +
> > +void foo(decltype(auto)... args);  // { dg-error "parameter with 
> > .decltype.auto..|no parameter packs" }
> 
> I noticed for
> 
>   void foo(decltype(auto) arg);
> 
> we already issue an identical error from grokdeclarator.  Perhaps we could
> instead extend the error handling there to detect decltype(auto)... as well,
> rather than adding new error handling in cp_parser_simple_type_specifier?

Ah thanks, I didn't notice this; this simplifies the change a fair bit.
How about this patch instead?

Regtested on x86_64-pc-linux-gnu.

-- 8< --

This patch ensures that checks for usages of 'auto' in function
parameters also consider parameter packs, since 'type_uses_auto' does
not seem to consider this case.

PR c++/103497

gcc/cp/ChangeLog:

* decl.cc (grokdeclarator): Check for decltype(auto) in
parameter pack.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto-103497.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/decl.cc| 3 +++
 gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C | 8 
 2 files changed, 11 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 60f107d50c4..aaf691fce68 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -14044,6 +14044,9 @@ grokdeclarator (const cp_declarator *declarator,
error ("cannot use %<::%> in parameter declaration");
 
   tree auto_node = type_uses_auto (type);
+  if (!auto_node && parameter_pack_p)
+   auto_node = type_uses_auto (PACK_EXPANSION_PATTERN (type));
+
   if (auto_node && !(cxx_dialect >= cxx17 && template_parm_flag))
{
  if (cxx_dialect >= cxx14)
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C
new file mode 100644
index 000..cedd661710c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C
@@ -0,0 +1,8 @@
+// PR c++/103497
+// { dg-do compile { target c++14 } }
+
+void foo(decltype(auto)... args);  // { dg-error "cannot declare a parameter 
with .decltype.auto.." }
+
+int main() {
+  foo();
+}
-- 
2.41.0



[Patch, fortran] PR49213 - [OOP] gfortran rejects structure constructor expression

2023-06-24 Thread Paul Richard Thomas via Gcc-patches
Hi All,

I was looking through Neil Carlson's collection of gfortran bugs and
was shocked to find this rather fundamental PR. At 12 years old, it is
certainly a "golden oldie"!

The patch is rather straightforward and seems to do the job of
admitting derived, intrinsic and character expressions to allocatable
class components in structure constructors.

I have included the adjustment to 'gfc_is_ptr_fcn' and eliminating the
extra blank line, introduced by my last patch. I played safe and went
exclusively for class functions with attr.class_pointer set on the
grounds that these have had all the accoutrements checked and built
(ie. class_ok). I am still not sure if this is necessary or not.

OK for trunk?

Paul

Fortran: Enable class expressions in structure constructors [PR49213]

2023-06-24  Paul Thomas  

gcc/fortran
PR fortran/49213
* expr.cc (gfc_is_ptr_fcn): Guard pointer attribute to exclude
class expressions.
* resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow
associate names with pointer function targets to be used in
variable definition context.
* trans-decl.cc (get_symbol_decl): Remove extraneous line.
* trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain
size of intrinsic and character expressions.
(gfc_trans_subcomponent_assign): Expand assignment to class
components to include intrinsic and character expressions.

gcc/testsuite/
PR fortran/49213
* gfortran.dg/pr49213.f90 : New test
diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index c960dfeabd9..92061d69781 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -816,7 +816,7 @@ bool
 gfc_is_ptr_fcn (gfc_expr *e)
 {
   return e != NULL && e->expr_type == EXPR_FUNCTION
-	  && (gfc_expr_attr (e).pointer
+	  && ((e->ts.type != BT_CLASS && gfc_expr_attr (e).pointer)
 		  || (e->ts.type == BT_CLASS
 		  && CLASS_DATA (e)->attr.class_pointer));
 }
diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 82e6ac53aa1..217d69d4e0b 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1350,6 +1350,9 @@ resolve_structure_cons (gfc_expr *expr, int init)
 	  && CLASS_DATA (comp)->as)
  	rank = CLASS_DATA (comp)->as->rank;
 
+  if (comp->ts.type == BT_CLASS && cons->expr->ts.type == BT_DERIVED)
+	  gfc_find_derived_vtab (cons->expr->ts.u.derived);
+
   if (cons->expr->expr_type != EXPR_NULL && rank != cons->expr->rank
 	  && (comp->attr.allocatable || cons->expr->rank))
 	{
@@ -1381,7 +1384,7 @@ resolve_structure_cons (gfc_expr *expr, int init)
 			 gfc_basic_typename (comp->ts.type));
 	  t = false;
 	}
-	  else
+	  else if (!UNLIMITED_POLY (comp))
 	{
 	  bool t2 = gfc_convert_type (cons->expr, &comp->ts, 1);
 	  if (t)
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 18589e17843..b0fd25e92a3 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -1915,7 +1915,6 @@ gfc_get_symbol_decl (gfc_symbol * sym)
 	gcc_assert (!sym->value || sym->value->expr_type == EXPR_NULL);
 }
 
-
   gfc_finish_var_decl (decl, sym);
 
   if (sym->ts.type == BT_CHARACTER)
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 3c209bcde97..5a1ff0c1d21 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -8781,6 +8781,7 @@ alloc_scalar_allocatable_subcomponent (stmtblock_t *block, tree comp,
   tree size;
   tree size_in_bytes;
   tree lhs_cl_size = NULL_TREE;
+  gfc_se se;
 
   if (!comp)
 return;
@@ -8815,16 +8816,26 @@ alloc_scalar_allocatable_subcomponent (stmtblock_t *block, tree comp,
 }
   else if (cm->ts.type == BT_CLASS)
 {
-  gcc_assert (expr2->ts.type == BT_CLASS || expr2->ts.type == BT_DERIVED);
-  if (expr2->ts.type == BT_DERIVED)
+  if (expr2->ts.type != BT_CLASS)
 	{
-	  tmp = gfc_get_symbol_decl (expr2->ts.u.derived);
-	  size = TYPE_SIZE_UNIT (tmp);
+	  if (expr2->ts.type == BT_CHARACTER)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr (&se, expr2);
+	  size = fold_convert (size_type_node, se.string_length);
+	}
+	  else
+	{
+	  if (expr2->ts.type == BT_DERIVED)
+		tmp = gfc_get_symbol_decl (expr2->ts.u.derived);
+	  else
+		tmp = gfc_typenode_for_spec (&expr2->ts);
+	  size = TYPE_SIZE_UNIT (tmp);
+	}
 	}
   else
 	{
 	  gfc_expr *e2vtab;
-	  gfc_se se;
 	  e2vtab = gfc_find_and_cut_at_last_class_ref (expr2);
 	  gfc_add_vptr_component (e2vtab);
 	  gfc_add_size_component (e2vtab);
@@ -8975,6 +8986,7 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm,
 {
   gfc_init_se (&se, NULL);
   gfc_conv_expr (&se, expr);
+  tree size;
 
   /* Take care about non-array allocatable components here.  The alloc_*
 	 routine below is motivated by the alloc_scalar_allocatable_for_
@@ -8990,7 +9002,7 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm,
 	  && expr->symtree->n.sym->attr.dummy)
 	se.expr = build_fold_indirect_ref_loc (input_location, se.expr);
 
-  if (cm->ts.type =

RE: [PATCH V2] LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE}

2023-06-24 Thread Li, Pan2 via Gcc-patches
Committed as passed bootstrap and regression test, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Saturday, June 24, 2023 9:29 AM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de; richard.sandif...@arm.com
Subject: Re: [PATCH V2] LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE}



On 6/23/23 17:41, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong 
> 
> Hi, Jeff. I fix format as you suggested.
> Ok for trunk ?
> 
> gcc/ChangeLog:
> 
>  * tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Apply 
> LEN_MASK_{LOAD,STORE}.
Yes.  Sorry I wasn't explicit that it was OK with the formatting change.

Jeff


Re: [PATCH 1/2] c++: implement __is_const built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
Sure. Thank you.

On Sat, Jun 24, 2023 at 5:14 AM Xi Ruoyao  wrote:

> Please use [PATCH v3 1/2] next time, now it's not easy to find the
> latest version of the series (I'm not sure if the number "3" is
> correct).
>
> On Sat, 2023-06-24 at 03:38 -0700, Ken Matsui via Gcc-patches wrote:
> > This patch implements built-in trait for std::is_const.
> >
> > gcc/cp/ChangeLog:
> >
> > * cp-trait.def: Define __is_const.
> > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
> > * semantics.cc (trait_expr_value): Likewise.
> > (finish_trait_expr): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
> > * g++.dg/ext/is_const.C: New test.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/constraint.cc |  3 +++
> >  gcc/cp/cp-trait.def  |  1 +
> >  gcc/cp/semantics.cc  |  4 
> >  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
> >  gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
> >  5 files changed, 30 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
> >
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 8cf0f2d0974..ff4ae831def 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
> >  case CPTK_IS_UNION:
> >inform (loc, "  %qT is not a union", t1);
> >break;
> > +case CPTK_IS_CONST:
> > +  inform (loc, "  %qT is not a const type", t1);
> > +  break;
> >  case CPTK_IS_AGGREGATE:
> >inform (loc, "  %qT is not an aggregate", t1);
> >break;
> > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > index 8b7fece0cc8..b40b475b86d 100644
> > --- a/gcc/cp/cp-trait.def
> > +++ b/gcc/cp/cp-trait.def
> > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE,
> > "__is_trivially_assignable", 2)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE,
> > "__is_trivially_constructible", -1)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
> >  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> > +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
> >  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY,
> > "__reference_constructs_from_temporary", 2)
> >  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY,
> > "__reference_converts_from_temporary", 2)
> >  /* FIXME Added space to avoid direct usage in GCC 13.  */
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 8fb47fd179e..011ba8e46e1 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree
> > type1, tree type2)
> >  case CPTK_IS_ENUM:
> >return type_code1 == ENUMERAL_TYPE;
> >
> > +case CPTK_IS_CONST:
> > +  return CP_TYPE_CONST_P (type1);
> > +
> >  case CPTK_IS_FINAL:
> >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
> >
> > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc,
> > cp_trait_kind kind, tree type1, tree type2)
> >  case CPTK_IS_ENUM:
> >  case CPTK_IS_UNION:
> >  case CPTK_IS_SAME:
> > +case CPTK_IS_CONST:
> >break;
> >
> >  case CPTK_IS_LAYOUT_COMPATIBLE:
> > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > index f343e153e56..965309a333a 100644
> > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > @@ -146,3 +146,6 @@
> >  #if !__has_builtin (__remove_cvref)
> >  # error "__has_builtin (__remove_cvref) failed"
> >  #endif
> > +#if !__has_builtin (__is_const)
> > +# error "__has_builtin (__is_const) failed"
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/ext/is_const.C
> > b/gcc/testsuite/g++.dg/ext/is_const.C
> > new file mode 100644
> > index 000..8f2d7c2fce9
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/ext/is_const.C
> > @@ -0,0 +1,19 @@
> > +// { dg-do compile { target c++11 } }
> > +
> > +#include 
> > +
> > +using namespace __gnu_test;
> > +
> > +#define SA(X) static_assert((X),#X)
> > +
> > +// Positive tests.
> > +SA(__is_const(const int));
> > +SA(__is_const(const volatile int));
> > +SA(__is_const(cClassType));
> > +SA(__is_const(cvClassType));
> > +
> > +// Negative tests.
> > +SA(!__is_const(int));
> > +SA(!__is_const(volatile int));
> > +SA(!__is_const(ClassType));
> > +SA(!__is_const(vClassType));
>
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University
>
-- 
Sincerely,
Ken Matsui


Re: [PATCH 1/2] c++: implement __is_const built-in trait

2023-06-24 Thread Xi Ruoyao via Gcc-patches
Please use [PATCH v3 1/2] next time, now it's not easy to find the
latest version of the series (I'm not sure if the number "3" is
correct).

On Sat, 2023-06-24 at 03:38 -0700, Ken Matsui via Gcc-patches wrote:
> This patch implements built-in trait for std::is_const.
> 
> gcc/cp/ChangeLog:
> 
> * cp-trait.def: Define __is_const.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
> * g++.dg/ext/is_const.C: New test.
> 
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 +++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
>  gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
>  5 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
> 
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..ff4ae831def 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>    inform (loc, "  %qT is not a union", t1);
>    break;
> +    case CPTK_IS_CONST:
> +  inform (loc, "  %qT is not a const type", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>    inform (loc, "  %qT is not an aggregate", t1);
>    break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..b40b475b86d 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE,
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE,
> "__is_trivially_constructible", -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY,
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY,
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..011ba8e46e1 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree
> type1, tree type2)
>  case CPTK_IS_ENUM:
>    return type_code1 == ENUMERAL_TYPE;
>  
> +    case CPTK_IS_CONST:
> +  return CP_TYPE_CONST_P (type1);
> +
>  case CPTK_IS_FINAL:
>    return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
>  
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc,
> cp_trait_kind kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +    case CPTK_IS_CONST:
>    break;
>  
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..965309a333a 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_const)
> +# error "__has_builtin (__is_const) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_const.C
> b/gcc/testsuite/g++.dg/ext/is_const.C
> new file mode 100644
> index 000..8f2d7c2fce9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_const.C
> @@ -0,0 +1,19 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +// Positive tests.
> +SA(__is_const(const int));
> +SA(__is_const(const volatile int));
> +SA(__is_const(cClassType));
> +SA(__is_const(cvClassType));
> +
> +// Negative tests.
> +SA(!__is_const(int));
> +SA(!__is_const(volatile int));
> +SA(!__is_const(ClassType));
> +SA(!__is_const(vClassType));

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH 2/2] libstdc++: use new built-in trait __is_const

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_const.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..3a46eca5377 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -764,6 +764,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if __has_builtin(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -771,6 +777,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3210,10 +3217,17 @@ template 
   inline constexpr bool is_compound_v = is_compound<_Tp>::value;
 template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
+
+#if __has_builtin(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
+
 template 
   inline constexpr bool is_volatile_v = false;
 template 
-- 
2.41.0



[PATCH 1/2] c++: implement __is_const built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_const.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_const.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
* g++.dg/ext/is_const.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
 5 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..ff4ae831def 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_CONST:
+  inform (loc, "  %qT is not a const type", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..b40b475b86d 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..011ba8e46e1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_ENUM:
   return type_code1 == ENUMERAL_TYPE;
 
+case CPTK_IS_CONST:
+  return CP_TYPE_CONST_P (type1);
+
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_CONST:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..965309a333a 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_const)
+# error "__has_builtin (__is_const) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
b/gcc/testsuite/g++.dg/ext/is_const.C
new file mode 100644
index 000..8f2d7c2fce9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_const.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+// Positive tests.
+SA(__is_const(const int));
+SA(__is_const(const volatile int));
+SA(__is_const(cClassType));
+SA(__is_const(cvClassType));
+
+// Negative tests.
+SA(!__is_const(int));
+SA(!__is_const(volatile int));
+SA(!__is_const(ClassType));
+SA(!__is_const(vClassType));
-- 
2.41.0



[PATCH 2/2] libstdc++: use new built-in trait __remove_pointer

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __remove_pointer.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_pointer): Use __remove_pointer 
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..81497e2f3e1 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2023,6 +2023,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Pointer modifications.
 
+  /// remove_pointer
+#if __has_builtin(__remove_pointer)
+  template
+struct remove_pointer
+{ using type = __remove_pointer(_Tp); };
+#else
   template
 struct __remove_pointer_helper
 { using type = _Tp; };
@@ -2031,11 +2037,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __remove_pointer_helper<_Tp, _Up*>
 { using type = _Up; };
 
-  /// remove_pointer
   template
 struct remove_pointer
 : public __remove_pointer_helper<_Tp, __remove_cv_t<_Tp>>
 { };
+#endif
 
   template
 struct __add_pointer_helper
-- 
2.41.0



[PATCH 1/2] c++: implement __remove_pointer built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::remove_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __remove_pointer.
* semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer.
* g++.dg/ext/remove_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  5 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++
 4 files changed, 60 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..07823e55579 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
 DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1)
 DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
+DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1)
 DEFTRAIT_TYPE (UNDERLYING_TYPE,  "__underlying_type", 1)
 DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1)
 
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..a3b283ce938 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12374,6 +12374,11 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
type1 = TREE_TYPE (type1);
   return cv_unqualified (type1);
 
+case CPTK_REMOVE_POINTER:
+  if (TYPE_PTR_P (type1))
+type1 = TREE_TYPE (type1);
+  return type1;
+
 case CPTK_TYPE_PACK_ELEMENT:
   return finish_type_pack_element (type1, type2, complain);
 
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..e21e0a95509 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__remove_pointer)
+# error "__has_builtin (__remove_pointer) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/remove_pointer.C 
b/gcc/testsuite/g++.dg/ext/remove_pointer.C
new file mode 100644
index 000..7b13db93950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/remove_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(__is_same(__remove_pointer(int), int));
+SA(__is_same(__remove_pointer(int*), int));
+SA(__is_same(__remove_pointer(int**), int*));
+
+SA(__is_same(__remove_pointer(const int*), const int));
+SA(__is_same(__remove_pointer(const int**), const int*));
+SA(__is_same(__remove_pointer(int* const), int));
+SA(__is_same(__remove_pointer(int** const), int*));
+SA(__is_same(__remove_pointer(int* const* const), int* const));
+
+SA(__is_same(__remove_pointer(volatile int*), volatile int));
+SA(__is_same(__remove_pointer(volatile int**), volatile int*));
+SA(__is_same(__remove_pointer(int* volatile), int));
+SA(__is_same(__remove_pointer(int** volatile), int*));
+SA(__is_same(__remove_pointer(int* volatile* volatile), int* volatile));
+
+SA(__is_same(__remove_pointer(const volatile int*), const volatile int));
+SA(__is_same(__remove_pointer(const volatile int**), const volatile int*));
+SA(__is_same(__remove_pointer(const int* volatile), const int));
+SA(__is_same(__remove_pointer(volatile int* const), volatile int));
+SA(__is_same(__remove_pointer(int* const volatile), int));
+SA(__is_same(__remove_pointer(const int** volatile), const int*));
+SA(__is_same(__remove_pointer(volatile int** const), volatile int*));
+SA(__is_same(__remove_pointer(int** const volatile), int*));
+SA(__is_same(__remove_pointer(int* const* const volatile), int* const));
+SA(__is_same(__remove_pointer(int* volatile* const volatile), int* volatile));
+SA(__is_same(__remove_pointer(int* const volatile* const volatile), int* const 
volatile));
+
+SA(__is_same(__remove_pointer(int&), int&));
+SA(__is_same(__remove_pointer(const int&), const int&));
+SA(__is_same(__remove_pointer(volatile int&), volatile int&));
+SA(__is_same(__remove_pointer(const volatile int&), const volatile int&));
+
+SA(__is_same(__remove_pointer(int&&), int&&));
+SA(__is_same(__remove_pointer(const int&&), const int&&));
+SA(__is_same(__remove_pointer(volatile int&&), volatile int&&));
+SA(__is_same(__remove_pointer(const volatile int&&), const volatile int&&));
+
+SA(__is_same(__remove_pointer(int[3]), int[3]));
+SA(__is_same(__remove_pointer(const int[3]), const int[3]));
+SA(__is_same(__remove_pointer(volatile int[3]), volatile int[3]));
+SA(__is_same(__remove_pointer(const volatile int[3]), const volatile int[3]));
+
+SA(__is_same(__remove_pointer(int(int)), int(int)));
+SA(__is_same(__remove_pointer(int(*const)(int)), int(int)));
+SA(__is_same(__remove_pointer(int(*volatile)(int))

Re: [PATCH 1/2] c++: implement __remove_pointer built-in trait

2023-06-24 Thread Ken Matsui via Gcc-patches
On Tue, Jun 20, 2023 at 8:22 AM Patrick Palka  wrote:
>
> On Sat, 17 Jun 2023, Ken Matsui via Gcc-patches wrote:
>
> > Hi,
> >
> > I conducted a benchmark for remove_pointer as well as is_object. Just
> > like the is_object benchmark, here is the benchmark code:
> >
> > https://github.com/ken-matsui/gcc-benches/blob/main/remove_pointer_benchmark.cc
> >
> > On my computer, using the gcc HEAD of this patch for a release build,
> > the patch with -DUSE_BUILTIN took 8.7% less time and used 4.3-4.9%
> > less memory on average compared to not using it. Although the
> > performance improvement was not as significant as with is_object, the
> > benchmark demonstrated that the compilation was consistently more
> > efficient.
>
> Thanks for the benchmark.  The improvement is lesser than I expected,
> but that might be because the benchmark is "biased":
>
>   template 
>   struct Instantiator : Instantiator {
>   static_assert(!std::is_pointer_v>);
>   };
>
> This only invokes remove_pointer_t on the non-pointer type Instantiator,
> and so the benchmark doesn't factor in the performance of the trait when
> invoked on pointer types, and traits typically will have different
> performance characteristics depending on the kind of type it's given.
>
> To more holistically assess the real-world performance of the trait the
> benchmark should also consider pointer types and maybe also cv-qualified
> types (given that the original implementation is in terms of
> __remove_cv_t and thus performance of the original implementation may be
> sensitive to cv-qualification).  So we should probably uniformly
> benchmark these classes of types, via doing e.g.:
>
>   static_assert(!std::is_pointer_v>);
>   static_assert(!std::is_pointer_v>);
>   static_assert(!std::is_pointer_v>);
>   static_assert(!std::is_pointer_v>);
>
> (We could consider other kinds of types too, e.g. reference types and
> integral types, but it seems clear based on the implementations being
> benchmarked that performance won't be sensitive to reference-ness
> or integral-ness.)

Thank you for your review! I totally agree with your opinion that the
benchmark should have been exhaustive. I conducted a benchmark for
this new change:

https://github.com/ken-matsui/gcc-benches/blob/main/remove_pointer.md#tue-jun-20-030313-pm-pdt-2023

Time: -27.918%
Peak Memory Usage: -19.0755%
Total Memory Usage: -20.0199%

> >
> > Sincerely,
> > Ken Matsui
> >
> > On Thu, Jun 15, 2023 at 5:22 AM Ken Matsui  
> > wrote:
> > >
> > > This patch implements built-in trait for std::remove_pointer.
> > >
> > > gcc/cp/ChangeLog:
> > >
> > > * cp-trait.def: Define __remove_pointer.
> > > * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer.
> > > * g++.dg/ext/remove_pointer.C: New test.
> > >
> > > Signed-off-by: Ken Matsui 
> > > ---
> > >  gcc/cp/cp-trait.def   |  1 +
> > >  gcc/cp/semantics.cc   |  4 ++
> > >  gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
> > >  gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++
> > >  4 files changed, 59 insertions(+)
> > >  create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C
> > >
> > > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > > index 8b7fece0cc8..07823e55579 100644
> > > --- a/gcc/cp/cp-trait.def
> > > +++ b/gcc/cp/cp-trait.def
> > > @@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2)
> > >  DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
> > >  DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1)
> > >  DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
> > > +DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1)
> > >  DEFTRAIT_TYPE (UNDERLYING_TYPE,  "__underlying_type", 1)
> > >  DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1)
> > >
> > > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > > index 8fb47fd179e..885c7a6fb64 100644
> > > --- a/gcc/cp/semantics.cc
> > > +++ b/gcc/cp/semantics.cc
> > > @@ -12373,6 +12373,10 @@ finish_trait_type (cp_trait_kind kind, tree 
> > > type1, tree type2,
> > >if (TYPE_REF_P (type1))
> > > type1 = TREE_TYPE (type1);
> > >return cv_unqualified (type1);
> > > +case CPTK_REMOVE_POINTER:
> > > +  if (TYPE_PTR_P (type1))
> > > +type1 = TREE_TYPE (type1);
> > > +  return type1;
>
> Maybe add a newline before the 'case' to visually separate it from the
> previous 'case'?  LGTM otherwise, thanks!

Will do! Thank you!

> > >
> > >  case CPTK_TYPE_PACK_ELEMENT:
> > >return finish_type_pack_element (type1, type2, complain);
> > > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> > > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > > index f343e153e56..e21e0a95509 100644
> > > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > > @@ -1