Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting
On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu wrote: > > On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote: > > > > On 25.06.2023 07:12, Hongtao Liu wrote: > > > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches > > > wrote: > > >> > > >> --- > > >> For the purpose here (and elsewhere) bcst_vector_operand() (really: > > >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow > > >> 128-bit and 256-bit types as well irrespective of AVX512VL being > > >> enabled. This would likely require a new predicate > > >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name > > >> selection it will want considering that this is applicable to certain > > >> non-calculational FP operations as well.) > > > I think so. > > > > Any preference towards predicate and constraint naming? > something like bcst_mem_operand_$suffiix, $suffix indicates the > pattern may use zmm instruction for 128/256-bit operand. > maybe just bcst_mem_operand_zmm? For constraint, maybe we can reuse Br, relax Br to match bcst_mem_operand_zmm. For those original patterns with bcst_mem_operand, it should be ok since it's already guarded by the predicate, the constraint must be valid. > > > > > Plus I think there's a more general question behind this: A new > > predicate / constraint pair is likely just one way of dealing > > with the issue. Another would appear to be to remove the > > restriction of 128- and 256-byte types when AVX512VL is not > > enabled, but AVX512F is. While that would require touching a > > lot of insn constraints, it looks as if lifting that restriction > > would "merely" require much wider use of Yv where v is used > > right now. But of course I may well be unaware of (some of) the > > reasons why that restriction was put in place in the first place > > (it can't really be the lack of suitable move insns, as those > > can be synthesized by using e.g. vextract{32,64}x4). > Also be careful of SIMD Floating-Point Exception if we use the zmm > version for those arithmetic instructions, the upper bits need to be > explicitly cleared for 128/256-bit operand. > For pternlog or other logic instructions, it's ok since there's no > SIMD Floating-Point Exception for such instructions. > > > > > Jan > > > > -- > BR, > Hongtao -- BR, Hongtao
Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting
On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote: > > On 25.06.2023 07:12, Hongtao Liu wrote: > > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches > > wrote: > >> > >> --- > >> For the purpose here (and elsewhere) bcst_vector_operand() (really: > >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow > >> 128-bit and 256-bit types as well irrespective of AVX512VL being > >> enabled. This would likely require a new predicate > >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name > >> selection it will want considering that this is applicable to certain > >> non-calculational FP operations as well.) > > I think so. > > Any preference towards predicate and constraint naming? something like bcst_mem_operand_$suffiix, $suffix indicates the pattern may use zmm instruction for 128/256-bit operand. maybe just bcst_mem_operand_zmm? > > Plus I think there's a more general question behind this: A new > predicate / constraint pair is likely just one way of dealing > with the issue. Another would appear to be to remove the > restriction of 128- and 256-byte types when AVX512VL is not > enabled, but AVX512F is. While that would require touching a > lot of insn constraints, it looks as if lifting that restriction > would "merely" require much wider use of Yv where v is used > right now. But of course I may well be unaware of (some of) the > reasons why that restriction was put in place in the first place > (it can't really be the lack of suitable move insns, as those > can be synthesized by using e.g. vextract{32,64}x4). Also be careful of SIMD Floating-Point Exception if we use the zmm version for those arithmetic instructions, the upper bits need to be explicitly cleared for 128/256-bit operand. For pternlog or other logic instructions, it's ok since there's no SIMD Floating-Point Exception for such instructions. > > Jan -- BR, Hongtao
Re: [PATCH 4/5] x86: further PR target/100711-like splitting
On Sun, Jun 25, 2023 at 2:16 PM Jan Beulich wrote: > > On 25.06.2023 07:06, Hongtao Liu wrote: > > On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches > > wrote: > >> > >> With respective two-operand bitwise operations now expressable by a > >> single VPTERNLOG, add splitters to also deal with ior and xor > >> counterparts of the original and-only case. Note that the splitters need > >> to be separate, as the placement of "not" differs in the final insns > >> (*iornot3, *xnor3) which are intended to pick up one half of > >> the result. > >> > >> gcc/ > >> > >> * config/i386/sse.md: New splitters to simplify > >> not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. > >> > >> gcc/testsuite/ > >> > >> * gcc.target/i386/pr100711-4.c: New test. > >> * gcc.target/i386/pr100711-5.c: New test. > >> > >> --- a/gcc/config/i386/sse.md > >> +++ b/gcc/config/i386/sse.md > >> @@ -17366,6 +17366,36 @@ > >> (match_dup 2)))] > >>"operands[3] = gen_reg_rtx (mode);") > >> > >> +(define_split > >> + [(set (match_operand:VI 0 "register_operand") > >> + (ior:VI > >> + (vec_duplicate:VI > >> + (not: > >> + (match_operand: 1 "nonimmediate_operand"))) > >> + (match_operand:VI 2 "vector_operand")))] > >> + " == 64 || TARGET_AVX512VL > >> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" > >> + [(set (match_dup 3) > >> + (vec_duplicate:VI (match_dup 1))) > >> + (set (match_dup 0) > >> + (ior:VI (not:VI (match_dup 3)) (match_dup 2)))] > >> + "operands[3] = gen_reg_rtx (mode);") > >> + > >> +(define_split > >> + [(set (match_operand:VI 0 "register_operand") > >> + (xor:VI > >> + (vec_duplicate:VI > >> + (not: > >> + (match_operand: 1 "nonimmediate_operand"))) > >> + (match_operand:VI 2 "vector_operand")))] > >> + " == 64 || TARGET_AVX512VL > >> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" > >> + [(set (match_dup 3) > >> + (vec_duplicate:VI (match_dup 1))) > >> + (set (match_dup 0) > >> + (not:VI (xor:VI (match_dup 3) (match_dup 2] > >> + "operands[3] = gen_reg_rtx (mode);") > >> + > > Can we merge this splitter(xor:not) into ior:not one with a code > > iterator for xor,ior, They look the same except for the xor/ior. > > They're only almost the same: Note (ior (not )) vs (not (xor )) as > the result of the splitting. The difference is necessary to fit > with what patch 1 introduces (which in turn is the way it is to > fit with what generic code transforms things to up front). (I had > it the way you suggest initially, until I figured why one of the > two would end up never being used.) > 3597 /* Convert (XOR (NOT x) (NOT y)) to (XOR x y). 3598 Also convert (XOR (NOT x) y) to (NOT (XOR x y)), similarly for 3599 (NOT y). */ 3600 { 3601int num_negated = 0; 3602 3603if (GET_CODE (op0) == NOT) 3604 num_negated++, op0 = XEXP (op0, 0); 3605if (GET_CODE (op1) == NOT) 3606 num_negated++, op1 = XEXP (op1, 0); It looks simplify_rtx plays the trick. And it's documented. 8602@cindex @code{xor}, canonicalization of 8603@item 8604The only possible RTL expressions involving both bitwise exclusive-or 8605and bitwise negation are @code{(xor:@var{m} @var{x} @var{y})} 8606and @code{(not:@var{m} (xor:@var{m} @var{x} @var{y}))}. Then the original patch LGTM. > Jan -- BR, Hongtao
Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting
On 25.06.2023 07:12, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches > wrote: >> >> --- >> For the purpose here (and elsewhere) bcst_vector_operand() (really: >> bcst_mem_operand()) isn't permissive enough: We'd want it to allow >> 128-bit and 256-bit types as well irrespective of AVX512VL being >> enabled. This would likely require a new predicate >> (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name >> selection it will want considering that this is applicable to certain >> non-calculational FP operations as well.) > I think so. Any preference towards predicate and constraint naming? Plus I think there's a more general question behind this: A new predicate / constraint pair is likely just one way of dealing with the issue. Another would appear to be to remove the restriction of 128- and 256-byte types when AVX512VL is not enabled, but AVX512F is. While that would require touching a lot of insn constraints, it looks as if lifting that restriction would "merely" require much wider use of Yv where v is used right now. But of course I may well be unaware of (some of) the reasons why that restriction was put in place in the first place (it can't really be the lack of suitable move insns, as those can be synthesized by using e.g. vextract{32,64}x4). Jan
Re: [PATCH 4/5] x86: further PR target/100711-like splitting
On 25.06.2023 07:06, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches > wrote: >> >> With respective two-operand bitwise operations now expressable by a >> single VPTERNLOG, add splitters to also deal with ior and xor >> counterparts of the original and-only case. Note that the splitters need >> to be separate, as the placement of "not" differs in the final insns >> (*iornot3, *xnor3) which are intended to pick up one half of >> the result. >> >> gcc/ >> >> * config/i386/sse.md: New splitters to simplify >> not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. >> >> gcc/testsuite/ >> >> * gcc.target/i386/pr100711-4.c: New test. >> * gcc.target/i386/pr100711-5.c: New test. >> >> --- a/gcc/config/i386/sse.md >> +++ b/gcc/config/i386/sse.md >> @@ -17366,6 +17366,36 @@ >> (match_dup 2)))] >>"operands[3] = gen_reg_rtx (mode);") >> >> +(define_split >> + [(set (match_operand:VI 0 "register_operand") >> + (ior:VI >> + (vec_duplicate:VI >> + (not: >> + (match_operand: 1 "nonimmediate_operand"))) >> + (match_operand:VI 2 "vector_operand")))] >> + " == 64 || TARGET_AVX512VL >> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" >> + [(set (match_dup 3) >> + (vec_duplicate:VI (match_dup 1))) >> + (set (match_dup 0) >> + (ior:VI (not:VI (match_dup 3)) (match_dup 2)))] >> + "operands[3] = gen_reg_rtx (mode);") >> + >> +(define_split >> + [(set (match_operand:VI 0 "register_operand") >> + (xor:VI >> + (vec_duplicate:VI >> + (not: >> + (match_operand: 1 "nonimmediate_operand"))) >> + (match_operand:VI 2 "vector_operand")))] >> + " == 64 || TARGET_AVX512VL >> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" >> + [(set (match_dup 3) >> + (vec_duplicate:VI (match_dup 1))) >> + (set (match_dup 0) >> + (not:VI (xor:VI (match_dup 3) (match_dup 2] >> + "operands[3] = gen_reg_rtx (mode);") >> + > Can we merge this splitter(xor:not) into ior:not one with a code > iterator for xor,ior, They look the same except for the xor/ior. They're only almost the same: Note (ior (not )) vs (not (xor )) as the result of the splitting. The difference is necessary to fit with what patch 1 introduces (which in turn is the way it is to fit with what generic code transforms things to up front). (I had it the way you suggest initially, until I figured why one of the two would end up never being used.) Jan
RE: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD
Committed, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Saturday, June 24, 2023 10:45 PM To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org Cc: rguent...@suse.de; richard.sandif...@arm.com Subject: Re: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD On 6/23/23 07:48, juzhe.zh...@rivai.ai wrote: > From: Ju-Zhe Zhong > > Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer. > > Currenly, > 1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones. > 2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF. > > Now, I think it makes sense that we can support > > fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - > bias) is VF. > > gcc/ChangeLog: > > * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}. > (gimple_fold_partial_load_store_mem_ref): Ditto. > (gimple_fold_partial_store): Ditto. > (gimple_fold_call): Ditto. OK jeff
Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations
On 25.06.2023 06:42, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches > wrote: >> >> +(define_code_iterator andor [and ior]) >> +(define_code_attr nlogic [(and "nor") (ior "nand")]) >> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")]) >> + >> +(define_insn "*3" >> + [(set (match_operand:VI 0 "register_operand" "=v,v") >> + (andor:VI >> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v")) >> + (not:VI (match_operand:VI 2 "bcst_vector_operand" "vBr,m"] > I'm thinking of doing it in simplify_rtx or gimple match.pd to transform > (and (not op1)) (not op2)) -> (not: (ior: op1 op2)) This wouldn't be a win (not + andn) -> (or + not), but what's more important is ... > (ior (not op1) (not op2)) -> (not : (and op1 op2)) > > Even w/o avx512f, the transformation should also benefit since it > takes less logic operations 3 -> 2.(or 2 -> 2 for pandn). ... that these transformations (from the, as per the doc, canonical representation of nand and nor) are already occurring in common code, _if_ no suitable insn can be found. That was at least the conclusion I drew from looking around a lot, supported by the code that's generated prior to this change. Jan
RE: [PATCH] New finish_compare_by_pieces target hook (for x86).
On Tue, 13 June 2023 12:02, Richard Biener wrote: > On Mon, Jun 12, 2023 at 4:04 PM Roger Sayle > wrote: > > The following simple test case, from PR 104610, shows that memcmp () > > == 0 can result in some bizarre code sequences on x86. > > > > int foo(char *a) > > { > > static const char t[] = "0123456789012345678901234567890"; > > return __builtin_memcmp(a, &t[0], sizeof(t)) == 0; } > > > > with -O2 currently contains both: > > xorl%eax, %eax > > xorl$1, %eax > > and also > > movl$1, %eax > > xorl$1, %eax > > > > Changing the return type of foo to _Bool results in the equally > > bizarre: > > xorl%eax, %eax > > testl %eax, %eax > > sete%al > > and also > > movl$1, %eax > > testl %eax, %eax > > sete%al > > > > All these sequences set the result to a constant, but this > > optimization opportunity only occurs very late during compilation, by > > basic block duplication in the 322r.bbro pass, too late for CSE or > > peephole2 to do anything about it. The problem is that the idiom > > expanded by compare_by_pieces for __builtin_memcmp_eq contains basic > > blocks that can't easily be optimized by if-conversion due to the > > multiple incoming edges on the fail block. > > > > In summary, compare_by_pieces generates code that looks like: > > > > if (x[0] != y[0]) goto fail_label; > > if (x[1] != y[1]) goto fail_label; > > ... > > if (x[n] != y[n]) goto fail_label; > > result = 1; > > goto end_label; > > fail_label: > > result = 0; > > end_label: > > > > In theory, the RTL if-conversion pass could be enhanced to tackle > > arbitrarily complex if-then-else graphs, but the solution proposed > > here is to allow suitable targets to perform if-conversion during > > compare_by_pieces. The x86, for example, can take advantage that all > > of the above comparisons set and test the zero flag (ZF), which can > > then be used in combination with sete. Hence compare_by_pieces could > > instead generate: > > > > if (x[0] != y[0]) goto fail_label; > > if (x[1] != y[1]) goto fail_label; > > ... > > if (x[n] != y[n]) goto fail_label; > > fail_label: > > sete result > > > > which requires one less basic block, and the redundant conditional > > branch to a label immediately after is cleaned up by GCC's existing > > RTL optimizations. > > > > For the test case above, where -O2 -msse4 previously generated: > > > > foo:movdqu (%rdi), %xmm0 > > pxor.LC0(%rip), %xmm0 > > ptest %xmm0, %xmm0 > > je .L5 > > .L2:movl$1, %eax > > xorl$1, %eax > > ret > > .L5:movdqu 16(%rdi), %xmm0 > > pxor.LC1(%rip), %xmm0 > > ptest %xmm0, %xmm0 > > jne .L2 > > xorl%eax, %eax > > xorl$1, %eax > > ret > > > > we now generate: > > > > foo:movdqu (%rdi), %xmm0 > > pxor.LC0(%rip), %xmm0 > > ptest %xmm0, %xmm0 > > jne .L2 > > movdqu 16(%rdi), %xmm0 > > pxor.LC1(%rip), %xmm0 > > ptest %xmm0, %xmm0 > > .L2:sete%al > > movzbl %al, %eax > > ret > > > > Using a target hook allows the large amount of intelligence already in > > compare_by_pieces to be re-used by the i386 backend, but this can also > > help other backends with condition flags where the equality result can > > be materialized. > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} > > with no new failures. Ok for mainline? > > What's the guarantee that the zero flag is appropriately set on all edges > incoming > now and forever? Is there any reason why this target hook can't be removed (in future) should it stop being useful? It's completely optional and not required for the correct functioning of the compiler. > Does this require target specific knowledge on how do_compare_rtx_and_jump > is emitting RTL? Yes. Each backend can decide how best to implement finish_compare_by_pieces given its internal knowledge of how do_compare_rtx_and_jump works. It's not important to the middle-end how the underlying invariants are guaranteed, just that they are and the backend produces correct code. A backend may store flags on the target label, or maintain state in cfun. Future changes to the i386 backend might cause it to revert to the default finish_compare_by_pieces, or provide an alternate implementation, but at the moment this patch improves the code that GCC generates. Very little (in software like GCC) is forever. > Do you see matching this in ifcvt to be unreasonable? I'm thinking of > "reducing" > the incoming edges pairwise without actually looking at the ifcvt code. There's nothing about the proposed patch that prevents or blocks improvements
Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting
On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches wrote: > > Following two-operand bitwise operations, add another splitter to also > deal with not followed by broadcast all on its own, which can be > expressed as simple embedded broadcast instead once a broadcast operand > is actually permitted in the respective insn. While there also permit > a broadcast operand in the corresponding expander. The patch LGTM. > > gcc/ > > * config/i386/sse.md: New splitters to simplify > not;vec_duplicate as a singular vpternlog. > (one_cmpl2): Allow broadcast for operand 1. > (one_cmpl2): Likewise. > > gcc/testsuite/ > > * gcc.target/i386/pr100711-6.c: New test. > --- > For the purpose here (and elsewhere) bcst_vector_operand() (really: > bcst_mem_operand()) isn't permissive enough: We'd want it to allow > 128-bit and 256-bit types as well irrespective of AVX512VL being > enabled. This would likely require a new predicate > (bcst_intvec_operand()?) and a new constraint (BR? Bi?). (Yet for name > selection it will want considering that this is applicable to certain > non-calculational FP operations as well.) I think so. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17156,7 +17156,7 @@ > > (define_expand "one_cmpl2" >[(set (match_operand:VI 0 "register_operand") > - (xor:VI (match_operand:VI 1 "vector_operand") > + (xor:VI (match_operand:VI 1 "bcst_vector_operand") > (match_dup 2)))] >"TARGET_SSE" > { > @@ -17168,7 +17168,7 @@ > > (define_insn "one_cmpl2" >[(set (match_operand:VI 0 "register_operand" "=v,v") > - (xor:VI (match_operand:VI 1 "nonimmediate_operand" "v,m") > + (xor:VI (match_operand:VI 1 "bcst_vector_operand" "vBr,m") > (match_operand:VI 2 "vector_all_ones_operand" "BC,BC")))] >"TARGET_AVX512F > && (! > @@ -17191,6 +17191,19 @@ > (symbol_ref " == 64 || TARGET_AVX512VL") > (const_int 1)))]) > > +(define_split > + [(set (match_operand:VI48_AVX512F 0 "register_operand") > + (vec_duplicate:VI48_AVX512F > + (not: > + (match_operand: 1 "nonimmediate_operand"] > + " == 64 || TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" > + [(set (match_dup 0) > + (xor:VI48_AVX512F > + (vec_duplicate:VI48_AVX512F (match_dup 1)) > + (match_dup 2)))] > + "operands[2] = CONSTM1_RTX (mode);") > + > (define_expand "_andnot3" >[(set (match_operand:VI_AVX2 0 "register_operand") > (and:VI_AVX2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr100711-6.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */ > + > +typedef int v16si __attribute__ ((vector_size (64))); > +typedef long long v8di __attribute__((vector_size (64))); > + > +v16si foo_v16si (const int *a) > +{ > +return (__extension__ (v16si) {~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, > + ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a}); > +} > + > +v8di foo_v8di (const long long *a) > +{ > +return (__extension__ (v8di) {~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a, ~*a}); > +} > + > +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x55, > \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}" 2 } } */ > -- BR, Hongtao
Re: [PATCH 4/5] x86: further PR target/100711-like splitting
On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches wrote: > > With respective two-operand bitwise operations now expressable by a > single VPTERNLOG, add splitters to also deal with ior and xor > counterparts of the original and-only case. Note that the splitters need > to be separate, as the placement of "not" differs in the final insns > (*iornot3, *xnor3) which are intended to pick up one half of > the result. > > gcc/ > > * config/i386/sse.md: New splitters to simplify > not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. > > gcc/testsuite/ > > * gcc.target/i386/pr100711-4.c: New test. > * gcc.target/i386/pr100711-5.c: New test. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17366,6 +17366,36 @@ > (match_dup 2)))] >"operands[3] = gen_reg_rtx (mode);") > > +(define_split > + [(set (match_operand:VI 0 "register_operand") > + (ior:VI > + (vec_duplicate:VI > + (not: > + (match_operand: 1 "nonimmediate_operand"))) > + (match_operand:VI 2 "vector_operand")))] > + " == 64 || TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" > + [(set (match_dup 3) > + (vec_duplicate:VI (match_dup 1))) > + (set (match_dup 0) > + (ior:VI (not:VI (match_dup 3)) (match_dup 2)))] > + "operands[3] = gen_reg_rtx (mode);") > + > +(define_split > + [(set (match_operand:VI 0 "register_operand") > + (xor:VI > + (vec_duplicate:VI > + (not: > + (match_operand: 1 "nonimmediate_operand"))) > + (match_operand:VI 2 "vector_operand")))] > + " == 64 || TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)" > + [(set (match_dup 3) > + (vec_duplicate:VI (match_dup 1))) > + (set (match_dup 0) > + (not:VI (xor:VI (match_dup 3) (match_dup 2] > + "operands[3] = gen_reg_rtx (mode);") > + Can we merge this splitter(xor:not) into ior:not one with a code iterator for xor,ior, They look the same except for the xor/ior. No need to merge it into and:not case which have different guard conditions. Others LGTM. > (define_insn "*andnot3_mask" >[(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") > (vec_merge:VI48_AVX512VL > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr100711-4.c > @@ -0,0 +1,42 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */ > + > +typedef char v64qi __attribute__ ((vector_size (64))); > +typedef short v32hi __attribute__ ((vector_size (64))); > +typedef int v16si __attribute__ ((vector_size (64))); > +typedef long long v8di __attribute__((vector_size (64))); > + > +v64qi foo_v64qi (char a, v64qi b) > +{ > +return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; > +} > + > +v32hi foo_v32hi (short a, v32hi b) > +{ > +return (__extension__ (v32hi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; > +} > + > +v16si foo_v16si (int a, v16si b) > +{ > +return (__extension__ (v16si) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; > +} > + > +v8di foo_v8di (long long a, v8di b) > +{ > +return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) | b; > +} > + > +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 4 { > target { ! ia32 } } } } */ > +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xbb" 2 { > target { ia32 } } } } */ > +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0xdd" 2 { > target { ia32 } } } } */ > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr100711-5.c > @@ -0,0 +1,40 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512bw -mno-avx512vl -mprefer-vector-width=512 -O2" } */ > + > +typedef char v64qi __attribute__ ((vector_size (64))); > +typedef short v32hi __attribute__ ((vector_size (64))); > +typedef int v16si __attribute__ ((vector_size (64))); > +typedef long long v8di __attribute__((vector_size (64))); > + > +v64qi foo_v64qi (char a, v64qi b) > +{ > +return (__extension__ (v64qi) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a, > + ~a, ~a, ~a, ~a, ~a, ~a,
Re: [PATCH 3/5] x86: allow memory operand for AVX2 splitter for PR target/100711
On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches wrote: > > The intended broadcast (with AVX512) can very well be done right from > memory. Ok. > > gcc/ > > * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 > form of splitter for PR target/100711. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17356,7 +17356,7 @@ > (and:VI_AVX2 > (vec_duplicate:VI_AVX2 > (not: > - (match_operand: 1 "register_operand"))) > + (match_operand: 1 "nonimmediate_operand"))) > (match_operand:VI_AVX2 2 "vector_operand")))] >"TARGET_AVX2" >[(set (match_dup 3) > -- BR, Hongtao
Re: [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms
On Wed, Jun 21, 2023 at 2:27 PM Jan Beulich via Gcc-patches wrote: > > When it's the memory operand which is to be inverted, using VPANDN* > requires a further load instruction. The same can be achieved by a > single VPTERNLOG*. Add two new alternatives (for plain memory and > embedded broadcast), adjusting the predicate for the first operand > accordingly. > > Two pre-existing testcases actually end up being affected (improved) by > the change, which is reflected in updated expectations there. LGTM. > > gcc/ > > PR target/93768 > * config/i386/sse.md (*andnot3): Add new alternatives > for memory form operand 1. > > gcc/testsuite/ > > PR target/93768 > * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. > * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations > towards generated code. > * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit > code. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17210,11 +17210,13 @@ >"TARGET_AVX512F") > > (define_insn "*andnot3" > - [(set (match_operand:VI 0 "register_operand" "=x,x,v") > + [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v") > (and:VI > - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v")) > - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))] > - "TARGET_SSE" > + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")) > + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v,v")))] > + "TARGET_SSE > + && (register_operand (operands[1], mode) > + || register_operand (operands[2], mode))" > { >char buf[64]; >const char *ops; > @@ -17281,6 +17283,15 @@ > case 2: >ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; >break; > +case 3: > +case 4: > + tmp = "pternlog"; > + ssesuffix = ""; > + if (which_alternative != 4 || TARGET_AVX512VL) > + ops = "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}"; > + else > + ops = "v%s%s\t{$0x44, %%g1, %%g2, %%g0|%%g0, %%g2, %%g1, $0x44}"; > + break; > default: >gcc_unreachable (); > } > @@ -17289,7 +17300,7 @@ >output_asm_insn (buf, operands); >return ""; > } > - [(set_attr "isa" "noavx,avx,avx") > + [(set_attr "isa" "noavx,avx,avx,*,*") > (set_attr "type" "sselog") > (set (attr "prefix_data16") > (if_then_else > @@ -17297,9 +17308,12 @@ > (eq_attr "mode" "TI")) > (const_string "1") > (const_string "*"))) > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,vex,evex,evex,evex") > (set (attr "mode") > - (cond [(match_test "TARGET_AVX2") > + (cond [(and (eq_attr "alternative" "3,4") > + (match_test " < 64 && !TARGET_AVX512VL")) > +(const_string "XI") > + (match_test "TARGET_AVX2") > (const_string "") >(match_test "TARGET_AVX") > (if_then_else > @@ -17310,7 +17324,15 @@ > (match_test "optimize_function_for_size_p (cfun)")) > (const_string "V4SF") > ] > - (const_string "")))]) > + (const_string ""))) > + (set (attr "enabled") > + (cond [(eq_attr "alternative" "3") > +(symbol_ref " == 64 || TARGET_AVX512VL") > + (eq_attr "alternative" "4") > +(symbol_ref " == 64 || TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") > + ] > + (const_string "*")))]) > > ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn > (define_split > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */ > +/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, > \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ > +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ > + > +#define type __m512i > +#define vec 512 > +#define op andnot > +#define suffix epi64 > +#define SCALAR long long > + > +#include "avx512-binop-2.h" > --- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c > @@ -1,7 +1,7 @@ > /* { dg-do compile } */ > /* { dg-options "-mavx512f -O2" } */ > -/* { dg-final { scan-assembler-times "vpbroadcastd\[^\n\]*%zmm\[0-9\]+" 1 } > } */ > -/* { dg-final { scan-assembler-times "vpandnd\[^\n\]*%zmm\[0-9\]+" 1 } } */ > +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, > \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ > +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ > > #define type __m512i > #define vec 512 > --- a/gcc/testsuite/gcc.target/i386/pr100711-3.c > +++ b/gc
Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations
On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches wrote: > > All combinations of and, ior, xor, and not involving two operands can be > expressed that way in a single insn. > > gcc/ > > PR target/93768 > * config/i386/i386.cc (ix86_rtx_costs): Further special-case > bitwise vector operations. > * config/i386/sse.md (*iornot3): New insn. > (*xnor3): Likewise. > (*3): Likewise. > (andor): New code iterator. > (nlogic): New code attribute. > (ternlog_nlogic): Likewise. > > gcc/testsuite/ > > PR target/93768 > gcc.target/i386/avx512-binop-not-1.h: New. > gcc.target/i386/avx512-binop-not-2.h: New. > gcc.target/i386/avx512f-orn-si-zmm-1.c: New test. > gcc.target/i386/avx512f-orn-si-zmm-2.c: New test. > --- > The use of VI matches that in e.g. one_cmpl2 / > one_cmpl2 and *andnot3, despite > (here and there) > - V64QI and V32HI being needlessly excluded when AVX512BW isn't enabled, > - VTI not being covered, > - vector modes more narrow than 16 bytes not being covered. > > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -21178,6 +21178,32 @@ ix86_rtx_costs (rtx x, machine_mode mode >return false; > > case IOR: > + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) > + { > + /* (ior (not ...) ...) can be a single insn in AVX512. */ > + if (GET_CODE (XEXP (x, 0)) == NOT && TARGET_AVX512F > + && (GET_MODE_SIZE (mode) == 64 > + || (TARGET_AVX512VL > + && (GET_MODE_SIZE (mode) == 32 > + || GET_MODE_SIZE (mode) == 16 > + { > + rtx right = GET_CODE (XEXP (x, 1)) != NOT > + ? XEXP (x, 1) : XEXP (XEXP (x, 1), 0); > + > + *total = ix86_vec_cost (mode, cost->sse_op) > + + rtx_cost (XEXP (XEXP (x, 0), 0), mode, > + outer_code, opno, speed) > + + rtx_cost (right, mode, outer_code, opno, speed); > + return true; > + } > + *total = ix86_vec_cost (mode, cost->sse_op); > + } > + else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) > + *total = cost->add * 2; > + else > + *total = cost->add; > + return false; > + > case XOR: >if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) > *total = ix86_vec_cost (mode, cost->sse_op); > @@ -21198,11 +21224,20 @@ ix86_rtx_costs (rtx x, machine_mode mode > /* pandn is a single instruction. */ > if (GET_CODE (XEXP (x, 0)) == NOT) > { > + rtx right = XEXP (x, 1); > + > + /* (and (not ...) (not ...)) can be a single insn in AVX512. */ > + if (GET_CODE (right) == NOT && TARGET_AVX512F > + && (GET_MODE_SIZE (mode) == 64 > + || (TARGET_AVX512VL > + && (GET_MODE_SIZE (mode) == 32 > + || GET_MODE_SIZE (mode) == 16 > + right = XEXP (right, 0); > + > *total = ix86_vec_cost (mode, cost->sse_op) >+ rtx_cost (XEXP (XEXP (x, 0), 0), mode, >outer_code, opno, speed) > - + rtx_cost (XEXP (x, 1), mode, > - outer_code, opno, speed); > + + rtx_cost (right, mode, outer_code, opno, speed); > return true; > } > else if (GET_CODE (XEXP (x, 1)) == NOT) > @@ -21260,8 +21295,25 @@ ix86_rtx_costs (rtx x, machine_mode mode > > case NOT: >if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) > - // vnot is pxor -1. > - *total = ix86_vec_cost (mode, cost->sse_op) + 1; > + { > + /* (not (xor ...)) can be a single insn in AVX512. */ > + if (GET_CODE (XEXP (x, 0)) == XOR && TARGET_AVX512F > + && (GET_MODE_SIZE (mode) == 64 > + || (TARGET_AVX512VL > + && (GET_MODE_SIZE (mode) == 32 > + || GET_MODE_SIZE (mode) == 16 > + { > + *total = ix86_vec_cost (mode, cost->sse_op) > + + rtx_cost (XEXP (XEXP (x, 0), 0), mode, > + outer_code, opno, speed) > + + rtx_cost (XEXP (XEXP (x, 0), 1), mode, > + outer_code, opno, speed); > + return true; > + } > + > + // vnot is pxor -1. > + *total = ix86_vec_cost (mode, cost->sse_op) + 1; > + } >else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) > *total = cost->add * 2; >else > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17616,6 +17616,98 @@ >operands[2] = force_reg (V1TImode, CONSTM1_RTX (V1TImode)); > }) > > +(define_insn "*iornot3" > + [(set (match_operand:VI 0
Re: [PATCH] RISC-V: force arg and target to reg rtx under -O0
Hi, Li. Appreciate for catching this! I think it's better: -emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src)); +emit_move_insn (gen_lowpart (e.vector_mode (), e.target), src); do this to fix this issue. Thanks. juzhe.zh...@rivai.ai From: Li Xu Date: 2023-06-25 11:08 To: gcc-patches CC: kito.cheng; palmer; juzhe.zhong; Li Xu Subject: [PATCH] RISC-V: force arg and target to reg rtx under -O0 arg and target should be expanded to reg rtx during expand pass. Consider this following case: void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); } Compilation fails with: test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4': test.c:5:1: error: unrecognizable insn: 5 | } | ^ (insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32]) (mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1 (nil)) during RTL pass: vregs test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791 0x7c61b8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:108 0x7c61d7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:116 0xed58a7 extract_insn(rtx_insn*) ../.././riscv-gcc/gcc/recog.cc:2791 0xb7f789 instantiate_virtual_regs_in_insn ../.././riscv-gcc/gcc/function.cc:1611 0xb7f789 instantiate_virtual_regs ../.././riscv-gcc/gcc/function.cc:1984 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: force arg and target to reg rtx. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test. --- gcc/config/riscv/riscv-vector-builtins-bases.cc | 5 - gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c | 8 2 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index c6c53dc13a5..f135f7971fa 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1567,7 +1567,10 @@ public: { tree arg = CALL_EXPR_ARG (e.exp, 0); rtx src = expand_normal (arg); -emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src)); +if (MEM_P (e.target)) + e.target = force_reg (GET_MODE (e.target), e.target); +emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), + MEM_P (src) ? force_reg (GET_MODE (src), src) : src)); return e.target; } }; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c new file mode 100644 index 000..2b088b53546 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */ + +#include "riscv_vector.h" + +void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { + vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); +} -- 2.17.1
[PATCH] internal-fn: Fix bug of BIAS argument index
From: Ju-Zhe Zhong When trying to enable LEN_MASK_{LOAD,STORE} in RISC-V port, I found I made a mistake in case of argument index of BIAS. This patch is an obvious fix, Ok for trunk ? gcc/ChangeLog: * internal-fn.cc (expand_partial_store_optab_fn): Fix bug of BIAS argument index. --- gcc/internal-fn.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 1c2fd487e2a..9017176dc7a 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -2991,7 +2991,7 @@ expand_partial_store_optab_fn (internal_fn ifn, gcall *stmt, convert_optab optab maskt = gimple_call_arg (stmt, 3); mask = expand_normal (maskt); create_input_operand (&ops[3], mask, TYPE_MODE (TREE_TYPE (maskt))); - biast = gimple_call_arg (stmt, 4); + biast = gimple_call_arg (stmt, 5); bias = expand_normal (biast); create_input_operand (&ops[4], bias, QImode); icode = convert_optab_handler (optab, TYPE_MODE (type), GET_MODE (mask)); -- 2.36.3
[PATCH] RISC-V: force arg and target to reg rtx under -O0
arg and target should be expanded to reg rtx during expand pass. Consider this following case: void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); } Compilation fails with: test.c: In function 'test_vlmul_ext_v_i8mf8_i8mf4': test.c:5:1: error: unrecognizable insn: 5 | } | ^ (insn 30 29 0 2 (set (mem/c:VNx2QI (reg/f:DI 143) [0 x+0 S[2, 2] A32]) (mem/c:VNx2QI (reg/f:DI 148) [0 op1+0 S[2, 2] A16])) "test.c":4:18 -1 (nil)) during RTL pass: vregs test.c:5:1: internal compiler error: in extract_insn, at recog.cc:2791 0x7c61b8 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:108 0x7c61d7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) ../.././riscv-gcc/gcc/rtl-error.cc:116 0xed58a7 extract_insn(rtx_insn*) ../.././riscv-gcc/gcc/recog.cc:2791 0xb7f789 instantiate_virtual_regs_in_insn ../.././riscv-gcc/gcc/function.cc:1611 0xb7f789 instantiate_virtual_regs ../.././riscv-gcc/gcc/function.cc:1984 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: force arg and target to reg rtx. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vlmul_ext-2.c: New test. --- gcc/config/riscv/riscv-vector-builtins-bases.cc | 5 - gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c | 8 2 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index c6c53dc13a5..f135f7971fa 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -1567,7 +1567,10 @@ public: { tree arg = CALL_EXPR_ARG (e.exp, 0); rtx src = expand_normal (arg); -emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), src)); +if (MEM_P (e.target)) + e.target = force_reg (GET_MODE (e.target), e.target); +emit_insn (gen_rtx_SET (gen_lowpart (e.vector_mode (), e.target), + MEM_P (src) ? force_reg (GET_MODE (src), src) : src)); return e.target; } }; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c new file mode 100644 index 000..2b088b53546 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vlmul_ext-2.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */ + +#include "riscv_vector.h" + +void test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) { + vint8mf4_t res = __riscv_vlmul_ext_v_i8mf8_i8mf4(op1); +} -- 2.17.1
Re: Re: [PATCH V1] RISC-V:Add float16 tuple type support
Such issue will be addressed by this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622440.html But still wait for Jakub's comments. juzhe.zh...@rivai.ai From: Andreas Schwab Date: 2023-06-23 18:25 To: shiyulong CC: gcc-patches; palmer; kito.cheng; jim.wilson.gcc; juzhe.zhong; pan2.li; wuwei2016; jiawei; shihua; dje.gcc; mirimmad Subject: Re: [PATCH V1] RISC-V:Add float16 tuple type support ../../gcc/lto-streamer-out.cc: In function 'void lto_output_init_mode_table()': ../../gcc/lto-streamer-out.cc:3177:10: error: 'void* memset(void*, int, size_t)' forming offset [256, 283] is out of the bounds [0, 256] of object 'streamer_mode_table' with type 'unsigned char [256]' [-Werror=array-bounds=] 3177 | memset (streamer_mode_table, '\0', MAX_MACHINE_MODE); | ~~~^ In file included from ../../gcc/gimple-streamer.h:25, from ../../gcc/lto-streamer-out.cc:33: ../../gcc/tree-streamer.h:78:22: note: 'streamer_mode_table' declared here 78 | extern unsigned char streamer_mode_table[1 << 8]; | ^~~ cc1plus: all warnings being treated as errors make[3]: *** [Makefile:1180: lto-streamer-out.o] Error 1 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
[PATCHv4, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]
Hi, This patch adds a new insn for vector splat with small V2DI constants on P8. If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be loaded with vspltisw and vupkhsw on P8. It should be efficient than loading vector from memory. Compared to last version, the main change is to remove the new constraint and use a super constraint in the insn and set the check into insn condition. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog 2023-06-25 Haochen Gui gcc/ PR target/104124 * config/rs6000/altivec.md (*altivec_vupkhs_direct): Rename to... (altivec_vupkhs_direct): ...this. * config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split): New predicate to test if a constant can be loaded with vspltisw and vupkhsw. (easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check if a vector constant can be synthesized with a vspltisw and a vupkhsw. * config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p): Declare. * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): New function to return true if OP mode is V2DI and can be synthesized with vupkhsw and vspltisw. * config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up constants with vspltisw and vupkhsw. gcc/testsuite/ PR target/104124 * gcc.target/powerpc/pr104124.c: New. patch.diff diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index 49b0c964f4d..2c932854c33 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -2542,7 +2542,7 @@ (define_insn "altivec_vupkhs" } [(set_attr "type" "vecperm")]) -(define_insn "*altivec_vupkhs_direct" +(define_insn "altivec_vupkhs_direct" [(set (match_operand:VP 0 "register_operand" "=v") (unspec:VP [(match_operand: 1 "register_operand" "v")] UNSPEC_VUNPACK_HI_SIGN_DIRECT))] diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 52c65534e51..f62a4d9b506 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -694,6 +694,12 @@ (define_predicate "xxspltib_constant_split" return num_insns > 1; }) +;; Return true if the operand is a constant that can be loaded with a vspltisw +;; instruction and then a vupkhsw instruction. + +(define_predicate "vspltisw_vupkhsw_constant_split" + (and (match_code "const_vector") + (match_test "vspltisw_vupkhsw_constant_p (op, mode)"))) ;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB ;; instruction. @@ -742,6 +748,11 @@ (define_predicate "easy_vector_constant" && xxspltib_constant_p (op, mode, &num_insns, &value)) return true; + /* V2DI constant within RANGE (-16, 15) can be synthesized with a +vspltisw and a vupkhsw. */ + if (vspltisw_vupkhsw_constant_p (op, mode, &value)) + return true; + return easy_altivec_constant (op, mode); } diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 1a4fc1df668..00cb2d82953 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int, extern int easy_altivec_constant (rtx, machine_mode); extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); +extern bool vspltisw_vupkhsw_constant_p (rtx, machine_mode, int * = nullptr); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 3be5860dd9b..ae34a02b282 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -6638,6 +6638,36 @@ xxspltib_constant_p (rtx op, return true; } +/* Return true if OP mode is V2DI and can be synthesized with ISA 2.07 + instructions vupkhsw and vspltisw. + + Return the constant that is being split via CONSTANT_PTR. */ + +bool +vspltisw_vupkhsw_constant_p (rtx op, machine_mode mode, int *constant_ptr) +{ + HOST_WIDE_INT value; + rtx elt; + + if (!TARGET_P8_VECTOR) +return false; + + if (mode != V2DImode) +return false; + + if (!const_vec_duplicate_p (op, &elt)) +return false; + + value = INTVAL (elt); + if (value == 0 || value == 1 + || !EASY_VECTOR_15 (value)) +return false; + + if (constant_ptr) +*constant_ptr = (int) value; + return true; +} + const char * output_vec_const_move (rtx *operands) { diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 7d845df5c2d..4919b073e50 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -1174,6 +1174,30 @@ (define_insn_and_split "*xxspltib__split" [(set_attr "type" "vecperm") (set_attr "length" "8")]) +(defi
Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
On Sat, Jun 24, 2023, at 11:01 AM, Jeff Law via Gcc-patches wrote: > On 6/21/23 02:14, Wang, Yanzhang wrote: >> Hi Jeff, sorry for the late reply. >> >>> The long branch handling is done at the assembler level. So the clobbering >>> of $ra isn't visible to the compiler. Thus the compiler has to be >>> extremely careful to not hold values in $ra because the assembler may >>> clobber $ra. >> >> If assembler will modify the $ra behavior, it seems the rules we defined in >> the riscv.cc will be ignored. For example, the $ra saving generated by this >> patch may be modified by the assmebler and all others depends on it will be >> wrong. So implementing the long jump in the compiler is better. > Basically correct. The assembler potentially clobbers $ra. That's why > in the long jump patches $ra becomes a fixed register -- the compiler > doesn't know when it's clobbered by the assembler. > > Even if this were done in the compiler, we'd still have to do something > special with $ra. The point at which decisions about register > allocation and such are made is before the point where we know the final > positions of jumps/labels. It's a classic problem in GCC's design. Do you have a reference for more information on the long jump patches? I'm particularly curious about why $ra was selected as the temporary instead of $t1 like the tail pseudoinstruction uses. -s
RE: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis
Committed, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Saturday, June 24, 2023 10:09 PM To: 钟居哲 ; gcc-patches Cc: rguenther ; richard.sandiford Subject: Re: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis On 6/23/23 17:21, 钟居哲 wrote: > Not sure since I saw MASK_STORE/LEN_STORE didn't compute size. Yea, I think you're right. We take the size from the LHS. My mistake. This is fine for the trunk. jeff
Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD
On Sun, Jun 25, 2023 at 9:17 AM Liu, Hongtao wrote: > > > > > -Original Message- > > From: Jan Beulich > > Sent: Wednesday, June 21, 2023 8:40 PM > > To: Hongtao Liu > > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin ; Liu, > > Hongtao > > Subject: Re: [PATCH v2] x86: make better use of VBROADCASTSS / > > VPBROADCASTD > > > > On 21.06.2023 09:44, Jan Beulich wrote: > > > On 21.06.2023 09:37, Hongtao Liu wrote: > > >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches > > >> wrote: > > >>> > > >>> Isn't prefix_extra use bogus here? What extra prefix does > > >>> vbroadcastss > > >> According to comments, yes, no extra prefix is needed. > > >> > > >> ;; There are also additional prefixes in 3DNOW, SSSE3. > > >> ;; ssemuladd,sse4arg default to 0f24/0f25 and DREX byte, ;; > > >> sseiadd1,ssecvt1 to 0f7a with no DREX byte. > > >> ;; 3DNOW has 0f0f prefix, SSSE3 and SSE4_{1,2} 0f38/0f3a. > > > > > > Right, that's what triggered my question. I guess dropping these > > > "prefix_extra" really wants to be a separate patch (or maybe even > > > multiple, but it's hard to see how to split), dealing with all of the > > > instances which likely have accumulated simply via copy-and-paste. > > > > Or wait - I'm altering those lines anyway, so I could as well drop them > > right > > away (and slightly shrink patch size), if that's okay with you. Of course I > > should then not forget to also mention this in the changelog entry. > > > Yes. >Would you be okay for me to fold in that adjustment, or do you >insist on a separate patch? Also for this, no need for a separate patch. > > Jan -- BR, Hongtao
RE: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD
> -Original Message- > From: Jan Beulich > Sent: Wednesday, June 21, 2023 8:40 PM > To: Hongtao Liu > Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin ; Liu, > Hongtao > Subject: Re: [PATCH v2] x86: make better use of VBROADCASTSS / > VPBROADCASTD > > On 21.06.2023 09:44, Jan Beulich wrote: > > On 21.06.2023 09:37, Hongtao Liu wrote: > >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches > >> wrote: > >>> > >>> Isn't prefix_extra use bogus here? What extra prefix does > >>> vbroadcastss > >> According to comments, yes, no extra prefix is needed. > >> > >> ;; There are also additional prefixes in 3DNOW, SSSE3. > >> ;; ssemuladd,sse4arg default to 0f24/0f25 and DREX byte, ;; > >> sseiadd1,ssecvt1 to 0f7a with no DREX byte. > >> ;; 3DNOW has 0f0f prefix, SSSE3 and SSE4_{1,2} 0f38/0f3a. > > > > Right, that's what triggered my question. I guess dropping these > > "prefix_extra" really wants to be a separate patch (or maybe even > > multiple, but it's hard to see how to split), dealing with all of the > > instances which likely have accumulated simply via copy-and-paste. > > Or wait - I'm altering those lines anyway, so I could as well drop them right > away (and slightly shrink patch size), if that's okay with you. Of course I > should then not forget to also mention this in the changelog entry. > Yes. > Jan
RE: [PATCH V1] RISC-V:Add float16 tuple type abi
Committed, thanks Jeff. Pan -Original Message- From: Jeff Law Sent: Saturday, June 24, 2023 10:51 PM To: juzhe.zh...@rivai.ai; yulong ; gcc-patches Cc: palmer ; Kito.cheng ; Li, Pan2 ; wuwei2016 ; jiawei ; shihua ; dje.gcc ; pinskia ; Robin Dapp Subject: Re: [PATCH V1] RISC-V:Add float16 tuple type abi On 6/21/23 01:46, juzhe.zh...@rivai.ai wrote: > LGTM. Thanks. OK from me as well. jeff
Re: [Patch, fortran] PR49213 - [OOP] gfortran rejects structure constructor expression
Hi Paul! On 6/24/23 15:18, Paul Richard Thomas via Gcc-patches wrote: I have included the adjustment to 'gfc_is_ptr_fcn' and eliminating the extra blank line, introduced by my last patch. I played safe and went exclusively for class functions with attr.class_pointer set on the grounds that these have had all the accoutrements checked and built (ie. class_ok). I am still not sure if this is necessary or not. maybe it is my fault, but I find the version in the patch confusing: @@ -816,7 +816,7 @@ bool gfc_is_ptr_fcn (gfc_expr *e) { return e != NULL && e->expr_type == EXPR_FUNCTION - && (gfc_expr_attr (e).pointer + && ((e->ts.type != BT_CLASS && gfc_expr_attr (e).pointer) || (e->ts.type == BT_CLASS && CLASS_DATA (e)->attr.class_pointer)); } The caller 'gfc_is_ptr_fcn' has e->expr_type == EXPR_FUNCTION, so gfc_expr_attr (e) boils down to: if (e->value.function.esym && e->value.function.esym->result) { gfc_symbol *sym = e->value.function.esym->result; attr = sym->attr; if (sym->ts.type == BT_CLASS && sym->attr.class_ok) { attr.dimension = CLASS_DATA (sym)->attr.dimension; attr.pointer = CLASS_DATA (sym)->attr.class_pointer; attr.allocatable = CLASS_DATA (sym)->attr.allocatable; } } ... else if (e->symtree) attr = gfc_variable_attr (e, NULL); So I thought this should already do what you want if you do gfc_is_ptr_fcn (gfc_expr *e) { return e != NULL && e->expr_type == EXPR_FUNCTION && gfc_expr_attr (e).pointer; } or what am I missing? The additional checks in gfc_expr_attr are there to avoid ICEs in case CLASS_DATA (sym) has issues, and we all know Gerhard who showed that he is an expert in exploiting this. To sum up, I'd prefer to use the safer form if it works. If it doesn't, I would expect a latent issue. The rest of the code looked good to me, but I was suspicious about the handling of CHARACTER. Nasty as I am, I modified the testcase to use character(kind=4) instead of kind=1 (see attached). This either fails here (stop 10), or if I activate the marked line !cont = tContainer('hello!') ! ### ICE! ### I get an ICE. Can you have another look? Thanks, Harald OK for trunk? Paul Fortran: Enable class expressions in structure constructors [PR49213] 2023-06-24 Paul Thomas gcc/fortran PR fortran/49213 * expr.cc (gfc_is_ptr_fcn): Guard pointer attribute to exclude class expressions. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (get_symbol_decl): Remove extraneous line. * trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain size of intrinsic and character expressions. (gfc_trans_subcomponent_assign): Expand assignment to class components to include intrinsic and character expressions. gcc/testsuite/ PR fortran/49213 * gfortran.dg/pr49213.f90 : New test ! { dg-do run } ! ! Contributed by Neil Carlson ! program main ! character(2) :: c character(2,kind=4) :: c type :: S integer :: n end type type(S) :: Sobj type, extends(S) :: S2 integer :: m end type type(S2) :: S2obj type :: T class(S), allocatable :: x end type type(T) :: Tobj Sobj = S(1) Tobj = T(Sobj) S2obj = S2(1,2) Tobj = T(S2obj)! Failed here select type (x => Tobj%x) type is (S2) if ((x%n .ne. 1) .or. (x%m .ne. 2)) stop 1 class default stop 2 end select c = 4_" " call pass_it (T(Sobj)) if (c .ne. 4_"S ") stop 3 call pass_it (T(S2obj))! and here if (c .ne. 4_"S2") stop 4 call bar contains subroutine pass_it (foo) type(T), intent(in) :: foo select type (x => foo%x) type is (S) c = 4_"S " if (x%n .ne. 1) stop 5 type is (S2) c = 4_"S2" if ((x%n .ne. 1) .or. (x%m .ne. 2)) stop 6 class default stop 7 end select end subroutine subroutine bar ! Test from comment #29 of the PR - due to Janus Weil type tContainer class(*), allocatable :: x end type integer, parameter :: i = 0 character(7,kind=4) :: chr = 4_"goodbye" type(tContainer) :: cont cont%x = i ! linker error: undefined reference to `__copy_INTEGER_4_.3804' cont = tContainer(i+42) ! Failed here select type (z => cont%x) type is (integer) if (z .ne. 42) stop 8 class default stop 9 end select !cont = tContainer('hello!') ! ### ICE! ### cont = tContainer(4_'hello!') select type (z => cont%x) type is (character(*,kind=4)) if (z .ne. 4_'hello!') stop 10 class default stop 11 end select cont = tContainer(chr) select type (z => cont%x) type is (character(*,kind=4)) if (z .ne. 4_'goodbye') stop 12 class default
[PATCH, part2, committed] Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360]
Dear all, the first part of the patch came with a testcase that also exercised code for constant string arguments, which was not touched by that patch but seems to have caused runtime failures on big-endian platforms (e.g. Power-* BE) for all optimization levels, and on x86 / -m32 at -O1 and higher (not at -O0). I did not see any issues on x86 / -m64 and any optimization level, but could reproduce a problem with x86 / -m32 at -O1, which appears to be related how arguments that are to be passed by value are handled when there is a mismatch between the function prototype and the passed argument. The solution is to truncate too long constant string arguments, fixed by the attached patch, pushed as: https://gcc.gnu.org/g:3f97d10aa1ff5984d6fd657f246d3f251b254ff1 and see attached. * * * I found gcc-testresults quite helpful in checking whether my patch caused trouble on architectures different from the one I'm working on. The value (pun intended) would have been even greater if output of runtime failures would also be made available. Many (Fortran) tests provide either a stop code, or some hopefully helpful diagnostic output on stdout intended for locating errors on platforms where one has no direct access to, or is less familiar with. Far better than a plain FAIL: gfortran.dg/value_9.f90 -O1 execution test * * * Thanks, Harald From 3f97d10aa1ff5984d6fd657f246d3f251b254ff1 Mon Sep 17 00:00:00 2001 From: Harald Anlauf Date: Sat, 24 Jun 2023 20:36:53 +0200 Subject: [PATCH] Fortran: ABI for scalar CHARACTER(LEN=1),VALUE dummy argument [PR110360] gcc/fortran/ChangeLog: PR fortran/110360 * trans-expr.cc (gfc_conv_procedure_call): Truncate constant string argument of length > 1 passed to scalar CHARACTER(1),VALUE dummy. --- gcc/fortran/trans-expr.cc | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index c92fccd0be2..63e3cf9681e 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -6395,20 +6395,25 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, /* ABI: actual arguments to CHARACTER(len=1),VALUE dummy arguments are actually passed by value. - The BIND(C) case is handled elsewhere. - TODO: truncate constant strings to length 1. */ + Constant strings are truncated to length 1. + The BIND(C) case is handled elsewhere. */ if (fsym->ts.type == BT_CHARACTER && !fsym->ts.is_c_interop && fsym->ts.u.cl->length->expr_type == EXPR_CONSTANT && fsym->ts.u.cl->length->ts.type == BT_INTEGER && (mpz_cmp_ui - (fsym->ts.u.cl->length->value.integer, 1) == 0) - && e->expr_type != EXPR_CONSTANT) + (fsym->ts.u.cl->length->value.integer, 1) == 0)) { - parmse.expr = gfc_string_to_single_character - (build_int_cst (gfc_charlen_type_node, 1), - parmse.expr, - e->ts.kind); + if (e->expr_type != EXPR_CONSTANT) + parmse.expr = gfc_string_to_single_character + (build_int_cst (gfc_charlen_type_node, 1), + parmse.expr, + e->ts.kind); + else if (e->value.character.length > 1) + { + e->value.character.length = 1; + gfc_conv_expr (&parmse, e); + } } if (fsym->attr.optional -- 2.35.3
[x86_PATCH] New *ashl_doubleword_highpart define_insn_and_split.
This patch contains a pair of (related) optimizations in i386.md that allow us to generate better code for the example below (this is a step towards fixing a bugzilla PR, but I've forgotten the number). __int128 foo64(__int128 x, long long y) { __int128 t = (__int128)y << 64; return x ^ t; } The hidden issue is that the RTL currently seen by reload contains the sign extension of y from DImode to TImode, even though this is dead (not required) for left shifts by more than WORD_SIZE bits. (insn 11 8 12 2 (parallel [ (set (reg:TI 0 ax [orig:91 y ] [91]) (sign_extend:TI (reg:DI 1 dx [97]))) (clobber (reg:CC 17 flags)) (clobber (scratch:DI)) ]) {extendditi2} What makes this particularly undesirable is that the sign-extension pattern above requires an additional DImode scratch register, indicated by the clobber, which unnecessarily increases register pressure. The proposed solution is to add a define_insn_and_split for such left shifts (of sign or zero extensions) that only have a non-zero highpart, where the extension is redundant and eliminated, that can be split after reload, without scratch registers or early clobbers. This (late split) exposes a second optimization opportunity where setting the lowpart to zero can sometimes be combined/simplified with the following instruction during peephole2. For the test case above, we previously generated with -O2: foo64: xorl%eax, %eax xorq%rsi, %rdx xorq%rdi, %rax ret with this patch, we now generate: foo64: movq%rdi, %rax xorq%rsi, %rdx ret Likewise for the related -m32 test case, we go from: foo32: movl12(%esp), %eax movl%eax, %edx xorl%eax, %eax xorl8(%esp), %edx xorl4(%esp), %eax ret to the improved: foo32: movl12(%esp), %edx movl4(%esp), %eax xorl8(%esp), %edx ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-06-24 Roger Sayle gcc/ChangeLog * config/i386/i386.md (peephole2): Simplify zeroing a register followed by an IOR, XOR or PLUS operation on it, into a move. (*ashl3_doubleword_highpart): New define_insn_and_split to eliminate (and hide from reload) unnecessary word to doubleword extensions that are followed by left shifts by sufficient large (but valid) bit counts. gcc/testsuite/ChangeLog * gcc.target/i386/ashldi3-1.c: New 32-bit test case. * gcc.target/i386/ashlti3-2.c: New 64-bit test case. Thanks again, Roger -- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 95a6653c..7664dff 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -12206,6 +12206,18 @@ (set_attr "type" "alu") (set_attr "mode" "QI")]) +;; Peephole2 rega = 0; rega op= regb into rega = regb. +(define_peephole2 + [(parallel [(set (match_operand:SWI 0 "general_reg_operand") + (const_int 0)) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 0) + (any_or_plus:SWI (match_dup 0) + (match_operand:SWI 1 ""))) + (clobber (reg:CC FLAGS_REG))])] + "" + [(set (match_dup 0) (match_dup 1))]) + ;; Split DST = (HI<<32)|LO early to minimize register usage. (define_insn_and_split "*concat3_1" [(set (match_operand: 0 "nonimmediate_operand" "=ro,r") @@ -13365,6 +13377,28 @@ [(const_int 0)] "ix86_split_ashl (operands, operands[3], mode); DONE;") +(define_insn_and_split "*ashl3_doubleword_highpart" + [(set (match_operand: 0 "register_operand" "=r") + (ashift: + (any_extend: (match_operand:DWIH 1 "nonimmediate_operand" "rm")) + (match_operand:QI 2 "const_int_operand"))) + (clobber (reg:CC FLAGS_REG))] + "INTVAL (operands[2]) >= * BITS_PER_UNIT + && INTVAL (operands[2]) < * BITS_PER_UNIT * 2" + "#" + "&& reload_completed" + [(const_int 0)] +{ + split_double_mode (mode, &operands[0], 1, &operands[0], &operands[3]); + int bits = INTVAL (operands[2]) - ( * BITS_PER_UNIT); + if (!rtx_equal_p (operands[3], operands[1])) +emit_move_insn (operands[3], operands[1]); + if (bits > 0) +emit_insn (gen_ashl3 (operands[3], operands[3], GEN_INT (bits))); + ix86_expand_clear (operands[0]); + DONE; +}) + (define_insn "x86_64_shld" [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m") (ior:DI (ashift:DI (match_dup 0) diff --git a/gcc/testsuite/gcc.target/i386/ashldi3-1.c b/gcc/testsuite/gcc.target/i386/ashldi3-1.c new file mode 100644 index 000..b61d63b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/ashldi3-1.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2" } */ + +long long foo(long long x, int y) +{ + lon
PR82943 - Suggested patch to fix
Hello, I am new to the GFortran community. Over the past two weeks I created a patch that should fix PR82943 for GFortran. I have attached it to this email. The patch allows the code below to compile successfully. I am working on creating test cases next, but I am new to the process so it may take me some time. After I make test cases, do I email them to you as well? Do I need to make a pull-request on github in order to get the patch reviewed? Thank you, Alexander Westbrooks module testmod public :: foo type, public :: tough_lvl_0(a, b) integer, kind :: a = 1 integer, len :: b contains procedure :: foo end type type, public, EXTENDS(tough_lvl_0) :: tough_lvl_1 (c) integer, len :: c contains procedure :: bar end type type, public, EXTENDS(tough_lvl_1) :: tough_lvl_2 (d) integer, len :: d contains procedure :: foobar end type contains subroutine foo(this) class(tough_lvl_0(1,*)), intent(inout) :: this end subroutine subroutine bar(this) class(tough_lvl_1(1,*,*)), intent(inout) :: this end subroutine subroutine foobar(this) class(tough_lvl_2(1,*,*,*)), intent(inout) :: this end subroutine end module PROGRAM testprogram USE testmod TYPE(tough_lvl_0(1,5)) :: test_pdt_0 TYPE(tough_lvl_1(1,5,6)) :: test_pdt_1 TYPE(tough_lvl_2(1,5,6,7)) :: test_pdt_2 CALL test_pdt_0%foo() CALL test_pdt_1%foo() CALL test_pdt_1%bar() CALL test_pdt_2%foo() CALL test_pdt_2%bar() CALL test_pdt_2%foobar() END PROGRAM testprogram 0001-bug-patch-PR82943.patch Description: Binary data
[x86_64 PATCH] Handle SUBREG conversions in TImode STV (for ptest).
This patch teaches i386's STV pass how to handle SUBREG conversions, i.e. that a TImode SUBREG can be transformed into a V1TImode SUBREG, without worrying about other DEFs and USEs. A motivating example where this is useful is typedef long long __m128i __attribute__ ((__vector_size__ (16))); int foo (__m128i x, __m128i y) { return (__int128)x == (__int128)y; } where with -O2 -msse4 we can now scalar-to-vector transform: (insn 7 4 8 2 (set (reg:CCZ 17 flags) (compare:CCZ (subreg:TI (reg/v:V2DI 86 [ x ]) 0) (subreg:TI (reg/v:V2DI 87 [ y ]) 0))) {*cmpti_doubleword} into (insn 17 4 7 2 (set (reg:V1TI 91) (xor:V1TI (subreg:V1TI (reg/v:V2DI 86 [ x ]) 0) (subreg:V1TI (reg/v:V2DI 87 [ y ]) 0))) (nil)) (insn 7 17 8 2 (set (reg:CCZ 17 flags) (unspec:CCZ [ (reg:V1TI 91) repeated x2 ] UNSPEC_PTEST)) {*sse4_1_ptestv1ti} (expr_list:REG_DEAD (reg/v:V2DI 87 [ y ]) (expr_list:REG_DEAD (reg/v:V2DI 86 [ x ]) (nil with the dramatic effect that the assembly output before: foo:movaps %xmm0, -40(%rsp) movq-32(%rsp), %rdx movq%xmm0, %rax movq%xmm1, %rsi movaps %xmm1, -24(%rsp) movq-16(%rsp), %rcx xorq%rsi, %rax xorq%rcx, %rdx orq %rdx, %rax sete%al movzbl %al, %eax ret now becomes foo:pxor%xmm1, %xmm0 xorl%eax, %eax ptest %xmm0, %xmm0 sete%al ret i.e. a 128-bit vector doesn't need to be transferred to the scalar unit to be tested for equality. The new test case includes additional related examples that show similar improvements. Previously we explicitly checked *cmpti_doubleword operands to be either immediate constants, or a TImode REG or a TImode MEM. By enhancing this to allow a TImode SUBREG, we now handle everything that would match the general_operand predicate, making this part of STV more like other RTL passes (lra/reload). The big change is that unlike a regular DF USE, a SUBREG USE doesn't require us to analyze and convert the rest of the DEF-USE chain. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-06-24 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (scalar_chain:add_insn): Don't call analyze_register_chain if the USE is a SUBREG. (timode_scalar_chain::convert_op): Call gen_lowpart to convert TImode SUBREGs to V1TImode SUBREGs. (convertible_comparison_p): We can now handle all general_operands of *cmp_doubleword. (timode_remove_non_convertible_regs): We only need to check TImode uses that aren't TImode SUBREGs of registers in other modes. gcc/testsuite/ChangeLog * gcc.target/i386/sse4_1-ptest-7.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 4a3b07a..6e9ba54 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -449,7 +449,8 @@ scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid, return true; for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) -if (!DF_REF_REG_MEM_P (ref)) +if (DF_REF_TYPE (ref) == DF_REF_REG_USE + && !SUBREG_P (DF_REF_REG (ref))) if (!analyze_register_chain (candidates, ref, disallowed)) return false; @@ -1621,7 +1622,8 @@ timode_scalar_chain::convert_op (rtx *op, rtx_insn *insn) else { gcc_assert (SUBREG_P (*op)); - gcc_assert (GET_MODE (*op) == vmode); + if (GET_MODE (*op) != V1TImode) + *op = gen_lowpart (V1TImode, *op); } } @@ -1912,12 +1914,8 @@ convertible_comparison_p (rtx_insn *insn, enum machine_mode mode) rtx op2 = XEXP (src, 1); /* *cmp_doubleword. */ - if ((CONST_SCALAR_INT_P (op1) - || ((REG_P (op1) || MEM_P (op1)) - && GET_MODE (op1) == mode)) - && (CONST_SCALAR_INT_P (op2) - || ((REG_P (op2) || MEM_P (op2)) - && GET_MODE (op2) == mode))) + if (general_operand (op1, mode) + && general_operand (op2, mode)) return true; /* *testti_doubleword. */ @@ -2244,8 +2242,9 @@ timode_remove_non_convertible_regs (bitmap candidates) DF_REF_REGNO (ref)); FOR_EACH_INSN_USE (ref, insn) - if (!DF_REF_REG_MEM_P (ref) - && GET_MODE (DF_REF_REG (ref)) == TImode) + if (DF_REF_TYPE (ref) == DF_REF_REG_USE + && GET_MODE (DF_REF_REG (ref)) == TImode + && !SUBREG_P (DF_REF_REG (ref))) timode_check_non_convertible_regs (candidates, regs, DF_REF_REGNO (ref)); } diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-ptest-7.c b
Re: [PATCH] RISC-V: Split VF iterators for Zvfh(min).
On 6/22/23 07:03, Robin Dapp wrote: Hi, when working on FP widening/narrowing I realized the Zvfhmin handling is not ideal right now: We use the "enabled" insn attribute to disable instructions not available with Zvfhmin (but only with Zvfh). However, "enabled == 0" only disables insn alternatives, in our case all of them when the mode is a HFmode. The insn itself remains available (e.g. for combine to match) and we end up with an insn without alternatives that reload cannot handle --> ICE. The proper solution is to disable the instruction for the respective mode altogether. This patch achieves this by splitting the VF as well as VWEXTF iterators into variants with TARGET_ZVFH and TARGET_VECTOR_ELEN_FP_16 (which is true when either TARGET_ZVFH or TARGET_ZVFHMIN are true). Also, VWCONVERTI, VHF and VHF_LMUL1 need adjustments. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md: VF_AUTO -> VF. * config/riscv/vector-iterators.md: Introduce VF_ZVFHMIN, VWEXTF_ZVFHMIN and use TARGET_ZVFH in VWCONVERTI, VHF and VHF_LMUL1. * config/riscv/vector.md: Use new iterators. OK for the trunk. Thanks for walking everyone through the issues here. jeff
Re: [PATCH v7 0/6] c++, libstdc++: get std::is_object to dispatch to new built-in traits
On Tue, Jun 20, 2023 at 8:32 AM Patrick Palka wrote: > > On Thu, 15 Jun 2023, Ken Matsui via Libstdc++ wrote: > > > Hi, > > > > For those curious about the performance improvements of this patch, I > > conducted a benchmark that instantiates 256k specializations of > > is_object_v based on Patrick's code. You can find the benchmark code > > at this link: > > > > https://github.com/ken-matsui/gcc-benches/blob/main/is_object_benchmark.cc > > > > On my computer, using the gcc HEAD of this patch for a release build, > > the patch with -DUSE_BUILTIN took 64% less time and used 44-47% less > > memory compared to not using it. > > That's more like it :D Though the benchmark should also invoke the > trait on non-object types too, e.g. Instantiator& or Instantiator(int). Here is the updated benchmark: https://github.com/ken-matsui/gcc-benches/blob/main/is_object.md#sat-jun-24-080110-am-pdt-2023 Time: -74.7544% Peak Memory Usage: -62.5913% Total Memory Usage: -64.2708% > > > > Sincerely, > > Ken Matsui > > > > On Mon, Jun 12, 2023 at 3:49 PM Ken Matsui > > wrote: > > > > > > Hi, > > > > > > This patch series gets std::is_object to dispatch to built-in traits and > > > implements the following built-in traits, on which std::object depends. > > > > > > * __is_reference > > > * __is_function > > > * __is_void > > > > > > std::is_object was depending on them with disjunction and negation. > > > > > > __not_<__or_, is_reference<_Tp>, is_void<_Tp>>>::type > > > > > > Therefore, this patch uses them directly instead of implementing an > > > additional > > > built-in trait __is_object, which makes the compiler slightly bigger and > > > slower. > > > > > > __bool_constant > > __is_void(_Tp))> > > > > > > This would instantiate only __bool_constant and > > > __bool_constant, > > > which can be mostly shared. That is, the purpose of built-in traits is > > > considered as achieved. > > > > > > Changes in v7 > > > > > > * Removed an unnecessary new line. > > > > > > Ken Matsui (6): > > > c++: implement __is_reference built-in trait > > > libstdc++: use new built-in trait __is_reference for std::is_reference > > > c++: implement __is_function built-in trait > > > libstdc++: use new built-in trait __is_function for std::is_function > > > c++, libstdc++: implement __is_void built-in trait > > > libstdc++: make std::is_object dispatch to new built-in traits > > > > > > gcc/cp/constraint.cc | 9 +++ > > > gcc/cp/cp-trait.def | 3 + > > > gcc/cp/semantics.cc | 12 > > > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 9 +++ > > > gcc/testsuite/g++.dg/ext/is_function.C| 58 +++ > > > gcc/testsuite/g++.dg/ext/is_reference.C | 34 +++ > > > gcc/testsuite/g++.dg/ext/is_void.C| 35 +++ > > > gcc/testsuite/g++.dg/tm/pr46567.C | 6 +- > > > libstdc++-v3/include/bits/cpp_type_traits.h | 15 - > > > libstdc++-v3/include/debug/helper_functions.h | 5 +- > > > libstdc++-v3/include/std/type_traits | 51 > > > 11 files changed, 216 insertions(+), 21 deletions(-) > > > create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C > > > create mode 100644 gcc/testsuite/g++.dg/ext/is_reference.C > > > create mode 100644 gcc/testsuite/g++.dg/ext/is_void.C > > > > > > -- > > > 2.41.0 > > > > > > >
Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.
On 6/21/23 02:14, Wang, Yanzhang wrote: Hi Jeff, sorry for the late reply. The long branch handling is done at the assembler level. So the clobbering of $ra isn't visible to the compiler. Thus the compiler has to be extremely careful to not hold values in $ra because the assembler may clobber $ra. If assembler will modify the $ra behavior, it seems the rules we defined in the riscv.cc will be ignored. For example, the $ra saving generated by this patch may be modified by the assmebler and all others depends on it will be wrong. So implementing the long jump in the compiler is better. Basically correct. The assembler potentially clobbers $ra. That's why in the long jump patches $ra becomes a fixed register -- the compiler doesn't know when it's clobbered by the assembler. Even if this were done in the compiler, we'd still have to do something special with $ra. The point at which decisions about register allocation and such are made is before the point where we know the final positions of jumps/labels. It's a classic problem in GCC's design. If you're not going to use dwarf, then my recommendation is to ensure that the data you need is *always* available in the stack at known offsets. That will mean your code isn't optimized as well. It means hand written assembly code has to follow the conventions, you can't link against libraries that do not follow those conventions, etc etc. But that's the price you pay for not using dwarf (or presumably ORC/SFRAME which I haven't studied in detail). Yes. That's right. All the libraries need to follow the same logic. But as you said, this is the price if we choose this solution. And fortunately, this will only be used in special scenarios. The key point is you want the location of the return pointer to be consistent in every function and you want to know that every function has a frame pointer. Otherwise you end up having to either consult on-the-side tables (at which point you might as well look at ORC/SFRAME) or disassembling code in the executable to deduce where to find fp, ra, etc (which is a path to madness). Thus for the usage scenario you're looking at, I would recommend always having a frame pointer, every function, no matter how trivial and that $ra always be saved into a suitable slot relative to the frame pointer, again, no matter how trivial the function. And Jeff, do you have any other comments about this patch? Should we add some descriptions somewhere in the doc? We may need to adjust the documentation a bit since I think I'm suggesting slight changes in the behavior of existing -m options. I'd like to see an updated patch before commenting further on implementation details. jeff
Re: [PATCH V1] RISC-V:Add float16 tuple type abi
On 6/21/23 01:46, juzhe.zh...@rivai.ai wrote: LGTM. Thanks. OK from me as well. jeff
Re: [PATCH v2 1/2] c++: implement __is_volatile built-in trait
Here is the benchmark result for is_volatile: https://github.com/ken-matsui/gcc-benches/blob/main/is_volatile.md#sat-jun-24-074036-am-pdt-2023 Time: -2.42335% Peak Memory Usage: -1.07651% Total Memory Usage: -1.62369% On Sat, Jun 24, 2023 at 7:24 AM Ken Matsui wrote: > > This patch implements built-in trait for std::is_volatile. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __is_volatile. > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE. > * semantics.cc (trait_expr_value): Likewise. > (finish_trait_expr): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile. > * g++.dg/ext/is_volatile.C: New test. > > Signed-off-by: Ken Matsui > --- > gcc/cp/constraint.cc | 3 +++ > gcc/cp/cp-trait.def | 1 + > gcc/cp/semantics.cc | 4 > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ > gcc/testsuite/g++.dg/ext/is_volatile.C | 19 +++ > 5 files changed, 30 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index 8cf0f2d0974..e971d67ee25 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) > case CPTK_IS_UNION: >inform (loc, " %qT is not a union", t1); >break; > +case CPTK_IS_VOLATILE: > + inform (loc, " %qT is not a volatile type", t1); > + break; > case CPTK_IS_AGGREGATE: >inform (loc, " %qT is not an aggregate", t1); >break; > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > index 8b7fece0cc8..414b1065a11 100644 > --- a/gcc/cp/cp-trait.def > +++ b/gcc/cp/cp-trait.def > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, > "__is_trivially_assignable", 2) > DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", > -1) > DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) > DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) > +DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1) > DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, > "__reference_constructs_from_temporary", 2) > DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, > "__reference_converts_from_temporary", 2) > /* FIXME Added space to avoid direct usage in GCC 13. */ > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > index 8fb47fd179e..10934d01504 100644 > --- a/gcc/cp/semantics.cc > +++ b/gcc/cp/semantics.cc > @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, > tree type2) > case CPTK_IS_ENUM: >return type_code1 == ENUMERAL_TYPE; > > +case CPTK_IS_VOLATILE: > + return CP_TYPE_VOLATILE_P (type1); > + > case CPTK_IS_FINAL: >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); > > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind > kind, tree type1, tree type2) > case CPTK_IS_ENUM: > case CPTK_IS_UNION: > case CPTK_IS_SAME: > +case CPTK_IS_VOLATILE: >break; > > case CPTK_IS_LAYOUT_COMPATIBLE: > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > index f343e153e56..7ad640f141b 100644 > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > @@ -146,3 +146,6 @@ > #if !__has_builtin (__remove_cvref) > # error "__has_builtin (__remove_cvref) failed" > #endif > +#if !__has_builtin (__is_volatile) > +# error "__has_builtin (__is_volatile) failed" > +#endif > diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C > b/gcc/testsuite/g++.dg/ext/is_volatile.C > new file mode 100644 > index 000..004e397e5e7 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/is_volatile.C > @@ -0,0 +1,19 @@ > +// { dg-do compile { target c++11 } } > + > +#include > + > +using namespace __gnu_test; > + > +#define SA(X) static_assert((X),#X) > + > +// Positive tests. > +SA(__is_volatile(volatile int)); > +SA(__is_volatile(const volatile int)); > +SA(__is_volatile(vClassType)); > +SA(__is_volatile(cvClassType)); > + > +// Negative tests. > +SA(!__is_volatile(int)); > +SA(!__is_volatile(const int)); > +SA(!__is_volatile(ClassType)); > +SA(!__is_volatile(cClassType)); > -- > 2.41.0 >
Re: [PATCH] GIMPLE_FOLD: Apply LEN_MASK_{LOAD, STORE} into GIMPLE_FOLD
On 6/23/23 07:48, juzhe.zh...@rivai.ai wrote: From: Ju-Zhe Zhong Hi, since we are going to have LEN_MASK_{LOAD,STORE} into loopVectorizer. Currenly, 1. we can fold MASK_{LOAD,STORE} into MEM when mask is all ones. 2. we can fold LEN_{LOAD,STORE} into MEM when (len - bias) is VF. Now, I think it makes sense that we can support fold LEN_MASK_{LOAD,STORE} into MEM when both mask = all ones and (len - bias) is VF. gcc/ChangeLog: * gimple-fold.cc (arith_overflowed_p): Apply LEN_MASK_{LOAD,STORE}. (gimple_fold_partial_load_store_mem_ref): Ditto. (gimple_fold_partial_store): Ditto. (gimple_fold_call): Ditto. OK jeff
RE: [PATCH] RISC-V: Refactor the integer ternary autovec pattern
Committed, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Saturday, June 24, 2023 10:04 PM To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com Subject: Re: [PATCH] RISC-V: Refactor the integer ternary autovec pattern On 6/21/23 16:38, Juzhe-Zhong wrote: > Long time ago, I encounter ICE when trying to set clobber register as Pmode > and I forgot the reason. > > So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which > makes patterns look unreasonable. > > According to Jeff's comments, I tried it again, it works now when we try to > set clobber register as Pmode and the patterns look more reasonable now. > > The tests are all passed, Ok for trunk. > > gcc/ChangeLog: > > * config/riscv/autovec.md (*fma): set clobber to Pmode in > expand stage. > (*fma): Ditto. > (*fnma): Ditto. > (*fnma): Ditto. OK jeff
RE: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization
Committed, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Saturday, June 24, 2023 10:06 PM To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com Subject: Re: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization On 6/21/23 09:53, Juzhe-Zhong wrote: > This patch adds RVV floating-point auto-vectorization. > Also, fix attribute bug of floating-point ternary operations in vector.md. > > gcc/ChangeLog: > > * config/riscv/autovec.md (fma4): New pattern. > (*fma): Ditto. > (fnma4): Ditto. > (*fnma): Ditto. > (fms4): Ditto. > (*fms): Ditto. > (fnms4): Ditto. > (*fnms): Ditto. > * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New > function. > * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto. > * config/riscv/vector.md: Fix attribute bug. OK. Thanks for digging into that clobber issue. Jeff
Re: [PATCH v2] RISC-V: Implement autovec copysign.
On 6/21/23 08:24, 钟居哲 wrote: LGTM. Likewise. OK for the trunk. jeff
Re: [PATCH][RFC] middle-end/110237 - wrong MEM_ATTRs for partial loads/stores
On 6/22/23 00:39, Richard Biener wrote: I suspect there's no way to specify the desired semantics? OTOH code that looks at the MEM operand only and not the insn (which should have some UNSPEC wrapped) needs to be conservative, so maybe the alias code shouldn't assume that a (mem:V16SI ..) actually performs an access of the size of V16SI at the specified location? I'm not aware of a way to express the semantics fully right now. We'd need some way to indicate that the MEM is a partial and pass along the actual length. We could do both through MEM_ATTRS with some work. For example we could declare that for vector modes full semantic information is carried in the MEM_ATTRS rather than by the mode itself. So it falls into a space between how we currently think of something like V16SI and BLK. The mode specifies a maximum size and how to interpret the elements. But actual size and perhaps mask info would be found in MEM_ATTRS. jeff
[PATCH v2 2/2] libstdc++: use new built-in trait __is_volatile
This patch lets libstdc++ use new built-in trait __is_volatile. libstdc++-v3/ChangeLog: * include/std/type_traits (is_volatile): Use __is_volatile built-in trait. (is_volatile_v): Likewise. Signed-off-by: Ken Matsui --- libstdc++-v3/include/std/type_traits | 13 + 1 file changed, 13 insertions(+) diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 0e7a9c9c7f3..db74b884b35 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -773,6 +773,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : public true_type { }; /// is_volatile +#if __has_builtin(__is_volatile) + template +struct is_volatile +: public __bool_constant<__is_volatile(_Tp)> +{ }; +#else template struct is_volatile : public false_type { }; @@ -780,6 +786,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template struct is_volatile<_Tp volatile> : public true_type { }; +#endif /// is_trivial template @@ -3214,10 +3221,16 @@ template inline constexpr bool is_const_v = false; template inline constexpr bool is_const_v = true; + +#if __has_builtin(__is_volatile) +template + inline constexpr bool is_volatile_v = __is_volatile(_Tp); +#else template inline constexpr bool is_volatile_v = false; template inline constexpr bool is_volatile_v = true; +#endif template inline constexpr bool is_trivial_v = __is_trivial(_Tp); -- 2.41.0
Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()
On 6/22/23 07:42, Jan Hubicka wrote: On 6/22/23 00:31, Richard Biener wrote: I think there's a difference in that __builtin_trap () is observable while __builtin_unreachable () is not and reaching __builtin_unreachable () invokes undefined behavior while reaching __builtin_trap () does not. So the isolation code marking the trapping code volatile should be enough and the trap () is just there to end the basic block (and maybe be on the safe side to really trap). Agreed WRT observability -- but that's not really the point of the trap and if we wanted we could change that behavior. The trap is there to halt execution immediately rather than letting it keep running. That was a design decision from a security standpoint. If we've detected that we're executing undefined behavior, stop rather than potentially letting a malicious actor turn a bug into an exploit. Also as discussed some time ago, the volatile loads between traps has effect of turning previously pure/const functions into non-const which is somewhat sad, so it is still on my todo list to change it this stage1 to something more careful. We discussed internal functions trap_store and trap_load which will expand to load/store + trap but will make it clear that side effect does not count to modref. It's been a long time since I looked at this code -- isn't it the case that we already must have had a load/store and that all we've done is change its form (to enable more DCE) and added the volatile marker? Meaning that it couldn't have been pure/cost before, could it? Or is it the case that you want to not have the erroneous path be the sole reason to spoil pure/const detection -- does that happen often in practice? jeff
[PATCH v2 1/2] c++: implement __is_volatile built-in trait
This patch implements built-in trait for std::is_volatile. gcc/cp/ChangeLog: * cp-trait.def: Define __is_volatile. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile. * g++.dg/ext/is_volatile.C: New test. Signed-off-by: Ken Matsui --- gcc/cp/constraint.cc | 3 +++ gcc/cp/cp-trait.def | 1 + gcc/cp/semantics.cc | 4 gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ gcc/testsuite/g++.dg/ext/is_volatile.C | 19 +++ 5 files changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index 8cf0f2d0974..e971d67ee25 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) case CPTK_IS_UNION: inform (loc, " %qT is not a union", t1); break; +case CPTK_IS_VOLATILE: + inform (loc, " %qT is not a volatile type", t1); + break; case CPTK_IS_AGGREGATE: inform (loc, " %qT is not an aggregate", t1); break; diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index 8b7fece0cc8..414b1065a11 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2) DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1) DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) +DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1) DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, "__reference_constructs_from_temporary", 2) DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, "__reference_converts_from_temporary", 2) /* FIXME Added space to avoid direct usage in GCC 13. */ diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 8fb47fd179e..10934d01504 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: return type_code1 == ENUMERAL_TYPE; +case CPTK_IS_VOLATILE: + return CP_TYPE_VOLATILE_P (type1); + case CPTK_IS_FINAL: return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: case CPTK_IS_UNION: case CPTK_IS_SAME: +case CPTK_IS_VOLATILE: break; case CPTK_IS_LAYOUT_COMPATIBLE: diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..7ad640f141b 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__is_volatile) +# error "__has_builtin (__is_volatile) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C b/gcc/testsuite/g++.dg/ext/is_volatile.C new file mode 100644 index 000..004e397e5e7 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_volatile.C @@ -0,0 +1,19 @@ +// { dg-do compile { target c++11 } } + +#include + +using namespace __gnu_test; + +#define SA(X) static_assert((X),#X) + +// Positive tests. +SA(__is_volatile(volatile int)); +SA(__is_volatile(const volatile int)); +SA(__is_volatile(vClassType)); +SA(__is_volatile(cvClassType)); + +// Negative tests. +SA(!__is_volatile(int)); +SA(!__is_volatile(const int)); +SA(!__is_volatile(ClassType)); +SA(!__is_volatile(cClassType)); -- 2.41.0
Re: [PATCH v2 1/2] c++: implement __is_array built-in trait
Here is the benchmark result for is_array: https://github.com/ken-matsui/gcc-benches/blob/main/is_array.md#sat-jun-24-070630-am-pdt-2023 Time: -15.511% Peak Memory Usage: +0.173923% Total Memory Usage: -6.2037% On Sat, Jun 24, 2023 at 6:54 AM Ken Matsui wrote: > > This patch implements built-in trait for std::is_array. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __is_array. > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARRAY. > * semantics.cc (trait_expr_value): Likewise. > (finish_trait_expr): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_array. > * g++.dg/ext/is_array.C: New test. > > Signed-off-by: Ken Matsui > --- > gcc/cp/constraint.cc | 3 +++ > gcc/cp/cp-trait.def | 1 + > gcc/cp/semantics.cc | 4 > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ > gcc/testsuite/g++.dg/ext/is_array.C | 28 > 5 files changed, 39 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ext/is_array.C > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index 8cf0f2d0974..7cec7eba591 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) > case CPTK_IS_UNION: >inform (loc, " %qT is not a union", t1); >break; > +case CPTK_IS_ARRAY: > + inform (loc, " %qT is not an array", t1); > + break; > case CPTK_IS_AGGREGATE: >inform (loc, " %qT is not an aggregate", t1); >break; > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > index 8b7fece0cc8..f68c7f2e8ec 100644 > --- a/gcc/cp/cp-trait.def > +++ b/gcc/cp/cp-trait.def > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, > "__is_trivially_assignable", 2) > DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", > -1) > DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) > DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) > +DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1) > DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, > "__reference_constructs_from_temporary", 2) > DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, > "__reference_converts_from_temporary", 2) > /* FIXME Added space to avoid direct usage in GCC 13. */ > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > index 8fb47fd179e..22f2700ec0b 100644 > --- a/gcc/cp/semantics.cc > +++ b/gcc/cp/semantics.cc > @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, > tree type2) > case CPTK_IS_UNION: >return type_code1 == UNION_TYPE; > > +case CPTK_IS_ARRAY: > + return type_code1 == ARRAY_TYPE; > + > case CPTK_IS_ASSIGNABLE: >return is_xible (MODIFY_EXPR, type1, type2); > > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind > kind, tree type1, tree type2) > case CPTK_IS_ENUM: > case CPTK_IS_UNION: > case CPTK_IS_SAME: > +case CPTK_IS_ARRAY: >break; > > case CPTK_IS_LAYOUT_COMPATIBLE: > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > index f343e153e56..56485ae62be 100644 > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > @@ -146,3 +146,6 @@ > #if !__has_builtin (__remove_cvref) > # error "__has_builtin (__remove_cvref) failed" > #endif > +#if !__has_builtin (__is_array) > +# error "__has_builtin (__is_array) failed" > +#endif > diff --git a/gcc/testsuite/g++.dg/ext/is_array.C > b/gcc/testsuite/g++.dg/ext/is_array.C > new file mode 100644 > index 000..facfed5c7cb > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/is_array.C > @@ -0,0 +1,28 @@ > +// { dg-do compile { target c++11 } } > + > +#include > + > +using namespace __gnu_test; > + > +#define SA(X) static_assert((X),#X) > +#define SA_TEST_CATEGORY(TRAIT, X, expect) \ > + SA(TRAIT(X) == expect); \ > + SA(TRAIT(const X) == expect);\ > + SA(TRAIT(volatile X) == expect); \ > + SA(TRAIT(const volatile X) == expect) > + > +SA_TEST_CATEGORY(__is_array, int[2], true); > +SA_TEST_CATEGORY(__is_array, int[], true); > +SA_TEST_CATEGORY(__is_array, int[2][3], true); > +SA_TEST_CATEGORY(__is_array, int[][3], true); > +SA_TEST_CATEGORY(__is_array, float*[2], true); > +SA_TEST_CATEGORY(__is_array, float*[], true); > +SA_TEST_CATEGORY(__is_array, float*[2][3], true); > +SA_TEST_CATEGORY(__is_array, float*[][3], true); > +SA_TEST_CATEGORY(__is_array, ClassType[2], true); > +SA_TEST_CATEGORY(__is_array, ClassType[], true); > +SA_TEST_CATEGORY(__is_array, ClassType[2][3], true); > +SA_TEST_CATEGORY(__is_array, ClassType[][3], true); > + > +// Sanity check. > +SA_TEST_CATEGORY(__is_array, ClassType, false); > -- > 2.41.0 >
Re: [PATCH] SSA ALIAS: Apply LEN_MASK_STORE to 'ref_maybe_used_by_call_p_1'
On 6/23/23 17:20, 钟居哲 wrote: Not sure since I saw MASK_STORE/LEN_STORE didn't compute size. Also OK after a re-review on my part. The code sets the size to -1 after doing the ao_ref_init_from_ptr_and_size, meaning it's not a known size. jeff
Re: [PATCH] SSA ALIAS: Apply LEN_MASK_{LOAD, STORE} into SSA alias analysis
On 6/23/23 17:21, 钟居哲 wrote: Not sure since I saw MASK_STORE/LEN_STORE didn't compute size. Yea, I think you're right. We take the size from the LHS. My mistake. This is fine for the trunk. jeff
Re: [PATCH V3] RISC-V: Support RVV floating-point auto-vectorization
On 6/21/23 09:53, Juzhe-Zhong wrote: This patch adds RVV floating-point auto-vectorization. Also, fix attribute bug of floating-point ternary operations in vector.md. gcc/ChangeLog: * config/riscv/autovec.md (fma4): New pattern. (*fma): Ditto. (fnma4): Ditto. (*fnma): Ditto. (fms4): Ditto. (*fms): Ditto. (fnms4): Ditto. (*fnms): Ditto. * config/riscv/riscv-protos.h (emit_vlmax_fp_ternary_insn): New function. * config/riscv/riscv-v.cc (emit_vlmax_fp_ternary_insn): Ditto. * config/riscv/vector.md: Fix attribute bug. OK. Thanks for digging into that clobber issue. Jeff
Re: [PATCH] RISC-V: Refactor the integer ternary autovec pattern
On 6/21/23 16:38, Juzhe-Zhong wrote: Long time ago, I encounter ICE when trying to set clobber register as Pmode and I forgot the reason. So, I clobber SI scratch and PUT_MODE to make it Pmode after reload which makes patterns look unreasonable. According to Jeff's comments, I tried it again, it works now when we try to set clobber register as Pmode and the patterns look more reasonable now. The tests are all passed, Ok for trunk. gcc/ChangeLog: * config/riscv/autovec.md (*fma): set clobber to Pmode in expand stage. (*fma): Ditto. (*fnma): Ditto. (*fnma): Ditto. OK jeff
Re: [PATCH v2 1/3] c++: Track lifetimes in constant evaluation [PR70331, PR96630, PR98675]
On Fri, Jun 23, 2023 at 12:43:21PM -0400, Patrick Palka wrote: > On Wed, 29 Mar 2023, Nathaniel Shead via Gcc-patches wrote: > > > This adds rudimentary lifetime tracking in C++ constexpr contexts, > > allowing the compiler to report errors with using values after their > > backing has gone out of scope. We don't yet handle other ways of ending > > lifetimes (e.g. explicit destructor calls). > > Awesome! > > > > > PR c++/96630 > > PR c++/98675 > > PR c++/70331 > > > > gcc/cp/ChangeLog: > > > > * constexpr.cc (constexpr_global_ctx::put_value): Mark value as > > in lifetime. > > (constexpr_global_ctx::remove_value): Mark value as expired. > > (cxx_eval_call_expression): Remove comment that is no longer > > applicable. > > (non_const_var_error): Add check for expired values. > > (cxx_eval_constant_expression): Add checks for expired values. Forget > > local variables at end of bind expressions. Forget temporaries at end > > of cleanup points. > > * cp-tree.h (struct lang_decl_base): New flag to track expired values > > in constant evaluation. > > (DECL_EXPIRED_P): Access the new flag. > > (SET_DECL_EXPIRED_P): Modify the new flag. > > * module.cc (trees_out::lang_decl_bools): Write out the new flag. > > (trees_in::lang_decl_bools): Read in the new flag. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp0x/constexpr-ice20.C: Update error raised by test. > > * g++.dg/cpp1y/constexpr-lifetime1.C: New test. > > * g++.dg/cpp1y/constexpr-lifetime2.C: New test. > > * g++.dg/cpp1y/constexpr-lifetime3.C: New test. > > * g++.dg/cpp1y/constexpr-lifetime4.C: New test. > > * g++.dg/cpp1y/constexpr-lifetime5.C: New test. > > > > Signed-off-by: Nathaniel Shead > > --- > > gcc/cp/constexpr.cc | 69 +++ > > gcc/cp/cp-tree.h | 10 ++- > > gcc/cp/module.cc | 2 + > > gcc/testsuite/g++.dg/cpp0x/constexpr-ice20.C | 2 +- > > .../g++.dg/cpp1y/constexpr-lifetime1.C| 13 > > .../g++.dg/cpp1y/constexpr-lifetime2.C| 20 ++ > > .../g++.dg/cpp1y/constexpr-lifetime3.C| 13 > > .../g++.dg/cpp1y/constexpr-lifetime4.C| 11 +++ > > .../g++.dg/cpp1y/constexpr-lifetime5.C| 11 +++ > > 9 files changed, 137 insertions(+), 14 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime1.C > > create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime2.C > > create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime3.C > > create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime4.C > > create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-lifetime5.C > > > > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc > > index 3de60cfd0f8..bdbc12144a7 100644 > > --- a/gcc/cp/constexpr.cc > > +++ b/gcc/cp/constexpr.cc > > @@ -1185,10 +1185,22 @@ public: > >void put_value (tree t, tree v) > >{ > > bool already_in_map = values.put (t, v); > > +if (!already_in_map && DECL_P (t)) > > + { > > + if (!DECL_LANG_SPECIFIC (t)) > > + retrofit_lang_decl (t); > > + if (DECL_LANG_SPECIFIC (t)) > > + SET_DECL_EXPIRED_P (t, false); > > + } > > Since this new flag would only be used only during constexpr evaluation, > could we instead use an on-the-side hash_set in constexpr_global_ctx for > tracking expired-ness? That way we won't have to allocate a > DECL_LANG_SPECIFIC structure for decls that lack it, and won't have to > worry about the flag in other parts of the compiler. I've tried this but I haven't been able to get it to work well. The main issue I'm running into is the caching of function calls in constant evaluation. For example, consider the following: constexpr const double& test() { const double& local = 3.0; return local; } constexpr int foo(const double&) { return 5; } constexpr int a = foo(test()); static_assert(test() == 3.0); When constant-evaluating 'a', we evaluate 'test()'. It returns a value that ends its lifetime immediately, so we mark this in 'ctx->global' as expired. However, 'foo()' never actually evaluates this expired value, so the initialisation of 'a' succeeds. However, then when the static assertion attempts to constant evaluate a second time, the result of 'test' has already been cached, and we just get directly handed a value. This is a new constant evaluation, so 'ctx->global' has been reset, and because we just got the result of the cached function we don't actually know whether this is expired or not anymore, and so this compiles without any error in case it was valid. I haven't yet been able to come up with a good way of avoiding this issue without complicating the caching of call expressions overly much. I suppose I could add an extra field to 'constexpr_call' to track if the value has already been expired (which would solve this particular ca
[PATCH v2 2/2] libstdc++: use new built-in trait __is_array
This patch lets libstdc++ use new built-in trait __is_array. libstdc++-v3/ChangeLog: * include/std/type_traits (is_array): Use __is_array built-in trait. (is_array_v): Likewise. Signed-off-by: Ken Matsui --- libstdc++-v3/include/std/type_traits | 12 1 file changed, 12 insertions(+) diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 0e7a9c9c7f3..f2a3a327e7d 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -503,6 +503,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { }; /// is_array +#if __has_builtin(__is_array) + template +struct is_array +: public __bool_constant<__is_array(_Tp)> +{ }; +#else template struct is_array : public false_type { }; @@ -514,6 +520,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template struct is_array<_Tp[]> : public true_type { }; +#endif template struct __is_pointer_helper @@ -3161,12 +3168,17 @@ template template inline constexpr bool is_floating_point_v = is_floating_point<_Tp>::value; +#if __has_builtin(__is_array) +template + inline constexpr bool is_array_v = __is_array(_Tp); +#else template inline constexpr bool is_array_v = false; template inline constexpr bool is_array_v<_Tp[]> = true; template inline constexpr bool is_array_v<_Tp[_Num]> = true; +#endif template inline constexpr bool is_pointer_v = is_pointer<_Tp>::value; -- 2.41.0
[PATCH v2 1/2] c++: implement __is_array built-in trait
This patch implements built-in trait for std::is_array. gcc/cp/ChangeLog: * cp-trait.def: Define __is_array. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARRAY. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_array. * g++.dg/ext/is_array.C: New test. Signed-off-by: Ken Matsui --- gcc/cp/constraint.cc | 3 +++ gcc/cp/cp-trait.def | 1 + gcc/cp/semantics.cc | 4 gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ gcc/testsuite/g++.dg/ext/is_array.C | 28 5 files changed, 39 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_array.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index 8cf0f2d0974..7cec7eba591 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) case CPTK_IS_UNION: inform (loc, " %qT is not a union", t1); break; +case CPTK_IS_ARRAY: + inform (loc, " %qT is not an array", t1); + break; case CPTK_IS_AGGREGATE: inform (loc, " %qT is not an aggregate", t1); break; diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index 8b7fece0cc8..f68c7f2e8ec 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2) DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1) DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) +DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1) DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, "__reference_constructs_from_temporary", 2) DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, "__reference_converts_from_temporary", 2) /* FIXME Added space to avoid direct usage in GCC 13. */ diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 8fb47fd179e..22f2700ec0b 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_UNION: return type_code1 == UNION_TYPE; +case CPTK_IS_ARRAY: + return type_code1 == ARRAY_TYPE; + case CPTK_IS_ASSIGNABLE: return is_xible (MODIFY_EXPR, type1, type2); @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: case CPTK_IS_UNION: case CPTK_IS_SAME: +case CPTK_IS_ARRAY: break; case CPTK_IS_LAYOUT_COMPATIBLE: diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..56485ae62be 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__is_array) +# error "__has_builtin (__is_array) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/is_array.C b/gcc/testsuite/g++.dg/ext/is_array.C new file mode 100644 index 000..facfed5c7cb --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_array.C @@ -0,0 +1,28 @@ +// { dg-do compile { target c++11 } } + +#include + +using namespace __gnu_test; + +#define SA(X) static_assert((X),#X) +#define SA_TEST_CATEGORY(TRAIT, X, expect) \ + SA(TRAIT(X) == expect); \ + SA(TRAIT(const X) == expect);\ + SA(TRAIT(volatile X) == expect); \ + SA(TRAIT(const volatile X) == expect) + +SA_TEST_CATEGORY(__is_array, int[2], true); +SA_TEST_CATEGORY(__is_array, int[], true); +SA_TEST_CATEGORY(__is_array, int[2][3], true); +SA_TEST_CATEGORY(__is_array, int[][3], true); +SA_TEST_CATEGORY(__is_array, float*[2], true); +SA_TEST_CATEGORY(__is_array, float*[], true); +SA_TEST_CATEGORY(__is_array, float*[2][3], true); +SA_TEST_CATEGORY(__is_array, float*[][3], true); +SA_TEST_CATEGORY(__is_array, ClassType[2], true); +SA_TEST_CATEGORY(__is_array, ClassType[], true); +SA_TEST_CATEGORY(__is_array, ClassType[2][3], true); +SA_TEST_CATEGORY(__is_array, ClassType[][3], true); + +// Sanity check. +SA_TEST_CATEGORY(__is_array, ClassType, false); -- 2.41.0
Re: [PATCH v2 1/2] c++: implement __is_const built-in trait
Here is the benchmark result for is_const. https://github.com/ken-matsui/gcc-benches/blob/main/is_const.md#sat-jun-24-044815-am-pdt-2023 Time: -2.86467% Peak Memory Usage: -1.0654% Total Memory Usage: -1.62369% On Sat, Jun 24, 2023 at 6:41 AM Ken Matsui wrote: > > This patch implements built-in trait for std::is_const. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __is_const. > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. > * semantics.cc (trait_expr_value): Likewise. > (finish_trait_expr): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_const. > * g++.dg/ext/is_const.C: New test. > > Signed-off-by: Ken Matsui > --- > gcc/cp/constraint.cc | 3 +++ > gcc/cp/cp-trait.def | 1 + > gcc/cp/semantics.cc | 4 > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ > gcc/testsuite/g++.dg/ext/is_const.C | 19 +++ > 5 files changed, 30 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index 8cf0f2d0974..ff4ae831def 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) > case CPTK_IS_UNION: >inform (loc, " %qT is not a union", t1); >break; > +case CPTK_IS_CONST: > + inform (loc, " %qT is not a const type", t1); > + break; > case CPTK_IS_AGGREGATE: >inform (loc, " %qT is not an aggregate", t1); >break; > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > index 8b7fece0cc8..b40b475b86d 100644 > --- a/gcc/cp/cp-trait.def > +++ b/gcc/cp/cp-trait.def > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, > "__is_trivially_assignable", 2) > DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", > -1) > DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) > DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) > +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1) > DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, > "__reference_constructs_from_temporary", 2) > DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, > "__reference_converts_from_temporary", 2) > /* FIXME Added space to avoid direct usage in GCC 13. */ > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > index 8fb47fd179e..011ba8e46e1 100644 > --- a/gcc/cp/semantics.cc > +++ b/gcc/cp/semantics.cc > @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, > tree type2) > case CPTK_IS_ENUM: >return type_code1 == ENUMERAL_TYPE; > > +case CPTK_IS_CONST: > + return CP_TYPE_CONST_P (type1); > + > case CPTK_IS_FINAL: >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); > > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind > kind, tree type1, tree type2) > case CPTK_IS_ENUM: > case CPTK_IS_UNION: > case CPTK_IS_SAME: > +case CPTK_IS_CONST: >break; > > case CPTK_IS_LAYOUT_COMPATIBLE: > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > index f343e153e56..965309a333a 100644 > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > @@ -146,3 +146,6 @@ > #if !__has_builtin (__remove_cvref) > # error "__has_builtin (__remove_cvref) failed" > #endif > +#if !__has_builtin (__is_const) > +# error "__has_builtin (__is_const) failed" > +#endif > diff --git a/gcc/testsuite/g++.dg/ext/is_const.C > b/gcc/testsuite/g++.dg/ext/is_const.C > new file mode 100644 > index 000..8f2d7c2fce9 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/is_const.C > @@ -0,0 +1,19 @@ > +// { dg-do compile { target c++11 } } > + > +#include > + > +using namespace __gnu_test; > + > +#define SA(X) static_assert((X),#X) > + > +// Positive tests. > +SA(__is_const(const int)); > +SA(__is_const(const volatile int)); > +SA(__is_const(cClassType)); > +SA(__is_const(cvClassType)); > + > +// Negative tests. > +SA(!__is_const(int)); > +SA(!__is_const(volatile int)); > +SA(!__is_const(ClassType)); > +SA(!__is_const(vClassType)); > -- > 2.41.0 >
[PATCH v2 2/2] libstdc++: use new built-in trait __is_const
This patch lets libstdc++ use new built-in trait __is_const. libstdc++-v3/ChangeLog: * include/std/type_traits (is_const): Use __is_const built-in trait. (is_const_v): Likewise. Signed-off-by: Ken Matsui --- libstdc++-v3/include/std/type_traits | 14 ++ 1 file changed, 14 insertions(+) diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 0e7a9c9c7f3..3a46eca5377 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -764,6 +764,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Type properties. /// is_const +#if __has_builtin(__is_const) + template +struct is_const +: public __bool_constant<__is_const(_Tp)> +{ }; +#else template struct is_const : public false_type { }; @@ -771,6 +777,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template struct is_const<_Tp const> : public true_type { }; +#endif /// is_volatile template @@ -3210,10 +3217,17 @@ template inline constexpr bool is_compound_v = is_compound<_Tp>::value; template inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value; + +#if __has_builtin(__is_const) +template + inline constexpr bool is_const_v = __is_const(_Tp); +#else template inline constexpr bool is_const_v = false; template inline constexpr bool is_const_v = true; +#endif + template inline constexpr bool is_volatile_v = false; template -- 2.41.0
[PATCH v2 1/2] c++: implement __is_const built-in trait
This patch implements built-in trait for std::is_const. gcc/cp/ChangeLog: * cp-trait.def: Define __is_const. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_const. * g++.dg/ext/is_const.C: New test. Signed-off-by: Ken Matsui --- gcc/cp/constraint.cc | 3 +++ gcc/cp/cp-trait.def | 1 + gcc/cp/semantics.cc | 4 gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ gcc/testsuite/g++.dg/ext/is_const.C | 19 +++ 5 files changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index 8cf0f2d0974..ff4ae831def 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) case CPTK_IS_UNION: inform (loc, " %qT is not a union", t1); break; +case CPTK_IS_CONST: + inform (loc, " %qT is not a const type", t1); + break; case CPTK_IS_AGGREGATE: inform (loc, " %qT is not an aggregate", t1); break; diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index 8b7fece0cc8..b40b475b86d 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2) DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1) DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1) DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, "__reference_constructs_from_temporary", 2) DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, "__reference_converts_from_temporary", 2) /* FIXME Added space to avoid direct usage in GCC 13. */ diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 8fb47fd179e..011ba8e46e1 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: return type_code1 == ENUMERAL_TYPE; +case CPTK_IS_CONST: + return CP_TYPE_CONST_P (type1); + case CPTK_IS_FINAL: return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: case CPTK_IS_UNION: case CPTK_IS_SAME: +case CPTK_IS_CONST: break; case CPTK_IS_LAYOUT_COMPATIBLE: diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..965309a333a 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__is_const) +# error "__has_builtin (__is_const) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/is_const.C b/gcc/testsuite/g++.dg/ext/is_const.C new file mode 100644 index 000..8f2d7c2fce9 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_const.C @@ -0,0 +1,19 @@ +// { dg-do compile { target c++11 } } + +#include + +using namespace __gnu_test; + +#define SA(X) static_assert((X),#X) + +// Positive tests. +SA(__is_const(const int)); +SA(__is_const(const volatile int)); +SA(__is_const(cClassType)); +SA(__is_const(cvClassType)); + +// Negative tests. +SA(!__is_const(int)); +SA(!__is_const(volatile int)); +SA(!__is_const(ClassType)); +SA(!__is_const(vClassType)); -- 2.41.0
Re: [PATCH] c++: Report invalid id-expression in decltype [PR100482]
On Fri, Jun 23, 2023 at 12:15:32PM -0400, Patrick Palka wrote: > On Sun, 30 Apr 2023, Nathaniel Shead via Gcc-patches wrote: > > > This patch ensures that any errors raised by finish_id_expression when > > parsing a decltype expression are properly reported, rather than > > potentially going ignored and causing invalid code to be accepted. > > > > We can also now remove the separate check for templates without args as > > this is also checked for in finish_id_expression. > > > > PR 100482 > > > > gcc/cp/ChangeLog: > > > > * parser.cc (cp_parser_decltype_expr): Report errors raised by > > finish_id_expression. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/pr100482.C: New test. > > LGTM. Some minor comments about the new testcase below: > > > > > Signed-off-by: Nathaniel Shead > > --- > > gcc/cp/parser.cc| 22 +++--- > > gcc/testsuite/g++.dg/pr100482.C | 11 +++ > > 2 files changed, 22 insertions(+), 11 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/pr100482.C > > > > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc > > index e5f032f2330..20ebcdc3cfd 100644 > > --- a/gcc/cp/parser.cc > > +++ b/gcc/cp/parser.cc > > @@ -16508,10 +16508,6 @@ cp_parser_decltype_expr (cp_parser *parser, > > expr = cp_parser_lookup_name_simple (parser, expr, > > id_expr_start_token->location); > > > > - if (expr && TREE_CODE (expr) == TEMPLATE_DECL) > > - /* A template without args is not a complete id-expression. */ > > - expr = error_mark_node; > > - > >if (expr > >&& expr != error_mark_node > >&& TREE_CODE (expr) != TYPE_DECL > > @@ -16532,13 +16528,17 @@ cp_parser_decltype_expr (cp_parser *parser, > > &error_msg, > >id_expr_start_token->location)); > > > > - if (expr == error_mark_node) > > -/* We found an id-expression, but it was something that we > > - should not have found. This is an error, not something > > - we can recover from, so note that we found an > > - id-expression and we'll recover as gracefully as > > - possible. */ > > -id_expression_or_member_access_p = true; > > + if (error_msg) > > + { > > + /* We found an id-expression, but it was something that we > > +should not have found. This is an error, not something > > +we can recover from, so report the error we found and > > +we'll recover as gracefully as possible. */ > > + cp_parser_parse_definitely (parser); > > + cp_parser_error (parser, error_msg); > > + id_expression_or_member_access_p = true; > > + return error_mark_node; > > + } > > } > > > >if (expr > > diff --git a/gcc/testsuite/g++.dg/pr100482.C > > b/gcc/testsuite/g++.dg/pr100482.C > > new file mode 100644 > > index 000..dcf6722fda5 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/pr100482.C > > We generally prefer to organize tests according to the language dialect > they apply to and the langugae construct that they're primarily testing. > In this case we could name the test e.g. > > gcc/testsuite/g++.dg/cpp0x/decltype-100482.C > > > @@ -0,0 +1,11 @@ > > +// { dg-do compile { target c++10 } } > > We also usually mention the PR number in the test as a comment: > > // PR c++/100482 > > One benefit of doing so is that the git alias 'git gcc-commit-mklog' > (https://gcc.gnu.org/gitwrite.html#vendor) will then automatically > include the PR number in the commit message template. > > > + > > +namespace N {} > > +decltype(std) x; // { dg-error "expected primary-expression" } > > + > > +struct S {}; > > +decltype(S) y; // { dg-error "argument to .decltype. must be an > > expression" } > > + > > +template > > +struct U {}; > > +decltype(U) z; // { dg-error "missing template arguments" } > > -- > > 2.40.0 > > > > > Thanks for the comments. I've fixed the test, does this look OK? -- 8< -- This patch ensures that any errors raised by finish_id_expression when parsing a decltype expression are properly reported, rather than potentially going ignored and causing invalid code to be accepted. We can also now remove the separate check for templates without args as this is also checked for in finish_id_expression. PR c++/100482 gcc/cp/ChangeLog: * parser.cc (cp_parser_decltype_expr): Report errors raised by finish_id_expression. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype-100482.C: New test. Signed-off-by: Nathaniel Shead --- gcc/cp/parser.cc | 22 ++-- gcc/testsuite/g++.dg/cpp0x/decltype-100482.C | 12 +++ 2 files changed, 23 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-100482.C diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index dd3665
Re: [PATCH] c++: Fix ICE with parameter pack of decltype(auto) [PR103497]
On Fri, Jun 23, 2023 at 11:59:51AM -0400, Patrick Palka wrote: > Hi, > > On Sat, 22 Apr 2023, Nathaniel Shead via Gcc-patches wrote: > > > Bootstrapped and tested on x86_64-pc-linux-gnu. > > > > -- 8< -- > > > > This patch raises an error early when the decltype(auto) specifier is > > used as a parameter of a function. This prevents any issues with an > > unexpected tree type later on when performing the call. > > Thanks very much for the patch! Some minor comments below. > > > > > PR 103497 > > We should include the bug component name when referring to the PR in the > commit message (i.e. PR c++/103497) so that upon pushing the patch the > post-commit hook automatically adds a comment to the PR reffering to the > commit. I could be wrong but AFAIK the hook only performs this when the > component name is included. Thanks for the review! Fixed. > > > > gcc/cp/ChangeLog: > > > > * parser.cc (cp_parser_simple_type_specifier): Add check for > > decltype(auto) as function parameter. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/pr103497.C: New test. > > > > Signed-off-by: Nathaniel Shead > > --- > > gcc/cp/parser.cc| 10 ++ > > gcc/testsuite/g++.dg/pr103497.C | 7 +++ > > 2 files changed, 17 insertions(+) > > create mode 100644 gcc/testsuite/g++.dg/pr103497.C > > > > diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc > > index e5f032f2330..1415e07e152 100644 > > --- a/gcc/cp/parser.cc > > +++ b/gcc/cp/parser.cc > > @@ -19884,6 +19884,16 @@ cp_parser_simple_type_specifier (cp_parser* parser, > >&& cp_lexer_peek_nth_token (parser->lexer, 2)->type != CPP_SCOPE) > > { > >type = saved_checks_value (token->u.tree_check_value); > > + /* Within a function parameter declaration, decltype(auto) is always > > an > > +error. */ > > + if (parser->auto_is_implicit_function_template_parm_p > > + && TREE_CODE (type) == TEMPLATE_TYPE_PARM > > We could check is_auto (type) here instead, to avoid any confusion with > checking AUTO_IS_DECLTYPE for a non-auto TEMPLATE_TYPE_PARM. > > > + && AUTO_IS_DECLTYPE (type)) > > + { > > + error_at (token->location, > > + "cannot declare a parameter with %"); > > + type = error_mark_node; > > + } > >if (decl_specs) > > { > > cp_parser_set_decl_spec_type (decl_specs, type, > > diff --git a/gcc/testsuite/g++.dg/pr103497.C > > b/gcc/testsuite/g++.dg/pr103497.C > > new file mode 100644 > > index 000..bcd421c2907 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/pr103497.C > > @@ -0,0 +1,7 @@ > > +// { dg-do compile { target c++14 } } > > + > > +void foo(decltype(auto)... args); // { dg-error "parameter with > > .decltype.auto..|no parameter packs" } > > I noticed for > > void foo(decltype(auto) arg); > > we already issue an identical error from grokdeclarator. Perhaps we could > instead extend the error handling there to detect decltype(auto)... as well, > rather than adding new error handling in cp_parser_simple_type_specifier? Ah thanks, I didn't notice this; this simplifies the change a fair bit. How about this patch instead? Regtested on x86_64-pc-linux-gnu. -- 8< -- This patch ensures that checks for usages of 'auto' in function parameters also consider parameter packs, since 'type_uses_auto' does not seem to consider this case. PR c++/103497 gcc/cp/ChangeLog: * decl.cc (grokdeclarator): Check for decltype(auto) in parameter pack. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/decltype-auto-103497.C: New test. Signed-off-by: Nathaniel Shead --- gcc/cp/decl.cc| 3 +++ gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C | 8 2 files changed, 11 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 60f107d50c4..aaf691fce68 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -14044,6 +14044,9 @@ grokdeclarator (const cp_declarator *declarator, error ("cannot use %<::%> in parameter declaration"); tree auto_node = type_uses_auto (type); + if (!auto_node && parameter_pack_p) + auto_node = type_uses_auto (PACK_EXPANSION_PATTERN (type)); + if (auto_node && !(cxx_dialect >= cxx17 && template_parm_flag)) { if (cxx_dialect >= cxx14) diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C new file mode 100644 index 000..cedd661710c --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto-103497.C @@ -0,0 +1,8 @@ +// PR c++/103497 +// { dg-do compile { target c++14 } } + +void foo(decltype(auto)... args); // { dg-error "cannot declare a parameter with .decltype.auto.." } + +int main() { + foo(); +} -- 2.41.0
[Patch, fortran] PR49213 - [OOP] gfortran rejects structure constructor expression
Hi All, I was looking through Neil Carlson's collection of gfortran bugs and was shocked to find this rather fundamental PR. At 12 years old, it is certainly a "golden oldie"! The patch is rather straightforward and seems to do the job of admitting derived, intrinsic and character expressions to allocatable class components in structure constructors. I have included the adjustment to 'gfc_is_ptr_fcn' and eliminating the extra blank line, introduced by my last patch. I played safe and went exclusively for class functions with attr.class_pointer set on the grounds that these have had all the accoutrements checked and built (ie. class_ok). I am still not sure if this is necessary or not. OK for trunk? Paul Fortran: Enable class expressions in structure constructors [PR49213] 2023-06-24 Paul Thomas gcc/fortran PR fortran/49213 * expr.cc (gfc_is_ptr_fcn): Guard pointer attribute to exclude class expressions. * resolve.cc (resolve_assoc_var): Call gfc_is_ptr_fcn to allow associate names with pointer function targets to be used in variable definition context. * trans-decl.cc (get_symbol_decl): Remove extraneous line. * trans-expr.cc (alloc_scalar_allocatable_subcomponent): Obtain size of intrinsic and character expressions. (gfc_trans_subcomponent_assign): Expand assignment to class components to include intrinsic and character expressions. gcc/testsuite/ PR fortran/49213 * gfortran.dg/pr49213.f90 : New test diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc index c960dfeabd9..92061d69781 100644 --- a/gcc/fortran/expr.cc +++ b/gcc/fortran/expr.cc @@ -816,7 +816,7 @@ bool gfc_is_ptr_fcn (gfc_expr *e) { return e != NULL && e->expr_type == EXPR_FUNCTION - && (gfc_expr_attr (e).pointer + && ((e->ts.type != BT_CLASS && gfc_expr_attr (e).pointer) || (e->ts.type == BT_CLASS && CLASS_DATA (e)->attr.class_pointer)); } diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc index 82e6ac53aa1..217d69d4e0b 100644 --- a/gcc/fortran/resolve.cc +++ b/gcc/fortran/resolve.cc @@ -1350,6 +1350,9 @@ resolve_structure_cons (gfc_expr *expr, int init) && CLASS_DATA (comp)->as) rank = CLASS_DATA (comp)->as->rank; + if (comp->ts.type == BT_CLASS && cons->expr->ts.type == BT_DERIVED) + gfc_find_derived_vtab (cons->expr->ts.u.derived); + if (cons->expr->expr_type != EXPR_NULL && rank != cons->expr->rank && (comp->attr.allocatable || cons->expr->rank)) { @@ -1381,7 +1384,7 @@ resolve_structure_cons (gfc_expr *expr, int init) gfc_basic_typename (comp->ts.type)); t = false; } - else + else if (!UNLIMITED_POLY (comp)) { bool t2 = gfc_convert_type (cons->expr, &comp->ts, 1); if (t) diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc index 18589e17843..b0fd25e92a3 100644 --- a/gcc/fortran/trans-decl.cc +++ b/gcc/fortran/trans-decl.cc @@ -1915,7 +1915,6 @@ gfc_get_symbol_decl (gfc_symbol * sym) gcc_assert (!sym->value || sym->value->expr_type == EXPR_NULL); } - gfc_finish_var_decl (decl, sym); if (sym->ts.type == BT_CHARACTER) diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 3c209bcde97..5a1ff0c1d21 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -8781,6 +8781,7 @@ alloc_scalar_allocatable_subcomponent (stmtblock_t *block, tree comp, tree size; tree size_in_bytes; tree lhs_cl_size = NULL_TREE; + gfc_se se; if (!comp) return; @@ -8815,16 +8816,26 @@ alloc_scalar_allocatable_subcomponent (stmtblock_t *block, tree comp, } else if (cm->ts.type == BT_CLASS) { - gcc_assert (expr2->ts.type == BT_CLASS || expr2->ts.type == BT_DERIVED); - if (expr2->ts.type == BT_DERIVED) + if (expr2->ts.type != BT_CLASS) { - tmp = gfc_get_symbol_decl (expr2->ts.u.derived); - size = TYPE_SIZE_UNIT (tmp); + if (expr2->ts.type == BT_CHARACTER) + { + gfc_init_se (&se, NULL); + gfc_conv_expr (&se, expr2); + size = fold_convert (size_type_node, se.string_length); + } + else + { + if (expr2->ts.type == BT_DERIVED) + tmp = gfc_get_symbol_decl (expr2->ts.u.derived); + else + tmp = gfc_typenode_for_spec (&expr2->ts); + size = TYPE_SIZE_UNIT (tmp); + } } else { gfc_expr *e2vtab; - gfc_se se; e2vtab = gfc_find_and_cut_at_last_class_ref (expr2); gfc_add_vptr_component (e2vtab); gfc_add_size_component (e2vtab); @@ -8975,6 +8986,7 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm, { gfc_init_se (&se, NULL); gfc_conv_expr (&se, expr); + tree size; /* Take care about non-array allocatable components here. The alloc_* routine below is motivated by the alloc_scalar_allocatable_for_ @@ -8990,7 +9002,7 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm, && expr->symtree->n.sym->attr.dummy) se.expr = build_fold_indirect_ref_loc (input_location, se.expr); - if (cm->ts.type =
RE: [PATCH V2] LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE}
Committed as passed bootstrap and regression test, thanks Jeff. Pan -Original Message- From: Gcc-patches On Behalf Of Jeff Law via Gcc-patches Sent: Saturday, June 24, 2023 9:29 AM To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org Cc: rguent...@suse.de; richard.sandif...@arm.com Subject: Re: [PATCH V2] LOOP IVOPTS: Apply LEN_MASK_{LOAD,STORE} On 6/23/23 17:41, juzhe.zh...@rivai.ai wrote: > From: Ju-Zhe Zhong > > Hi, Jeff. I fix format as you suggested. > Ok for trunk ? > > gcc/ChangeLog: > > * tree-ssa-loop-ivopts.cc (get_mem_type_for_internal_fn): Apply > LEN_MASK_{LOAD,STORE}. Yes. Sorry I wasn't explicit that it was OK with the formatting change. Jeff
Re: [PATCH 1/2] c++: implement __is_const built-in trait
Sure. Thank you. On Sat, Jun 24, 2023 at 5:14 AM Xi Ruoyao wrote: > Please use [PATCH v3 1/2] next time, now it's not easy to find the > latest version of the series (I'm not sure if the number "3" is > correct). > > On Sat, 2023-06-24 at 03:38 -0700, Ken Matsui via Gcc-patches wrote: > > This patch implements built-in trait for std::is_const. > > > > gcc/cp/ChangeLog: > > > > * cp-trait.def: Define __is_const. > > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. > > * semantics.cc (trait_expr_value): Likewise. > > (finish_trait_expr): Likewise. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_const. > > * g++.dg/ext/is_const.C: New test. > > > > Signed-off-by: Ken Matsui > > --- > > gcc/cp/constraint.cc | 3 +++ > > gcc/cp/cp-trait.def | 1 + > > gcc/cp/semantics.cc | 4 > > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ > > gcc/testsuite/g++.dg/ext/is_const.C | 19 +++ > > 5 files changed, 30 insertions(+) > > create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C > > > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > > index 8cf0f2d0974..ff4ae831def 100644 > > --- a/gcc/cp/constraint.cc > > +++ b/gcc/cp/constraint.cc > > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) > > case CPTK_IS_UNION: > >inform (loc, " %qT is not a union", t1); > >break; > > +case CPTK_IS_CONST: > > + inform (loc, " %qT is not a const type", t1); > > + break; > > case CPTK_IS_AGGREGATE: > >inform (loc, " %qT is not an aggregate", t1); > >break; > > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > > index 8b7fece0cc8..b40b475b86d 100644 > > --- a/gcc/cp/cp-trait.def > > +++ b/gcc/cp/cp-trait.def > > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, > > "__is_trivially_assignable", 2) > > DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, > > "__is_trivially_constructible", -1) > > DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) > > DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) > > +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1) > > DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, > > "__reference_constructs_from_temporary", 2) > > DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, > > "__reference_converts_from_temporary", 2) > > /* FIXME Added space to avoid direct usage in GCC 13. */ > > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > > index 8fb47fd179e..011ba8e46e1 100644 > > --- a/gcc/cp/semantics.cc > > +++ b/gcc/cp/semantics.cc > > @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree > > type1, tree type2) > > case CPTK_IS_ENUM: > >return type_code1 == ENUMERAL_TYPE; > > > > +case CPTK_IS_CONST: > > + return CP_TYPE_CONST_P (type1); > > + > > case CPTK_IS_FINAL: > >return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); > > > > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, > > cp_trait_kind kind, tree type1, tree type2) > > case CPTK_IS_ENUM: > > case CPTK_IS_UNION: > > case CPTK_IS_SAME: > > +case CPTK_IS_CONST: > >break; > > > > case CPTK_IS_LAYOUT_COMPATIBLE: > > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > index f343e153e56..965309a333a 100644 > > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > @@ -146,3 +146,6 @@ > > #if !__has_builtin (__remove_cvref) > > # error "__has_builtin (__remove_cvref) failed" > > #endif > > +#if !__has_builtin (__is_const) > > +# error "__has_builtin (__is_const) failed" > > +#endif > > diff --git a/gcc/testsuite/g++.dg/ext/is_const.C > > b/gcc/testsuite/g++.dg/ext/is_const.C > > new file mode 100644 > > index 000..8f2d7c2fce9 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/ext/is_const.C > > @@ -0,0 +1,19 @@ > > +// { dg-do compile { target c++11 } } > > + > > +#include > > + > > +using namespace __gnu_test; > > + > > +#define SA(X) static_assert((X),#X) > > + > > +// Positive tests. > > +SA(__is_const(const int)); > > +SA(__is_const(const volatile int)); > > +SA(__is_const(cClassType)); > > +SA(__is_const(cvClassType)); > > + > > +// Negative tests. > > +SA(!__is_const(int)); > > +SA(!__is_const(volatile int)); > > +SA(!__is_const(ClassType)); > > +SA(!__is_const(vClassType)); > > -- > Xi Ruoyao > School of Aerospace Science and Technology, Xidian University > -- Sincerely, Ken Matsui
Re: [PATCH 1/2] c++: implement __is_const built-in trait
Please use [PATCH v3 1/2] next time, now it's not easy to find the latest version of the series (I'm not sure if the number "3" is correct). On Sat, 2023-06-24 at 03:38 -0700, Ken Matsui via Gcc-patches wrote: > This patch implements built-in trait for std::is_const. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __is_const. > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. > * semantics.cc (trait_expr_value): Likewise. > (finish_trait_expr): Likewise. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/has-builtin-1.C: Test existence of __is_const. > * g++.dg/ext/is_const.C: New test. > > Signed-off-by: Ken Matsui > --- > gcc/cp/constraint.cc | 3 +++ > gcc/cp/cp-trait.def | 1 + > gcc/cp/semantics.cc | 4 > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ > gcc/testsuite/g++.dg/ext/is_const.C | 19 +++ > 5 files changed, 30 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc > index 8cf0f2d0974..ff4ae831def 100644 > --- a/gcc/cp/constraint.cc > +++ b/gcc/cp/constraint.cc > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) > case CPTK_IS_UNION: > inform (loc, " %qT is not a union", t1); > break; > + case CPTK_IS_CONST: > + inform (loc, " %qT is not a const type", t1); > + break; > case CPTK_IS_AGGREGATE: > inform (loc, " %qT is not an aggregate", t1); > break; > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > index 8b7fece0cc8..b40b475b86d 100644 > --- a/gcc/cp/cp-trait.def > +++ b/gcc/cp/cp-trait.def > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, > "__is_trivially_assignable", 2) > DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, > "__is_trivially_constructible", -1) > DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) > DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) > +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1) > DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, > "__reference_constructs_from_temporary", 2) > DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, > "__reference_converts_from_temporary", 2) > /* FIXME Added space to avoid direct usage in GCC 13. */ > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > index 8fb47fd179e..011ba8e46e1 100644 > --- a/gcc/cp/semantics.cc > +++ b/gcc/cp/semantics.cc > @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree > type1, tree type2) > case CPTK_IS_ENUM: > return type_code1 == ENUMERAL_TYPE; > > + case CPTK_IS_CONST: > + return CP_TYPE_CONST_P (type1); > + > case CPTK_IS_FINAL: > return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); > > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, > cp_trait_kind kind, tree type1, tree type2) > case CPTK_IS_ENUM: > case CPTK_IS_UNION: > case CPTK_IS_SAME: > + case CPTK_IS_CONST: > break; > > case CPTK_IS_LAYOUT_COMPATIBLE: > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > index f343e153e56..965309a333a 100644 > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > @@ -146,3 +146,6 @@ > #if !__has_builtin (__remove_cvref) > # error "__has_builtin (__remove_cvref) failed" > #endif > +#if !__has_builtin (__is_const) > +# error "__has_builtin (__is_const) failed" > +#endif > diff --git a/gcc/testsuite/g++.dg/ext/is_const.C > b/gcc/testsuite/g++.dg/ext/is_const.C > new file mode 100644 > index 000..8f2d7c2fce9 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/is_const.C > @@ -0,0 +1,19 @@ > +// { dg-do compile { target c++11 } } > + > +#include > + > +using namespace __gnu_test; > + > +#define SA(X) static_assert((X),#X) > + > +// Positive tests. > +SA(__is_const(const int)); > +SA(__is_const(const volatile int)); > +SA(__is_const(cClassType)); > +SA(__is_const(cvClassType)); > + > +// Negative tests. > +SA(!__is_const(int)); > +SA(!__is_const(volatile int)); > +SA(!__is_const(ClassType)); > +SA(!__is_const(vClassType)); -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
[PATCH 2/2] libstdc++: use new built-in trait __is_const
This patch lets libstdc++ use new built-in trait __is_const. libstdc++-v3/ChangeLog: * include/std/type_traits (is_const): Use __is_const built-in trait. (is_const_v): Likewise. Signed-off-by: Ken Matsui --- libstdc++-v3/include/std/type_traits | 14 ++ 1 file changed, 14 insertions(+) diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 0e7a9c9c7f3..3a46eca5377 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -764,6 +764,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Type properties. /// is_const +#if __has_builtin(__is_const) + template +struct is_const +: public __bool_constant<__is_const(_Tp)> +{ }; +#else template struct is_const : public false_type { }; @@ -771,6 +777,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template struct is_const<_Tp const> : public true_type { }; +#endif /// is_volatile template @@ -3210,10 +3217,17 @@ template inline constexpr bool is_compound_v = is_compound<_Tp>::value; template inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value; + +#if __has_builtin(__is_const) +template + inline constexpr bool is_const_v = __is_const(_Tp); +#else template inline constexpr bool is_const_v = false; template inline constexpr bool is_const_v = true; +#endif + template inline constexpr bool is_volatile_v = false; template -- 2.41.0
[PATCH 1/2] c++: implement __is_const built-in trait
This patch implements built-in trait for std::is_const. gcc/cp/ChangeLog: * cp-trait.def: Define __is_const. * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST. * semantics.cc (trait_expr_value): Likewise. (finish_trait_expr): Likewise. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __is_const. * g++.dg/ext/is_const.C: New test. Signed-off-by: Ken Matsui --- gcc/cp/constraint.cc | 3 +++ gcc/cp/cp-trait.def | 1 + gcc/cp/semantics.cc | 4 gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 +++ gcc/testsuite/g++.dg/ext/is_const.C | 19 +++ 5 files changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc index 8cf0f2d0974..ff4ae831def 100644 --- a/gcc/cp/constraint.cc +++ b/gcc/cp/constraint.cc @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args) case CPTK_IS_UNION: inform (loc, " %qT is not a union", t1); break; +case CPTK_IS_CONST: + inform (loc, " %qT is not a const type", t1); + break; case CPTK_IS_AGGREGATE: inform (loc, " %qT is not an aggregate", t1); break; diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index 8b7fece0cc8..b40b475b86d 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2) DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1) DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1) DEFTRAIT_EXPR (IS_UNION, "__is_union", 1) +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1) DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, "__reference_constructs_from_temporary", 2) DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, "__reference_converts_from_temporary", 2) /* FIXME Added space to avoid direct usage in GCC 13. */ diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 8fb47fd179e..011ba8e46e1 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12079,6 +12079,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: return type_code1 == ENUMERAL_TYPE; +case CPTK_IS_CONST: + return CP_TYPE_CONST_P (type1); + case CPTK_IS_FINAL: return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1); @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, tree type1, tree type2) case CPTK_IS_ENUM: case CPTK_IS_UNION: case CPTK_IS_SAME: +case CPTK_IS_CONST: break; case CPTK_IS_LAYOUT_COMPATIBLE: diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..965309a333a 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__is_const) +# error "__has_builtin (__is_const) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/is_const.C b/gcc/testsuite/g++.dg/ext/is_const.C new file mode 100644 index 000..8f2d7c2fce9 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/is_const.C @@ -0,0 +1,19 @@ +// { dg-do compile { target c++11 } } + +#include + +using namespace __gnu_test; + +#define SA(X) static_assert((X),#X) + +// Positive tests. +SA(__is_const(const int)); +SA(__is_const(const volatile int)); +SA(__is_const(cClassType)); +SA(__is_const(cvClassType)); + +// Negative tests. +SA(!__is_const(int)); +SA(!__is_const(volatile int)); +SA(!__is_const(ClassType)); +SA(!__is_const(vClassType)); -- 2.41.0
[PATCH 2/2] libstdc++: use new built-in trait __remove_pointer
This patch lets libstdc++ use new built-in trait __remove_pointer. libstdc++-v3/ChangeLog: * include/std/type_traits (remove_pointer): Use __remove_pointer built-in trait. Signed-off-by: Ken Matsui --- libstdc++-v3/include/std/type_traits | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits index 0e7a9c9c7f3..81497e2f3e1 100644 --- a/libstdc++-v3/include/std/type_traits +++ b/libstdc++-v3/include/std/type_traits @@ -2023,6 +2023,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Pointer modifications. + /// remove_pointer +#if __has_builtin(__remove_pointer) + template +struct remove_pointer +{ using type = __remove_pointer(_Tp); }; +#else template struct __remove_pointer_helper { using type = _Tp; }; @@ -2031,11 +2037,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION struct __remove_pointer_helper<_Tp, _Up*> { using type = _Up; }; - /// remove_pointer template struct remove_pointer : public __remove_pointer_helper<_Tp, __remove_cv_t<_Tp>> { }; +#endif template struct __add_pointer_helper -- 2.41.0
[PATCH 1/2] c++: implement __remove_pointer built-in trait
This patch implements built-in trait for std::remove_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_pointer. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer. * g++.dg/ext/remove_pointer.C: New test. Signed-off-by: Ken Matsui --- gcc/cp/cp-trait.def | 1 + gcc/cp/semantics.cc | 5 +++ gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 ++ gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++ 4 files changed, 60 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def index 8b7fece0cc8..07823e55579 100644 --- a/gcc/cp/cp-trait.def +++ b/gcc/cp/cp-trait.def @@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2) DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1) DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1) DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1) +DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1) DEFTRAIT_TYPE (UNDERLYING_TYPE, "__underlying_type", 1) DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1) diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index 8fb47fd179e..a3b283ce938 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -12374,6 +12374,11 @@ finish_trait_type (cp_trait_kind kind, tree type1, tree type2, type1 = TREE_TYPE (type1); return cv_unqualified (type1); +case CPTK_REMOVE_POINTER: + if (TYPE_PTR_P (type1)) +type1 = TREE_TYPE (type1); + return type1; + case CPTK_TYPE_PACK_ELEMENT: return finish_type_pack_element (type1, type2, complain); diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C b/gcc/testsuite/g++.dg/ext/has-builtin-1.C index f343e153e56..e21e0a95509 100644 --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C @@ -146,3 +146,6 @@ #if !__has_builtin (__remove_cvref) # error "__has_builtin (__remove_cvref) failed" #endif +#if !__has_builtin (__remove_pointer) +# error "__has_builtin (__remove_pointer) failed" +#endif diff --git a/gcc/testsuite/g++.dg/ext/remove_pointer.C b/gcc/testsuite/g++.dg/ext/remove_pointer.C new file mode 100644 index 000..7b13db93950 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/remove_pointer.C @@ -0,0 +1,51 @@ +// { dg-do compile { target c++11 } } + +#define SA(X) static_assert((X),#X) + +SA(__is_same(__remove_pointer(int), int)); +SA(__is_same(__remove_pointer(int*), int)); +SA(__is_same(__remove_pointer(int**), int*)); + +SA(__is_same(__remove_pointer(const int*), const int)); +SA(__is_same(__remove_pointer(const int**), const int*)); +SA(__is_same(__remove_pointer(int* const), int)); +SA(__is_same(__remove_pointer(int** const), int*)); +SA(__is_same(__remove_pointer(int* const* const), int* const)); + +SA(__is_same(__remove_pointer(volatile int*), volatile int)); +SA(__is_same(__remove_pointer(volatile int**), volatile int*)); +SA(__is_same(__remove_pointer(int* volatile), int)); +SA(__is_same(__remove_pointer(int** volatile), int*)); +SA(__is_same(__remove_pointer(int* volatile* volatile), int* volatile)); + +SA(__is_same(__remove_pointer(const volatile int*), const volatile int)); +SA(__is_same(__remove_pointer(const volatile int**), const volatile int*)); +SA(__is_same(__remove_pointer(const int* volatile), const int)); +SA(__is_same(__remove_pointer(volatile int* const), volatile int)); +SA(__is_same(__remove_pointer(int* const volatile), int)); +SA(__is_same(__remove_pointer(const int** volatile), const int*)); +SA(__is_same(__remove_pointer(volatile int** const), volatile int*)); +SA(__is_same(__remove_pointer(int** const volatile), int*)); +SA(__is_same(__remove_pointer(int* const* const volatile), int* const)); +SA(__is_same(__remove_pointer(int* volatile* const volatile), int* volatile)); +SA(__is_same(__remove_pointer(int* const volatile* const volatile), int* const volatile)); + +SA(__is_same(__remove_pointer(int&), int&)); +SA(__is_same(__remove_pointer(const int&), const int&)); +SA(__is_same(__remove_pointer(volatile int&), volatile int&)); +SA(__is_same(__remove_pointer(const volatile int&), const volatile int&)); + +SA(__is_same(__remove_pointer(int&&), int&&)); +SA(__is_same(__remove_pointer(const int&&), const int&&)); +SA(__is_same(__remove_pointer(volatile int&&), volatile int&&)); +SA(__is_same(__remove_pointer(const volatile int&&), const volatile int&&)); + +SA(__is_same(__remove_pointer(int[3]), int[3])); +SA(__is_same(__remove_pointer(const int[3]), const int[3])); +SA(__is_same(__remove_pointer(volatile int[3]), volatile int[3])); +SA(__is_same(__remove_pointer(const volatile int[3]), const volatile int[3])); + +SA(__is_same(__remove_pointer(int(int)), int(int))); +SA(__is_same(__remove_pointer(int(*const)(int)), int(int))); +SA(__is_same(__remove_pointer(int(*volatile)(int))
Re: [PATCH 1/2] c++: implement __remove_pointer built-in trait
On Tue, Jun 20, 2023 at 8:22 AM Patrick Palka wrote: > > On Sat, 17 Jun 2023, Ken Matsui via Gcc-patches wrote: > > > Hi, > > > > I conducted a benchmark for remove_pointer as well as is_object. Just > > like the is_object benchmark, here is the benchmark code: > > > > https://github.com/ken-matsui/gcc-benches/blob/main/remove_pointer_benchmark.cc > > > > On my computer, using the gcc HEAD of this patch for a release build, > > the patch with -DUSE_BUILTIN took 8.7% less time and used 4.3-4.9% > > less memory on average compared to not using it. Although the > > performance improvement was not as significant as with is_object, the > > benchmark demonstrated that the compilation was consistently more > > efficient. > > Thanks for the benchmark. The improvement is lesser than I expected, > but that might be because the benchmark is "biased": > > template > struct Instantiator : Instantiator { > static_assert(!std::is_pointer_v>); > }; > > This only invokes remove_pointer_t on the non-pointer type Instantiator, > and so the benchmark doesn't factor in the performance of the trait when > invoked on pointer types, and traits typically will have different > performance characteristics depending on the kind of type it's given. > > To more holistically assess the real-world performance of the trait the > benchmark should also consider pointer types and maybe also cv-qualified > types (given that the original implementation is in terms of > __remove_cv_t and thus performance of the original implementation may be > sensitive to cv-qualification). So we should probably uniformly > benchmark these classes of types, via doing e.g.: > > static_assert(!std::is_pointer_v>); > static_assert(!std::is_pointer_v>); > static_assert(!std::is_pointer_v>); > static_assert(!std::is_pointer_v>); > > (We could consider other kinds of types too, e.g. reference types and > integral types, but it seems clear based on the implementations being > benchmarked that performance won't be sensitive to reference-ness > or integral-ness.) Thank you for your review! I totally agree with your opinion that the benchmark should have been exhaustive. I conducted a benchmark for this new change: https://github.com/ken-matsui/gcc-benches/blob/main/remove_pointer.md#tue-jun-20-030313-pm-pdt-2023 Time: -27.918% Peak Memory Usage: -19.0755% Total Memory Usage: -20.0199% > > > > Sincerely, > > Ken Matsui > > > > On Thu, Jun 15, 2023 at 5:22 AM Ken Matsui > > wrote: > > > > > > This patch implements built-in trait for std::remove_pointer. > > > > > > gcc/cp/ChangeLog: > > > > > > * cp-trait.def: Define __remove_pointer. > > > * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer. > > > * g++.dg/ext/remove_pointer.C: New test. > > > > > > Signed-off-by: Ken Matsui > > > --- > > > gcc/cp/cp-trait.def | 1 + > > > gcc/cp/semantics.cc | 4 ++ > > > gcc/testsuite/g++.dg/ext/has-builtin-1.C | 3 ++ > > > gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++ > > > 4 files changed, 59 insertions(+) > > > create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C > > > > > > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def > > > index 8b7fece0cc8..07823e55579 100644 > > > --- a/gcc/cp/cp-trait.def > > > +++ b/gcc/cp/cp-trait.def > > > @@ -90,6 +90,7 @@ DEFTRAIT_EXPR (IS_DEDUCIBLE, "__is_deducible ", 2) > > > DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1) > > > DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1) > > > DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1) > > > +DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1) > > > DEFTRAIT_TYPE (UNDERLYING_TYPE, "__underlying_type", 1) > > > DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1) > > > > > > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc > > > index 8fb47fd179e..885c7a6fb64 100644 > > > --- a/gcc/cp/semantics.cc > > > +++ b/gcc/cp/semantics.cc > > > @@ -12373,6 +12373,10 @@ finish_trait_type (cp_trait_kind kind, tree > > > type1, tree type2, > > >if (TYPE_REF_P (type1)) > > > type1 = TREE_TYPE (type1); > > >return cv_unqualified (type1); > > > +case CPTK_REMOVE_POINTER: > > > + if (TYPE_PTR_P (type1)) > > > +type1 = TREE_TYPE (type1); > > > + return type1; > > Maybe add a newline before the 'case' to visually separate it from the > previous 'case'? LGTM otherwise, thanks! Will do! Thank you! > > > > > > case CPTK_TYPE_PACK_ELEMENT: > > >return finish_type_pack_element (type1, type2, complain); > > > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > > index f343e153e56..e21e0a95509 100644 > > > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C > > > @@ -1