RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

Tamar Christina Tue, 11 Mar 2025 17:24:09 -0700

> -----Original Message-----
> From: Richard Sandiford <[email protected]>
> Sent: Wednesday, March 5, 2025 11:27 AM
> To: Tamar Christina <[email protected]>
> Cc: [email protected]; nd <[email protected]>; Richard Earnshaw
> <[email protected]>; [email protected]
> Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last
> extractions [PR118464]
> 
> Tamar Christina <[email protected]> writes:
> >> > diff --git a/gcc/config/aarch64/aarch64-sve.md
> b/gcc/config/aarch64/aarch64-
> >> sve.md
> >> > index
> >>
> e975286a01904bec0b283b7ba4afde6f0fd60bf1..6c0be3c1a51449274720175b
> >> 5e6e7d7535928de6 100644
> >> > --- a/gcc/config/aarch64/aarch64-sve.md
> >> > +++ b/gcc/config/aarch64/aarch64-sve.md
> >> > @@ -3107,7 +3107,7 @@ (define_insn "@extract_<last_op>_<mode>"
> >> >    [(set (match_operand:<VEL> 0 "register_operand")
> >> >          (unspec:<VEL>
> >> >            [(match_operand:<VPRED> 1 "register_operand")
> >> > -           (match_operand:SVE_FULL 2 "register_operand")]
> >> > +           (match_operand:SVE_ALL 2 "register_operand")]
> >> >            LAST))]
> >> >    "TARGET_SVE"
> >> >    {@ [ cons: =0 , 1   , 2  ]
> >>
> >> It looks like this will use (say):
> >>
> >>   lasta b<n>, pg, z<m>.b
> >>
> >> for VNx4QI, is that right?  I don't think that's safe, since the .b form
> >> treats all bits of the pg input as significant, whereas only one in every
> >> four bits of pg is defined for VNx4BI (the predicate associated with 
> >> VNx4QI).
> >>
> >> I think converting these patterns to partial vectors means operating
> >> on containers rather than elements.  E.g. the VNx4QI case should use
> >> .s rather than .b.  That should just be a case of changing vwcore to
> >> vccore and Vetype to Vctype, but I haven't looked too deeply.
> >
> > Ah I see, so for partial types, the values are not expected to be packed in 
> > the
> lower
> > part of the vector, but instead are "padded"?
> 
> Right.
> 
> > That explains some of the other patterns
> > I was confused about.
> >
> > Any ideas how to test these? It's hard to control what modes the vectorizer
> picks..
> 
> Yeah, agreed.  I think it'd be difficult to trigger it reliably from the
> vectoriser given its current limited use of the ifns.
> 
> A gimple frontend test might work though, with a predicate/mask
> generated from (say) 16-bit elements, then bitcast to a predicate/mask
> for 32-bit elements and used as an input to an explicit ifn on 32-bit
> elements.  If the 16-bit predicate contains 0, 1 for some even-aligned
> pair, after the last 1, 0 aligned pair, then the code would likely have
> picked the wrong element.


*insert chaos emoji* I realized the testcases in the testsuite early break
vect tests needed this since that's how I noticed it anyway.. so it's enough
to run that with SVE enabled (as there are tests depending on partial vectors
anyway)

So...

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:


        PR tree-optimization/118464
        PR tree-optimization/116855
        * config/aarch64/aarch64-sve.md (@extract_<last_op>_<mode>,
        @fold_extract_<last_op>_<mode>,
        @aarch64_fold_extract_vector_<last_op>_<mode>): Change SVE_FULL to
        SVE_ALL/

-- inline copy of patch --

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 
a93bc463a909ea28460cc7877275fce16e05f7e6..205eeec2e35544de848e0dbb48e3f5ae59391a88
 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -3107,12 +3107,12 @@ (define_insn "@extract_<last_op>_<mode>"
   [(set (match_operand:<VEL> 0 "register_operand")
        (unspec:<VEL>
          [(match_operand:<VPRED> 1 "register_operand")
-          (match_operand:SVE_FULL 2 "register_operand")]
+          (match_operand:SVE_ALL 2 "register_operand")]
          LAST))]
   "TARGET_SVE"
   {@ [ cons: =0 , 1   , 2  ]
-     [ ?r       , Upl , w  ] last<ab>\t%<vwcore>0, %1, %2.<Vetype>
-     [ w        , Upl , w  ] last<ab>\t%<Vetype>0, %1, %2.<Vetype>
+     [ ?r       , Upl , w  ] last<ab>\t%<vwcore>0, %1, %2.<Vctype>
+     [ w        , Upl , w  ] last<ab>\t%<Vctype>0, %1, %2.<Vctype>
   }
 )
 
@@ -8899,26 +8899,26 @@ (define_insn "@fold_extract_<last_op>_<mode>"
        (unspec:<VEL>
          [(match_operand:<VEL> 1 "register_operand")
           (match_operand:<VPRED> 2 "register_operand")
-          (match_operand:SVE_FULL 3 "register_operand")]
+          (match_operand:SVE_ALL 3 "register_operand")]
          CLAST))]
   "TARGET_SVE"
   {@ [ cons: =0 , 1 , 2   , 3  ]
-     [ ?r       , 0 , Upl , w  ] clast<ab>\t%<vwcore>0, %2, %<vwcore>0, 
%3.<Vetype>
-     [ w        , 0 , Upl , w  ] clast<ab>\t%<Vetype>0, %2, %<Vetype>0, 
%3.<Vetype>
+     [ ?r       , 0 , Upl , w  ] clast<ab>\t%<vwcore>0, %2, %<vwcore>0, 
%3.<Vctype>
+     [ w        , 0 , Upl , w  ] clast<ab>\t%<Vctype>0, %2, %<Vctype>0, 
%3.<Vctype>
   }
 )
 
 (define_insn "@aarch64_fold_extract_vector_<last_op>_<mode>"
-  [(set (match_operand:SVE_FULL 0 "register_operand")
-       (unspec:SVE_FULL
-         [(match_operand:SVE_FULL 1 "register_operand")
+  [(set (match_operand:SVE_ALL 0 "register_operand")
+       (unspec:SVE_ALL
+         [(match_operand:SVE_ALL 1 "register_operand")
           (match_operand:<VPRED> 2 "register_operand")
-          (match_operand:SVE_FULL 3 "register_operand")]
+          (match_operand:SVE_ALL 3 "register_operand")]
          CLAST))]
   "TARGET_SVE"
   {@ [ cons: =0 , 1 , 2   , 3  ]
-     [ w        , 0 , Upl , w  ] clast<ab>\t%0.<Vetype>, %2, %0.<Vetype>, 
%3.<Vetype>
-     [ ?&w      , w , Upl , w  ] movprfx\t%0, %1\;clast<ab>\t%0.<Vetype>, %2, 
%0.<Vetype>, %3.<Vetype>
+     [ w        , 0 , Upl , w  ] clast<ab>\t%0.<Vctype>, %2, %0.<Vctype>, 
%3.<Vctype>
+     [ ?&w      , w , Upl , w  ] movprfx\t%0, %1\;clast<ab>\t%0.<Vctype>, %2, 
%0.<Vctype>, %3.<Vctype>
   }
 )

rb19246.patch
Description: rb19246.patch

RE: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

Reply via email to