On Thu, Oct 15, 2020 at 1:37 AM Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
>
> Hi!
>
> On Wed, Oct 14, 2020 at 01:43:45PM +0800, Hongtao Liu wrote:
> > On Wed, Oct 14, 2020 at 4:01 AM Segher Boessenkool
> > <seg...@kernel.crashing.org> wrote:
> > > On Tue, Oct 13, 2020 at 04:40:53PM +0800, Hongtao Liu wrote:
> > > >   For rtx like
> > > >   (vec_select:V2SI (subreg:V4SI (inner:V2SI) 0)
> > > >                    (parallel [(const_int 0) (const_int 1)]))
> > > >  it could be simplified as inner.
> > >
> > > You could even simplify any vec_select of a subreg of X to just a
> > > vec_select of X, by changing the selection vector a bit (well, only do
> >
> > Yes, when SUBREG_BYTE of trueop0 is not 0, we need to add offset to 
> > selection.
>
> Exactly.
>
> > > this if that is a constant vector, I suppose).  Not just for paradoxical
> > > subregs either, just for *all* subregs.
> >
> > Yes, and only when X has the same inner mode and more elements.
>
> No, for *all*.  The mode of the first argument of vec_select does not
> have to equal its result mode.
>
> Any (constant indices) vec_select of a subreg can be written as just a
> vec_select.
>
> > +       /* Simplify vec_select of a subreg of X to just a vec_select of X
> > +          when available.  */
>
> What does "when available" mean here?
>

When X has same component mode as vec_select, i'll add this to comments.

> > +       int l2;
> > +       if (GET_CODE (trueop0) == SUBREG
> > +           && (GET_MODE_INNER (mode)
> > +               == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0))))
>
> Don't use unnecessary parens here please, it makes it harder to read
> (there are quite enough parens already :-) )
>

Yes.

> > +           gcc_assert (can_div_trunc_p (SUBREG_BYTE (trueop0),
> > +                                        GET_MODE_SIZE (GET_MODE_INNER 
> > (mode)),
> > +                                        &subreg_offset));
>
> Why is this needed?

I only found this interface for poly_uint64 division to get subreg_offset.

>
> > +           bool success = true;
> > +           for (int i = 0;i != l1; i++)
>
> (space after ; )
>
> > +             {
> > +               rtx j = XVECEXP (trueop1, 0, i);
>
> (i and j and k ususally are integers, not rtx)
>
> > +               if (!CONST_INT_P (j)
> > +                   || known_ge (UINTVAL (j), l2 - subreg_offset))
> > +                 {
> > +                   success = false;
> > +                   break;
> > +                 }
> > +             }
>
> You don't have to test if the input RTL is valid.  You can assume it is.
>

This test is for something like (vec_select:v2di (subreg:v4di
(reg:v2di) 0)(parallel [ (const_int 2) (const_int 3)])).
const_int 2 here is out of range. Are you meaning the upper rtx wouldn't exist?

> > +           if (success)
> > +             {
> > +               rtx par = trueop1;
> > +               if (subreg_offset)
> > +                 {
> > +                   rtvec vec = rtvec_alloc (l1);
> > +                   for (int i = 0; i < l1; i++)
> > +                     RTVEC_ELT (vec, i)
> > +                       = GEN_INT (INTVAL (XVECEXP (trueop1, 0, i)
> > +                                          + subreg_offset));
> > +                   par = gen_rtx_PARALLEL (VOIDmode, vec);
> > +                 }
> > +               return gen_rtx_VEC_SELECT (mode, XEXP (trueop0, 0), par);
> > +             }
> > +         }
>
> subreg_offset will differ in meaning if big-endian; is this correct
Yes.
> there, do all the stars align so this code works out fine there as well?
>

i found it's a bit tricky to adjust selection index for target
BYTES_BIG_ENDIA != WORDS_BIG_ENDIAN.
Especially for component mode smaller than word, Any interface to handle this?

> Looks fine otherwise, thanks :-)
>
>
> Segher



-- 
BR,
Hongtao

Reply via email to