> -----Original Message-----
> From: H.J. Lu <[email protected]>
> Sent: Tuesday, May 6, 2025 2:16 PM
> To: Liu, Hongtao <[email protected]>
> Cc: GCC Patches <[email protected]>; Uros Bizjak
> <[email protected]>
> Subject: Re: [PATCH] x86: Skip if the mode size is smaller than its natural
> size
>
> On Tue, May 6, 2025 at 10:54 AM Liu, Hongtao <[email protected]>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: H.J. Lu <[email protected]>
> > > Sent: Thursday, May 1, 2025 6:39 AM
> > > To: GCC Patches <[email protected]>; Uros Bizjak
> > > <[email protected]>; Liu, Hongtao <[email protected]>
> > > Subject: [PATCH] x86: Skip if the mode size is smaller than its
> > > natural size
> > >
> > > When generating a SUBREG from V16QI to V2HF, validate_subreg fails
> > > since the V2HF size (4 bytes) is smaller than its natural size (word
> > > size).
> > > Update remove_redundant_vector_load to skip if the mode size is
> > > smaller than its natural size.
> > I think we can also handle it in replace_vector_const by inserting an
> > extra move with (Set (reg:v4qi) (subreg:v4qi (v16qi const0_rtx) 0))
> > And then use subreg with same vector size (v2hf<->v4qi) (set
> > (reg:v2hf) (subreg:v2hf (reg:v4qi) 0))
>
> What is the advantage of this approach? My patch uses a single instruction to
> write 4 bytes of 0s and 1s. Your suggestion needs at least one more
> instruction.
I'm not asking to do it for all the cases, just to handle those cases with
invalid subreg
@@ -3334,8 +3334,11 @@ replace_vector_const (machine_mode vector_mode, rtx
vector_const,
machine_mode mode = GET_MODE (dest);
rtx replace;
+ if (!validate_subreg (mode, vector_mode, vector_const, 0))
+ /* Insert an extra move to avoid invalid subreg. */
+ .........
/* Replace the source operand with VECTOR_CONST. */
- if (SUBREG_P (dest) || mode == vector_mode)
+ else if (SUBREG_P (dest) || mode == vector_mode)
replace = vector_const;
else
replace = gen_rtx_SUBREG (mode, vector_const, 0);
For valid subreg, no need for extra instruction.
I think RA can eliminate the extra move, then the optimization is not limited
to "the mode size is smaller than its natural size".
>
> > I think this can also pass validate_subreg.
> > >
> > > gcc/
> > >
> > > PR target/120036
> > > * config/i386/i386-features.cc (remove_redundant_vector_load):
> > > Also skip if the mode size is smaller than its natural size.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/120036
> > > * g++.target/i386/pr120036.C: New test.
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.