On 5/5/2022 2:26 AM, Richard Biener via Gcc-patches wrote:
On Thu, May 5, 2022 at 7:04 AM liuhongt <hongtao....@intel.com> wrote:
Optimize

   _1 = *srcp_3(D);
   _4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>;
   _5 = BIT_FIELD_REF <_4, 128, 0>;

to

   _1 = *srcp_3(D);
   _5 = BIT_FIELD_REF <_1, 128, 128>;

the upper will finally be optimized to

_5 = BIT_FIELD_REF <*srcp_3(D), 128, 128>;

Bootstrapped and regtested on x86_64-pc-linux-gnu{m32,}.
Ok for trunk?
Hmm, tree-ssa-forwprop.cc:simplify_bitfield_ref should already
handle this in the

   if (code == VEC_PERM_EXPR
       && constant_multiple_p (bit_field_offset (op), size, &idx))
     {

part of the code - maybe that needs to be enhanced to cover
a contiguous stride in the VEC_PERM_EXPR.  I see
we have

   size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
   if (maybe_ne (bit_field_size (op), size))
     return false;

where it will currently bail, so adjust that to check for a
constant multiple.  I also think we should only handle the
case where the new bit_field_offset alignment is not
worse than the original one.

That said, I'd prefer if you integrate this transform with
simplify_bitfield_ref.
I've got a hack here that tries to do something similar, but it's trying to catch the case where we CONSTRUCTOR feeds the BIT_FIELD_REF.  It walks the CONSTRUCTOR elements to see if an element has the right offset/size to satisify the BIT_FIELD_REF. For x264 we're often able to eliminate the VEC_PERMUTE entirely and just forward operands into the BIT_FIELD_REF.

I was leaning towards moving those bits into match.pd before submitting, but if you'd prefer them in tree-ssa-forwprop, that's even easier.

Jeff


Reply via email to