Hongtao Liu <[email protected]> writes:
>> > + && (GET_MODE_NUNITS (mode)).is_constant (&l1)
>> > + && (GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0))))
>> > + .is_constant (&l2)
>> > + && known_le (l1, l2)
>>
>> I'm not sure the last two &&s are really the important condition.
>> I think we should drop them for the suggestion below.
>>
>
> Changed, assume gcc also support something like (vec_select:v4di
> (reg:v2di) (parallel [ (const_int 0) (const_int 1) (const_int 1)
> (const_int 0)]))
> as long as the range of selection guaranteed by
> || maybe_ge (UINTVAL (idx) + subreg_offset, nunits))
Yeah, that vec_select looks OK.
>>
>> > + if (!CONST_INT_P (idx))
>>
>> Here I think we should check:
>>
>> || maybe_ge (UINTVAL (idx) + subreg_offset, nunits))
>>
>> where:
>>
>> poly_uint64 nunits
>> = GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))).
>>
>
> Changed.
>
>> This makes sure that all indices are in range. In particular, it's
>> valid for the SUBREG_REG to be narrower than mode, for appropriate
>> vec_select indices
>>
>
> Yes, that's what paradoxical subreg means.
But I was comparing the mode of the vec_select with the mode of the
SUBREG_REG (rather than the mode of trueop0 with the mode of the
SUBREG_REG, which is what matters for paradoxical subregs).
> + /* Simplify vec_select of a subreg of X to just a vec_select of X
> + when X has same component mode as vec_select. */
> + unsigned HOST_WIDE_INT subreg_offset = 0;
> + if (GET_CODE (trueop0) == SUBREG
> + && GET_MODE_INNER (mode)
> + == GET_MODE_INNER (GET_MODE (SUBREG_REG (trueop0)))
> + && (GET_MODE_NUNITS (mode)).is_constant (&l1)
Unnecessary brackets around “GET_MODE_NUNITS (mode)”.
> + && constant_multiple_p (SUBREG_BYTE (trueop0),
> + GET_MODE_UNIT_BITSIZE (mode),
> + &subreg_offset))
Sorry, my bad, this should be:
&& constant_multiple_p (subreg_memory_offset (trueop0),
GET_MODE_UNIT_BITSIZE (mode),
&subreg_offset))
> + {
> + gcc_assert (XVECLEN (trueop1, 0) == l1);
> + bool success = true;
> + poly_uint64 nunits
> + = GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)));
> + for (int i = 0; i != l1; i++)
> + {
> + rtx idx = XVECEXP (trueop1, 0, i);
> + if (!CONST_INT_P (idx)
> + || maybe_ge (UINTVAL (idx) + subreg_offset, nunits))
> + {
> + success = false;
> + break;
> + }
> + }
> + if (success)
> + {
> + rtx par = trueop1;
> + if (subreg_offset)
> + {
> + rtvec vec = rtvec_alloc (l1);
> + for (int i = 0; i < l1; i++)
> + RTVEC_ELT (vec, i)
> + = GEN_INT (INTVAL (XVECEXP (trueop1, 0, i)
> + + subreg_offset));
This is applying subreg_offset to the pointer rather than the INTVAL.
It should be:
= GEN_INT (UINTVAL (XVECEXP (trueop1, 0, i))
+ subreg_offset);
OK with those changes, thanks.
Richard