Hongtao Liu via Gcc-patches <[email protected]> writes:
> + /* Simplify vec_select of a subreg of X to just a vec_select of X
> + when X has same component mode as vec_select. */
> + int l2;
> + if (GET_CODE (trueop0) == SUBREG
> + && GET_MODE_INNER (mode)
> + == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
Better to use SUBREG_REG here and below.
> + && (GET_MODE_NUNITS (GET_MODE (trueop0))).is_constant (&l0)
> + && (GET_MODE_NUNITS (mode)).is_constant (&l1)
> + && (GET_MODE_NUNITS (GET_MODE (XEXP (trueop0, 0))))
> + .is_constant (&l2)
> + && known_le (l1, l2))
> + {
> + unsigned HOST_WIDE_INT subreg_offset = 0;
> + gcc_assert (known_eq (XVECLEN (trueop1, 0), l1));
> + gcc_assert (can_div_trunc_p (exact_div (subreg_lsb (trueop0),
> BITS_PER_UNIT),
> + GET_MODE_SIZE (GET_MODE_INNER
> (mode)),
> + &subreg_offset));
can_div_trunc_p discards the remainder, whereas it looks like here
you want an exact multiple.
I don't think it's absolutely guaranteed that the “if” condition makes
the division by GET_MODE_SIZE exact. E.g. in principle you could have
a subreg of a vector of TIs in which the subreg offset is misaligned by
a DI offset.
I'm not sure the subreg_lsb conversion is correct though. On big-endian
targets, lane numbering follows memory layout, just like subreg byte
offsets do. So ISTM that using SUBREG_BYTE (as per the earlier patch)
was correct.
In summary, I think the "if” condition should include something like:
constant_mulitple_p (SUBREG_BYTE (trueop0),
GET_MODE_UNIT_BITSIZE (mode),
&subreg_offset)
Thanks,
Richard