https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119368
Hongtao Liu <liuhongt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |liuhongt at gcc dot gnu.org
--- Comment #3 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> So it seems sse.md expects combiner to be able to simplify the vec_select of
> mem into shorter mem, while combiner doesn't do that?
There're related codes in simplify-rtx.cc
4870 /* If we select a low-part subreg, return that. */
4871 if (vec_series_lowpart_p (mode, GET_MODE (trueop0), trueop1))
4872 {
4873 rtx new_rtx = lowpart_subreg (mode, trueop0,
4874 GET_MODE (trueop0));
4875 if (new_rtx != NULL_RTX)
4876 return new_rtx;
4877 }
but it relies on targetm.can_change_mode_class (op_mode, result_mode, ALL_REGS)
which return false for x86.
7017/* Return true if, for all OP of mode OP_MODE:
7018
7019 (vec_select:RESULT_MODE OP SEL)
7020
7021 is equivalent to the lowpart RESULT_MODE of OP. */
7022
7023bool
7024vec_series_lowpart_p (machine_mode result_mode, machine_mode op_mode, rtx
sel)
7025{
7026 int nunits;
7027 if (GET_MODE_NUNITS (op_mode).is_constant (&nunits)
7028 && targetm.can_change_mode_class (op_mode, result_mode, ALL_REGS))
7029 {
7030 int offset = BYTES_BIG_ENDIAN ? nunits - XVECLEN (sel, 0) : 0;
7031 return rtvec_series_p (XVEC (sel, 0), offset);
7032 }
7033 return false;
7034}
7035
I once tries to enable it for x86(always use subreg instead of vec_select), but
it regressed lots of testcases, some of which needs backend pattern changes,
some of which needs middle-end adjustment.
But for this case, I think targetm.can_change_mode_class (op_mode, result_mode,
ALL_REGS) is not needed since it's memory.