On 06/12/2019 18:21, Richard Sandiford wrote:
Andrew Stubbs <andrew_stu...@mentor.com> writes:
Hi all,
This patch re-enables the V64QImode and V64HImode for GCN.
GCC does not make these easy to work with because there is (was?) an
assumption that vector registers do not have excess bits in vector
registers, and therefore does not need to worry about truncating or
extending smaller types, when vectorized. This is not true on GCN where
each vector lane is always at least 32-bits wide, so we only really
implement loading at storing these vectors modes (for now).
FWIW, partial SVE modes work the same way, and this is supposed to be
supported now. E.g. SVE's VNx4QI is a vector of QIs stored in SI
containers; in other words, it's a VNx4SI in which only the low 8 bits
of each SI are used.
sext_optab, zext_optab and trunc_optab now support vector modes,
so e.g. extendv64qiv64si2 provides sign extension from V64QI to V64SI.
At the moment, in-register truncations like truncv64siv16qi2 have to
be provided as patterns, even though they're no-ops for the target
machine, since they're not no-ops in rtl terms.
And the main snag is rtl, because this isn't the way GCC expects vector
registers to be laid out. It looks like you already handle that in
TARGET_CAN_CHANGE_MODE_CLASS and TARGET_SECONDARY_RELOAD though.
For SVE, partial vector loads are actually extending loads and partial
vector stores are truncating stores. Maybe it's the same for amdgcn.
If so, there's a benefit to providing both native movv64qis
and V64QI->V64SI extending loads, i.e. a combine pattern the fuses
movv64qi with a sign_extend or zero_extend.
(Probably none of that is news, sorry, just saying in case.)
Thanks, Richard.
That it's now supposed to work is news to me; good news! :-)
GCN has both unsigned and signed subword loads, so we should be able to
have both independent and combined loads.
How does the middle end know that QImode and HImode should be extended
before use? Is there a hook for that?
I suppose I need to go read what you changed in the internals documentation.
Andrew