https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 33763
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33763&action=edit
gcc5-pr63594-wip2.patch

Updated WIP patch, which attempts to generate better code using inter-unit
moves, but have also memory as an alternative, so it allows RA to choose what
is best.  This still generates non-perfect code for V2DI/V4DI loads from GPRs
without -mavx512f (but e.g. vec_concatv2di uses Yi constraint).
And, for AVX512-{F,BW,VL}, I'm surprised that the broadcasts from gprs are done
as different instructions from broadcasts from memory or vector reg, I would
have thought that must have been done using a single insn with alternatives.

Reply via email to