https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115069

--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #8)
> A better patch:

The real issue is that the following permutation (truncation):

+      for (i = 0; i < d.nelt; ++i)
+       d.perm[i] = i * 2;
+
+      ok = ix86_expand_vec_perm_const_1 (&d);

results in a slow code involving VPERMQ. Ideally, ix86_expand_vec_perm_const_1
should emit faster code for truncation, because this will benefit other code as
well.

Reply via email to