https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103905

--- Comment #6 from Uroš Bizjak <ubizjak at gmail dot com> ---
@Jakub: It looks the problem is in expand_vec_perm_pshufb, where permutation
vector is recalculated for partial vectors:

  if (vmode == V4QImode
      || vmode == V8QImode)
    {
      rtx m128 = GEN_INT (-128);

      /* Remap elements from the second operand, as we have to
         account for inactive top elements from the first operand.  */
      if (!d->one_operand_p)
        {
          int sz = GET_MODE_SIZE (vmode);

          for (i = 0; i < nelt; ++i)
            {
              int ival = INTVAL (rperm[i]);
              if (ival >= sz)
                ival += 16-sz;
              rperm[i] = GEN_INT (ival);
            }
        }

      /* V4QI/V8QI is emulated with V16QI instruction, fill inactive
         elements in the top positions with zeros.  */
      for (i = nelt; i < 16; ++i)
        rperm[i] = m128;

      vpmode = V16QImode;
    }

I must admit I only eyeballed the generated code, so perhaps there lies the
dragon.

Reply via email to