https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92645

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |uros at gcc dot gnu.org

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it's probably better but not great yet.  It would help tremendously to look
at PR92658, for this testcase we specifically need truncv16hiv16qi2 which
can be implemented with pshufb (maybe also better).  I tried the following
(the upper half of the selector is just "garbage")

(define_expand "truncv16hiv16qi2"
  [(set (subreg:V32QI (match_operand:V16QI 0 "register_operand") 0)
        (vec_select:V32QI
          (subreg:V32QI (match_operand:V16HI 1 "register_operand") 0)
          (parallel [(const_int 0) (const_int 2)
                     (const_int 4) (const_int 6)
                     (const_int 8) (const_int 10)
                     (const_int 12) (const_int 14)
                     (const_int 16) (const_int 18)
                     (const_int 20) (const_int 22)
                     (const_int 24) (const_int 26)
                     (const_int 28) (const_int 30)
                     (const_int 0) (const_int 2)
                     (const_int 4) (const_int 6)
                     (const_int 8) (const_int 10)
                     (const_int 12) (const_int 14)
                     (const_int 16) (const_int 18)
                     (const_int 20) (const_int 22)
                     (const_int 24) (const_int 26)
                     (const_int 28) (const_int 30)
                     ])))]
  "TARGET_AVX2") 

but that isn't recognized.  Possibly because of the outer subreg, who
knows.

(define_insn "truncv16hiv16qi2"
 [(set (match_operand:V16QI 0 "register_operand" "=x,v")
       (truncate:V16QI
        (match_operand:V16HI 1 "register_operand" "x,v")))]
 "TARGET_AVX2"
 "@
  pshufb\t{%1, %0|%0, %1}
  vpshufb\t{%1, %0|%0, %1}")

"works" but of course is wrong (somehow need the constant mask in a
register).  The rest of the backend also doesn't know truncate of
vectors so representing as shuffles is probably better.

I also wonder how to macroize all this - probably via some
helpers in i386-expand.c I guess.

But I can also work with the pack/unpack tree codes for now.

Reply via email to