On 9/2/20 3:34 AM, Hongtao Liu via Gcc-patches wrote: > Hi: > Add define_peephole2 to eliminate potential redundant conversion > from mask to vector. > Bootstrap is ok, regression test is ok for i386/x86-64 backend. > Ok for trunk? > > gcc/ChangeLog: > PR target/96891 > * config/i386/sse.md (VI_128_256): New mode iterator. > (define_peephole2): Lower avx512 vector compare to avx version > when dest is vector. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/avx512bw-pr96891-1.c: New test. > * gcc.target/i386/avx512f-pr96891-1.c: New test. > * gcc.target/i386/avx512f-pr96891-2.c: New test.
Aren't these the two insns in question: (insn 7 4 8 2 (set (reg:QI 86) (unspec:QI [ (reg:V8SF 90) (reg:V8SF 89) (const_int 2 [0x2]) ] UNSPEC_PCMP)) "j.c":4:14 1911 {avx512vl_cmpv8sf3} (expr_list:REG_DEAD (reg:V8SF 90) (expr_list:REG_DEAD (reg:V8SF 89) (nil)))) (note 8 7 9 2 NOTE_INSN_DELETED) (insn 9 8 14 2 (set (reg:V8SI 82 [ _2 ]) (vec_merge:V8SI (const_vector:V8SI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ]) (const_vector:V8SI [ (const_int 0 [0]) repeated x8 ]) (reg:QI 86))) "j.c":4:14 2705 {*avx512vl_cvtmask2dv8si} (expr_list:REG_DEAD (reg:QI 86) (nil))) Note there's a data dependency between them. insn 7 feeds insn 9. When there's a data dependency, combiner patterns are usually the better choice than peepholes. I think you'd be looking to match something likethis (from the . combine dump): (set (reg:V8SI 82 [ _2 ]) (vec_merge:V8SI (const_vector:V8SI [ (const_int -1 [0xffffffffffffffff]) repeated x8 ]) (const_vector:V8SI [ (const_int 0 [0]) repeated x8 ]) (unspec:QI [ (reg:V8SF 90) (reg:V8SF 89) (const_int 2 [0x2]) ] UNSPEC_PCMP))) Jeff