On Tue, Nov 17, 2020 at 8:05 AM Jeff Law <l...@redhat.com> wrote: > > > On 9/2/20 3:34 AM, Hongtao Liu via Gcc-patches wrote: > > Hi: > > Add define_peephole2 to eliminate potential redundant conversion > > from mask to vector. > > Bootstrap is ok, regression test is ok for i386/x86-64 backend. > > Ok for trunk? > > > > gcc/ChangeLog: > > PR target/96891 > > * config/i386/sse.md (VI_128_256): New mode iterator. > > (define_peephole2): Lower avx512 vector compare to avx version > > when dest is vector. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/avx512bw-pr96891-1.c: New test. > > * gcc.target/i386/avx512f-pr96891-1.c: New test. > > * gcc.target/i386/avx512f-pr96891-2.c: New test. > > Aren't these the two insns in question: > > > (insn 7 4 8 2 (set (reg:QI 86) > (unspec:QI [ > (reg:V8SF 90) > (reg:V8SF 89) > (const_int 2 [0x2]) > ] UNSPEC_PCMP)) "j.c":4:14 1911 {avx512vl_cmpv8sf3} > (expr_list:REG_DEAD (reg:V8SF 90) > (expr_list:REG_DEAD (reg:V8SF 89) > (nil)))) > (note 8 7 9 2 NOTE_INSN_DELETED) > (insn 9 8 14 2 (set (reg:V8SI 82 [ _2 ]) > (vec_merge:V8SI (const_vector:V8SI [ > (const_int -1 [0xffffffffffffffff]) repeated x8 > ]) > (const_vector:V8SI [ > (const_int 0 [0]) repeated x8 > ]) > (reg:QI 86))) "j.c":4:14 2705 {*avx512vl_cvtmask2dv8si} > (expr_list:REG_DEAD (reg:QI 86) > (nil))) > > > Note there's a data dependency between them. insn 7 feeds insn 9. When > there's a data dependency, combiner patterns are usually the better > choice than peepholes. I think you'd be looking to match something > likethis (from the . combine dump): > > (set (reg:V8SI 82 [ _2 ]) > (vec_merge:V8SI (const_vector:V8SI [ > (const_int -1 [0xffffffffffffffff]) repeated x8 > ]) > (const_vector:V8SI [ > (const_int 0 [0]) repeated x8 > ]) > (unspec:QI [ > (reg:V8SF 90) > (reg:V8SF 89) > (const_int 2 [0x2]) > ] UNSPEC_PCMP))) > > > Jeff >
Yes, as discussed in [1], maybe it's better to refactor avx512 integer mask with VnBImode. Then unspec_pcmp could be dropped and simplify_rtx could handle vector comparison more effectively. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521#c4 -- BR, Hongtao