On Tue, Nov 17, 2020 at 8:05 AM Jeff Law <l...@redhat.com> wrote:
>
>
> On 9/2/20 3:34 AM, Hongtao Liu via Gcc-patches wrote:
> > Hi:
> >   Add define_peephole2 to eliminate potential redundant conversion
> > from mask to vector.
> >   Bootstrap is ok, regression test is ok for i386/x86-64 backend.
> >   Ok for trunk?
> >
> > gcc/ChangeLog:
> >         PR target/96891
> >         * config/i386/sse.md (VI_128_256): New mode iterator.
> >         (define_peephole2): Lower avx512 vector compare to avx version
> >         when dest is vector.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/i386/avx512bw-pr96891-1.c: New test.
> >         * gcc.target/i386/avx512f-pr96891-1.c: New test.
> >         * gcc.target/i386/avx512f-pr96891-2.c: New test.
>
> Aren't these the two insns in question:
>
>
> (insn 7 4 8 2 (set (reg:QI 86)
>         (unspec:QI [
>                 (reg:V8SF 90)
>                 (reg:V8SF 89)
>                 (const_int 2 [0x2])
>             ] UNSPEC_PCMP)) "j.c":4:14 1911 {avx512vl_cmpv8sf3}
>      (expr_list:REG_DEAD (reg:V8SF 90)
>         (expr_list:REG_DEAD (reg:V8SF 89)
>             (nil))))
> (note 8 7 9 2 NOTE_INSN_DELETED)
> (insn 9 8 14 2 (set (reg:V8SI 82 [ _2 ])
>         (vec_merge:V8SI (const_vector:V8SI [
>                     (const_int -1 [0xffffffffffffffff]) repeated x8
>                 ])
>             (const_vector:V8SI [
>                     (const_int 0 [0]) repeated x8
>                 ])
>             (reg:QI 86))) "j.c":4:14 2705 {*avx512vl_cvtmask2dv8si}
>      (expr_list:REG_DEAD (reg:QI 86)
>         (nil)))
>
>
> Note there's a data dependency between them.  insn 7 feeds insn 9.  When
> there's a data dependency, combiner patterns are usually the better
> choice than peepholes.  I think you'd be looking to match something
> likethis (from the . combine dump):
>
> (set (reg:V8SI 82 [ _2 ])
>     (vec_merge:V8SI (const_vector:V8SI [
>                 (const_int -1 [0xffffffffffffffff]) repeated x8
>             ])
>         (const_vector:V8SI [
>                 (const_int 0 [0]) repeated x8
>             ])
>         (unspec:QI [
>                 (reg:V8SF 90)
>                 (reg:V8SF 89)
>                 (const_int 2 [0x2])
>             ] UNSPEC_PCMP)))
>
>
> Jeff
>

Yes, as discussed in [1], maybe it's better to refactor avx512 integer
mask with VnBImode. Then unspec_pcmp could be dropped and simplify_rtx
could handle vector comparison more effectively.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521#c4
-- 
BR,
Hongtao

Reply via email to