On 11/16/20 8:10 PM, Hongtao Liu wrote:
> On Tue, Nov 17, 2020 at 8:05 AM Jeff Law <l...@redhat.com> wrote:
>>
>> On 9/2/20 3:34 AM, Hongtao Liu via Gcc-patches wrote:
>>> Hi:
>>>   Add define_peephole2 to eliminate potential redundant conversion
>>> from mask to vector.
>>>   Bootstrap is ok, regression test is ok for i386/x86-64 backend.
>>>   Ok for trunk?
>>>
>>> gcc/ChangeLog:
>>>         PR target/96891
>>>         * config/i386/sse.md (VI_128_256): New mode iterator.
>>>         (define_peephole2): Lower avx512 vector compare to avx version
>>>         when dest is vector.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>         * gcc.target/i386/avx512bw-pr96891-1.c: New test.
>>>         * gcc.target/i386/avx512f-pr96891-1.c: New test.
>>>         * gcc.target/i386/avx512f-pr96891-2.c: New test.
>> Aren't these the two insns in question:
>>
>>
>> (insn 7 4 8 2 (set (reg:QI 86)
>>         (unspec:QI [
>>                 (reg:V8SF 90)
>>                 (reg:V8SF 89)
>>                 (const_int 2 [0x2])
>>             ] UNSPEC_PCMP)) "j.c":4:14 1911 {avx512vl_cmpv8sf3}
>>      (expr_list:REG_DEAD (reg:V8SF 90)
>>         (expr_list:REG_DEAD (reg:V8SF 89)
>>             (nil))))
>> (note 8 7 9 2 NOTE_INSN_DELETED)
>> (insn 9 8 14 2 (set (reg:V8SI 82 [ _2 ])
>>         (vec_merge:V8SI (const_vector:V8SI [
>>                     (const_int -1 [0xffffffffffffffff]) repeated x8
>>                 ])
>>             (const_vector:V8SI [
>>                     (const_int 0 [0]) repeated x8
>>                 ])
>>             (reg:QI 86))) "j.c":4:14 2705 {*avx512vl_cvtmask2dv8si}
>>      (expr_list:REG_DEAD (reg:QI 86)
>>         (nil)))
>>
>>
>> Note there's a data dependency between them.  insn 7 feeds insn 9.  When
>> there's a data dependency, combiner patterns are usually the better
>> choice than peepholes.  I think you'd be looking to match something
>> likethis (from the . combine dump):
>>
>> (set (reg:V8SI 82 [ _2 ])
>>     (vec_merge:V8SI (const_vector:V8SI [
>>                 (const_int -1 [0xffffffffffffffff]) repeated x8
>>             ])
>>         (const_vector:V8SI [
>>                 (const_int 0 [0]) repeated x8
>>             ])
>>         (unspec:QI [
>>                 (reg:V8SF 90)
>>                 (reg:V8SF 89)
>>                 (const_int 2 [0x2])
>>             ] UNSPEC_PCMP)))
>>
>>
>> Jeff
>>
> Yes, as discussed in [1], maybe it's better to refactor avx512 integer
> mask with VnBImode. Then unspec_pcmp could be dropped and simplify_rtx
> could handle vector comparison more effectively.
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521#c4
Thanks for the pointer.   I didn't realize this patch was essentially
abandoned.

Jeff

Reply via email to