On 5/9/19 10:54 PM, Hongtao Liu wrote:
> On Fri, May 10, 2019 at 3:55 AM Jeff Law <l...@redhat.com> wrote:
>>
>> On 5/6/19 11:38 PM, Hongtao Liu wrote:
>>> Hi Uros and GCC:
>>>   This patch is to fix ix86_expand_sse_comi_round whose implementation
>>> was not correct.
>>>   New implentation aligns with _mm_cmp_round_s[sd]_mask.
>>>
>>> Bootstrap and regression tests for x86 is fine.
>>> Ok for trunk?
>>>
>>>
>>> ChangeLog:
>>> gcc/
>>>    * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
>>>    Modified, original implementation isn't correct.
>>>
>>> gcc/testsuite
>>>    * gcc.target/i386/avx512f-vcomisd-2.c: New runtime tests.
>>>    * gcc.target/i386/avx512f-vcomisd-2.c: Likewise.
>> So you'll have to bear with me, I'm not really familiar with this code,
>> but in the absence of a maintainer I'll try to work through it.
>>
>>
>>>
>>> -- BR, Hongtao
>>>
>>>
>>> 0001-Fix-ix86_expand_sse_comi_round.patch
>>>
>>> Index: gcc/ChangeLog
>>> ===================================================================
>>> --- gcc/ChangeLog     (revision 270933)
>>> +++ gcc/ChangeLog     (working copy)
>>> @@ -1,3 +1,11 @@
>>> +2019-05-06  H.J. Lu  <hongjiu...@intel.com>
>>> +         Hongtao Liu  <hongtao....@intel.com>
>>> +
>>> +     PR Target/89750
>>> +     PR Target/86444
>>> +     * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
>>> +     Modified, original implementation isn't correct.
>>> +
>>>  2019-05-06  Segher Boessenkool  <seg...@kernel.crashing.org>
>>>
>>>       * config/rs6000/rs6000.md (FIRST_ALTIVEC_REGNO, LAST_ALTIVEC_REGNO)
>>> Index: gcc/config/i386/i386-expand.c
>>> ===================================================================
>>> --- gcc/config/i386/i386-expand.c     (revision 270933)
>>> +++ gcc/config/i386/i386-expand.c     (working copy)
>>> @@ -9853,18 +9853,24 @@
>>>    const struct insn_data_d *insn_p = &insn_data[icode];
>>>    machine_mode mode0 = insn_p->operand[0].mode;
>>>    machine_mode mode1 = insn_p->operand[1].mode;
>>> -  enum rtx_code comparison = UNEQ;
>>> -  bool need_ucomi = false;
>>>
>>>    /* See avxintrin.h for values.  */
>>> -  enum rtx_code comi_comparisons[32] =
>>> +  static const enum rtx_code comparisons[32] =
>> So I assume the comment refers to the _CMP_* #defines in avxintrin.h?
>>
>   Yes.
>>
>>>      {
>>> -      UNEQ, GT, GE, UNORDERED, LTGT, UNLE, UNLT, ORDERED, UNEQ, UNLT,
>>> -      UNLE, LT, LTGT, GE, GT, LT, UNEQ, GT, GE, UNORDERED, LTGT, UNLE,
>>> -      UNLT, ORDERED, UNEQ, UNLT, UNLE, LT, LTGT, GE, GT, LT
>>> +      EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
>>> +      EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED,
>>> +      EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
>>> +      EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED
>>>      };
>>
>> For CMP_EQ_UQ aren't we looking for an unordered comparison, so UNEQ
>> seems right, but you're using EQ.  Can you double-check this?  If it's
>> wrong, then please make sure we cover this case with a test.
>>
> Avx512f-vcomis[sd]-2.c covers all 32 compare predicates.
> UNEQ and EQ behave differently when either operand is NAN, besides
> they're the same.
> Since NAN operands are handled separtely, so EQ/UNEQ makes no
> difference, That why this passes cover tests.
> I'll correct it.
Ah.  Thanks.  FWIW my approach was to walk through the _CMP_* defines in
avxintrin.h and map that to what I thought the comparison ought to be.
Then I reviewed my result against your patch.  I got a couple wrong, but
could easily see my mistake.  The only one I couldn't reconcile was the
CMP_EQ_UQ.  Knowing the NaNs are handled separately makes it clear.
Thanks gain.



>>
>>
>>
>>> @@ -9932,11 +10021,37 @@
>>>      }
>>>
>>>    emit_insn (pat);
>>> +
>>> +  /* XXX: Set CCFPmode and check a different CCmode.  Does it work
>>> +     correctly?  */
>>> +  if (GET_MODE (set_dst) != mode)
>>> +    set_dst = gen_rtx_REG (mode, REGNO (set_dst));
>> This looks worrisome, even without the cryptic comment.  I don't think
>> you can just blindly change the mode like that.  Unless you happen to
>> know that the only things you test in the new mode were set in precisely
>> the same way as the old mode.
>>
> Modified as:
> +  /* NB: Set CCFPmode and check a different CCmode.  */
> +  if (GET_MODE (set_dst) != mode)
> +    set_dst = gen_rtx_REG (mode, FLAGS_REG);
That might actually be worse.  The mode carries semantic information
about where to find the various condition codes within the flags
register and which condition codes are valid.  The register number
determines which (of possibly many) flags registers we are querying.

Thus if the mode of SET_DEST is not the same as MODE, then there is a
mismatch between the point where the condition codes were set and where
we want to use them.

That can only be safe is the condition codes set in the new mode are a
strict subset of the condition codes set in the old mode and they're
found in the same place within the same flags register.

Maybe a simple example would help.  Consider a fairly standard target
with arithmetic insns that set ZNV and logicals that set ZN and clobber V.

Arithmetic insns will have a set of the flags register with one mode
(CC_ZNV for example) and logicals would use a different mode for the
flags register (CC_ZN for example).

Now consider if the architecture has memory bit operations.  Ie, you can
query the state of a bit in memory.  Unfortunately the result is stored
in the C bit of the condition code register (rather than in the Z bit
like you might hope).  Yes, this comes from a real world example.

During compare elimination, the compiler will try to eliminate a compare
insn and instead use the condition codes set by a prior insn such as an
arithmetic, logical, bit test, etc.  The mode of the flags register will
indicate which bits are valid and which bits can be tested.  So for our
hypothetical architecture we might have modes CC_ZNV, CC_ZN, CC_Z_IN_C
for arithmetic, logical and bit testing.

If we are using the result of a prior bit test to eliminate a compare,
the mode of the flags register on the bit test would be CC_Z_N_C as
would the mode of the conditional branch.

We can not blindly change the mode to CC_ZNV because the bit we want is
not in the ZNV  bits of the flag register, it's actually in the C bit.

Things might be even more complicated if there's a separate register for
floating point condition codes or multiple integer condition code registers.

Hopefully that makes things clear in terms of how the mode of the flags
register is used as well as the register number.  You'd have to dig into
the x86 port's use of the flags register to know what changes are safe
and which aren't -- and that's the analysis I'd need to help move this
along.

Jeff

Reply via email to