"Bin.Cheng" <amker.ch...@gmail.com> writes:
> On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> Bin Cheng <bin.ch...@arm.com> writes:
>>> Hi,
>>> This simple patch fixes the ICE by adding LTGT in
>>> vec_cmp<mode><v_cmp_result> pattern.
>>> I also modified the original test case into a compilation one since
>>> -fno-wrapping-math
>>> should not be used in general.
>>> Bootstrap and test on AArch64, test result check for x86_64.  Is it OK?
>>> I would also need to
>>> backport it to gcc-7-branch.
>>>
>>> Thanks,
>>> bin
>>> 2017-07-27  Bin Cheng  <bin.ch...@arm.com>
>>>
>>>       PR target/81228
>>>       * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add
>>>       LTGT.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2017-07-27  Bin Cheng  <bin.ch...@arm.com>
>>>
>>>       PR target/81228
>>>       * gcc.dg/pr81228.c: New.
>>>
>>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>>> b/gcc/config/aarch64/aarch64-simd.md
>>> index 011fcec0..9cd67a2 100644
>>> --- a/gcc/config/aarch64/aarch64-simd.md
>>> +++ b/gcc/config/aarch64/aarch64-simd.md
>>> @@ -2524,6 +2524,7 @@
>>>      case EQ:
>>>        comparison = gen_aarch64_cmeq<mode>;
>>>        break;
>>> +    case LTGT:
>>>      case UNEQ:
>>>      case ORDERED:
>>>      case UNORDERED:
>>> @@ -2571,6 +2572,7 @@
>>>        emit_insn (comparison (operands[0], operands[2], operands[3]));
>>>        break;
>>>
>>> +    case LTGT:
>>>      case UNEQ:
>>>        /* We first check (a > b ||  b > a) which is !UNEQ, inverting
>>>        this result will then give us (a == b || a UNORDERED b).  */
>>> @@ -2578,7 +2580,8 @@
>>>                                        operands[2], operands[3]));
>>>        emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2]));
>>>        emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp));
>>> -      emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0]));
>>> +      if (code == UNEQ)
>>> +     emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0]));
>>>        break;
>>
>> AFAIK this is still a grey area, but I think (ltgt x y) is supposed to
>> be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y))
>> than (not (uneq x y)).  See e.g. the handling in may_trap_p_1, where
>> LTGT is handled like LT and GT rather than like UNEQ.
>>
>> See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html
> Thanks for pointing me to this, I don't know anything about floating point 
> here.
> As for the change, the code now looks like:
>
>     case LTGT:
>     case UNEQ:
>       /* We first check (a > b ||  b > a) which is !UNEQ, inverting
>      this result will then give us (a == b || a UNORDERED b).  */
>       emit_insn (gen_aarch64_cmgt<mode> (operands[0],
>                      operands[2], operands[3]));
>       emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2]));
>       emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp));
>       if (code == UNEQ)
>     emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0]));
>       break;
>
> So (a > b || b > a) is generated for LTGT which you suggested?

Ah, yeah, I was just going off LTGT being treated as !UNEQ, but...

> Here we invert the result for UNEQ though.

...it looks like it might be the UNEQ code that's wrong.  E.g. this
test fails at -O3 and passes at -O for me:

#define _GNU_SOURCE
#include <fenv.h>

double x[16], y[16];
int res[16];

int
main (void)
{
  for (int i = 0; i < 16; ++i)
    {
      x[i] = __builtin_nan ("");
      y[i] = i;
    }
  asm volatile ("" ::: "memory");
  feclearexcept (FE_ALL_EXCEPT);
  for (int i = 0; i < 16; ++i)
    res[i] = __builtin_islessgreater (x[i], y[i]);
  asm volatile ("" ::: "memory");
  return fetestexcept (FE_ALL_EXCEPT) != 0;
}

(asm volatiles just added for paranoia, in case stuff gets optimised
away otherwise.)

But I suppose that's no reason to hold up your patch. :-)  Maybe it'd
be worth having a comment though?

Thanks,
Richard

Reply via email to