"Bin.Cheng" <amker.ch...@gmail.com> writes: > On Fri, Jul 28, 2017 at 12:55 PM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> Bin Cheng <bin.ch...@arm.com> writes: >>> Hi, >>> This simple patch fixes the ICE by adding LTGT in >>> vec_cmp<mode><v_cmp_result> pattern. >>> I also modified the original test case into a compilation one since >>> -fno-wrapping-math >>> should not be used in general. >>> Bootstrap and test on AArch64, test result check for x86_64. Is it OK? >>> I would also need to >>> backport it to gcc-7-branch. >>> >>> Thanks, >>> bin >>> 2017-07-27 Bin Cheng <bin.ch...@arm.com> >>> >>> PR target/81228 >>> * config/aarch64/aarch64-simd.md (vec_cmp<mode><v_cmp_result>): Add >>> LTGT. >>> >>> gcc/testsuite/ChangeLog >>> 2017-07-27 Bin Cheng <bin.ch...@arm.com> >>> >>> PR target/81228 >>> * gcc.dg/pr81228.c: New. >>> >>> diff --git a/gcc/config/aarch64/aarch64-simd.md >>> b/gcc/config/aarch64/aarch64-simd.md >>> index 011fcec0..9cd67a2 100644 >>> --- a/gcc/config/aarch64/aarch64-simd.md >>> +++ b/gcc/config/aarch64/aarch64-simd.md >>> @@ -2524,6 +2524,7 @@ >>> case EQ: >>> comparison = gen_aarch64_cmeq<mode>; >>> break; >>> + case LTGT: >>> case UNEQ: >>> case ORDERED: >>> case UNORDERED: >>> @@ -2571,6 +2572,7 @@ >>> emit_insn (comparison (operands[0], operands[2], operands[3])); >>> break; >>> >>> + case LTGT: >>> case UNEQ: >>> /* We first check (a > b || b > a) which is !UNEQ, inverting >>> this result will then give us (a == b || a UNORDERED b). */ >>> @@ -2578,7 +2580,8 @@ >>> operands[2], operands[3])); >>> emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); >>> emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); >>> - emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>> + if (code == UNEQ) >>> + emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); >>> break; >> >> AFAIK this is still a grey area, but I think (ltgt x y) is supposed to >> be a trapping operation, i.e. it's closer to (ior (lt x y) (gt x y)) >> than (not (uneq x y)). See e.g. the handling in may_trap_p_1, where >> LTGT is handled like LT and GT rather than like UNEQ. >> >> See also: https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00583.html > Thanks for pointing me to this, I don't know anything about floating point > here. > As for the change, the code now looks like: > > case LTGT: > case UNEQ: > /* We first check (a > b || b > a) which is !UNEQ, inverting > this result will then give us (a == b || a UNORDERED b). */ > emit_insn (gen_aarch64_cmgt<mode> (operands[0], > operands[2], operands[3])); > emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2])); > emit_insn (gen_ior<v_cmp_result>3 (operands[0], operands[0], tmp)); > if (code == UNEQ) > emit_insn (gen_one_cmpl<v_cmp_result>2 (operands[0], operands[0])); > break; > > So (a > b || b > a) is generated for LTGT which you suggested?
Ah, yeah, I was just going off LTGT being treated as !UNEQ, but... > Here we invert the result for UNEQ though. ...it looks like it might be the UNEQ code that's wrong. E.g. this test fails at -O3 and passes at -O for me: #define _GNU_SOURCE #include <fenv.h> double x[16], y[16]; int res[16]; int main (void) { for (int i = 0; i < 16; ++i) { x[i] = __builtin_nan (""); y[i] = i; } asm volatile ("" ::: "memory"); feclearexcept (FE_ALL_EXCEPT); for (int i = 0; i < 16; ++i) res[i] = __builtin_islessgreater (x[i], y[i]); asm volatile ("" ::: "memory"); return fetestexcept (FE_ALL_EXCEPT) != 0; } (asm volatiles just added for paranoia, in case stuff gets optimised away otherwise.) But I suppose that's no reason to hold up your patch. :-) Maybe it'd be worth having a comment though? Thanks, Richard