https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113034
Bug ID: 113034 Summary: Miscompilation of __m128 ne comparison on LoongArch Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: c at jia dot je Target Milestone: --- Compile and run the following code: ``` #include <lsxintrin.h> #include <stdio.h> __m128i unord_vec(__m128 a, __m128 b) { return (a != a) | (b != b); } int unord_float(float a, float b) { return (a != a) | (b != b); } int main() { float nan = 0.0 / 0.0; // nan __m128 nan_vec = {nan, nan}; int res_float = unord_float(nan, nan); __m128i res_vec = unord_vec(nan_vec, nan_vec); printf("%d %ld %ld\n", res_float, res_vec[0], res_vec[1]); return 0; } ``` Compile commands: `gcc-14 -mlsx test.c -O -o test`. GCC version is 14.0.0 202231203 snapshot. It does the `unordered` comparison between two floats. The expected output: ``` 1 1 1 ``` Actual output: ``` 1 0 0 ``` Reading the assembly, the `unord_vec` is implemented wrongly as `vfcmp.cne.s`: ``` unord_vec: .LFB538 = . .cfi_startproc vinsgr2vr.d $vr0,$r4,0 vinsgr2vr.d $vr0,$r5,1 vinsgr2vr.d $vr1,$r6,0 vinsgr2vr.d $vr1,$r7,1 vfcmp.cne.s $vr0,$vr0,$vr0 vfcmp.cne.s $vr1,$vr1,$vr1 vor.v $vr0,$vr0,$vr1 vpickve2gr.du $r4,$vr0,0 vpickve2gr.du $r5,$vr0,1 jr $r1 .cfi_endproc ``` Whereas `unord_float` is correctly implemented as `fcmp.cune.s`: ``` unord_float: .LFB539 = . .cfi_startproc addi.w $r4,$r0,1 # 0x1 fcmp.cune.s $fcc0,$f0,$f0 bcnez $fcc0,.L3 or $r4,$r0,$r0 .L3: addi.w $r12,$r0,1 # 0x1 fcmp.cune.s $fcc1,$f1,$f1 bcnez $fcc1,.L4 or $r12,$r0,$r0 .L4: or $r4,$r4,$r12 andi $r4,$r4,1 jr $r1 .cfi_endproc ``` So there is a mismatch on the `unordered` case. Besides, these functions can be optimized to use `vfcmp.cun.s` and `fcmp.cun.s`.