I've backported this to GCC8 too since it had the same failures:

The testcase for PR62178 has been failing for a while due to the pass
conditions being too tight, resulting in failures with -mcmodel=tiny:

        ldr     q2, [x0], 124
        ld1r    {v1.4s}, [x1], 4
        cmp     x0, x2
        mla     v0.4s, v2.4s, v1.4s
        bne     .L7

-mcmodel=small generates the slightly different:

        ldr     q1, [x0], 124
        ldr     s2, [x1, 4]!
        cmp     x0, x2
        mla     v0.4s, v1.4s, v2.s[0]
        bne     .L7

This is due to Combine merging a DUP instruction with either a load
or MLA - we can't force it to prefer one over the other.  However the
generated vector loop is fast either way since it generates MLA and
merges the DUP either with a load or MLA.  So relax the conditions
slightly and check we still generate MLA and there is no DUP or FMOV.

The testcase now passes - committed as obvious.

ChangeLog
2019-01-09  Wilco Dijkstra  <wdijk...@arm.com>  

    testsuite/
        * gcc.target/aarch64/pr62178.c: Relax scan-assembler checks.

--- gcc/testsuite/gcc.target/aarch64/pr62178.c  (revision 266178)
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c  (working copy)
@@ -18,5 +18,5 @@
 
 /* { dg-final { scan-assembler "ldr\\tq\[0-9\]+, \\\[x\[0-9\]+\\\], \[0-9\]+" 
} } */
 /* { dg-final { scan-assembler "mla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+" 
} } */
-/* { dg-final { scan-assembler-not { dup } } } */
-/* { dg-final { scan-assembler-not { fmov } } } */
+/* { dg-final { scan-assembler-not {dup} } } */
+/* { dg-final { scan-assembler-not {fmov} } } */
    

Reply via email to