Ping.

Thanks,
Kyrill

On 08/05/17 12:00, Kyrill Tkachov wrote:
Ping.

Thanks,
Kyrill

On 24/04/17 10:38, Kyrill Tkachov wrote:
Pinging this back into context so that I don't forget about it...

https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00376.html

Thanks,
Kyrill

On 08/03/17 16:35, Kyrill Tkachov wrote:
Hi all,

For the testcase in this patch where the value of x is zero we currently 
generate:
foo:
        mov     w1, 4
.L2:
        ldaxr   w2, [x0]
        cmp     w2, 0
        bne     .L3
        stxr    w3, w1, [x0]
        cbnz    w3, .L2
.L3:
        cset    w0, eq
        ret

We currently cannot merge the cmp and b.ne inside the loop into a cbnz because 
we need
the condition flags set for the return value of the function (i.e. the cset at 
the end).
But if we re-jig the sequence in that case we can generate a tighter loop:
foo:
        mov     w1, 4
.L2:
        ldaxr   w2, [x0]
        cbnz    w2, .L3
        stxr    w3, w1, [x0]
        cbnz    w3, .L2
.L3:
        cmp     w2, 0
        cset    w0, eq
        ret

So we add an explicit compare after the loop and inside the loop we use the 
fact that
we're comparing against zero to emit a CBNZ. This means we may re-do the 
comparison twice
(once inside the CBNZ, once at the CMP at the end), but there is now less code 
inside the loop.

I've seen this sequence appear in glibc locking code so maybe it's worth adding 
the extra bit
of complexity to the compare-exchange splitter to catch this case.

Bootstrapped and tested on aarch64-none-linux-gnu. In previous iterations of 
the patch where
I had gotten some logic wrong it would cause miscompiles of libgomp leading to 
timeouts in its
testsuite but this version passes everything cleanly.

Ok for GCC 8? (I know it's early, but might as well get it out in case someone 
wants to try it out)

Thanks,
Kyrill

2017-03-08  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * config/aarch64/aarch64.c (aarch64_split_compare_and_swap):
    Emit CBNZ inside loop when doing a strong exchange and comparing
    against zero.  Generate the CC flags after the loop.

2017-03-08  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>

    * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: New test.



Reply via email to