On 30 September 2011 18:01, H.J. Lu <hjl.to...@gmail.com> wrote:
> You may want to look a look at:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50583
>
> ARM may have the same problem.

OK - although to be honest this patch only stretches the same
structures to 64bit - any major changes in semantics are a separate issue - but
thanks for pointing it out.

Hmm - I think what's produced is correct; however the manual
description is inconsistent:

     These builtins perform the operation suggested by the name, and
     returns the value that had previously been in memory.  That is,

          { tmp = *ptr; *ptr OP= value; return tmp; }

The ARM code (see below) does a single load inside a loop with a guarded
store.  This guarantees that the value returned is the value that
was 'previously been in memory' directly prior to the atomic operation - however
that does mean it doesn't do the pair of accesses implied by the 'tmp
= *ptr; *ptr OP= value'

On ARM the operation for fetch_and_add we get:
(This is pre-my-patch and 32bit, my patch doesn't change the structure
except for the position of that last label):

        mov     r3, r0
        dmb     sy
        .LSYT6:
        ldrex   r0, [r3]
        add     r2, r0, r1
        strex   r0, r2, [r3]
        teq     r0, #0
        bne     .LSYT6
        sub     r0, r2, r1
        dmb     sy

That seems the correct semantics to me - if not what am I missing? Was
the intention of the example
really to cause two loads - if so why?

for sync_and_fetch we get:


        dmb     sy
        .LSYT6:
        ldrex   r0, [r3]
        add     r0, r0, r1
        strex   r2, r0, [r3]
        teq     r2, #0
        bne     .LSYT6
        dmb     sy

i.e. the value returned is always the value that goes into the guarded
store - and is hence
always the value that's stored.

Dave

Reply via email to