I was hoping to see the actual inlined assembly for the code. Here is what gcc
output:
LFB3:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
movl $0, -4(%rbp)
leaq -4(%rbp), %rcx
.align 4,0x90
L2:
movl -4(%rbp), %eax
leal 1(%rax), %edx
lock;cmpxchgl %edx,(%rcx)
sete %al
testb %al, %al
je L2
movl %edx, %eax
leave
ret
As you can see there is no explicit call, the opal_atomic_cmpset_32 is really
inlined. I think the problem is that you didn't specify the -O3 flag on your
command line.
OK, now that the assembly code is here, I can tell you what I was looking for.
The pgi comiler generated two warnings: one about oldval being initialized but
not used, and the second one about the cc being ignored. However, if we suppose
that the assembly code generated by pgi is correct then there are two things we
should have in the assembly output:
1. Initialization of %eax shortly before the cmpxchgl, but inside the internal
loop (in my example two lines before). **This is the place where the oldval is
supposed to be used**
2. Base the internal loop exit condition on the CCR register (the sete
instruction on my code). **This is the place where the cc is important**
george.
On Jun 8, 2010, at 12:28 , Jeff Squyres wrote:
> What exactly do you need? Your first mail said:
>
>>>> Can you send the assembly instructions generated by the PGI compiler for
>>>> the following code:
>>>>
>>>> int32_t oldval;
>>>>
>>>> do {
>>>> oldval = *addr;
>>>> } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval + delta));
>>>> return (oldval + delta);
>
>
> Which is what I sent...?
>
>
> On Jun 8, 2010, at 8:22 AM, George Bosilca wrote:
>
>> The inline was ignored, and the code for the opal_atomic_cmpset_32 is not in
>> there ...
>
> --
> Jeff Squyres
> [email protected]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel