Right, that was a pretty intense discussion. However, I don't think we talked
about replacing the = by a +. The difference is that = means write and + means
read&write. Here is the assembly output from gcc -O3:
with output "=m" (*v) | with output "+m" (*v)
and input "ir" (i), "m" (*v) | and input "r" (i)
|
_opal_atomic_add_32: | _opal_atomic_add_32:
LFB5: | LFB5:
pushq %rbp | pushq %rbp
LCFI3: | LCFI3:
movq %rsp, %rbp | movq %rsp, %rbp
LCFI4: | LCFI4:
movq %rdi, -8(%rbp) | movq %rdi, -8(%rbp)
movl %esi, -12(%rbp) | movl %esi, -12(%rbp)
movq -8(%rbp), %rcx | movq -8(%rbp), %rcx
movl -12(%rbp), %edx | movl -12(%rbp), %edx
movq -8(%rbp), %rax | movq -8(%rbp), %rax
lock;addl %edx,(%rcx) | lock;addl %edx,(%rcx)
movq -8(%rbp), %rax | movq -8(%rbp), %rax
movl (%rax), %eax | movl (%rax), %eax
leave | leave
ret | ret
It generates multiple loads as %ras is updated before the lock. Useless!
Now, if we put on the output "=m"(*v) and skip the (*)v in the input arguments
the code looks like this:
LFB7:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
movl $0, -4(%rbp)
leaq -4(%rbp), %rax
movl $1, %edx
lock
addl %edx,(%rax)
movl -4(%rbp), %eax
leave
ret
Which is a LOT better. Not perfect as it still generate a load after the locked
addl, but this is because we wanted to return the (*v).
Thus the code should look at least like this
static inline int32_t opal_atomic_add_32(volatile int32_t* v, int i)
{
__asm__ __volatile__(
SMPLOCK "addl %1,%0"
:"=m" (*v)
:"r" (i));
return (*v); /* should be an atomic operation */
}
Now, if what we want back from this function is the __REAL__ result of the
atomic addition, then the code is wrong. Well, mostly wrong under heavy usage
(i.e. multiple threads doing atomics on the same variable).
Here is the opal_atomic_add_32 returning the correct result. This is similar to
the atomic intrinsic called add_and_fetch.
static inline int32_t opal_atomic_add_32(volatile int32_t* v, int i)
{
int ret = i;
__asm__ __volatile__(
SMPLOCK "xaddl %1,%0"
:"=m" (*v), "+r" (ret)
);
return ret+i;
}
george.
On Aug 25, 2010, at 10:58 , Rolf vandeVaart wrote:
> With respect to the warnings with atomic.h, we have been down this road
> before.
> Here is the ticket with the background information.
>
> https://svn.open-mpi.org/trac/ompi/ticket/1500
>
> Eventually, we decided to just live with the warnings. However, I will take
> a look at George's two suggestions.
>
> Rolf
>
>
>
> On 08/24/10 21:28, George Bosilca wrote:
>> On Aug 24, 2010, at 20:40 , Paul H. Hargrove wrote:
>>
>>
>>
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170:
>>> warning: impossible constraint for "%1" asm operand
>>>
>>>
>>
>> __asm__ __volatile__(
>> SMPLOCK "addl %1,%0"
>> :"=m" (*v)
>> :"ir" (i), "m" (*v));
>>
>> The problem seems to come from the "ir". Based on a Sun blog about the gcc
>> style asm inlining support (
>> http://blogs.sun.com/x86be/entry/gcc_style_asm_inlining_support
>> ) it appears that i (any size integer immediate constraint) and r (any
>> registers in rax, rbx, rcx, rdx, rbp, rsi, rdi, rsp, r8 - r15). As we don't
>> only apply our atomics on immediate I think we should drop the "i".
>>
>>
>>
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 170:
>>> warning: parameter in inline asm statement unused: %2
>>>
>>>
>>
>> This one is more trickier. Because of the %2 I suspect that the second (*v)
>> on the inputs is not matched to the first (*v) on the outputs. While this
>> might be significantly bad under some circumstances, in this case I think it
>> can be safely ignored.
>>
>> However I would like to try the following asm code instead with the SUN
>> compiler:
>>
>> __asm__ __volatile__(
>> SMPLOCK "addl %1,%0"
>> :"+m" (*v)
>> :"r" (i));
>>
>> Thanks,
>> george.
>>
>>
>>
>>
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 187:
>>> warning: impossible constraint for "%1" asm operand
>>> "../../../openmpi-1.5rc5/opal/include/opal/sys/ia32/atomic.h", line 187:
>>> warning: parameter in inline asm statement unused: %2
>>>
>>> ../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 145: Warning
>>> (Anachronism): Formal argument read_conversion_fn of type extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to
>>> MPI_Register_datarep(char*, extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 146: Warning
>>> (Anachronism): Formal argument write_conversion_fn of type extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to
>>> MPI_Register_datarep(char*, extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 147: Warning
>>> (Anachronism): Formal argument dtype_file_extent_fn of type extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*,
>>> extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern
>>> "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(ompi_datatype_t*,int*,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 172: Warning
>>> (Anachronism): Formal argument write_conversion_fn of type extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to
>>> MPI_Register_datarep(char*, extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 173: Warning
>>> (Anachronism): Formal argument dtype_file_extent_fn of type extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*,
>>> extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern
>>> "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(ompi_datatype_t*,int*,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 197: Warning
>>> (Anachronism): Formal argument read_conversion_fn of type extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*) in call to
>>> MPI_Register_datarep(char*, extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(void*,ompi_datatype_t*,int,void*,long long,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 199: Warning
>>> (Anachronism): Formal argument dtype_file_extent_fn of type extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*,
>>> extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern
>>> "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(ompi_datatype_t*,int*,void*).
>>> "../../../../openmpi-1.5rc5/ompi/mpi/cxx/file.cc", line 224: Warning
>>> (Anachronism): Formal argument dtype_file_extent_fn of type extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*) in call to MPI_Register_datarep(char*,
>>> extern "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern
>>> "C" int(*)(void*,ompi_datatype_t*,int,void*,long long,void*), extern "C"
>>> int(*)(ompi_datatype_t*,int*,void*), void*) is being passed
>>> int(*)(ompi_datatype_t*,int*,void*).
>>> [Though line numbers differ very slightly]
>>>
>>>
>>> NEW, not seen with 1.4.3rc1, warnings:
>>>
>>> Many instances of the following:
>>>
>>> "../../../../openmpi-1.5rc5/orte/mca/ess/ess.h", line 61: warning:
>>> attribute "noreturn" may not be applied to variable, ignored
>>> "../../../../openmpi-1.5rc5/orte/mca/errmgr/errmgr.h", line 138: warning:
>>> attribute "noreturn" may not be applied to variable, ignored
>>> [Due to applying __opal_attribute_noreturn__ to a function pointer]
>>>
>>> Single instances of the following:
>>>
>>> "../../../../../openmpi-1.5rc5/opal/mca/crs/none/crs_none_module.c", line
>>> 136: warning: statement not reached
>>> "../../../../openmpi-1.5rc5/orte/mca/plm/base/plm_base_rsh_support.c", line
>>> 462: warning: implicit function declaration: rindex
>>> "../../../../openmpi-1.5rc5/orte/mca/plm/base/plm_base_rsh_support.c", line
>>> 462: warning: improper pointer/integer combination: op "="
>>> "../../../../openmpi-1.5rc5/orte/mca/plm/base/plm_base_rsh_support.c", line
>>> 565: warning: improper pointer/integer combination: op "="
>>> "../../../../../openmpi-1.5rc5/orte/mca/rmcast/tcp/rmcast_tcp.c", line 982:
>>> warning: assignment type mismatch:
>>> "../../../../../openmpi-1.5rc5/orte/mca/rmcast/tcp/rmcast_tcp.c", line
>>> 1023: warning: assignment type mismatch:
>>> "../../../../../openmpi-1.5rc5/orte/mca/rmcast/udp/rmcast_udp.c", line 877:
>>> warning: assignment type mismatch:
>>> "../../../../../openmpi-1.5rc5/orte/mca/rmcast/udp/rmcast_udp.c", line 918:
>>> warning: assignment type mismatch:
>>> "../../../../openmpi-1.5rc5/orte/tools/orte-ps/orte-ps.c", line 288:
>>> warning: initializer does not fit or is out of range: 0xfffffffe
>>> "../../../../openmpi-1.5rc5/orte/tools/orte-ps/orte-ps.c", line 289:
>>> warning: initializer does not fit or is out of range: 0xfffffffe
>>> "
>>> "../../../../../openmpi-1.5rc5/ompi/mca/common/sm/common_sm_mmap.c", line
>>> 110: warning: improper pointer/integer combination: arg #3
>>> "../../../../../openmpi-1.5rc5/ompi/mca/common/sm/common_sm_mmap.c", line
>>> 135: warning: improper pointer/integer combination: arg #3
>>> "../../../../../openmpi-1.5rc5/ompi/mca/common/sm/common_sm_mmap.c", line
>>> 201: warning: assignment type mismatch:
>>> pointer to char "=" pointer to int
>>> "../../../../../openmpi-1.5rc5/ompi/mca/common/sm/common_sm_mmap.c", line
>>> 207: warning: assignment type mismatch:
>>> pointer to char "=" pointer to int
>>> "../../../../../openmpi-1.5rc5/ompi/mca/common/sm/common_sm_mmap.c", line
>>> 280: warning: argument #1 is incompatible with prototype:
>>> prototype: pointer to char : "/usr/include/sys/mman.h", line 238
>>> argument : pointer to struct mca_common_sm_mmap_t {struct
>>> opal_list_item_t {..} map_item, pointer to struct
>>> mca_common_sm_file_header_t {..} map_seg, pointer to unsigned char
>>> map_addr, pointer to unsigned char data_addr, unsigned int map_size,
>>> array[1025] of char map_path}
>>>
>>>
>>> -Paul
>>>
>>> --
>>> Paul H. Hargrove
>>> [email protected]
>>>
>>> Future Technologies Group
>>> HPC Research Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>
>>> _______________________________________________
>>> devel mailing list
>>>
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> devel mailing list
>>
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel