Re: [OMPI devel] Master broken for ILP32

2016-05-09 Thread Hjelm, Nathan Thomas
We have chosen to use the __sync builtins by default on master. There was an 
rfc on it awhile ago. Is there a good reason to go back to the inline by 
default or is this just surprising?



From: devel on behalf of Paul Hargrove
Sent: Monday, May 09, 2016 11:12:16 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] Master broken for ILP32

Regarding "distro":
This was happening, for instance, on OpenBSD and NetBSD (32-bit kernels on 
64-bit h/w) when testing your PR1643.
However, it sounds like Nathan knows how/where to fix this.

HOWEVER, that is not the only issue here.
Why is master is picking the BUILTIN (__sync) atomics (as shown in the 
configure output quoted below), while v2.x (same system and same config args) 
uses a .s file:
*** Assembler
checking dependency style of gcc -std=gnu99... gcc3
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /usr/bin/grep -F
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
checking if processor supports x86_64 16-byte compare-and-exchange... no
checking if gcc -std=gnu99 supports GCC inline assembly... yes
checking if gcc -std=gnu99 supports DEC inline assembly... no
checking if gcc -std=gnu99 supports XLC inline assembly... no
checking for assembly format... default-.text-.globl-:--.L-@-1-0-1-1-0
checking for assembly architecture... IA32
checking for builtin atomics... BUILTIN_NO
checking for perl... perl
checking for pre-built assembly file... yes (atomic-ia32-linux-nongas.s)
checking for atomic assembly filename... atomic-ia32-linux-nongas.s


-Paul

On Mon, May 9, 2016 at 1:22 AM, Gilles Gouaillardet 
> wrote:

Paul,


on which distro are you running ?

are you compiling on a 64 bit distro to generate a 32 bit library ?


it seems we are currently only testing a atomic on a long (32 bits on a 32 bits 
arch) and

then incorrectly assume it works also on 64 bits (!)


Cheers,


Gilles

On 5/9/2016 3:59 PM, Paul Hargrove wrote:
Perhaps this is already known.
Several builds I've tried recently from the ompi (not ompi-release) repo are 
failing on 32-bit platforms with
   ../../../opal/.libs/libopen-pal.so: undefined reference to 
`__sync_add_and_fetch_8'

This is impacting PRs that I am being asked to test (e.g. 1643).

Note that I did *not* configure with --enable-builtin-atomics, yet configure 
seems to show them being selected anyway:
*** Assembler
checking dependency style of gcc -std=gnu99... gcc3
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /usr/bin/grep -F
checking for __sync builtin atomics... yes
checking for processor support of __sync builtin atomic compare-and-swap on 
128-bit values... no
checking for __sync builtin atomic compare-and-swap on 128-bit values with 
-mcx16 flag... no
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
checking if processor supports x86_64 16-byte compare-and-exchange... no
checking for assembly architecture... IA32
checking for builtin atomics... BUILTIN_SYNC
checking for atomic assembly filename... none

-Paul

--
Paul H. Hargrove   
phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: 
+1-510-495-2352
Lawrence Berkeley National Laboratory Fax: 
+1-510-486-6900



___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/05/18941.php


___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/05/18942.php



--
Paul H. Hargrove   

Re: [OMPI devel] Master broken for ILP32

2016-05-09 Thread Hjelm, Nathan Thomas
Nevermind. Looks like there are two different macros for 64-bit and one is 
wrong in this case. Fix incoming.



From: devel on behalf of Gilles Gouaillardet
Sent: Monday, May 09, 2016 2:22:24 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] Master broken for ILP32

Paul,


on which distro are you running ?

are you compiling on a 64 bit distro to generate a 32 bit library ?


it seems we are currently only testing a atomic on a long (32 bits on a 32 bits 
arch) and

then incorrectly assume it works also on 64 bits (!)


Cheers,


Gilles

On 5/9/2016 3:59 PM, Paul Hargrove wrote:
Perhaps this is already known.
Several builds I've tried recently from the ompi (not ompi-release) repo are 
failing on 32-bit platforms with
   ../../../opal/.libs/libopen-pal.so: undefined reference to 
`__sync_add_and_fetch_8'

This is impacting PRs that I am being asked to test (e.g. 1643).

Note that I did *not* configure with --enable-builtin-atomics, yet configure 
seems to show them being selected anyway:
*** Assembler
checking dependency style of gcc -std=gnu99... gcc3
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /usr/bin/grep -F
checking for __sync builtin atomics... yes
checking for processor support of __sync builtin atomic compare-and-swap on 
128-bit values... no
checking for __sync builtin atomic compare-and-swap on 128-bit values with 
-mcx16 flag... no
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
checking if processor supports x86_64 16-byte compare-and-exchange... no
checking for assembly architecture... IA32
checking for builtin atomics... BUILTIN_SYNC
checking for atomic assembly filename... none

-Paul

--
Paul H. Hargrove   
phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/05/18941.php



Re: [OMPI devel] Master broken for ILP32

2016-05-09 Thread Hjelm, Nathan Thomas
This really isnt a problem with the atomics code. We have a macro to indicate 
whether 64-bit is really supported. Something in opal is using 64-bit atomics 
without checking if they are supported. With sync atomics we get a link error 
but with the others it is a compile error. I fixed a similar problem in vader 
but it looks like there are more places that need to be fixed.



From: devel on behalf of Gilles Gouaillardet
Sent: Monday, May 09, 2016 2:22:24 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] Master broken for ILP32

Paul,


on which distro are you running ?

are you compiling on a 64 bit distro to generate a 32 bit library ?


it seems we are currently only testing a atomic on a long (32 bits on a 32 bits 
arch) and

then incorrectly assume it works also on 64 bits (!)


Cheers,


Gilles

On 5/9/2016 3:59 PM, Paul Hargrove wrote:
Perhaps this is already known.
Several builds I've tried recently from the ompi (not ompi-release) repo are 
failing on 32-bit platforms with
   ../../../opal/.libs/libopen-pal.so: undefined reference to 
`__sync_add_and_fetch_8'

This is impacting PRs that I am being asked to test (e.g. 1643).

Note that I did *not* configure with --enable-builtin-atomics, yet configure 
seems to show them being selected anyway:
*** Assembler
checking dependency style of gcc -std=gnu99... gcc3
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking for fgrep... /usr/bin/grep -F
checking for __sync builtin atomics... yes
checking for processor support of __sync builtin atomic compare-and-swap on 
128-bit values... no
checking for __sync builtin atomic compare-and-swap on 128-bit values with 
-mcx16 flag... no
checking if .proc/endp is needed... no
checking directive for setting text section... .text
checking directive for exporting symbols... .globl
checking for objdump... objdump
checking if .note.GNU-stack is needed... no
checking suffix for labels... :
checking prefix for global symbol labels...
checking prefix for lsym labels... .L
checking prefix for function in .type... @
checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
checking if processor supports x86_64 16-byte compare-and-exchange... no
checking for assembly architecture... IA32
checking for builtin atomics... BUILTIN_SYNC
checking for atomic assembly filename... none

-Paul

--
Paul H. Hargrove   
phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/05/18941.php



Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-509-g38d6627

2014-12-15 Thread Hjelm, Nathan Thomas
It will take about 5 mins to either fix or determine if more work is needed.



From: devel on behalf of Howard Pritchard
Sent: Monday, December 15, 2014 10:05:24 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master 
updated. dev-509-g38d6627

I'd prefer Paul's suggestion to disable xpmem for sgi/uv for 1.8.X
Is anyone actually supporting this?

Howard

2014-12-15 8:56 GMT-07:00 Nathan Hjelm 
>:

Not yet. I am still trying to pinpoint the problem. From what I can tell
the SGI version of XPMEM should be nearly identical to the Cray
version. I should have this figured out this week. If I don't get it
fixed by Wed I will open a pull request to remove the check for
sn/xpmem.h.

-Nathan

On Fri, Dec 12, 2014 at 07:50:11PM -0800, Ralph Castain wrote:
> Nathan - does this need to come to 1.8.4? Or do you want to go with Paul’s 
> suggested fix?
>
> > On Dec 12, 2014, at 8:09 AM, 
> > git...@crest.iu.edu wrote:
> >
> > This is an automated email from the git hooks/post-receive script. It was
> > generated because a ref change was pushed to the repository containing
> > the project "open-mpi/ompi".
> >
> > The branch, master has been updated
> >   via  38d66272c51fd531181d9dc282a7260f40270f64 (commit)
> >  from  f4aecdbfd22a74feadab5566d2d595b65be4a8cb (commit)
> >
> > Those revisions listed above that are new to this repository have
> > not appeared on any other notification email; so we list those
> > revisions in full, below.
> >
> > - Log -
> > https://github.com/open-mpi/ompi/commit/38d66272c51fd531181d9dc282a7260f40270f64
> >
> > commit 38d66272c51fd531181d9dc282a7260f40270f64
> > Author: Nathan Hjelm >
> > Date:   Fri Dec 12 09:09:01 2014 -0700
> >
> >btl/vader: fix compile on SGI UV
> >
> > diff --git a/opal/mca/btl/vader/btl_vader_component.c 
> > b/opal/mca/btl/vader/btl_vader_component.c
> > index 7061612..aabf03d 100644
> > --- a/opal/mca/btl/vader/btl_vader_component.c
> > +++ b/opal/mca/btl/vader/btl_vader_component.c
> > @@ -354,9 +354,8 @@ static void mca_btl_vader_check_single_copy (void)
> > #if OPAL_BTL_VADER_HAVE_XPMEM
> > if (MCA_BTL_VADER_XPMEM == 
> > mca_btl_vader_component.single_copy_mechanism) {
> > /* try to create an xpmem segment for the entire address space */
> > -mca_btl_vader_component.my_seg_id = xpmem_make (0, 
> > VADER_MAX_ADDRESS, XPMEM_PERMIT_MODE, (void *)0666);
> > -
> > -if (-1 == mca_btl_vader_component.my_seg_id) {
> > +rc = mca_btl_vader_xpmem_init ();
> > +if (OPAL_SUCCESS != rc) {
> > if (MCA_BTL_VADER_XPMEM == initial_mechanism) {
> > opal_show_help("help-btl-vader.txt", "xpmem-make-failed",
> >true, opal_process_info.nodename, errno,
> > @@ -364,11 +363,7 @@ static void mca_btl_vader_check_single_copy (void)
> > }
> >
> > mca_btl_vader_select_next_single_copy_mechanism ();
> > -} else {
> > -mca_btl_vader.super.btl_get = mca_btl_vader_get_xpmem;
> > -mca_btl_vader.super.btl_put = mca_btl_vader_get_xpmem;
> > }
> > -
> > }
> > #endif
> >
> > diff --git a/opal/mca/btl/vader/btl_vader_xpmem.c 
> > b/opal/mca/btl/vader/btl_vader_xpmem.c
> > index 7e362ea..4bb9a3b 100644
> > --- a/opal/mca/btl/vader/btl_vader_xpmem.c
> > +++ b/opal/mca/btl/vader/btl_vader_xpmem.c
> > @@ -19,6 +19,19 @@
> >
> > #if OPAL_BTL_VADER_HAVE_XPMEM
> >
> > +int mca_btl_vader_xpmem_init (void)
> > +{
> > +mca_btl_vader_component.my_seg_id = xpmem_make (0, VADER_MAX_ADDRESS, 
> > XPMEM_PERMIT_MODE, (void *)0666);
> > +if (-1 == mca_btl_vader_component.my_seg_id) {
> > +return OPAL_ERR_NOT_AVAILABLE;
> > +}
> > +
> > +mca_btl_vader.super.btl_get = mca_btl_vader_get_xpmem;
> > +mca_btl_vader.super.btl_put = mca_btl_vader_get_xpmem;
> > +
> > +return OPAL_SUCCESS;
> > +}
> > +
> > /* look up the remote pointer in the peer rcache and attach if
> >  * necessary */
> > mca_mpool_base_registration_t *vader_get_registation (struct 
> > mca_btl_base_endpoint_t *ep, void *rem_ptr,
> > diff --git a/opal/mca/btl/vader/btl_vader_xpmem.h 
> > b/opal/mca/btl/vader/btl_vader_xpmem.h
> > index 1be188a..e040e26 100644
> > --- a/opal/mca/btl/vader/btl_vader_xpmem.h
> > +++ b/opal/mca/btl/vader/btl_vader_xpmem.h
> > @@ -22,6 +22,7 @@
> >   #include 
> >
> >   typedef int64_t xpmem_segid_t;
> > +  typedef int64_t xpmem_apid_t;
> > #endif
> >
> > /* look up the remote pointer in the peer rcache and attach if
> > @@ -30,6 +31,8 @@
> > /* largest address we can attach to using xpmem */
> > #define VADER_MAX_ADDRESS ((uintptr_t)0x7000ul)
> >
> > +int mca_btl_vader_xpmem_init (void);
> > +
> > mca_mpool_base_registration_t *vader_get_registation 

Re: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master updated. dev-327-gccaecf0

2014-11-19 Thread Hjelm, Nathan Thomas
Yes. Usnic, yoda, and smcuda need to be updated for the new interface. The 
warnings in opening I will fix.



From: devel on behalf of Ralph Castain
Sent: Wednesday, November 19, 2014 3:15:07 PM
To: Open MPI Developers
Subject: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master  
updated. dev-327-gccaecf0

Was this commit intended to happen? It broke the trunk:

btl_openib.c:119:9: warning: initialization from incompatible pointer type 
[enabled by default]
 .btl_atomic_fop = mca_btl_openib_atomic_fop,
 ^
btl_openib.c:119:9: warning: (near initialization for 
'mca_btl_openib_module.super.btl_atomic_fop') [enabled by default]
btl_openib.c:120:9: warning: initialization from incompatible pointer type 
[enabled by default]
 .btl_atomic_cswap = mca_btl_openib_atomic_cswap,
 ^
btl_openib.c:120:9: warning: (near initialization for 
'mca_btl_openib_module.super.btl_atomic_cswap') [enabled by default]
btl_openib.c: In function 'mca_btl_openib_prepare_src':
btl_openib.c:1456:9: warning: variable 'rc' set but not used 
[-Wunused-but-set-variable]
 int rc;
 ^
btl_openib.c:1450:30: warning: variable 'openib_btl' set but not used 
[-Wunused-but-set-variable]
 mca_btl_openib_module_t *openib_btl;
  ^
btl_openib_component.c: In function 'init_one_device':
btl_openib_component.c:2047:54: warning: comparison between 'enum ' 
and 'mca_base_var_source_t' [-Wenum-compare]
 else if (BTL_OPENIB_RQ_SOURCE_DEVICE_INI ==
  ^
btl_usnic_frag.c: In function 'recv_seg_constructor':
btl_usnic_frag.c:144:17: error: 'mca_btl_base_descriptor_t' has no member named 
'des_remote'
 seg->rs_desc.des_remote = NULL;
 ^
btl_usnic_frag.c:145:17: error: 'mca_btl_base_descriptor_t' has no member named 
'des_remote_count'
 seg->rs_desc.des_remote_count = 0;
 ^
btl_usnic_frag.c: In function 'send_frag_constructor':
btl_usnic_frag.c:168:9: error: 'mca_btl_base_descriptor_t' has no member named 
'des_remote'
 desc->des_remote = frag->sf_base.uf_remote_seg;
 ^
btl_usnic_frag.c:169:9: error: 'mca_btl_base_descriptor_t' has no member named 
'des_remote_count'
 desc->des_remote_count = 0;
 ^
make[2]: *** [btl_usnic_frag.lo] Error 1
make[2]: *** Waiting for unfinished jobs
btl_usnic_module.c: In function 'usnic_put':
btl_usnic_module.c:1107:56: error: 'struct mca_btl_base_descriptor_t' has no 
member named 'des_remote'
 frag->sf_base.uf_remote_seg[0].seg_addr.pval = 
desc->des_remote->seg_addr.pval;
^
btl_usnic_module.c: At top level:
btl_usnic_module.c:2325:9: error: unknown field 'btl_seg_size' specified in 
initializer
 .btl_seg_size = sizeof(mca_btl_base_segment_t), /* seg size */
 ^
btl_usnic_module.c:2332:9: warning: initialization from incompatible pointer 
type [enabled by default]
 .btl_prepare_src = usnic_prepare_src,
 ^
btl_usnic_module.c:2332:9: warning: (near initialization for 
'opal_btl_usnic_module_template.super.btl_prepare_src') [enabled by default]
btl_usnic_module.c:2333:9: error: unknown field 'btl_prepare_dst' specified in 
initializer
 .btl_prepare_dst = usnic_prepare_dst,
 ^
btl_usnic_module.c:2333:9: warning: initialization from incompatible pointer 
type [enabled by default]
btl_usnic_module.c:2333:9: warning: (near initialization for 
'opal_btl_usnic_module_template.super.btl_send') [enabled by default]
btl_usnic_module.c:2335:9: warning: initialization from incompatible pointer 
type [enabled by default]
 .btl_put = usnic_put,
 ^
btl_usnic_module.c:2335:9: warning: (near initialization for 
'opal_btl_usnic_module_template.super.btl_put') [enabled by default]
make[2]: *** [btl_usnic_module.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1



Begin forwarded message:

To: ompi-comm...@open-mpi.org
List-Post: devel@lists.open-mpi.org
Date: November 19, 2014 at 2:01:45 PM PST
From: git...@crest.iu.edu
Subject: [OMPI commits] Git: open-mpi/ompi branch master updated. 
dev-327-gccaecf0
Reply-To: de...@open-mpi.org

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "open-mpi/ompi".

The branch, master has been updated
  via  ccaecf0fd6c862877e6a1e2643f95fa956c87769 (commit)
  via  5a0a48c3c45a9ce7033684958d7ba8d2a4712ab9 (commit)
  via  2b579610f2d7e5bf9e0defb6871c5b0e1b9cc778 (commit)
  via  2a382c2ec1747ae6bab66fccd27a42b2193b058f (commit)
  via  1a5349ec790d9d36039206eea08dad84390f380c (commit)
  via  8f1a44e60e0d06d43222a4b805a770b6f5e88f45 (commit)
  via  

Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old value

2014-08-08 Thread Hjelm, Nathan Thomas
I will try to take a look this week and see what I can do.

-Nathan

From: devel [devel-boun...@open-mpi.org] on behalf of George Bosilca 
[bosi...@icl.utk.edu]
Sent: Thursday, August 07, 2014 10:37 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] RFC: add atomic compare-and-swap that returns old 
value

Paul's tests identified an small issue with the previous patch (a real 
corner-case for ARM v5). The patch below is fixing all known issues.

Btw, there is still room for volunteers for the .asm work.

  George.



On Tue, Aug 5, 2014 at 2:23 PM, George Bosilca 
> wrote:
Thanks to Paul help all the inlined atomics have been tested. The new patch is 
attached below. However, this only fixes the inline atomics, all those 
generated from the *.asm files have not been updated. Any volunteer?

  George.



On Aug 1, 2014, at 18:09 , Paul Hargrove 
> wrote:

I have confirmed that George's latest version works on both SPARC ABIs.

ARMv7 and three MIPS ABIs still pending...

-Paul


On Fri, Aug 1, 2014 at 9:40 AM, George Bosilca 
> wrote:
Another version of the atomic patch. Paul has tested it on a bunch of 
platforms. At this point we have confirmation from all architectures except 
SPARC (v8+ and v9).

  George.



On Jul 31, 2014, at 19:13 , George Bosilca 
> wrote:

> All,
>
> Here is the patch that change the meaning of the atomics to make them always 
> return the previous value (similar to sync_fetch_and_<*>). I tested this with 
> the following atomics: OS X, gcc style intrinsics and AMD64.
>
> I did not change the base assembly files used when GCC style assembly 
> operations are not supported. If someone feels like fixing them, feel free.
>
> Paul, I know you have a pretty diverse range computers. Can you try to 
> compile and run a “make check” with the following patch?
>
>  George.
>
> 
>
> On Jul 30, 2014, at 15:21 , Nathan Hjelm 
> > wrote:
>
>>
>> That is what I would prefer. I was trying to not disturb things too
>> much :). Please bring the changes over!
>>
>> -Nathan
>>
>> On Wed, Jul 30, 2014 at 03:18:44PM -0400, George Bosilca wrote:
>>>  Why do you want to add new versions? This will lead to having two, almost
>>>  identical, sets of atomics that are conceptually equivalent but different
>>>  in terms of code. And we will have to maintained both!
>>>  I did a similar change in a fork of OPAL in another project but instead of
>>>  adding another flavor of atomics, I completely replaced the available ones
>>>  with a set returning the old value. I can bring the code over.
>>>George.
>>>
>>>  On Tue, Jul 29, 2014 at 5:29 PM, Paul Hargrove 
>>> > wrote:
>>>
>>>On Tue, Jul 29, 2014 at 2:10 PM, Nathan Hjelm 
>>> > wrote:
>>>
>>>  Is there a reason why the
>>>  current implementations of opal atomics (add, cmpset) do not return
>>>  the
>>>  old value?
>>>
>>>Because some CPUs don't implement such an atomic instruction?
>>>
>>>On any CPU one *can* certainly synthesize the desired operation with an
>>>added read before the compare-and-swap to return a value that was
>>>present at some time before a failed cmpset.  That is almost certainly
>>>sufficient for your purposes.  However, the added load makes it
>>>(marginally) more expensive on some CPUs that only have the native
>>>equivalent of gcc's __sync_bool_compare_and_swap().