Paul,

I think the ref_count should always be manipulated with atomic operations,
otherwise we can't use them for internal, thread-safe, purposes. That being
said the issue at hand seems a little different. The difference in the
generated code between the opal_atomic_add and the OPAL_THREAD_ADD32, is
that the macro is explicitly calling opal_atomic_add32, while the generic
atomic_add has a switch inside (to select between atomics operations on
different type). For the error you mention to happen our configure script
must have detected that there is support for 8bytes atomic operations, thus
setting OPAL_HAVE_ATOMIC_ADD_64 to 1.

Can you take a look at the 64 bits atomic detection in the config.log and
post here the corresponding output ?

Thanks,
  George.



On Tue, May 10, 2016 at 1:38 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I am currently working with the v2.x branch, rather than tarballs.
>
> While attempting to build on AIX (which is ILP32 by default), I
> encountered an unexpected undefined reference to __sync_add_and_fetch_8()
> from opal/class/opal_list.h.
>
> I found that when debugging is enabled (as in almost every build I try)
> there is the following code:
> #if OPAL_ENABLE_DEBUG
>         /* Spot check: ensure this item is only on the list that we
>            just insertted it into */
>
>         (void)opal_atomic_add( &(item->opal_list_item_refcount), 1 );
>         assert(1 == item->opal_list_item_refcount);
>         item->opal_list_item_belong_to = list;
> #endif
>
> I am not sure why (and it may be an AIX-specific issue), but that
> "opal_atomic_add()" is attempting a 64-bit add.
> That is a problem, given that 'opal_list_item_refcount' is 32-bits!
>
> Noting that all other accesses to this field are OPAL_THREAD_ADD32(), I
> suggest the following (with a bonus spell-check at no additional charge):
>
> --- opal/class/opal_list.c~     2016-05-10 10:20:19.000000000 -0700
> +++ opal/class/opal_list.c      2016-05-10 10:29:14.000000000 -0700
> @@ -142,9 +142,9 @@
>
>   #if OPAL_ENABLE_DEBUG
>           /* Spot check: ensure this item is only on the list that we
> -            just insertted it into */
> +            just inserted it into */
>
> -         (void)opal_atomic_add( &(item->opal_list_item_refcount), 1 );
> +         (void)OPAL_THREAD_ADD32( &(item->opal_list_item_refcount), 1 );
>           assert(1 == item->opal_list_item_refcount);
>           item->opal_list_item_belong_to = list;
>   #endif
>
>
> Source inspection shows the same mixing or opal_atomic_add() vs
> OPAL_THREAD_ADD32() in master.
>
> -Paul
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18952.php
>

Reply via email to