Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-08-06 Thread Ralph Castain
Are we maybe approaching this from the wrong direction? I ask because we had to 
do some gyrations in the pmix framework to work around the difference in naming 
schemes between OPAL and the rest of the code base, and now we have more 
gyrations here.

Given that the MPI and RTE layers both rely on the structured form of the name, 
what about if we just mimic that down in OPAL? I think we could perhaps do this 
in a way that still allows someone to overlay it with a 64-bit unstructured 
identifier if they want, but that would put the extra work on their side. In 
other words, we make it easy to work with the other parts of our own code base, 
acknowledging that those wanting to do something else may have to do some extra 
work.

I ask because every resource manager out there assigns each process a jobid and 
vpid in some form of integer format. So we have to absorb that information in 
{jobid, vpid} format regardless of what we may want to do internally. What we 
now have to do is immediately convert that into the unstructured form for OPAL 
(where we take it in via PMI), then convert it back to structured form when 
passing it up to ORTE so it can be handed to OMPI, and then convert it back to 
unstructured form every time either OMPI or ORTE accesses the OPAL layer.

Seems awfully convoluted and error prone. Simplifying things for ourselves 
might make more sense.


On Aug 6, 2014, at 1:21 PM, George Bosilca  wrote:

> Gilles,
> 
> This looks right. It is really unfortunately that we have to change the 
> definition of orte_process_name_t for big endian architectures, but I don't 
> think there is a way around.
> 
> Regarding your patch I have two comments:
> 1. There is a flagrant lack of comments ... especially on the ORTE side
> 2. at the OPAL level we are really implementing a htonll, and I really think 
> we should stick to the POSIX prototype (aka. returning the changes value 
> instead of doing things inplace).
> 
>   George.
> 
> 
> 
> On Wed, Aug 6, 2014 at 7:02 AM, Gilles Gouaillardet 
>  wrote:
> Ralph and George,
> 
> here is attached a patch that fixes the heterogeneous support without the 
> abstraction violation.
> 
> Cheers,
> 
> Gilles
> 
> 
> On 2014/08/06 9:40, Gilles Gouaillardet wrote:
>> hummm
>> 
>> i intentionally did not swap the two 32 bits (!)
>> 
>> from the top level, what we have is :
>> 
>> typedef struct {
>>union {
>>   uint64_t opal;
>>   struct {
>>uint32_t jobid;
>>uint32_t vpid;
>>} orte;
>> } meta_process_name_t;
>> 
>> OPAL is agnostic about jobid and vpid.
>> jobid and vpid are set in ORTE/MPI and OPAL is used only
>> to transport the 64 bits
>> /* opal_process_name_t and orte_process_name_t are often casted into each
>> other */
>> at ORTE/MPI level, jobid and vpid are set individually
>> /* e.g. we do *not* do something like opal = jobid | (vpid<<32) */
>> this is why everything works fine on homogeneous clusters regardless
>> endianness.
>> 
>> now in heterogeneous cluster, thing get a bit trickier ...
>> 
>> i was initially unhappy with my commit and i think i found out why :
>> this is an abstraction violation !
>> the two 32 bits are not swapped by OPAL because this is what is expected by
>> the ORTE/OMPI.
>> 
>> now i d like to suggest the following lightweight approach :
>> 
>> at OPAL, use #if protected htonll/ntohll
>> (e.g. swap the two 32bits)
>> 
>> do the trick at the ORTE level :
>> 
>> simply replace
>> 
>> struct orte_process_name_t {
>> orte_jobid_t jobid;
>> orte_vpid_t vpid;
>> };
>> 
>> with
>> 
>> #if OPAL_ENABLE_HETEROGENEOUS_SUPPORT && !defined(WORDS_BIGENDIAN)
>> struct orte_process_name_t {
>> orte_vpid_t vpid;
>> orte_jobid_t jobid;
>> };
>> #else
>> struct orte_process_name_t {
>> orte_jobid_t jobid;
>> orte_vpid_t vpid;
>> };
>> #endif
>> 
>> 
>> so we keep OPAL agnostic about how the uint64_t is really used at the upper
>> level.
>> an other option is to make OPAL aware of jobid and vpid but this is a bit
>> more heavyweight imho.
>> 
>> i'll try this today and make sure it works.
>> 
>> any thoughts ?
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> 
>> On Wed, Aug 6, 2014 at 8:17 AM, Ralph Castain  wrote:
>> 
>>> Ah yes, so it is - sorry I missed that last test :-/
>>> 
>>> On Aug 5, 2014, at 10:50 AM, George Bosilca  wrote:
>>> 
>>> The code committed by Gilles is correctly protected for big endian (
>>> https://svn.open-mpi.org/trac/ompi/changeset/32425). I was merely
>>> pointing out that I think he should also swap the 2 32 bits in his
>>> implementation.
>>> 
>>>   George.
>>> 
>>> 
>>> 
>>> On Tue, Aug 5, 2014 at 1:32 PM, Ralph Castain  wrote:
>>> 
 On Aug 5, 2014, at 10:23 AM, George Bosilca  wrote:
 
 On Tue, Aug 5, 2014 at 1:15 PM, Ralph Castain  wrote:
 
> Hmmm...wouldn't that then require that you 

[hwloc-devel] Create success (hwloc git dev-181-g135efd0)

2014-08-06 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc dev-181-g135efd0
Start time: Wed Aug  6 21:01:01 EDT 2014
End time:   Wed Aug  6 21:02:31 EDT 2014

Your friendly daemon,
Cyrador


Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-08-06 Thread George Bosilca
Gilles,

This looks right. It is really unfortunately that we have to change the
definition of orte_process_name_t for big endian architectures, but I don't
think there is a way around.

Regarding your patch I have two comments:
1. There is a flagrant lack of comments ... especially on the ORTE side
2. at the OPAL level we are really implementing a htonll, and I really
think we should stick to the POSIX prototype (aka. returning the changes
value instead of doing things inplace).

  George.



On Wed, Aug 6, 2014 at 7:02 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Ralph and George,
>
> here is attached a patch that fixes the heterogeneous support without the
> abstraction violation.
>
> Cheers,
>
> Gilles
>
>
> On 2014/08/06 9:40, Gilles Gouaillardet wrote:
>
> hummm
>
> i intentionally did not swap the two 32 bits (!)
>
> from the top level, what we have is :
>
> typedef struct {
>union {
>   uint64_t opal;
>   struct {
>uint32_t jobid;
>uint32_t vpid;
>} orte;
> } meta_process_name_t;
>
> OPAL is agnostic about jobid and vpid.
> jobid and vpid are set in ORTE/MPI and OPAL is used only
> to transport the 64 bits
> /* opal_process_name_t and orte_process_name_t are often casted into each
> other */
> at ORTE/MPI level, jobid and vpid are set individually
> /* e.g. we do *not* do something like opal = jobid | (vpid<<32) */
> this is why everything works fine on homogeneous clusters regardless
> endianness.
>
> now in heterogeneous cluster, thing get a bit trickier ...
>
> i was initially unhappy with my commit and i think i found out why :
> this is an abstraction violation !
> the two 32 bits are not swapped by OPAL because this is what is expected by
> the ORTE/OMPI.
>
> now i d like to suggest the following lightweight approach :
>
> at OPAL, use #if protected htonll/ntohll
> (e.g. swap the two 32bits)
>
> do the trick at the ORTE level :
>
> simply replace
>
> struct orte_process_name_t {
> orte_jobid_t jobid;
> orte_vpid_t vpid;
> };
>
> with
>
> #if OPAL_ENABLE_HETEROGENEOUS_SUPPORT && !defined(WORDS_BIGENDIAN)
> struct orte_process_name_t {
> orte_vpid_t vpid;
> orte_jobid_t jobid;
> };
> #else
> struct orte_process_name_t {
> orte_jobid_t jobid;
> orte_vpid_t vpid;
> };
> #endif
>
>
> so we keep OPAL agnostic about how the uint64_t is really used at the upper
> level.
> an other option is to make OPAL aware of jobid and vpid but this is a bit
> more heavyweight imho.
>
> i'll try this today and make sure it works.
>
> any thoughts ?
>
> Cheers,
>
> Gilles
>
>
> On Wed, Aug 6, 2014 at 8:17 AM, Ralph Castain  
>  wrote:
>
>
>  Ah yes, so it is - sorry I missed that last test :-/
>
> On Aug 5, 2014, at 10:50 AM, George Bosilca  
>  wrote:
>
> The code committed by Gilles is correctly protected for big endian 
> (https://svn.open-mpi.org/trac/ompi/changeset/32425). I was merely
> pointing out that I think he should also swap the 2 32 bits in his
> implementation.
>
>   George.
>
>
>
> On Tue, Aug 5, 2014 at 1:32 PM, Ralph Castain  
>  wrote:
>
>
>  On Aug 5, 2014, at 10:23 AM, George Bosilca  
>  wrote:
>
> On Tue, Aug 5, 2014 at 1:15 PM, Ralph Castain  
>  wrote:
>
>
>  Hmmm...wouldn't that then require that you know (a) the other side is
> little endian, and (b) that you are on a big endian? Otherwise, you wind up
> with the same issue in reverse, yes?
>
>
>  This is similar to the 32 bits ntohl that we are using in other parts of
> the project. Any  little endian participant will do the conversion, while
> every big endian participant will use an empty macro instead.
>
>
>
>  In the ORTE methods, we explicitly set the fields (e.g., jobid =
> ntohl(remote-jobid)) to get around this problem. I missed that he did it by
> location instead of named fields - perhaps we should do that instead?
>
>
>  As soon as we impose the ORTE naming scheme at the OPAL level (aka. the
> notion of jobid and vpid) this approach will become possible.
>
>
> Not proposing that at all so long as the other method will work without
> knowing the other side's endianness. Sounds like your approach should work
> fine as long as Gilles adds a #if so big endian defines the macro away
>
>
>   George.
>
>
>
>
>
> On Aug 5, 2014, at 10:06 AM, George Bosilca  
>  wrote:
>
> Technically speaking, converting a 64 bits to a big endian
> representation requires the swap of the 2 32 bits parts. So the correct
> approach would have been:
> uint64_t htonll(uint64_t v)
> {
> return uint64_t)ntohl(n)) << 32 | (uint64_t)ntohl(n >> 32));
> }
>
>   George.
>
>
>
> On Tue, Aug 5, 2014 at 5:52 AM, Ralph Castain  
>  wrote:
>
>
>  FWIW: that's exactly how we do it in ORTE
>
> On 

Re: [OMPI devel] OMPI devel] trunk compilation errors in jenkins

2014-08-06 Thread Gilles Gouaillardet
Ralph and George,

here is attached a patch that fixes the heterogeneous support without
the abstraction violation.

Cheers,

Gilles

On 2014/08/06 9:40, Gilles Gouaillardet wrote:
> hummm
>
> i intentionally did not swap the two 32 bits (!)
>
> from the top level, what we have is :
>
> typedef struct {
>union {
>   uint64_t opal;
>   struct {
>uint32_t jobid;
>uint32_t vpid;
>} orte;
> } meta_process_name_t;
>
> OPAL is agnostic about jobid and vpid.
> jobid and vpid are set in ORTE/MPI and OPAL is used only
> to transport the 64 bits
> /* opal_process_name_t and orte_process_name_t are often casted into each
> other */
> at ORTE/MPI level, jobid and vpid are set individually
> /* e.g. we do *not* do something like opal = jobid | (vpid<<32) */
> this is why everything works fine on homogeneous clusters regardless
> endianness.
>
> now in heterogeneous cluster, thing get a bit trickier ...
>
> i was initially unhappy with my commit and i think i found out why :
> this is an abstraction violation !
> the two 32 bits are not swapped by OPAL because this is what is expected by
> the ORTE/OMPI.
>
> now i d like to suggest the following lightweight approach :
>
> at OPAL, use #if protected htonll/ntohll
> (e.g. swap the two 32bits)
>
> do the trick at the ORTE level :
>
> simply replace
>
> struct orte_process_name_t {
> orte_jobid_t jobid;
> orte_vpid_t vpid;
> };
>
> with
>
> #if OPAL_ENABLE_HETEROGENEOUS_SUPPORT && !defined(WORDS_BIGENDIAN)
> struct orte_process_name_t {
> orte_vpid_t vpid;
> orte_jobid_t jobid;
> };
> #else
> struct orte_process_name_t {
> orte_jobid_t jobid;
> orte_vpid_t vpid;
> };
> #endif
>
>
> so we keep OPAL agnostic about how the uint64_t is really used at the upper
> level.
> an other option is to make OPAL aware of jobid and vpid but this is a bit
> more heavyweight imho.
>
> i'll try this today and make sure it works.
>
> any thoughts ?
>
> Cheers,
>
> Gilles
>
>
> On Wed, Aug 6, 2014 at 8:17 AM, Ralph Castain  wrote:
>
>> Ah yes, so it is - sorry I missed that last test :-/
>>
>> On Aug 5, 2014, at 10:50 AM, George Bosilca  wrote:
>>
>> The code committed by Gilles is correctly protected for big endian (
>> https://svn.open-mpi.org/trac/ompi/changeset/32425). I was merely
>> pointing out that I think he should also swap the 2 32 bits in his
>> implementation.
>>
>>   George.
>>
>>
>>
>> On Tue, Aug 5, 2014 at 1:32 PM, Ralph Castain  wrote:
>>
>>> On Aug 5, 2014, at 10:23 AM, George Bosilca  wrote:
>>>
>>> On Tue, Aug 5, 2014 at 1:15 PM, Ralph Castain  wrote:
>>>
 Hmmm...wouldn't that then require that you know (a) the other side is
 little endian, and (b) that you are on a big endian? Otherwise, you wind up
 with the same issue in reverse, yes?

>>> This is similar to the 32 bits ntohl that we are using in other parts of
>>> the project. Any  little endian participant will do the conversion, while
>>> every big endian participant will use an empty macro instead.
>>>
>>>
 In the ORTE methods, we explicitly set the fields (e.g., jobid =
 ntohl(remote-jobid)) to get around this problem. I missed that he did it by
 location instead of named fields - perhaps we should do that instead?

>>> As soon as we impose the ORTE naming scheme at the OPAL level (aka. the
>>> notion of jobid and vpid) this approach will become possible.
>>>
>>>
>>> Not proposing that at all so long as the other method will work without
>>> knowing the other side's endianness. Sounds like your approach should work
>>> fine as long as Gilles adds a #if so big endian defines the macro away
>>>
>>>
>>>   George.
>>>
>>>
>>>

 On Aug 5, 2014, at 10:06 AM, George Bosilca  wrote:

 Technically speaking, converting a 64 bits to a big endian
 representation requires the swap of the 2 32 bits parts. So the correct
 approach would have been:
 uint64_t htonll(uint64_t v)
 {
 return uint64_t)ntohl(n)) << 32 | (uint64_t)ntohl(n >> 32));
 }

   George.



 On Tue, Aug 5, 2014 at 5:52 AM, Ralph Castain  wrote:

> FWIW: that's exactly how we do it in ORTE
>
> On Aug 4, 2014, at 10:25 PM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
> George,
>
> i confirm there was a problem when running on an heterogeneous cluster,
> this is now fixed in r32425.
>
> i am not convinced i chose the most elegant way to achieve the desired
> result ...
> could you please double check this commit ?
>
> Thanks,
>
> Gilles
>
> On 2014/08/02 0:14, George Bosilca wrote:
>
> Gilles,
>
> The design of the BTL move was to let the opal_process_name_t be agnostic 
> to what is stored inside, and all accesses should be done through the 

Re: [OMPI devel] [1.8.2rc3] static linking fails on linux when not building ROMIO

2014-08-06 Thread Gilles Gouaillardet
Paul,

i missed a step indeed :
opal is required by rte, that is in turn required by mpi

the attached patch does the job (tested on a solaris10/x86_64 vm with
gnu compilers)

Cheers,

Gilles

On 2014/08/06 4:40, Paul Hargrove wrote:
> Gilles,
>
> I have not tested your patch.
> I've only read it.
>
> It looks like it could work, except that libopen-rte.a depends on libsocket
> and libnsl on Solaris.
> So, one probably needs to add $LIBS to the ORTE wrapper libs as well.
>
> Additionally,if your approach is the correct one, then I think one can fold:
>
> OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs])
> OPAL_WRAPPER_EXTRA_LIBS="$OPAL_WRAPPER_EXTRA_LIBS
> $with_wrapper_libs"
> +   OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS], [$LIBS])
> +   OPAL_WRAPPER_EXTRA_LIBS="$OPAL_WRAPPER_EXTRA_LIBS
> $with_wrapper_libs"
>
> into just
>
> -OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs])
> +   OPAL_FLAGS_APPEND_UNIQ([OPAL_WRAPPER_EXTRA_LIBS],
> [$wrapper_extra_libs $LIBS])
>
> which merges two calls to OPAL_FLAGS_APPEND_UNIQ and avoids double-adding
> of the user's $with_wrapper_libs.
> And of course the same 1-line change would apply for the OMPI and
> eventually ORTE variables too.
>
> I'd like to wait until Jeff has had a chance to look this over before I
> devote time to testing.
> Since I've determined already that 1.6.5 did not have the problem while
> 1.7.x does, the possibility exists that some smaller change might exist to
> restore what ever was lost between the v1.6 and v1.7 branches.
>
> -Paul
>
>
> On Tue, Aug 5, 2014 at 1:33 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>>  Here is a patch that has been minimally tested.
>>
>> this is likely an overkill (at least when dynamic libraries can be used),
>> but it does the job so far ...
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/08/05 16:56, Gilles Gouaillardet wrote:
>>
>> from libopen-pal.la :
>> dependency_libs=' -lrdmacm -libverbs -lscif -lnuma -ldl -lrt -lnsl
>> -lutil -lm'
>>
>>
>> i confirm mpicc fails linking
>>
>> but FWIT, using libtool does work (!)
>>
>> could the bug come from the mpicc (and other) wrappers ?
>>
>> Gilles
>>
>> $ gcc -g -O0 -o hw /csc/home1/gouaillardet/hw.c
>> -I/tmp/install/ompi.noromio/include -pthread -L/usr/lib64 -Wl,-rpath
>> -Wl,/usr/lib64 -Wl,-rpath -Wl,/tmp/install/ompi.noromio/lib
>> -Wl,--enable-new-dtags -L/tmp/install/ompi.noromio/lib -lmpi -lopen-rte
>> -lopen-pal -lm -lnuma -libverbs -lscif -lrdmacm -ldl -llustreapi
>>
>> $ /tmp/install/ompi.noromio/bin/mpicc -g -O0 -o hw -show ~/hw.c
>> gcc -g -O0 -o hw /csc/home1/gouaillardet/hw.c
>> -I/tmp/install/ompi.noromio/include -pthread -L/usr/lib64 -Wl,-rpath
>> -Wl,/usr/lib64 -Wl,-rpath -Wl,/tmp/install/ompi.noromio/lib
>> -Wl,--enable-new-dtags -L/tmp/install/ompi.noromio/lib -lmpi -lopen-rte
>> -lopen-pal -lm -lnuma -libverbs -lscif -lrdmacm -ldl -llustreapi
>> [gouaillardet@soleil build]$ /tmp/install/ompi.noromio/bin/mpicc -g -O0
>> -o hw ~/hw.c
>> /tmp/install/ompi.noromio/lib/libmpi.a(fbtl_posix_ipwritev.o): In
>> function `mca_fbtl_posix_ipwritev':
>> fbtl_posix_ipwritev.c:(.text+0x17b): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x237): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x3f4): undefined reference to `aio_write'
>> fbtl_posix_ipwritev.c:(.text+0x48e): undefined reference to `aio_write'
>> /tmp/install/ompi.noromio/lib/libopen-pal.a(opal_pty.o): In function
>> `opal_openpty':
>> opal_pty.c:(.text+0x1): undefined reference to `openpty'
>> /tmp/install/ompi.noromio/lib/libopen-pal.a(event.o): In function
>> `event_add_internal':
>> event.c:(.text+0x288d): undefined reference to `clock_gettime'
>>
>> $ /bin/sh ./static/libtool --silent --tag=CC   --mode=compile gcc
>> -std=gnu99 -I/tmp/install/ompi.noromio/include -c ~/hw.c
>> $ /bin/sh ./static/libtool --silent --tag=CC   --mode=link gcc
>> -std=gnu99 -o hw hw.o -L/tmp/install/ompi.noromio/lib -lmpi
>> $ ldd hw
>> linux-vdso.so.1 =>  (0x7fff7530d000)
>> librdmacm.so.1 => /usr/lib64/librdmacm.so.1 (0x7f0ed541e000)
>> libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x7f0ed521)
>> libscif.so.0 => /usr/lib64/libscif.so.0 (0x003b9c60)
>> libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x003ba560)
>> libdl.so.2 => /lib64/libdl.so.2 (0x003b9be0)
>> librt.so.1 => /lib64/librt.so.1 (0x003b9ca0)
>> libnsl.so.1 => /lib64/libnsl.so.1 (0x003bae20)
>> libutil.so.1 => /lib64/libutil.so.1 (0x003bac60)
>> libm.so.6 => /lib64/libm.so.6 (0x003b9ba0)
>> libpthread.so.0 => /lib64/libpthread.so.0 (0x003b9c20)
>> libc.so.6 => /lib64/libc.so.6 (0x003b9b60)
>> /lib64/ld-linux-x86-64.so.2 (0x003b9b20)
>>
>>
>>
>>
>> On 2014/08/05 7:56, Ralph Castain wrote:
>>
>>  My thought was to post initially as a blocker, pending a discussion