Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread Gilles Gouaillardet

Did you strip the libraries already ?


the script will show the list of frameworks and components used by MPI 
helloworld.


from that, you can deduce a list of components that are not required, 
exclude them via the configure command line, and rebuild a trimmed Open MPI.


note this is pretty painful and incomplete. for example, the ompi/io 
components are not explicitly required by MPI helloworld, but they are 
required


if your app uses MPI-IO (e.g. MPI_File_xxx)

some more components might be dynamically required by realworld MPI app.


may i ask why you are focusing on reducing the lib size ?

reducing the lib size by excluding (allegedly) useless components is a 
long and painful process, and you might end up having to debug


new problems by your own ...

as far as i am concerned, if a few MB libs is too big (filesystem ? 
memory ?), i do not see how a real world application can even run on 
your arm node



Cheers,


Gilles

On 11/2/2016 12:49 PM, Mahesh Nanavalla wrote:

HI George,
Thanks for reply,

By that above script ,how can i reduce*libmpi.so* size.



On Tue, Nov 1, 2016 at 11:27 PM, George Bosilca > wrote:


Let's try to coerce OMPI to dump all modules that are still loaded
after MPI_Init. We are still having a superset of the needed
modules, but at least everything unnecessary in your particular
environment has been trimmed as during a normal OMPI run.

George.

PS: It's a shell script that needs ag to run. You need to provide
the OMPI source directory. You will get a C file (named tmp.c) in
the current directory that contain the code necessary to dump all
active modules. You will have to fiddle with the compile line to
get it to work, as you will need to specify both source and build
header files directories. For the sake of completeness here is my
compile line

mpicc -o tmp -g tmp.c -I. -I../debug/opal/include
-I../debug/ompi/include -Iompi/include -Iopal/include
-Iopal/mca/event/libevent2022/libevent -Iorte/include
-I../debug/opal/mca/hwloc/hwloc1113/hwloc/include
-Iopal/mca/hwloc/hwloc1113/hwloc/include -Ioshmem/include
-I../debug/ -lopen-rte -l open-pal



On Tue, Nov 1, 2016 at 7:12 AM, Jeff Squyres (jsquyres)
> wrote:

Run ompi_info; it will tell you all the plugins that are
installed.

> On Nov 1, 2016, at 2:13 AM, Mahesh Nanavalla
> wrote:
>
> Hi Jeff Squyres,
>
> Thank you for your reply...
>
> My problem is i want to reduce library size by removing
unwanted plugin's.
>
> Here libmpi.so.12.0.3 size is 2.4MB.
>
> How can i know what are the pluggin's included to build the
libmpi.so.12.0.3 and how can remove.
>
> Thanks,
> Mahesh N
>
> On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres)
> wrote:
> On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla
> wrote:
> >
> > i have configured as below for arm
> >
> > ./configure --enable-orterun-prefix-by-default
--prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi"
CC=arm-openwrt-linux-muslgnueabi-gcc
CXX=arm-openwrt-linux-muslgnueabi-g++
--host=arm-openwrt-linux-muslgnueabi
--enable-script-wrapper-compilers --disable-mpi-fortran
--enable-dlopen --enable-shared --disable-vt --disable-java
--disable-libompitrace --disable-static
>
> Note that there is a tradeoff here: --enable-dlopen will
reduce the size of libmpi.so by splitting out all the plugins
into separate DSOs (dynamic shared objects -- i.e., individual
.so plugin files).  But note that some of plugins are quite
small in terms of code.  I mention this because when you
dlopen a DSO, it will load in DSOs in units of pages.  So even
if a DSO only has 1KB of code, it will use  of
bytes in your running process (e.g., 4KB -- or whatever the
page size is on your system).
>
> On the other hand, if you --disable-dlopen, then all of Open
MPI's plugins are slurped into libmpi.so (and friends). 
Meaning: no DSOs, no dlopen, no page-boundary-loading

behavior.  This allows the compiler/linker to pack in all the
plugins into memory more efficiently (because they'll be
compiled as part of libmpi.so, and all the code is packed in
there -- just like any other library).  Your total memory
usage in the process may be smaller.
>
> Sidenote: if you run more than one MPI process per node,
  

Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread Mahesh Nanavalla
HI George,
Thanks for reply,

By that above script ,how can i reduce* libmpi.so* size.



On Tue, Nov 1, 2016 at 11:27 PM, George Bosilca  wrote:

> Let's try to coerce OMPI to dump all modules that are still loaded after
> MPI_Init. We are still having a superset of the needed modules, but at
> least everything unnecessary in your particular environment has been
> trimmed as during a normal OMPI run.
>
> George.
>
> PS: It's a shell script that needs ag to run. You need to provide the OMPI
> source directory. You will get a C file (named tmp.c) in the current
> directory that contain the code necessary to dump all active modules. You
> will have to fiddle with the compile line to get it to work, as you will
> need to specify both source and build header files directories. For the
> sake of completeness here is my compile line
>
> mpicc -o tmp -g tmp.c -I. -I../debug/opal/include -I../debug/ompi/include
> -Iompi/include -Iopal/include -Iopal/mca/event/libevent2022/libevent
> -Iorte/include -I../debug/opal/mca/hwloc/hwloc1113/hwloc/include
> -Iopal/mca/hwloc/hwloc1113/hwloc/include -Ioshmem/include -I../debug/
> -lopen-rte -l open-pal
>
>
>
> On Tue, Nov 1, 2016 at 7:12 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> Run ompi_info; it will tell you all the plugins that are installed.
>>
>> > On Nov 1, 2016, at 2:13 AM, Mahesh Nanavalla <
>> mahesh.nanavalla...@gmail.com> wrote:
>> >
>> > Hi Jeff Squyres,
>> >
>> > Thank you for your reply...
>> >
>> > My problem is i want to reduce library size by removing unwanted
>> plugin's.
>> >
>> > Here libmpi.so.12.0.3 size is 2.4MB.
>> >
>> > How can i know what are the pluggin's included to build the
>> libmpi.so.12.0.3 and how can remove.
>> >
>> > Thanks,
>> > Mahesh N
>> >
>> > On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>> > On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla <
>> mahesh.nanavalla...@gmail.com> wrote:
>> > >
>> > > i have configured as below for arm
>> > >
>> > > ./configure --enable-orterun-prefix-by-default
>> --prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi"
>> CC=arm-openwrt-linux-muslgnueabi-gcc CXX=arm-openwrt-linux-muslgnueabi-g++
>> --host=arm-openwrt-linux-muslgnueabi --enable-script-wrapper-compilers
>> --disable-mpi-fortran --enable-dlopen --enable-shared --disable-vt
>> --disable-java --disable-libompitrace --disable-static
>> >
>> > Note that there is a tradeoff here: --enable-dlopen will reduce the
>> size of libmpi.so by splitting out all the plugins into separate DSOs
>> (dynamic shared objects -- i.e., individual .so plugin files).  But note
>> that some of plugins are quite small in terms of code.  I mention this
>> because when you dlopen a DSO, it will load in DSOs in units of pages.  So
>> even if a DSO only has 1KB of code, it will use  of bytes in
>> your running process (e.g., 4KB -- or whatever the page size is on your
>> system).
>> >
>> > On the other hand, if you --disable-dlopen, then all of Open MPI's
>> plugins are slurped into libmpi.so (and friends).  Meaning: no DSOs, no
>> dlopen, no page-boundary-loading behavior.  This allows the compiler/linker
>> to pack in all the plugins into memory more efficiently (because they'll be
>> compiled as part of libmpi.so, and all the code is packed in there -- just
>> like any other library).  Your total memory usage in the process may be
>> smaller.
>> >
>> > Sidenote: if you run more than one MPI process per node, then libmpi.so
>> (and friends) will be shared between processes.  You're assumedly running
>> in an embedded environment, so I don't know if this factor matters (i.e., I
>> don't know if you'll run with ppn>1), but I thought I'd mention it anyway.
>> >
>> > On the other hand (that's your third hand, for those at home
>> counting...), you may not want to include *all* the plugins.  I.e., there
>> may be a bunch of plugins that you're not actually using, and therefore if
>> they are compiled in as part of libmpi.so (and friends), they're consuming
>> space that you don't want/need.  So the dlopen mechanism might actually be
>> better -- because Open MPI may dlopen a plugin at run time, determine that
>> it won't be used, and then dlclose it (i.e., release the memory that would
>> have been used for it).
>> >
>> > On the other (fourth!) hand, you can actually tell Open MPI to *not*
>> build specific plugins with the --enable-dso-no-build=LIST configure
>> option.  I.e., if you know exactly what plugins you want to use, you can
>> negate the ones that you *don't* want to use on the configure line, use
>> --disable-static and --disable-dlopen, and you'll likely use the least
>> amount of memory.  This is admittedly a bit clunky, but Open MPI's
>> configure process was (obviously) not optimized for this use case -- it's
>> much more optimized to the "build everything possible, and figure out which
>> to use at run time" use case.
>> >
>> > If you really want to hit rock bottom on MPI 

Re: [OMPI users] Slurm binding not propagated to MPI jobs

2016-11-01 Thread r...@open-mpi.org
Ah crumby!! We already solved this on master, but it cannot be backported to 
the 1.10 series without considerable pain. For some reason, the support for it 
has been removed from the 2.x series as well. I’ll try to resolve that issue 
and get the support reinstated there (probably not until 2.1).

Can you manage until then? I think the v2 RM’s are thinking Dec/Jan for 2.1.
Ralph


> On Nov 1, 2016, at 11:38 AM, Riebs, Andy  wrote:
> 
> To close the thread here… I got the following information:
>  
> Looking at SLURM_CPU_BIND is the right idea, but there are quite a few more 
> options. It misses map_cpu, rank, plus the NUMA-based options:
> rank_ldom, map_ldom, and mask_ldom. See the srun man pages for documentation.
>  
>  
> From: Riebs, Andy 
> Sent: Thursday, October 27, 2016 1:53 PM
> To: users@lists.open-mpi.org
> Subject: Re: [OMPI users] Slurm binding not propagated to MPI jobs
>  
> Hi Ralph,
> 
> I haven't played around in this code, so I'll flip the question over to the 
> Slurm list, and report back here when I learn anything.
> 
> Cheers
> Andy
> 
> On 10/27/2016 01:44 PM, r...@open-mpi.org  wrote:
> Sigh - of course it wouldn’t be simple :-( 
>  
> All right, let’s suppose we look for SLURM_CPU_BIND:
>  
> * if it includes the word “none”, then we know the user specified that they 
> don’t want us to bind
>  
> * if it includes the word mask_cpu, then we have to check the value of that 
> option.
>  
> * If it is all F’s, then they didn’t specify a binding and we should do our 
> thing.
>  
> * If it is anything else, then we assume they _did_ specify a binding, and we 
> leave it alone
>  
> Would that make sense? Is there anything else that could be in that envar 
> which would trip us up?
>  
>  
> On Oct 27, 2016, at 10:37 AM, Andy Riebs  > wrote:
>  
> Yes, they still exist:
> $ srun --ntasks-per-node=2 -N1 env | grep BIND | sort -u
> SLURM_CPU_BIND_LIST=0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_VERBOSE=quiet
> Here are the relevant Slurm configuration options that could conceivably 
> change the behavior from system to system:
> SelectType  = select/cons_res
> SelectTypeParameters= CR_CPU
> 
>  
> On 10/27/2016 01:17 PM, r...@open-mpi.org  wrote:
> And if there is no --cpu_bind on the cmd line? Do these not exist?
>  
> On Oct 27, 2016, at 10:14 AM, Andy Riebs  > wrote:
>  
> Hi Ralph,
> 
> I think I've found the magic keys...
> 
> $ srun --ntasks-per-node=2 -N1 --cpu_bind=none env | grep BIND
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=none
> SLURM_CPU_BIND_LIST=
> SLURM_CPU_BIND=quiet,none
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=none
> SLURM_CPU_BIND_LIST=
> SLURM_CPU_BIND=quiet,none
> $ srun --ntasks-per-node=2 -N1 --cpu_bind=core env | grep BIND
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_LIST=0x,0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
> SLURM_CPU_BIND_VERBOSE=quiet
> SLURM_CPU_BIND_TYPE=mask_cpu:
> SLURM_CPU_BIND_LIST=0x,0x
> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
> 
> Andy
> 
> On 10/27/2016 11:57 AM, r...@open-mpi.org  wrote:
> 
> Hey Andy
> 
> Is there a SLURM envar that would tell us the binding option from the srun 
> cmd line? We automatically bind when direct launched due to user complaints 
> of poor performance if we don’t. If the user specifies a binding 
> option, then we detect that we were already bound and don’t do it.
> 
> However, if the user specifies that they not be bound, then we think they 
> simply didn’t specify anything - and that isn’t the case. If we 
> can see something that tells us “they explicitly said not to do 
> it”, then we can avoid the situation.
> 
> Ralph
> 
> 
> On Oct 27, 2016, at 8:48 AM, Andy Riebs  > wrote:
> 
> Hi All,
> 
> We are running Open MPI version 1.10.2, built with support for Slurm version 
> 16.05.0. When a user specifies "--cpu_bind=none", MPI tries to bind by core, 
> which segv's if there are more processes than cores.
> 
> The user reports:
> 
> What I found is that
> 
> % srun --ntasks-per-node=8 --cpu_bind=none  \
> env SHMEM_SYMMETRIC_HEAP_SIZE=1024M bin/all2all.shmem.exe 0
> 
> will have the problem, but:
> 
> % srun --ntasks-per-node=8 --cpu_bind=none  \
> env SHMEM_SYMMETRIC_HEAP_SIZE=1024M ./bindit.sh bin/all2all.shmem.exe 0
> 
> Will run as expected and print out the usage message because I didn’t 
> provide the right arguments to the code.
> 
> So, it appears that the binding has something to do with the issue. My 
> binding script is as follows:
> 
> % cat bindit.sh
> #!/bin/bash
> 
> #echo SLURM_LOCALID=$SLURM_LOCALID
> 
> stride=1
> 
> if [ ! -z "$SLURM_LOCALID" 

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Nathan Hjelm

UDCM does not require IPoIB. It should be working for you. Can you build Open 
MPI with --enable-debug and run with -mca btl_base_verbose 100 and create a 
gist with the output.

-Nathan

On Nov 01, 2016, at 07:50 AM, Sergei Hrushev  wrote:


I haven't worked with InfiniBand for years, but I do believe that yes: you need 
IPoIB enabled on your IB devices to get the RDMA CM support to work.


Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports that UD 
CM can't be used too.
Is it also require IPoIB?

Is it possible to read more about UD CM somewhere?


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Slurm binding not propagated to MPI jobs

2016-11-01 Thread Riebs, Andy
To close the thread here… I got the following information:


Looking at SLURM_CPU_BIND is the right idea, but there are quite a few more 
options. It misses map_cpu, rank, plus the NUMA-based options:

rank_ldom, map_ldom, and mask_ldom. See the srun man pages for documentation.


From: Riebs, Andy
Sent: Thursday, October 27, 2016 1:53 PM
To: users@lists.open-mpi.org
Subject: Re: [OMPI users] Slurm binding not propagated to MPI jobs


Hi Ralph,

I haven't played around in this code, so I'll flip the question over to the 
Slurm list, and report back here when I learn anything.
Cheers
Andy
On 10/27/2016 01:44 PM, r...@open-mpi.org wrote:
Sigh - of course it wouldn’t be simple :-(

All right, let’s suppose we look for SLURM_CPU_BIND:

* if it includes the word “none”, then we know the user specified that they 
don’t want us to bind

* if it includes the word mask_cpu, then we have to check the value of that 
option.

* If it is all F’s, then they didn’t specify a binding and we should do our 
thing.

* If it is anything else, then we assume they _did_ specify a binding, and we 
leave it alone

Would that make sense? Is there anything else that could be in that envar which 
would trip us up?


On Oct 27, 2016, at 10:37 AM, Andy Riebs 
> wrote:

Yes, they still exist:
$ srun --ntasks-per-node=2 -N1 env | grep BIND | sort -u
SLURM_CPU_BIND_LIST=0x
SLURM_CPU_BIND=quiet,mask_cpu:0x
SLURM_CPU_BIND_TYPE=mask_cpu:
SLURM_CPU_BIND_VERBOSE=quiet
Here are the relevant Slurm configuration options that could conceivably change 
the behavior from system to system:
SelectType  = select/cons_res
SelectTypeParameters= CR_CPU

On 10/27/2016 01:17 PM, r...@open-mpi.org wrote:
And if there is no --cpu_bind on the cmd line? Do these not exist?

On Oct 27, 2016, at 10:14 AM, Andy Riebs 
> wrote:

Hi Ralph,

I think I've found the magic keys...

$ srun --ntasks-per-node=2 -N1 --cpu_bind=none env | grep BIND
SLURM_CPU_BIND_VERBOSE=quiet
SLURM_CPU_BIND_TYPE=none
SLURM_CPU_BIND_LIST=
SLURM_CPU_BIND=quiet,none
SLURM_CPU_BIND_VERBOSE=quiet
SLURM_CPU_BIND_TYPE=none
SLURM_CPU_BIND_LIST=
SLURM_CPU_BIND=quiet,none
$ srun --ntasks-per-node=2 -N1 --cpu_bind=core env | grep BIND
SLURM_CPU_BIND_VERBOSE=quiet
SLURM_CPU_BIND_TYPE=mask_cpu:
SLURM_CPU_BIND_LIST=0x,0x
SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
SLURM_CPU_BIND_VERBOSE=quiet
SLURM_CPU_BIND_TYPE=mask_cpu:
SLURM_CPU_BIND_LIST=0x,0x
SLURM_CPU_BIND=quiet,mask_cpu:0x,0x

Andy

On 10/27/2016 11:57 AM, r...@open-mpi.org wrote:

Hey Andy

Is there a SLURM envar that would tell us the binding option from the srun cmd 
line? We automatically bind when direct launched due to user complaints of poor 
performance if we don’t. If the user specifies a binding option, then we 
detect that we were already bound and don’t do it.

However, if the user specifies that they not be bound, then we think they 
simply didn’t specify anything - and that isn’t the case. If we 
can see something that tells us “they explicitly said not to do it”, 
then we can avoid the situation.

Ralph


On Oct 27, 2016, at 8:48 AM, Andy Riebs 
> wrote:

Hi All,

We are running Open MPI version 1.10.2, built with support for Slurm version 
16.05.0. When a user specifies "--cpu_bind=none", MPI tries to bind by core, 
which segv's if there are more processes than cores.

The user reports:

What I found is that

% srun --ntasks-per-node=8 --cpu_bind=none  \
env SHMEM_SYMMETRIC_HEAP_SIZE=1024M bin/all2all.shmem.exe 0

will have the problem, but:

% srun --ntasks-per-node=8 --cpu_bind=none  \
env SHMEM_SYMMETRIC_HEAP_SIZE=1024M ./bindit.sh bin/all2all.shmem.exe 0

Will run as expected and print out the usage message because I didn’t 
provide the right arguments to the code.

So, it appears that the binding has something to do with the issue. My binding 
script is as follows:

% cat bindit.sh
#!/bin/bash

#echo SLURM_LOCALID=$SLURM_LOCALID

stride=1

if [ ! -z "$SLURM_LOCALID" ]; then
  let bindCPU=$SLURM_LOCALID*$stride
  exec numactl --membind=0 --physcpubind=$bindCPU $*
fi

$*

%


--
Andy Riebs
andy.ri...@hpe.com
Hewlett-Packard Enterprise
High Performance Computing Software Engineering
+1 404 648 9024
My opinions are not necessarily those of HPE
   May the source be with you!

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users 

Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread George Bosilca
Let's try to coerce OMPI to dump all modules that are still loaded after
MPI_Init. We are still having a superset of the needed modules, but at
least everything unnecessary in your particular environment has been
trimmed as during a normal OMPI run.

George.

PS: It's a shell script that needs ag to run. You need to provide the OMPI
source directory. You will get a C file (named tmp.c) in the current
directory that contain the code necessary to dump all active modules. You
will have to fiddle with the compile line to get it to work, as you will
need to specify both source and build header files directories. For the
sake of completeness here is my compile line

mpicc -o tmp -g tmp.c -I. -I../debug/opal/include -I../debug/ompi/include
-Iompi/include -Iopal/include -Iopal/mca/event/libevent2022/libevent
-Iorte/include -I../debug/opal/mca/hwloc/hwloc1113/hwloc/include
-Iopal/mca/hwloc/hwloc1113/hwloc/include -Ioshmem/include -I../debug/
-lopen-rte -l open-pal



On Tue, Nov 1, 2016 at 7:12 AM, Jeff Squyres (jsquyres) 
wrote:

> Run ompi_info; it will tell you all the plugins that are installed.
>
> > On Nov 1, 2016, at 2:13 AM, Mahesh Nanavalla <
> mahesh.nanavalla...@gmail.com> wrote:
> >
> > Hi Jeff Squyres,
> >
> > Thank you for your reply...
> >
> > My problem is i want to reduce library size by removing unwanted
> plugin's.
> >
> > Here libmpi.so.12.0.3 size is 2.4MB.
> >
> > How can i know what are the pluggin's included to build the
> libmpi.so.12.0.3 and how can remove.
> >
> > Thanks,
> > Mahesh N
> >
> > On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla <
> mahesh.nanavalla...@gmail.com> wrote:
> > >
> > > i have configured as below for arm
> > >
> > > ./configure --enable-orterun-prefix-by-default
> --prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi" 
> CC=arm-openwrt-linux-muslgnueabi-gcc
> CXX=arm-openwrt-linux-muslgnueabi-g++ --host=arm-openwrt-linux-muslgnueabi
> --enable-script-wrapper-compilers --disable-mpi-fortran --enable-dlopen
> --enable-shared --disable-vt --disable-java --disable-libompitrace
> --disable-static
> >
> > Note that there is a tradeoff here: --enable-dlopen will reduce the size
> of libmpi.so by splitting out all the plugins into separate DSOs (dynamic
> shared objects -- i.e., individual .so plugin files).  But note that some
> of plugins are quite small in terms of code.  I mention this because when
> you dlopen a DSO, it will load in DSOs in units of pages.  So even if a DSO
> only has 1KB of code, it will use  of bytes in your running
> process (e.g., 4KB -- or whatever the page size is on your system).
> >
> > On the other hand, if you --disable-dlopen, then all of Open MPI's
> plugins are slurped into libmpi.so (and friends).  Meaning: no DSOs, no
> dlopen, no page-boundary-loading behavior.  This allows the compiler/linker
> to pack in all the plugins into memory more efficiently (because they'll be
> compiled as part of libmpi.so, and all the code is packed in there -- just
> like any other library).  Your total memory usage in the process may be
> smaller.
> >
> > Sidenote: if you run more than one MPI process per node, then libmpi.so
> (and friends) will be shared between processes.  You're assumedly running
> in an embedded environment, so I don't know if this factor matters (i.e., I
> don't know if you'll run with ppn>1), but I thought I'd mention it anyway.
> >
> > On the other hand (that's your third hand, for those at home
> counting...), you may not want to include *all* the plugins.  I.e., there
> may be a bunch of plugins that you're not actually using, and therefore if
> they are compiled in as part of libmpi.so (and friends), they're consuming
> space that you don't want/need.  So the dlopen mechanism might actually be
> better -- because Open MPI may dlopen a plugin at run time, determine that
> it won't be used, and then dlclose it (i.e., release the memory that would
> have been used for it).
> >
> > On the other (fourth!) hand, you can actually tell Open MPI to *not*
> build specific plugins with the --enable-dso-no-build=LIST configure
> option.  I.e., if you know exactly what plugins you want to use, you can
> negate the ones that you *don't* want to use on the configure line, use
> --disable-static and --disable-dlopen, and you'll likely use the least
> amount of memory.  This is admittedly a bit clunky, but Open MPI's
> configure process was (obviously) not optimized for this use case -- it's
> much more optimized to the "build everything possible, and figure out which
> to use at run time" use case.
> >
> > If you really want to hit rock bottom on MPI process size in your
> embedded environment, you can do some experimentation to figure out exactly
> which components you need.  You can use repeated runs with "mpirun --mca
> ABC_base_verbose 100 ...", where "ABC" is each of Open MPI's framework
> names ("framework" = collection of plugins 

Re: [OMPI users] mpi4py+OpenMPI: Qs about submitting bugs and examples

2016-11-01 Thread Jason Maldonis
Thanks for the responses, and it's great to know that you found the OMPIP
bug in 2.x for dynamic process management. I tried some of the other MPI
libraries for my project as well, but Open MPI seemed to be by far the best
in terms of being bug-free for my code!


Lisandro, I will subscribe and check out the mailing list, thanks for the
link.

At the moment my examples need to be cleaned up quite a bit. I will be the
examples here , and I
can let you know when they are much better (and better organized).

The files testmpi.f90 and spawn_multiple_loop.py that I wrote were very
helpful once I learned how to write them. The fortran program uses split
communicators to run multiple executables at once on different data (MPMD),
and the spawn_multiple_loop.py file uses mpi4py to call those executables
simultaneously and collect the results.

At the moment I am thinking I'll split this up into 2-3 examples that build
on each other to explain how it works. I definitely need to clean them up
first though, and I'll let you know when they are better.

Thanks,
Jason


Jason Maldonis
Research Assistant of Professor Paul Voyles
Materials Science Grad Student
University of Wisconsin, Madison
1509 University Ave, Rm 202
Madison, WI 53706
maldo...@wisc.edu

On Tue, Nov 1, 2016 at 10:59 AM, Lisandro Dalcin  wrote:

>
> On 31 October 2016 at 20:39, Jason Maldonis  wrote:
>
>> I may also submit bugs to mpi4py, but I don't yet know exactly where the
>> bugs are originating from.  Do any of you know if github is the correct
>> place to submit bugs for mpi4py?
>>
>>
> https://bitbucket.org/mpi4py/mpi4py/issues
>
> You can also write to the mailing list in Google Groups
> https://groups.google.com/forum/#!forum/mpi4py
>
> If you are not sure your issue is mpi4py's fault, I think is better to ask
> in the mailing list.
>
>
>
>> I have also learned some cool things that are not well documented on the
>> web, and I'd like to provide nice examples or something similar. Can I
>> contribute examples to either mpi4py or OpenMPI?
>>
>
> Indeed, mpi4py documentation is poor. Maybe we could start by adding your
> examples in the wiki https://bitbucket.org/mpi4py/mpi4py/wiki/Home . Do
> you have them online in some repo to take a look?
>
>
> --
> Lisandro Dalcin
> 
> Research Scientist
> Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
> Extreme Computing Research Center (ECRC)
> King Abdullah University of Science and Technology (KAUST)
> http://ecrc.kaust.edu.sa/
>
> 4700 King Abdullah University of Science and Technology
> al-Khawarizmi Bldg (Bldg 1), Office # 0109
> Thuwal 23955-6900, Kingdom of Saudi Arabia
> http://www.kaust.edu.sa
>
> Office Phone: +966 12 808-0459
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpi4py+OpenMPI: Qs about submitting bugs and examples

2016-11-01 Thread Lisandro Dalcin
On 31 October 2016 at 20:39, Jason Maldonis  wrote:

> I may also submit bugs to mpi4py, but I don't yet know exactly where the
> bugs are originating from.  Do any of you know if github is the correct
> place to submit bugs for mpi4py?
>
>
https://bitbucket.org/mpi4py/mpi4py/issues

You can also write to the mailing list in Google Groups
https://groups.google.com/forum/#!forum/mpi4py

If you are not sure your issue is mpi4py's fault, I think is better to ask
in the mailing list.



> I have also learned some cool things that are not well documented on the
> web, and I'd like to provide nice examples or something similar. Can I
> contribute examples to either mpi4py or OpenMPI?
>

Indeed, mpi4py documentation is poor. Maybe we could start by adding your
examples in the wiki https://bitbucket.org/mpi4py/mpi4py/wiki/Home . Do you
have them online in some repo to take a look?


-- 
Lisandro Dalcin

Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
>
> I actually just filed a Github issue to ask this exact question:
>
> https://github.com/open-mpi/ompi/issues/2326
>
>
Good idea, thanks!
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MCA compilation later

2016-11-01 Thread Sean Ahern
That's useful. Thank you.

It sounds like, as long as the component exists for OpenMPI already, it's
just a matter of compiling OpenMPI on a machine that has the headers and
libraries (with appropriate configure flags), and grabbing the individual
component from there.

-Sean

--
Sean Ahern
Computational Engineering International
919-363-0883

On Tue, Nov 1, 2016 at 12:45 AM, r...@open-mpi.org  wrote:

> Here’s a link on how to create components:
>
> https://github.com/open-mpi/ompi/wiki/devel-CreateComponent
>
> and if you want to create a completely new framework:
>
> https://github.com/open-mpi/ompi/wiki/devel-CreateFramework
>
> If you want to distribute a proprietary plugin, you first develop and
> build it within the OMPI code base on your own machines. Then, just take
> the dll for your plugin from the /lib/openmpi directory and
> distribute that “blob”.
>
> I’ll correct my comment: you need the headers and the libraries. You just
> don’t need the hardware, though it means you cannot test those features.
>
>
> On Oct 31, 2016, at 6:19 AM, Sean Ahern  wrote:
>
> Thanks. That's what I expected and hoped. But is there a pointer about how
> to get started? If I've got an existing OpenMPI build, what's the process
> to get a new MCA plugin built with a new set of header files?
>
> (I'm a bit surprised only header files are necessary. Shouldn't the plugin
> require at least runtime linking with a low-level transport library?)
>
> -Sean
>
> --
> Sean Ahern
> Computational Engineering International
> 919-363-0883
>
> On Fri, Oct 28, 2016 at 3:40 PM, r...@open-mpi.org 
> wrote:
>
>> You don’t need any of the hardware - you just need the headers. Things
>> like libfabric and libibverbs are all publicly available, and so you can
>> build all that support even if you cannot run it on your machine.
>>
>> Once your customer installs the binary, the various plugins will check
>> for their required library and hardware and disqualify themselves if it
>> isn’t found.
>>
>> On Oct 28, 2016, at 12:33 PM, Sean Ahern  wrote:
>>
>> There's been discussion on the OpenMPI list recently about static linking
>> of OpenMPI with all of the desired MCAs in it. I've got the opposite
>> question. I'd like to add MCAs later on to an already-compiled version of
>> OpenMPI and am not quite sure how to do it.
>>
>> Let me summarize. We've got a commercial code that we deploy on customer
>> machines in binary form. We're working to integrate OpenMPI into the
>> installer, and things seem to be progressing well. (Note: because we're a
>> commercial code, making the customer compile something doesn't work for us
>> like it can for open source or research codes.)
>>
>> Now, we want to take advantage of OpenMPI's ability to find MCAs at
>> runtime, pointing to the various plugins that might apply to a deployed
>> system. I've configured and compiled OpenMPI on one of our build machines,
>> one that doesn't have any special interconnect hardware or software
>> installed. We take this compiled version of OpenMPI and use it on all of
>> our machines. (Yes, I've read Building FAQ #39
>>  about
>> relocating OpenMPI. Useful, that.) I'd like to take our pre-compiled
>> version of OpenMPI and add MCA libraries to it, giving OpenMPI the ability
>> to communicate via transport mechanisms that weren't available on the
>> original build machine. Things like InfiniBand, OmniPath, or one of Cray's
>> interconnects.
>>
>> How would I go about doing this? And what are the limitations?
>>
>> I'm guessing that I need to go configure and compile the same version of
>> OpenMPI on a machine that has the desired interconnect installation
>> (headers and libraries), then go grab the corresponding
>> lib/openmpi/mca_*{la,so} files. Take those files and drop them in our
>> pre-built OpenMPI from our build machine in the same relative plugin
>> location (lib/openmpi). If I stick with the same compiler (gcc, in this
>> case), I'm hoping that symbols will all resolve themselves at runtime. (I
>> probably will have to do some LD_LIBRARY_PATH games to be sure to find the
>> appropriate underlying libraries unless OpenMPI's process for building MCAs
>> links them in statically somehow.)
>>
>> Am I even on the right track here? (The various system-level FAQs (here
>> , here
>> , and especially here
>> ) seem to suggest that
>> I am.)
>>
>> Our first test platform will be getting OpenMPI via IB working on our
>> cluster, where we have IB (and TCP/IP) functional and not OpenMPI. This
>> will be a great stand-in for a customer that has an IB cluster and wants to
>> just run our binary installation.
>>
>> Thanks.
>>
>> -Sean
>>
>> --
>> Sean Ahern
>> Computational Engineering 

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Jeff Squyres (jsquyres)
I actually just filed a Github issue to ask this exact question:

https://github.com/open-mpi/ompi/issues/2326


> On Nov 1, 2016, at 9:49 AM, Sergei Hrushev  wrote:
> 
> 
> I haven't worked with InfiniBand for years, but I do believe that yes: you 
> need IPoIB enabled on your IB devices to get the RDMA CM support to work.
> 
> 
> Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports that 
> UD CM can't be used too.
> Is it also require IPoIB?
> 
> Is it possible to read more about UD CM somewhere?
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
>
>
> I haven't worked with InfiniBand for years, but I do believe that yes: you
> need IPoIB enabled on your IB devices to get the RDMA CM support to work.
>
>
Yes, I saw too that RDMA CM requires IP, but in my case OpenMPI reports
that UD CM can't be used too.
Is it also require IPoIB?

Is it possible to read more about UD CM somewhere?
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Jeff Squyres (jsquyres)
On Nov 1, 2016, at 2:40 AM, Sergei Hrushev  wrote:
> 
> Yes, I tried to get this info already.
> And I saw in log that rdmacm wants IP address on port.
> So my question in topc start message was:
> Is it enough for OpenMPI to have RDMA only or IPoIB should also be
> installed?

Sorry; I joined the thread late.

I haven't worked with InfiniBand for years, but I do believe that yes: you need 
IPoIB enabled on your IB devices to get the RDMA CM support to work.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi John !

I'm experimenting now with a head node and single compute node, all the
rest of cluster is switched off.

can you run :
>
> ibhosts
>

# ibhosts
Ca  : 0x7cfe900300bddec0 ports 1 "MT25408 ConnectX Mellanox
Technologies"
Ca  : 0xe41d2d030050caf0 ports 1 "MT25408 ConnectX Mellanox
Technologies"


>
> ibstat
>

# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.35.5100
Hardware version: 0
Node GUID: 0xe41d2d030050caf0
System image GUID: 0xe41d2d030050caf3
Port 1:
State: Active
Physical state: LinkUp
Rate: 56
Base lid: 1
LMC: 0
SM lid: 3
Capability mask: 0x0251486a
Port GUID: 0xe41d2d030050caf1
Link layer: InfiniBand


>
>
ibdiagnet
>
>
# ibdiagnet
# cat ibdiagnet.log
-W- Topology file is not specified.
Reports regarding cluster links will use direct routes.
-I- Using port 1 as the local port.
-I- Discovering ... 3 nodes (1 Switches & 2 CA-s) discovered.


-I---
-I- Bad Guids/LIDs Info
-I---
-I- No bad Guids were found

-I---
-I- Links With Logical State = INIT
-I---
-I- No bad Links (with logical state = INIT) were found

-I---
-I- General Device Info
-I---

-I---
-I- PM Counters Info
-I---
-I- No illegal PM counters values were found

-I---
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---
-I-PKey:0x7fff Hosts:2 full:2 limited:0

-I---
-I- IPoIB Subnets Check
-I---
-I- Subnet: IPv4 PKey:0x7fff QKey:0x0b1b MTU:2048Byte rate:10Gbps
SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

-I---
-I- Bad Links Info
-I- No bad link were found
-I---

-I- Done. Run time was 2 seconds.


>
> Lord help me for being so naive, but do you have a subnet manager running?
>

It seems, yes (I even have standby):

# service --status-all | grep opensm
 [ + ]  opensm

# cat ibdiagnet.sm

ibdiagnet fabric SM report

  SM - master
MT25408/P1 lid=0x0003 guid=0x7cfe900300bddec1 dev=4099 priority:0

  SM - standby
The Local Device : MT25408/P1 lid=0x0001 guid=0xe41d2d030050caf1
dev=4099 priority:0

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread John Hearns via users
Segei,
can you run :

ibhosts

ibstat

ibdiagnet


Lord help me for being so naive, but do you have a subnet manager running?



On 1 November 2016 at 06:40, Sergei Hrushev  wrote:

> Hi Jeff !
>
> What does "ompi_info | grep openib" show?
>>
>>
> $ ompi_info | grep openib
>  MCA btl: openib (MCA v2.0.0, API v2.0.0, Component
> v1.10.2)
>
> Additionally, Mellanox provides alternate support through their MXM
>> libraries, if you want to try that.
>>
>
> Yes, I know.
> But we already have a hybrid cluster with OpenMPI, OpenMP, CUDA, Torque
> and many other libraries installed,
> and because it works perfect over Ethernet interconnect my idea was to add
> InfiniBand support with minimum
> of changes. Mainly because we already have some custom-written software
> for OpenMPI.
>
>
>> If that shows that you have the openib BTL plugin loaded, try running
>> with "mpirun --mca btl_base_verbose 100 ..."  That will provide additional
>> output about why / why not each point-to-point plugin is chosen.
>>
>>
> Yes, I tried to get this info already.
> And I saw in log that rdmacm wants IP address on port.
> So my question in topc start message was:
>
> Is it enough for OpenMPI to have RDMA only or IPoIB should also be
> installed?
>
> The mpirun output is:
>
> [node1:02674] mca: base: components_register: registering btl components
> [node1:02674] mca: base: components_register: found loaded component openib
> [node1:02674] mca: base: components_register: component openib register
> function successful
> [node1:02674] mca: base: components_register: found loaded component sm
> [node1:02674] mca: base: components_register: component sm register
> function successful
> [node1:02674] mca: base: components_register: found loaded component self
> [node1:02674] mca: base: components_register: component self register
> function successful
> [node1:02674] mca: base: components_open: opening btl components
> [node1:02674] mca: base: components_open: found loaded component openib
> [node1:02674] mca: base: components_open: component openib open function
> successful
> [node1:02674] mca: base: components_open: found loaded component sm
> [node1:02674] mca: base: components_open: component sm open function
> successful
> [node1:02674] mca: base: components_open: found loaded component self
> [node1:02674] mca: base: components_open: component self open function
> successful
> [node1:02674] select: initializing btl component openib
> [node1:02674] openib BTL: rdmacm IP address not found on port
> [node1:02674] openib BTL: rdmacm CPC unavailable for use on mlx4_0:1;
> skipped
> [node1:02674] select: init of component openib returned failure
> [node1:02674] mca: base: close: component openib closed
> [node1:02674] mca: base: close: unloading component openib
> [node1:02674] select: initializing btl component sm
> [node1:02674] select: init of component sm returned failure
> [node1:02674] mca: base: close: component sm closed
> [node1:02674] mca: base: close: unloading component sm
> [node1:02674] select: initializing btl component self
> [node1:02674] select: init of component self returned success
> [node1:02674] mca: bml: Using self btl to [[16642,1],0] on node node1
> [node1:02674] mca: base: close: component self closed
> [node1:02674] mca: base: close: unloading component self
>
> Best regards,
> Sergei.
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread Jeff Squyres (jsquyres)
Run ompi_info; it will tell you all the plugins that are installed.

> On Nov 1, 2016, at 2:13 AM, Mahesh Nanavalla  
> wrote:
> 
> Hi Jeff Squyres,
> 
> Thank you for your reply...
> 
> My problem is i want to reduce library size by removing unwanted plugin's.
> 
> Here libmpi.so.12.0.3 size is 2.4MB.
> 
> How can i know what are the pluggin's included to build the libmpi.so.12.0.3 
> and how can remove.
> 
> Thanks,
> Mahesh N
> 
> On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres)  
> wrote:
> On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla  
> wrote:
> >
> > i have configured as below for arm
> >
> > ./configure --enable-orterun-prefix-by-default  
> > --prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi" 
> > CC=arm-openwrt-linux-muslgnueabi-gcc CXX=arm-openwrt-linux-muslgnueabi-g++ 
> > --host=arm-openwrt-linux-muslgnueabi --enable-script-wrapper-compilers 
> > --disable-mpi-fortran --enable-dlopen --enable-shared --disable-vt 
> > --disable-java --disable-libompitrace --disable-static
> 
> Note that there is a tradeoff here: --enable-dlopen will reduce the size of 
> libmpi.so by splitting out all the plugins into separate DSOs (dynamic shared 
> objects -- i.e., individual .so plugin files).  But note that some of plugins 
> are quite small in terms of code.  I mention this because when you dlopen a 
> DSO, it will load in DSOs in units of pages.  So even if a DSO only has 1KB 
> of code, it will use  of bytes in your running process (e.g., 4KB 
> -- or whatever the page size is on your system).
> 
> On the other hand, if you --disable-dlopen, then all of Open MPI's plugins 
> are slurped into libmpi.so (and friends).  Meaning: no DSOs, no dlopen, no 
> page-boundary-loading behavior.  This allows the compiler/linker to pack in 
> all the plugins into memory more efficiently (because they'll be compiled as 
> part of libmpi.so, and all the code is packed in there -- just like any other 
> library).  Your total memory usage in the process may be smaller.
> 
> Sidenote: if you run more than one MPI process per node, then libmpi.so (and 
> friends) will be shared between processes.  You're assumedly running in an 
> embedded environment, so I don't know if this factor matters (i.e., I don't 
> know if you'll run with ppn>1), but I thought I'd mention it anyway.
> 
> On the other hand (that's your third hand, for those at home counting...), 
> you may not want to include *all* the plugins.  I.e., there may be a bunch of 
> plugins that you're not actually using, and therefore if they are compiled in 
> as part of libmpi.so (and friends), they're consuming space that you don't 
> want/need.  So the dlopen mechanism might actually be better -- because Open 
> MPI may dlopen a plugin at run time, determine that it won't be used, and 
> then dlclose it (i.e., release the memory that would have been used for it).
> 
> On the other (fourth!) hand, you can actually tell Open MPI to *not* build 
> specific plugins with the --enable-dso-no-build=LIST configure option.  I.e., 
> if you know exactly what plugins you want to use, you can negate the ones 
> that you *don't* want to use on the configure line, use --disable-static and 
> --disable-dlopen, and you'll likely use the least amount of memory.  This is 
> admittedly a bit clunky, but Open MPI's configure process was (obviously) not 
> optimized for this use case -- it's much more optimized to the "build 
> everything possible, and figure out which to use at run time" use case.
> 
> If you really want to hit rock bottom on MPI process size in your embedded 
> environment, you can do some experimentation to figure out exactly which 
> components you need.  You can use repeated runs with "mpirun --mca 
> ABC_base_verbose 100 ...", where "ABC" is each of Open MPI's framework names 
> ("framework" = collection of plugins of the same type).  This verbose output 
> will show you exactly which components are opened, which ones are used, and 
> which ones are discarded.  You can build up a list of all the discarded 
> components and --enable-mca-no-build them.
> 
> > While i am running the using mpirun
> > am getting following errror..
> > root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 1 
> > /usr/bin/openmpiWiFiBulb
> > --
> > Sorry!  You were supposed to get help about:
> > opal_init:startup:internal-failure
> > But I couldn't open the help file:
> > 
> > /home/nmahesh/Workspace/ARM_MPI/openmpi/share/openmpi/help-opal-runtime.txt:
> >  No such file or directory.  Sorry!
> 
> So this is really two errors:
> 
> 1. The help message file is not being found.
> 2. Something is obviously going wrong during opal_init() (which is one of 
> Open MPI's startup functions).
> 
> For #1, when I do a default build of Open MPI 1.10.3, that file *is* 
> installed.  Are you trimming the installation tree, 

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Sergei Hrushev
Hi Jeff !

What does "ompi_info | grep openib" show?
>
>
$ ompi_info | grep openib
 MCA btl: openib (MCA v2.0.0, API v2.0.0, Component v1.10.2)

Additionally, Mellanox provides alternate support through their MXM
> libraries, if you want to try that.
>

Yes, I know.
But we already have a hybrid cluster with OpenMPI, OpenMP, CUDA, Torque and
many other libraries installed,
and because it works perfect over Ethernet interconnect my idea was to add
InfiniBand support with minimum
of changes. Mainly because we already have some custom-written software for
OpenMPI.


> If that shows that you have the openib BTL plugin loaded, try running with
> "mpirun --mca btl_base_verbose 100 ..."  That will provide additional
> output about why / why not each point-to-point plugin is chosen.
>
>
Yes, I tried to get this info already.
And I saw in log that rdmacm wants IP address on port.
So my question in topc start message was:

Is it enough for OpenMPI to have RDMA only or IPoIB should also be
installed?

The mpirun output is:

[node1:02674] mca: base: components_register: registering btl components
[node1:02674] mca: base: components_register: found loaded component openib
[node1:02674] mca: base: components_register: component openib register
function successful
[node1:02674] mca: base: components_register: found loaded component sm
[node1:02674] mca: base: components_register: component sm register
function successful
[node1:02674] mca: base: components_register: found loaded component self
[node1:02674] mca: base: components_register: component self register
function successful
[node1:02674] mca: base: components_open: opening btl components
[node1:02674] mca: base: components_open: found loaded component openib
[node1:02674] mca: base: components_open: component openib open function
successful
[node1:02674] mca: base: components_open: found loaded component sm
[node1:02674] mca: base: components_open: component sm open function
successful
[node1:02674] mca: base: components_open: found loaded component self
[node1:02674] mca: base: components_open: component self open function
successful
[node1:02674] select: initializing btl component openib
[node1:02674] openib BTL: rdmacm IP address not found on port
[node1:02674] openib BTL: rdmacm CPC unavailable for use on mlx4_0:1;
skipped
[node1:02674] select: init of component openib returned failure
[node1:02674] mca: base: close: component openib closed
[node1:02674] mca: base: close: unloading component openib
[node1:02674] select: initializing btl component sm
[node1:02674] select: init of component sm returned failure
[node1:02674] mca: base: close: component sm closed
[node1:02674] mca: base: close: unloading component sm
[node1:02674] select: initializing btl component self
[node1:02674] select: init of component self returned success
[node1:02674] mca: bml: Using self btl to [[16642,1],0] on node node1
[node1:02674] mca: base: close: component self closed
[node1:02674] mca: base: close: unloading component self

Best regards,
Sergei.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread Mahesh Nanavalla
Hi all,

Thank you for your reply...

My problem is i want to *reduce library* size by removing unwanted plugin's.

Here *libmpi.so.12.0.3 *size is 2.4MB.

How can i know what are the* pluggin's *included to* build the*
*libmpi.so.12.0.3* and how can remove.

Thanks,
Mahesh N

On Tue, Nov 1, 2016 at 11:43 AM, Mahesh Nanavalla <
mahesh.nanavalla...@gmail.com> wrote:

> Hi Jeff Squyres,
>
> Thank you for your reply...
>
> My problem is i want to *reduce library* size by removing unwanted
> plugin's.
>
> Here *libmpi.so.12.0.3 *size is 2.4MB.
>
> How can i know what are the* pluggin's *included to* build the*
> *libmpi.so.12.0.3* and how can remove.
>
> Thanks,
> Mahesh N
>
> On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla <
>> mahesh.nanavalla...@gmail.com> wrote:
>> >
>> > i have configured as below for arm
>> >
>> > ./configure --enable-orterun-prefix-by-default
>> --prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi"
>> CC=arm-openwrt-linux-muslgnueabi-gcc CXX=arm-openwrt-linux-muslgnueabi-g++
>> --host=arm-openwrt-linux-muslgnueabi --enable-script-wrapper-compilers
>> --disable-mpi-fortran --enable-dlopen --enable-shared --disable-vt
>> --disable-java --disable-libompitrace --disable-static
>>
>> Note that there is a tradeoff here: --enable-dlopen will reduce the size
>> of libmpi.so by splitting out all the plugins into separate DSOs (dynamic
>> shared objects -- i.e., individual .so plugin files).  But note that some
>> of plugins are quite small in terms of code.  I mention this because when
>> you dlopen a DSO, it will load in DSOs in units of pages.  So even if a DSO
>> only has 1KB of code, it will use  of bytes in your running
>> process (e.g., 4KB -- or whatever the page size is on your system).
>>
>> On the other hand, if you --disable-dlopen, then all of Open MPI's
>> plugins are slurped into libmpi.so (and friends).  Meaning: no DSOs, no
>> dlopen, no page-boundary-loading behavior.  This allows the compiler/linker
>> to pack in all the plugins into memory more efficiently (because they'll be
>> compiled as part of libmpi.so, and all the code is packed in there -- just
>> like any other library).  Your total memory usage in the process may be
>> smaller.
>>
>> Sidenote: if you run more than one MPI process per node, then libmpi.so
>> (and friends) will be shared between processes.  You're assumedly running
>> in an embedded environment, so I don't know if this factor matters (i.e., I
>> don't know if you'll run with ppn>1), but I thought I'd mention it anyway.
>>
>> On the other hand (that's your third hand, for those at home
>> counting...), you may not want to include *all* the plugins.  I.e., there
>> may be a bunch of plugins that you're not actually using, and therefore if
>> they are compiled in as part of libmpi.so (and friends), they're consuming
>> space that you don't want/need.  So the dlopen mechanism might actually be
>> better -- because Open MPI may dlopen a plugin at run time, determine that
>> it won't be used, and then dlclose it (i.e., release the memory that would
>> have been used for it).
>>
>> On the other (fourth!) hand, you can actually tell Open MPI to *not*
>> build specific plugins with the --enable-dso-no-build=LIST configure
>> option.  I.e., if you know exactly what plugins you want to use, you can
>> negate the ones that you *don't* want to use on the configure line, use
>> --disable-static and --disable-dlopen, and you'll likely use the least
>> amount of memory.  This is admittedly a bit clunky, but Open MPI's
>> configure process was (obviously) not optimized for this use case -- it's
>> much more optimized to the "build everything possible, and figure out which
>> to use at run time" use case.
>>
>> If you really want to hit rock bottom on MPI process size in your
>> embedded environment, you can do some experimentation to figure out exactly
>> which components you need.  You can use repeated runs with "mpirun --mca
>> ABC_base_verbose 100 ...", where "ABC" is each of Open MPI's framework
>> names ("framework" = collection of plugins of the same type).  This verbose
>> output will show you exactly which components are opened, which ones are
>> used, and which ones are discarded.  You can build up a list of all the
>> discarded components and --enable-mca-no-build them.
>>
>> > While i am running the using mpirun
>> > am getting following errror..
>> > root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 1
>> /usr/bin/openmpiWiFiBulb
>> > 
>> --
>> > Sorry!  You were supposed to get help about:
>> > opal_init:startup:internal-failure
>> > But I couldn't open the help file:
>> > 
>> > /home/nmahesh/Workspace/ARM_MPI/openmpi/share/openmpi/help-opal-runtime.txt:
>> No such file or directory.  Sorry!
>>
>> So this is really two errors:
>>
>> 1. The help message file is not being found.
>> 2. 

Re: [OMPI users] Redusing libmpi.so size....

2016-11-01 Thread Mahesh Nanavalla
Hi Jeff Squyres,

Thank you for your reply...

My problem is i want to *reduce library* size by removing unwanted plugin's.

Here *libmpi.so.12.0.3 *size is 2.4MB.

How can i know what are the* pluggin's *included to* build the*
*libmpi.so.12.0.3* and how can remove.

Thanks,
Mahesh N

On Fri, Oct 28, 2016 at 7:09 PM, Jeff Squyres (jsquyres)  wrote:

> On Oct 28, 2016, at 8:12 AM, Mahesh Nanavalla <
> mahesh.nanavalla...@gmail.com> wrote:
> >
> > i have configured as below for arm
> >
> > ./configure --enable-orterun-prefix-by-default  
> > --prefix="/home/nmahesh/Workspace/ARM_MPI/openmpi"
> CC=arm-openwrt-linux-muslgnueabi-gcc CXX=arm-openwrt-linux-muslgnueabi-g++
> --host=arm-openwrt-linux-muslgnueabi --enable-script-wrapper-compilers
> --disable-mpi-fortran --enable-dlopen --enable-shared --disable-vt
> --disable-java --disable-libompitrace --disable-static
>
> Note that there is a tradeoff here: --enable-dlopen will reduce the size
> of libmpi.so by splitting out all the plugins into separate DSOs (dynamic
> shared objects -- i.e., individual .so plugin files).  But note that some
> of plugins are quite small in terms of code.  I mention this because when
> you dlopen a DSO, it will load in DSOs in units of pages.  So even if a DSO
> only has 1KB of code, it will use  of bytes in your running
> process (e.g., 4KB -- or whatever the page size is on your system).
>
> On the other hand, if you --disable-dlopen, then all of Open MPI's plugins
> are slurped into libmpi.so (and friends).  Meaning: no DSOs, no dlopen, no
> page-boundary-loading behavior.  This allows the compiler/linker to pack in
> all the plugins into memory more efficiently (because they'll be compiled
> as part of libmpi.so, and all the code is packed in there -- just like any
> other library).  Your total memory usage in the process may be smaller.
>
> Sidenote: if you run more than one MPI process per node, then libmpi.so
> (and friends) will be shared between processes.  You're assumedly running
> in an embedded environment, so I don't know if this factor matters (i.e., I
> don't know if you'll run with ppn>1), but I thought I'd mention it anyway.
>
> On the other hand (that's your third hand, for those at home counting...),
> you may not want to include *all* the plugins.  I.e., there may be a bunch
> of plugins that you're not actually using, and therefore if they are
> compiled in as part of libmpi.so (and friends), they're consuming space
> that you don't want/need.  So the dlopen mechanism might actually be better
> -- because Open MPI may dlopen a plugin at run time, determine that it
> won't be used, and then dlclose it (i.e., release the memory that would
> have been used for it).
>
> On the other (fourth!) hand, you can actually tell Open MPI to *not* build
> specific plugins with the --enable-dso-no-build=LIST configure option.
> I.e., if you know exactly what plugins you want to use, you can negate the
> ones that you *don't* want to use on the configure line, use
> --disable-static and --disable-dlopen, and you'll likely use the least
> amount of memory.  This is admittedly a bit clunky, but Open MPI's
> configure process was (obviously) not optimized for this use case -- it's
> much more optimized to the "build everything possible, and figure out which
> to use at run time" use case.
>
> If you really want to hit rock bottom on MPI process size in your embedded
> environment, you can do some experimentation to figure out exactly which
> components you need.  You can use repeated runs with "mpirun --mca
> ABC_base_verbose 100 ...", where "ABC" is each of Open MPI's framework
> names ("framework" = collection of plugins of the same type).  This verbose
> output will show you exactly which components are opened, which ones are
> used, and which ones are discarded.  You can build up a list of all the
> discarded components and --enable-mca-no-build them.
>
> > While i am running the using mpirun
> > am getting following errror..
> > root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 1
> /usr/bin/openmpiWiFiBulb
> > 
> --
> > Sorry!  You were supposed to get help about:
> > opal_init:startup:internal-failure
> > But I couldn't open the help file:
> > 
> > /home/nmahesh/Workspace/ARM_MPI/openmpi/share/openmpi/help-opal-runtime.txt:
> No such file or directory.  Sorry!
>
> So this is really two errors:
>
> 1. The help message file is not being found.
> 2. Something is obviously going wrong during opal_init() (which is one of
> Open MPI's startup functions).
>
> For #1, when I do a default build of Open MPI 1.10.3, that file *is*
> installed.  Are you trimming the installation tree, perchance?  If so, if
> you can put at least that one file back in its installation location (it's
> in the Open MPI source tarball), it might reveal more information on
> exactly what is failing.
>
> Additionally, I wonder if shared memory is