Re: [OMPI users] mpirun works with cmd line call , but not with app context file arg

2016-10-16 Thread Gilles Gouaillardet
Out of curiosity, why do you specify both --hostfile and -H ?
Do you observe the same behavior without --hostfile ~/.mpihosts ?

Also, do you have at least 4 cores on both A.lan and B.lan ?

Cheers,

Gilles

On Sunday, October 16, 2016, MM  wrote:

> Hi,
>
> openmpi 1.10.3
>
> this call:
>
> mpirun --hostfile ~/.mpihosts -H localhost -np 1 prog1 : -H A.lan -np
> 4 prog2 : -H B.lan -np 4 prog2
>
> works, yet this one:
>
> mpirun --hostfile ~/.mpihosts --app ~/.mpiapp
>
> doesn't.  where ~/.mpiapp
>
> -H localhost -np 1 prog1
> -H A.lan -np 4 prog2
> -H B.lan -np 4 prog2
>
> it says
>
> "There are not enough slots available in the system to satisfy the 4 slots
> that were requested by the application:
>   prog2
> Either request fewer slots for your application, or make more slots
> available
> for use".
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
Rick,

So you have three types of tasks
- 1 dispatcher
- several sensors
- several proxies

If proxies do not communicate with each other, and if sensors do not
communicate with each other, then you could end up with 3 inter
communicators
sensorComm: dispatcher in the left group and sensors in the right group
proxyComm: dispatcher in the left group and proxies in the right group
controlComm: sensors in the left group and proxies in the right group

Does that fit your needs ?
If yes, then keep in mind sensorComm is MPI_COMM_NULL on the proxy tasks,
proxyComm is MPI_COMM_NULL on the sensor tasks, and controlComm is
MPI_COMM_NULL on the dispatcher.

Cheers,

Gilles

On Monday, October 17, 2016, Marlborough, Rick <rmarlboro...@aaccorp.com>
wrote:

> Designation: Non-Export Controlled Content
>
> Gilles;
>
> My scenario involves a Dispatcher of rank 0, and several
> sensors and proxy objects. The Dispatcher triggers activity and gathers
> results. The proxies get triggered first. They send data to the sensors,
> and the sensors indicate to the dispatcher that they are done. I am trying
> to create 2 comm groups. One for the sensors and one for the proxies. The
> dispatcher will use the 2 comm groups to coordinate activity. I tried
> adding the dispatcher to the sensorList comm group, but I get an error
> saying “invalid task”.
>
>
>
> Rick
>
>
>
> *From:* users [mailto:users-boun...@lists.open-mpi.org
> <javascript:_e(%7B%7D,'cvml','users-boun...@lists.open-mpi.org');>] *On
> Behalf Of *Gilles Gouaillardet
> *Sent:* Monday, October 17, 2016 9:30 AM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] communications groups
>
>
>
> Rick,
>
>
>
> I re-read the MPI standard and was unable to figure out if sensorgroup is
> MPI_GROUP_EMPTY or a group with task 1 on tasks except task 1
>
> (A group that does not contain the current task makes little sense to me,
> but I do not see any reason why this group have to be MPI_GROUP_EMPTY)
>
>
>
> Regardless, sensorComm will be MPI_COMM_NULL except on task 1, so
> MPI_Barrier will fail.
>
>
>
> Cheers,
>
>
>
> Gilles
>
> On Monday, October 17, 2016, Marlborough, Rick <rmarlboro...@aaccorp.com
> <javascript:_e(%7B%7D,'cvml','rmarlboro...@aaccorp.com');>> wrote:
>
> Designation: Non-Export Controlled Content
>
> George;
>
> Thanks for your response. Your second sentence is a little
> confusing. If my world group is P0,P1, visible on both processes, why
> wouldn’t the sensorList contain P1 on both processes?
>
>
>
> Rick
>
>
>
>
>
> *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *George
> Bosilca
> *Sent:* Friday, October 14, 2016 5:44 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] communications groups
>
>
>
> Rick,
>
>
>
> Let's assume that you have started 2 processes, and that your sensorList
> is {1}. The worldgroup will then be {P0, P1}, which trimmed via the
> sensorList will give the sensorgroup {MPI_GROUP_EMPTY} on P0 and the
> sensorgroup {P1} on P1. As a result on P0 you will create a
> MPI_COMM_NULL communicator, while on P1 you will have a valid communicator
> sensorComm (which will only contain P1). You cannot do a Barrier on an
> MPI_COMM_NULL communicator, which might explain the "invalid communicator"
> error you are getting.
>
>
>
> George.
>
>
>
>
>
> On Fri, Oct 14, 2016 at 5:33 PM, Marlborough, Rick <
> rmarlboro...@aaccorp.com> wrote:
>
> Designation: Non-Export Controlled Content
>
> Folks;
>
> I have the following code setup. The sensorList is an
> array of ints of size 1. The value it contains is 1. My comm world size is
> 5. The call to MPI_Barrier fails every time with error “invalid
> communicator”. This code is pretty much copied out of a text book. I must
> be doing something wrong. I just don’t see it. Can anyone else spot my
> error? I am using v2.01 on red hat 6.5.
>
>
>
> Thanks
>
> Rick
>
>
>
>
>
> MPI_Comm_group(MPI_COMM_WORLD, );
>
> MPI_Group_incl(worldgroup, 1, sensorList, );
>
> MPI_Comm_create(MPI_COMM_WORLD, sensorgroup, );
>
> MPI_Barrier(sensorComm);
>
> 3.1.1001
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> 3.1.1001
>
> 3.1.1001
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
Rick,

In my understanding, sensorgroup is a group with only task 1
Consequently, sensorComm is
- similar to MPI_COMM_SELF on task 1
- MPI_COMM_NULL on other tasks, and hence the barrier fails

I suggest you double check sensorgroup is never MPI_GROUP_EMPTY
and add a test not to call MPI_Barrier on MPI_COMM_NULL

Makes sense ?

Cheers,

Gilles

On Saturday, October 15, 2016, Marlborough, Rick 
wrote:

> Designation: Non-Export Controlled Content
>
> Folks;
>
> I have the following code setup. The sensorList is an
> array of ints of size 1. The value it contains is 1. My comm world size is
> 5. The call to MPI_Barrier fails every time with error “invalid
> communicator”. This code is pretty much copied out of a text book. I must
> be doing something wrong. I just don’t see it. Can anyone else spot my
> error? I am using v2.01 on red hat 6.5.
>
>
>
> Thanks
>
> Rick
>
>
>
>
>
> MPI_Comm_group(MPI_COMM_WORLD, );
>
> MPI_Group_incl(worldgroup, 1, sensorList, );
>
> MPI_Comm_create(MPI_COMM_WORLD, sensorgroup, );
>
> MPI_Barrier(sensorComm);
>
> 3.1.1001
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
Rick,

I re-read the MPI standard and was unable to figure out if sensorgroup is
MPI_GROUP_EMPTY or a group with task 1 on tasks except task 1
(A group that does not contain the current task makes little sense to me,
but I do not see any reason why this group have to be MPI_GROUP_EMPTY)

Regardless, sensorComm will be MPI_COMM_NULL except on task 1, so
MPI_Barrier will fail.

Cheers,

Gilles

On Monday, October 17, 2016, Marlborough, Rick 
wrote:

> Designation: Non-Export Controlled Content
>
> George;
>
> Thanks for your response. Your second sentence is a little
> confusing. If my world group is P0,P1, visible on both processes, why
> wouldn’t the sensorList contain P1 on both processes?
>
>
>
> Rick
>
>
>
>
>
> *From:* users [mailto:users-boun...@lists.open-mpi.org
> ] *On
> Behalf Of *George Bosilca
> *Sent:* Friday, October 14, 2016 5:44 PM
> *To:* Open MPI Users
> *Subject:* Re: [OMPI users] communications groups
>
>
>
> Rick,
>
>
>
> Let's assume that you have started 2 processes, and that your sensorList
> is {1}. The worldgroup will then be {P0, P1}, which trimmed via the
> sensorList will give the sensorgroup {MPI_GROUP_EMPTY} on P0 and the
> sensorgroup {P1} on P1. As a result on P0 you will create a
> MPI_COMM_NULL communicator, while on P1 you will have a valid communicator
> sensorComm (which will only contain P1). You cannot do a Barrier on an
> MPI_COMM_NULL communicator, which might explain the "invalid communicator"
> error you are getting.
>
>
>
> George.
>
>
>
>
>
> On Fri, Oct 14, 2016 at 5:33 PM, Marlborough, Rick <
> rmarlboro...@aaccorp.com
> > wrote:
>
> Designation: Non-Export Controlled Content
>
> Folks;
>
> I have the following code setup. The sensorList is an
> array of ints of size 1. The value it contains is 1. My comm world size is
> 5. The call to MPI_Barrier fails every time with error “invalid
> communicator”. This code is pretty much copied out of a text book. I must
> be doing something wrong. I just don’t see it. Can anyone else spot my
> error? I am using v2.01 on red hat 6.5.
>
>
>
> Thanks
>
> Rick
>
>
>
>
>
> MPI_Comm_group(MPI_COMM_WORLD, );
>
> MPI_Group_incl(worldgroup, 1, sensorList, );
>
> MPI_Comm_create(MPI_COMM_WORLD, sensorgroup, );
>
> MPI_Barrier(sensorComm);
>
> 3.1.1001
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> 3.1.1001
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread Gilles Gouaillardet
Christof,


There is something really odd with this stack trace.
count is zero, and some pointers do not point to valid addresses (!)

in OpenMPI, MPI_Allreduce(...,count=0,...) is a no-op, so that suggests that
the stack has been corrupted inside MPI_Allreduce(), or that you are not
using the library you think you use
pmap  will show you which lib is used

btw, this was not started with
mpirun --mca coll ^tuned ...
right ?

just to make it clear ...
a task from your program bluntly issues a fortran STOP, and this is kind of
a feature.
the *only* issue is mpirun does not kill the other MPI tasks and mpirun
never completes.
did i get it right ?

Cheers,

Gilles

On Thursday, December 8, 2016, Christof Koehler <
christof.koeh...@bccms.uni-bremen.de> wrote:

> Hello everybody,
>
> I tried it with the nightly and the direct 2.0.2 branch from git which
> according to the log should contain that patch
>
> commit d0b97d7a408b87425ca53523de369da405358ba2
> Merge: ac8c019 b9420bb
> Author: Jeff Squyres <jsquy...@users.noreply.github.com <javascript:;>>
> Date:   Wed Dec 7 18:24:46 2016 -0500
> Merge pull request #2528 from rhc54/cmr20x/signals
>
> Unfortunately it changes nothing. The root rank stops and all other
> ranks (and mpirun) just stay, the remaining ranks at 100 % CPU waiting
> apparently in that allreduce. The stack trace looks a bit more
> interesting (git is always debug build ?), so I include it at the very
> bottom just in case.
>
> Off-list Gilles Gouaillardet suggested to set breakpoints at exit,
> __exit etc. to try to catch signals. Would that be useful ? I need a
> moment to figure out how to do this, but I can definitively try.
>
> Some remark: During "make install" from the git repo I see a
>
> WARNING!  Common symbols found:
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_2complex
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_2double_complex
>   mpi-f08-types.o: 0004 C
> ompi_f08_mpi_2double_precision
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_2integer
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_2real
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_aint
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_band
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_bor
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_bxor
>   mpi-f08-types.o: 0004 C ompi_f08_mpi_byte
>
> I have never noticed this before.
>
>
> Best Regards
>
> Christof
>
> Thread 1 (Thread 0x2af84cde4840 (LWP 11219)):
> #0  0x2af84e4c669d in poll () from /lib64/libc.so.6
> #1  0x2af850517496 in poll_dispatch () from /cluster/mpi/openmpi/2.0.2/
> intel2016/lib/libopen-pal.so.20
> #2  0x2af85050ffa5 in opal_libevent2022_event_base_loop () from
> /cluster/mpi/openmpi/2.0.2/intel2016/lib/libopen-pal.so.20
> #3  0x2af85049fa1f in opal_progress () at runtime/opal_progress.c:207
> #4  0x2af84e02f7f7 in ompi_request_default_wait_all (count=233618144,
> requests=0x2, statuses=0x0) at ../opal/threads/wait_sync.h:80
> #5  0x2af84e0758a7 in ompi_coll_base_allreduce_intra_recursivedoubling
> (sbuf=0xdecbae0,
> rbuf=0x2, count=0, dtype=0x, op=0x0, comm=0x1,
> module=0xdee69e0) at base/coll_base_allreduce.c:225
> #6  0x2af84e07b747 in ompi_coll_tuned_allreduce_intra_dec_fixed
> (sbuf=0xdecbae0, rbuf=0x2, count=0, dtype=0x, op=0x0,
> comm=0x1, module=0x1) at coll_tuned_decision_fixed.c:66
> #7  0x2af84e03e832 in PMPI_Allreduce (sendbuf=0xdecbae0, recvbuf=0x2,
> count=0, datatype=0x, op=0x0, comm=0x1) at pallreduce.c:107
> #8  0x2af84ddaac90 in ompi_allreduce_f (sendbuf=0xdecbae0 "\005",
> recvbuf=0x2 , count=0x0,
> datatype=0x, op=0x0, comm=0x1, ierr=0x7ffdf3cffe9c) at
> pallreduce_f.c:87
> #9  0x0045ecc6 in m_sum_i_ ()
> #10 0x00e172c9 in mlwf_mp_mlwf_wannier90_ ()
> #11 0x004325ff in vamp () at main.F:2640
> #12 0x0040de1e in main ()
> #13 0x2af84e3fbb15 in __libc_start_main () from /lib64/libc.so.6
> #14 0x0040dd29 in _start ()
>
> On Wed, Dec 07, 2016 at 09:47:48AM -0800, r...@open-mpi.org <javascript:;>
> wrote:
> > Hi Christof
> >
> > Sorry if I missed this, but it sounds like you are saying that one of
> your procs abnormally terminates, and we are failing to kill the remaining
> job? Is that correct?
> >
> > If so, I just did some work that might relate to that problem that is
> pending in PR #2528: https://github.com/open-mpi/ompi/pull/2528 <
> https://github.com/open-mpi/ompi/pull/2528>
> >
> > Would you be able to try th

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-09 Thread Gilles Gouaillardet
Folks,

the problem is indeed pretty trivial to reproduce

i opened https://github.com/open-mpi/ompi/issues/2550 (and included a
reproducer)


Cheers,

Gilles

On Fri, Dec 9, 2016 at 5:15 AM, Noam Bernstein
<noam.bernst...@nrl.navy.mil> wrote:
> On Dec 8, 2016, at 6:05 AM, Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
>
> Christof,
>
>
> There is something really odd with this stack trace.
> count is zero, and some pointers do not point to valid addresses (!)
>
> in OpenMPI, MPI_Allreduce(...,count=0,...) is a no-op, so that suggests that
> the stack has been corrupted inside MPI_Allreduce(), or that you are not
> using the library you think you use
> pmap  will show you which lib is used
>
> btw, this was not started with
> mpirun --mca coll ^tuned ...
> right ?
>
> just to make it clear ...
> a task from your program bluntly issues a fortran STOP, and this is kind of
> a feature.
> the *only* issue is mpirun does not kill the other MPI tasks and mpirun
> never completes.
> did i get it right ?
>
>
> I just ran across very similar behavior in VASP (which we just switched over
> to openmpi 2.0.1), also in a allreduce + STOP combination (some nodes call
> one, others call the other), and I discovered several interesting things.
>
> The most important is that when MPI is active, the preprocessor converts
> (via a #define in symbol.inc) fortran STOP into calls to m_exit() (defined
> in mpi.F), which is a wrapper around mpi_finalize.  So in my case some
> processes in the communicator call mpi_finalize, others call mpi_allreduce.
> I’m not really surprised this hangs, because I think the correct thing to
> replace STOP with is mpi_abort, not mpi_finalize.  If you know where the
> STOP is called, you can check the preprocessed equivalent file (.f90 instead
> of .F), and see if it’s actually been replaced with a call to m_exit.  I’m
> planning to test whether replacing m_exit with m_stop in symbol.inc gives
> more sensible behavior, i.e. program termination when the original source
> file executes a STOP.
>
> I’m assuming that a mix of mpi_allreduce and mpi_finalize is really expected
> to hang, but just in case that’s surprising, here are my stack traces:
>
>
> hung in collective:
>
> (gdb) where
>
> #0  0x2b8d5a095ec6 in opal_progress () from
> /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libopen-pal.so.20
> #1  0x2b8d59b3a36d in ompi_request_default_wait_all () from
> /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
> #2  0x2b8d59b8107c in ompi_coll_base_allreduce_intra_recursivedoubling
> () from /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
> #3  0x2b8d59b495ac in PMPI_Allreduce () from
> /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
> #4  0x2b8d598e4027 in pmpi_allreduce__ () from
> /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi_mpifh.so.20
> #5  0x00414077 in m_sum_i (comm=..., ivec=warning: Range for type
> (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> warning: Range for type (null) has invalid bounds 1..-12884901892
> ..., n=2) at mpi.F:989
> #6  0x00daac54 in full_kpoints::set_indpw_full (grid=..., wdes=...,
> kpoints_f=...) at mkpoints_full.F:1099
> #7  0x01441654 in set_indpw_fock (t_info=..., p=warning: Range for
> type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> ..., wdes=..., grid=..., latt_cur=..., lmdim=Cannot access memory at address
> 0x1
> ) at fock.F:1669
> #8  fock::setup_fock (t_info=..., p=warning: Range for type (null) has
> invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> warning: Range for type (null) has invalid bounds 1..-1
> ..., wdes=..., grid=..., latt_cur=..., lmdim=Cannot access memory at address
> 0x1
> ) at fock.F:1413
&g

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Gilles Gouaillardet
Christoph,

can you please try again with

mpirun --mca btl tcp,self --mca pml ob1 ...

that will help figuring out whether pml/cm and/or mtl/psm2 is involved or not.


if that causes a crash, then can you please try

mpirun --mca btl tcp,self --mca pml ob1 --mca coll ^tuned ...

that will help figuring out whether coll/tuned is involved or not

coll/tuned is known not to correctly handle collectives with different
but matching signatures
(e.g. some tasks invoke the collective with one vector of N elements,
and some other invoke
the same collective with N elements)


if everything fails, can you describe of MPI_Allreduce is invoked ?
/* number of tasks, datatype, number of elements */



Cheers,

Gilles

On Wed, Dec 7, 2016 at 7:38 PM, Christof Koehler
 wrote:
> Hello everybody,
>
> I am observing a deadlock in allreduce with openmpi 2.0.1 on a Single
> node. A stack tracke (pstack) of one rank is below showing the program (vasp
> 5.3.5) and the two psm2 progress threads. However:
>
> In fact, the vasp input is not ok and it should abort at the point where
> it hangs. It does when using mvapich 2.2. With openmpi 2.0.1 it just
> deadlocks in some allreduce operation. Originally it was started with 20
> ranks, when it hangs there are only 19 left. From the PIDs I would
> assume it is the master rank which is missing. So, this looks like a
> failure to terminate.
>
> With 1.10 I get a clean
> --
> mpiexec noticed that process rank 0 with PID 18789 on node node109
> exited on signal 11 (Segmentation fault).
> --
>
> Any ideas what to try ? Of course in this situation it may well be the
> program. Still, with the observed difference between 2.0.1 and 1.10 (and
> mvapich) this might be interesting to someone.
>
> Best Regards
>
> Christof
>
>
> Thread 3 (Thread 0x2ad362577700 (LWP 4629)):
> #0  0x2ad35b1562c3 in epoll_wait () from /lib64/libc.so.6
> #1  0x2ad35d114f42 in epoll_dispatch () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #2  0x2ad35d116751 in opal_libevent2022_event_base_loop () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #3  0x2ad35d16e996 in progress_engine () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #4  0x2ad359efbdc5 in start_thread () from /lib64/libpthread.so.0
> #5  0x2ad35b155ced in clone () from /lib64/libc.so.6
> Thread 2 (Thread 0x2ad362778700 (LWP 4640)):
> #0  0x2ad35b14b69d in poll () from /lib64/libc.so.6 #1  
> 0x2ad35d11dc42 in poll_dispatch () from
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #2  0x2ad35d116751 in opal_libevent2022_event_base_loop () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #3  0x2ad35d0c61d1 in progress_engine () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #4  0x2ad359efbdc5 in start_thread () from /lib64/libpthread.so.0
> #5  0x2ad35b155ced in clone () from /lib64/libc.so.6
> Thread 1 (Thread 0x2ad35978d040 (LWP 4609)):
> #0  0x2ad35b14b69d in poll () from /lib64/libc.so.6
> #1  0x2ad35d11dc42 in poll_dispatch () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #2  0x2ad35d116751 in opal_libevent2022_event_base_loop () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #3  0x2ad35d0c28cf in opal_progress () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20
> #4  0x2ad35adce8d8 in ompi_request_wait_completion () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20
> #5  0x2ad35adce838 in mca_pml_cm_recv () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20
> #6  0x2ad35ad4da42 in ompi_coll_base_allreduce_intra_recursivedoubling () 
> from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20
> #7  0x2ad35ad52906 in ompi_coll_tuned_allreduce_intra_dec_fixed () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20
> #8  0x2ad35ad1f0f4 in PMPI_Allreduce () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20
> #9  0x2ad35aa99c38 in pmpi_allreduce__ () from 
> /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi_mpifh.so.20
> #10 0x0045f8c6 in m_sum_i_ ()
> #11 0x00e1ce69 in mlwf_mp_mlwf_wannier90_ ()
> #12 0x004331ff in vamp () at main.F:2640
> #13 0x0040ea1e in main ()
> #14 0x2ad35b080b15 in __libc_start_main () from /lib64/libc.so.6
> #15 0x0040e929 in _start ()
>
>
> --
> Dr. rer. nat. Christof Köhler   email: c.koeh...@bccms.uni-bremen.de
> Universitaet Bremen/ BCCMS  phone:  +49-(0)421-218-62334
> Am Fallturm 1/ TAB/ Raum 3.12   fax: +49-(0)421-218-62770
> 28359 Bremen
>
> PGP: http://www.bccms.uni-bremen.de/cms/people/c_koehler/
>
> ___
> users mailing list
> 

Re: [OMPI users] epoll add error with OpenMPI 2.0.1 and SGE

2016-12-17 Thread Gilles Gouaillardet
Dave,

thanks for the info

for what it's worth, it is generally a bad idea to --with-xxx=/usr
since you might inadvertently use some other external components.

in your case, --with-libevent=external is what you need if you want to
use an external libevent library installed in /usr

i guess the same comment would apply with /usr/local too

btw, which distro are you using ? is your distro's libevent up to date ?
we might want to add a FAQ entry with a known to be broken libevent


Cheers,

Gilles

On Sun, Dec 18, 2016 at 10:52 AM, Dave Turner  wrote:
>
>I've solved this problem by omitting --with-libevent=/usr from
> the configuration to force it to use the internal version.  I thought
> I had tried this before posting but evidently did something wrong.
>
>   Dave
>
> On Tue, Dec 13, 2016 at 9:57 PM,  wrote:
>>
>> Send users mailing list submissions to
>> users@lists.open-mpi.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> or, via email, send a message with subject or body 'help' to
>> users-requ...@lists.open-mpi.org
>>
>> You can reach the person managing the list at
>> users-ow...@lists.open-mpi.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of users digest..."
>>
>>
>> Today's Topics:
>>
>>1. epoll add error with OpenMPI 2.0.1 and SGE (Dave Turner)
>>
>>
>> --
>>
>> Message: 1
>> Date: Tue, 13 Dec 2016 21:57:40 -0600
>> From: Dave Turner 
>> To: users@lists.open-mpi.org
>> Subject: [OMPI users] epoll add error with OpenMPI 2.0.1 and SGE
>> Message-ID:
>>
>> 
>> Content-Type: text/plain; charset="utf-8"
>>
>> [warn] Epoll ADD(4) on fd 1 failed.  Old events were 0; read change was 0
>> (none); write change was 1 (add): Operation not permitted
>>
>> Gentoo with compiled OpenMPI 2.0.1 and SGE
>> ompi_info --all  file attached
>>
>> We recently did a maintenance upgrade to our cluster including
>> moving to OpenMPI 2.0.1.  Fortran programs now give the
>> epoll add error above at the start of a run and the stdout file
>> freezes until the end of the run when all info is dumped.
>>
>> I've read about this problem and it seems to be a file lock
>> issue where OpenMPI and SGE are both trying to lock the
>> same output file.  We have not seen this problem with
>> previous versions of OpenMPI.
>>
>> We've tried compiling OpenMPI with and without
>> specifying  --with-libevent=/usr, and I've tried compiling
>> with --disable-event-epoll and using -mca opal_event_include poll.
>> Both of these were suggestions from a few years back but
>> neither affects the problem.  I've also tried redirecting the output
>> manually as:
>>
>> mpirun -np 4 ./app > file.out
>>
>> This just locks file.out instead with all the output again being
>> dumped at the end of the run.
>>
>> We also do not have this issue with 1.10.4 installed.
>>
>>  Any suggestions?  Has anyone else run into this problem?
>>
>> Dave Turner
>> --
>> Work: davetur...@ksu.edu (785) 532-7791
>>  2219 Engineering Hall, Manhattan KS  66506
>> Home:drdavetur...@gmail.com
>>   cell: (785) 770-5929
>> -- next part --
>> An HTML attachment was scrubbed...
>> URL:
>> 
>> -- next part --
>> A non-text attachment was scrubbed...
>> Name: ompi_info.2.0.1.all
>> Type: application/octet-stream
>> Size: 202298 bytes
>> Desc: not available
>> URL:
>> 
>>
>> --
>>
>> Subject: Digest Footer
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>> --
>>
>> End of users Digest, Vol 3675, Issue 2
>> **
>
>
>
>
> --
> Work: davetur...@ksu.edu (785) 532-7791
>  2219 Engineering Hall, Manhattan KS  66506
> Home:drdavetur...@gmail.com
>   cell: (785) 770-5929
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-11 Thread Gilles Gouaillardet

Christof,


Ralph fixed the issue,

meanwhile, the patch can be manually downloaded at 
https://patch-diff.githubusercontent.com/raw/open-mpi/ompi/pull/2552.patch



Cheers,


Gilles



On 12/9/2016 5:39 PM, Christof Koehler wrote:

Hello,

our case is. The libwannier.a is a "third party"
library which is built seperately and the just linked in. So the vasp
preprocessor never touches it. As far as I can see no preprocessing of
the f90 source is involved in the libwannier build process.

I finally managed to set a breakpoint at the program exit of the root
rank:

(gdb) bt
#0  0x2b7ccd2e4220 in _exit () from /lib64/libc.so.6
#1  0x2b7ccd25ee2b in __run_exit_handlers () from /lib64/libc.so.6
#2  0x2b7ccd25eeb5 in exit () from /lib64/libc.so.6
#3  0x0407298d in for_stop_core ()
#4  0x012fad41 in w90_io_mp_io_error_ ()
#5  0x01302147 in w90_parameters_mp_param_read_ ()
#6  0x012f49c6 in wannier_setup_ ()
#7  0x00e166a8 in mlwf_mp_mlwf_wannier90_ ()
#8  0x004319ff in vamp () at main.F:2640
#9  0x0040d21e in main ()
#10 0x2b7ccd247b15 in __libc_start_main () from /lib64/libc.so.6
#11 0x0040d129 in _start ()

So for_stop_core is called apparently ? Of course it is below the main()
process of vasp, so additional things might happen which are not
visible. Is SIGCHILD (as observed when catching signals in mpirun) the
signal expectd after a for_stop_core ?

Thank you very much for investigating this !

Cheers

Christof

On Thu, Dec 08, 2016 at 03:15:47PM -0500, Noam Bernstein wrote:

On Dec 8, 2016, at 6:05 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> 
wrote:

Christof,


There is something really odd with this stack trace.
count is zero, and some pointers do not point to valid addresses (!)

in OpenMPI, MPI_Allreduce(...,count=0,...) is a no-op, so that suggests that
the stack has been corrupted inside MPI_Allreduce(), or that you are not using 
the library you think you use
pmap  will show you which lib is used

btw, this was not started with
mpirun --mca coll ^tuned ...
right ?

just to make it clear ...
a task from your program bluntly issues a fortran STOP, and this is kind of a 
feature.
the *only* issue is mpirun does not kill the other MPI tasks and mpirun never 
completes.
did i get it right ?

I just ran across very similar behavior in VASP (which we just switched over to 
openmpi 2.0.1), also in a allreduce + STOP combination (some nodes call one, 
others call the other), and I discovered several interesting things.

The most important is that when MPI is active, the preprocessor converts (via a 
#define in symbol.inc) fortran STOP into calls to m_exit() (defined in mpi.F), 
which is a wrapper around mpi_finalize.  So in my case some processes in the 
communicator call mpi_finalize, others call mpi_allreduce.  I’m not really 
surprised this hangs, because I think the correct thing to replace STOP with is 
mpi_abort, not mpi_finalize.  If you know where the STOP is called, you can 
check the preprocessed equivalent file (.f90 instead of .F), and see if it’s 
actually been replaced with a call to m_exit.  I’m planning to test whether 
replacing m_exit with m_stop in symbol.inc gives more sensible behavior, i.e. 
program termination when the original source file executes a STOP.

I’m assuming that a mix of mpi_allreduce and mpi_finalize is really expected to 
hang, but just in case that’s surprising, here are my stack traces:


hung in collective:

(gdb) where
#0  0x2b8d5a095ec6 in opal_progress () from 
/usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libopen-pal.so.20
#1  0x2b8d59b3a36d in ompi_request_default_wait_all () from 
/usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
#2  0x2b8d59b8107c in ompi_coll_base_allreduce_intra_recursivedoubling () 
from /usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
#3  0x2b8d59b495ac in PMPI_Allreduce () from 
/usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi.so.20
#4  0x2b8d598e4027 in pmpi_allreduce__ () from 
/usr/local/openmpi/2.0.1/x86_64/ib/intel/12.1.6/lib/libmpi_mpifh.so.20
#5  0x00414077 in m_sum_i (comm=..., ivec=warning: Range for type 
(null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
warning: Range for type (null) has invalid bounds 1..-12884901892
..., n=2) at mpi.F:989
#6  0x00daac54 in full_kpoints::set_indpw_full (grid=..., wdes=..., 
kpoints_f=...) at mkpoints_full.F:1099
#7  0x01441654 in set_indpw_fock (t_info=..., p=warning: Range for type 
(null) has invalid bounds 1..-1
warning: Range for type (null) has invalid bounds 1..-1
warning: Rang

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
Siegmar,

I was able to reproduce the issue on my vm
(No need for a real heterogeneous cluster here)

I will keep digging tomorrow.
Note that if you specify an incorrect slot list, MPI_Comm_spawn fails with a 
very unfriendly error message.
Right now, the 4th spawn'ed task crashes, so this is a different issue

Cheers,

Gilles

r...@open-mpi.org wrote:
>I think there is some relevant discussion here: 
>https://github.com/open-mpi/ompi/issues/1569
>
>
>It looks like Gilles had (at least at one point) a fix for master when 
>enable-heterogeneous, but I don’t know if that was committed.
>
>
>On Jan 9, 2017, at 8:23 AM, Howard Pritchard  wrote:
>
>
>HI Siegmar,
>
>
>You have some config parameters I wasn't trying that may have some impact.
>
>I'll give a try with these parameters.
>
>
>This should be enough info for now,
>
>
>Thanks,
>
>
>Howard
>
>
>
>2017-01-09 0:59 GMT-07:00 Siegmar Gross :
>
>Hi Howard,
>
>I use the following commands to build and install the package.
>${SYSTEM_ENV} is "Linux" and ${MACHINE_ENV} is "x86_64" for my
>Linux machine.
>
>mkdir openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
>cd openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
>
>../openmpi-2.0.2rc3/configure \
>  --prefix=/usr/local/openmpi-2.0.2_64_cc \
>  --libdir=/usr/local/openmpi-2.0.2_64_cc/lib64 \
>  --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
>  --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
>  JAVA_HOME=/usr/local/jdk1.8.0_66 \
>  LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack" CC="cc" CXX="CC" FC="f95" \
>  CFLAGS="-m64 -mt" CXXFLAGS="-m64" FCFLAGS="-m64" \
>  CPP="cpp" CXXCPP="cpp" \
>  --enable-mpi-cxx \
>  --enable-mpi-cxx-bindings \
>  --enable-cxx-exceptions \
>  --enable-mpi-java \
>  --enable-heterogeneous \
>  --enable-mpi-thread-multiple \
>  --with-hwloc=internal \
>  --without-verbs \
>  --with-wrapper-cflags="-m64 -mt" \
>  --with-wrapper-cxxflags="-m64" \
>  --with-wrapper-fcflags="-m64" \
>  --with-wrapper-ldflags="-mt" \
>  --enable-debug \
>  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>
>make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>rm -r /usr/local/openmpi-2.0.2_64_cc.old
>mv /usr/local/openmpi-2.0.2_64_cc /usr/local/openmpi-2.0.2_64_cc.old
>make install |& tee log.make-install.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>make check |& tee log.make-check.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>
>
>I get a different error if I run the program with gdb.
>
>loki spawn 118 gdb /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec
>GNU gdb (GDB; SUSE Linux Enterprise 12) 7.11.1
>Copyright (C) 2016 Free Software Foundation, Inc.
>License GPLv3+: GNU GPL version 3 or later 
>This is free software: you are free to change and redistribute it.
>There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>and "show warranty" for details.
>This GDB was configured as "x86_64-suse-linux".
>Type "show configuration" for configuration details.
>For bug reporting instructions, please see:
>.
>Find the GDB manual and other documentation resources online at:
>.
>For help, type "help".
>Type "apropos word" to search for commands related to "word"...
>Reading symbols from /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec...done.
>(gdb) r -np 1 --host loki --slot-list 0:0-5,1:0-5 spawn_master
>Starting program: /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec -np 1 --host loki 
>--slot-list 0:0-5,1:0-5 spawn_master
>Missing separate debuginfos, use: zypper install 
>glibc-debuginfo-2.24-2.3.x86_64
>[Thread debugging using libthread_db enabled]
>Using host libthread_db library "/lib64/libthread_db.so.1".
>[New Thread 0x73b97700 (LWP 13582)]
>[New Thread 0x718a4700 (LWP 13583)]
>[New Thread 0x710a3700 (LWP 13584)]
>[New Thread 0x7fffebbba700 (LWP 13585)]
>Detaching after fork from child process 13586.
>
>Parent process 0 running on loki
>  I create 4 slave processes
>
>Detaching after fork from child process 13589.
>Detaching after fork from child process 13590.
>Detaching after fork from child process 13591.
>[loki:13586] OPAL ERROR: Timeout in file 
>../../../../openmpi-2.0.2rc3/opal/mca/pmix/base/pmix_base_fns.c at line 193
>[loki:13586] *** An error occurred in MPI_Comm_spawn
>[loki:13586] *** reported by process [2873294849,0]
>[loki:13586] *** on communicator MPI_COMM_WORLD
>[loki:13586] *** MPI_ERR_UNKNOWN: unknown error
>[loki:13586] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
>abort,
>[loki:13586] ***    and potentially your MPI job)
>[Thread 0x7fffebbba700 (LWP 13585) exited]
>[Thread 0x710a3700 (LWP 13584) exited]
>[Thread 0x718a4700 (LWP 13583) exited]
>[Thread 0x73b97700 (LWP 13582) exited]
>[Inferior 1 (process 13567) exited with code 016]
>Missing separate debuginfos, use: zypper install 
>libpciaccess0-debuginfo-0.13.2-5.1.x86_64 libudev1-debuginfo-210-116.3.3.x86_64
>(gdb) bt
>No stack.
>(gdb)
>
>Do 

Re: [OMPI users] how to specify OSHMEM component from Mellanox in configure

2017-01-11 Thread Gilles Gouaillardet
Juan,

Open MPI has its own implementation of OpenSHMEM.
The Mellanox software is very likely yet an other implementation of
OpenSHMEM.

So you can consider these as independent libraries

Cheers,

Gilles

On Wednesday, January 11, 2017, Juan A. Cordero Varelaq <
bioinformatica-i...@us.es> wrote:

> Hi,
>
> I have an directory in /opt/openshmem which contains software from
> mellanox. Should I specify during OMPI installation (at the configure step)
> where that directory is located? If yes, how? I have been looking for a
> "--with-oshmem=" but haven't found anything, as for instance other
> Mellanox Software such as fca ("--with-fca=/opt/fca").
>
> Thanks in advance
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
Siegmar,

Your slot list is correct.
An invalid slot list for your node would be 0:1-7,1:0-7

/* and since the test requires only 5 tasks, that could even work with such
an invalid list.
My vm is single socket with 4 cores, so a 0:0-4 slot list results in an
unfriendly pmix error */

Bottom line, your test is correct, and there is a bug in v2.0.x that I will
investigate from tomorrow

Cheers,

Gilles

On Wednesday, January 11, 2017, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi Gilles,
>
> thank you very much for your help. What does incorrect slot list
> mean? My machine has two 6-core processors so that I specified
> "--slot-list 0:0-5,1:0-5". Does incorrect mean that it isn't
> allowed to specify more slots than available, to specify fewer
> slots than available, or to specify more slots than needed for
> the processes?
>
>
> Kind regards
>
> Siegmar
>
> Am 11.01.2017 um 10:04 schrieb Gilles Gouaillardet:
>
>> Siegmar,
>>
>> I was able to reproduce the issue on my vm
>> (No need for a real heterogeneous cluster here)
>>
>> I will keep digging tomorrow.
>> Note that if you specify an incorrect slot list, MPI_Comm_spawn fails
>> with a very unfriendly error message.
>> Right now, the 4th spawn'ed task crashes, so this is a different issue
>>
>> Cheers,
>>
>> Gilles
>>
>> r...@open-mpi.org wrote:
>> I think there is some relevant discussion here:
>> https://github.com/open-mpi/ompi/issues/1569
>>
>> It looks like Gilles had (at least at one point) a fix for master when
>> enable-heterogeneous, but I don’t know if that was committed.
>>
>> On Jan 9, 2017, at 8:23 AM, Howard Pritchard <hpprit...@gmail.com
>>> <mailto:hpprit...@gmail.com>> wrote:
>>>
>>> HI Siegmar,
>>>
>>> You have some config parameters I wasn't trying that may have some
>>> impact.
>>> I'll give a try with these parameters.
>>>
>>> This should be enough info for now,
>>>
>>> Thanks,
>>>
>>> Howard
>>>
>>>
>>> 2017-01-09 0:59 GMT-07:00 Siegmar Gross <siegmar.gr...@informatik.hs-
>>> fulda.de <mailto:siegmar.gr...@informatik.hs-fulda.de>>:
>>>
>>> Hi Howard,
>>>
>>> I use the following commands to build and install the package.
>>> ${SYSTEM_ENV} is "Linux" and ${MACHINE_ENV} is "x86_64" for my
>>> Linux machine.
>>>
>>> mkdir openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
>>> cd openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
>>>
>>> ../openmpi-2.0.2rc3/configure \
>>>   --prefix=/usr/local/openmpi-2.0.2_64_cc \
>>>   --libdir=/usr/local/openmpi-2.0.2_64_cc/lib64 \
>>>   --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
>>>   --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
>>>   JAVA_HOME=/usr/local/jdk1.8.0_66 \
>>>   LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack" CC="cc" CXX="CC"
>>> FC="f95" \
>>>   CFLAGS="-m64 -mt" CXXFLAGS="-m64" FCFLAGS="-m64" \
>>>   CPP="cpp" CXXCPP="cpp" \
>>>   --enable-mpi-cxx \
>>>   --enable-mpi-cxx-bindings \
>>>   --enable-cxx-exceptions \
>>>   --enable-mpi-java \
>>>   --enable-heterogeneous \
>>>   --enable-mpi-thread-multiple \
>>>   --with-hwloc=internal \
>>>   --without-verbs \
>>>   --with-wrapper-cflags="-m64 -mt" \
>>>   --with-wrapper-cxxflags="-m64" \
>>>   --with-wrapper-fcflags="-m64" \
>>>   --with-wrapper-ldflags="-mt" \
>>>   --enable-debug \
>>>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>>>
>>> make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>>> rm -r /usr/local/openmpi-2.0.2_64_cc.old
>>> mv /usr/local/openmpi-2.0.2_64_cc /usr/local/openmpi-2.0.2_64_cc.old
>>> make install |& tee log.make-install.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>>> make check |& tee log.make-check.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>>>
>>>
>>> I get a different error if I run the program with gdb.
>>>
>>> loki spawn 118 gdb /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec
>>> GNU gdb (GDB; SUSE Linux Enterprise 12) 7.11.1
>>> Copyright (C) 2016 Free Software Foundation, Inc.
>>>

Re: [OMPI users] mca_based_component warnings when running simple example code

2017-01-06 Thread Gilles Gouaillardet
Hi,

it looks like you installed Open MPI 2.0.1 at the same location than
previous Open MPI 1.10, but you did not uninstall v1.10.
the faulty modules have very been likely removed from 2.0.1, hence the error.
you can simply remove the openmpi plugins directory and reinstall openmpi

rm -rf /usr/local/lib/openmpi
make install

Cheers,

Gilles

On Fri, Jan 6, 2017 at 6:30 PM, Solal Amouyal  wrote:
> FYI: It's my first time posting here and I'm quit a beginner in MPI.
>
> I recently updated my gfortran (from 4.7 to 6.1.0) and OpenMPI (from 1.10 to
> 2.0.1) compilers. Since then, I've been getting warnings at the beginning of
> the simulation (one set of warnings per processor).
>
> I reduced my code to the most MPI program and the warnings persist.
>
> program main
> use mpi_f08
> implicit none
> integer :: ierror
>
> call mpi_init(ierror)
> call mpi_finalize(ierror)
> end program main
>
> I compile my code with mpif90 main.f90, and run it either directly - ./a.out
> - or with mpirun: mpirun -np 1 ./a.out. The output is the same:
>
> [username:79762] mca_base_component_repository_open: unable to open
> mca_grpcomm_bad: dlopen(/usr/local/lib/openmpi/mca_grpcomm_bad.so, 9):
> Symbol not found: _orte_grpcomm_base_modex
>   Referenced from: /usr/local/lib/openmpi/mca_grpcomm_bad.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_grpcomm_bad.so (ignored)
> [username:79761] mca_base_component_repository_open: unable to open
> mca_grpcomm_bad: dlopen(/usr/local/lib/openmpi/mca_grpcomm_bad.so, 9):
> Symbol not found: _orte_grpcomm_base_modex
>   Referenced from: /usr/local/lib/openmpi/mca_grpcomm_bad.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_grpcomm_bad.so (ignored)
> [username:79761] mca_base_component_repository_open: unable to open
> mca_pml_bfo: dlopen(/usr/local/lib/openmpi/mca_pml_bfo.so, 9): Symbol not
> found: _ompi_free_list_item_t_class
>   Referenced from: /usr/local/lib/openmpi/mca_pml_bfo.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_pml_bfo.so (ignored)
> [username:79761] mca_base_component_repository_open: coll
> "/usr/local/lib/openmpi/mca_coll_hierarch" uses an MCA interface that is not
> recognized (component MCA v2.0.0 != supported MCA v2.1.0) -- ignored
> [username:79761] mca_base_component_repository_open: unable to open
> mca_coll_ml: dlopen(/usr/local/lib/openmpi/mca_coll_ml.so, 9): Symbol not
> found: _mca_bcol_base_components_in_use
>   Referenced from: /usr/local/lib/openmpi/mca_coll_ml.so
>   Expected in: flat namespace
>  in /usr/local/lib/openmpi/mca_coll_ml.so (ignored)
>
> I ran and attached the output of ompi_info --all.
>
> I can see in one of the warnings that there's a MCA version mismatch between
> v2.0.0 and 2.0.1. I don't know if it might be related but I made sure to
> upgrade my OpenMPI after gfortran. I am using OSX10.11.4
>
> Thank you,
>
> _
>
> Solal
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] mpirun with ssh tunneling

2016-12-25 Thread Gilles Gouaillardet

Adam,


there are several things here


with an up-to-date master, you can specify an alternate ssh port via a 
hostfile


see https://github.com/open-mpi/ompi/issues/2224


Open MPI requires more than just ssh.

- remote nodes (orted) need to call back mpirun (oob/tcp)

- nodes (MPI tasks) need to be able to connect to each other (btl/tcp)


regarding oob/tcp, your mpirun command line will basically do under the hood
ssh docker2 orted  

then each task will use a port for btl/tcp, and tasks might directly 
connect to each other with the docker IP and this port.


by default, these two ports are dynamic, but you can use static port 
(range) via MCA parameter
mpirun --mca oob_tcp_static_ipv4_ports xxx --mca oob_btl_tcp_port_min_v4 
yyy --mca btl_tcp_port_range_v4 zzz



that does not change the fact that ssh tunneling works with host 
addresses, and Open MPI will (internally) use docker addresses.



i'd rather suggest you try to
- enable IP connectivity between your containers (eventually running on 
different hosts)
- assuming you need (some) network isolation, then use static ports, and 
update your firewall to allow full TCP/IP connectivity on these ports

  and port 22 (ssh).

you can also refer to https://github.com/open-mpi/ompi/issues/1511
yet an other way to use docker was discussed here.

last but not least, if you want to use containers but you are not tied 
to docker, you can consider http://singularity.lbl.gov/
(as far as Open MPI is concerned,native support is expected for Open MPI 
2.1)



Cheers,

Gilles

On 12/26/2016 6:11 AM, Adam Sylvester wrote:
I'm trying to use OpenMPI 1.10.4 to communicate between two Docker 
containers running on two different physical machines.  Docker doesn't 
have much to do with my question (unless someone has a suggestion for 
a better way to do what I'm trying to :o) )... each Docker container 
is running an OpenSSH server which shows up as 172.17.0.1 on the 
physical hosts:


$ ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:8E:07:05:A0
  inet addr:172.17.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
  inet6 addr: fe80::42:8eff:fe07:5a0/64 Scope:Link

The Docker container's ssh port is published on the physical host as 
port 32768.


The Docker container has a user 'mpirun' which I have public/private 
ssh keys set up for.


Let's call the physical hosts host1 and host2; each host is running a 
Docker container I'll refer to as docker1 and docker2 respectively.  
So, this means I can...

1. ssh From host1 into docker1:
ssh mpirun@172.17.0.1  -i ssh/id_rsa -p 32768

2. Set up an ssh tunnel from inside docker1, through host2, into 
docker2, on local port 4334 (ec2-user is the login to host2)
ssh -f -N -q -o "TCPKeepAlive yes" -o "ServerAliveInterval 60" -L 
4334:172.17.0.1:32768  -l ec2-user host2


3. Update my ~/.ssh/config file to name this host 'docker2':
StrictHostKeyChecking no
Host docker2
  HostName 127.0.0.1
  Port 4334
  User mpirun

4. I can now do 'ssh docker2' and ssh into it without issues.

Here's where I get stuck.  I'd read that OpenMPI's mpirun didn't 
support ssh'ing on a non-standard port, so I thought I could just do 
step 3 above and then list the hosts when I run mpirun from docker1:


mpirun --prefix /usr/local -n 2 -H localhost,docker2 
/home/mpirun/mpi_hello_world


However, I get:
[3524ae84a26b:00197] [[55635,0],1] tcp_peer_send_blocking: send() to 
socket 9 failed: Broken pipe (32)

--
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp 
(--tmpdir/orte_tmpdir_base).
  Please check with your sys admin to determine the correct location 
to use.


*  compilation of the orted with dynamic libraries when static are 
required

  (e.g., on Cray). Please check your configure cmd line and consider using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).
--

I'm guessing that something's going wrong when docker2 tries to 
communicate back to docker1.  However, I'm not sure what additional 
tunneling to set up to support this.  My understanding of ssh tunnels 
is relatively basic... I can of course create a tunnel on docker2 back 
to docker1 but I don't know how ssh/mpi will "find" it.  I've read a 
bit about reverse 

Re: [OMPI users] "Warning :: opal_list_remove_item" with openmpi-2.1.0rc4

2017-03-22 Thread Gilles Gouaillardet
Roland,

the easiest way is to use an external hwloc that is configured with
--disable-nvml

an other option is to hack the embedded hwloc configure.m4 and pass
--disable-nvml to the embedded hwloc configure. note this requires you run
autogen.sh and you hence needs recent autotools.

i guess Open MPI 1.8 embeds an older hwloc that is not aware of nvml, hence
the lack of warning.

Cheers,

Gilles

On Wednesday, March 22, 2017, Roland Fehrenbacher  wrote:

> > "SJ" == Sylvain Jeaugey > writes:
>
> SJ> If you installed CUDA libraries and includes in /usr, then it's
> SJ> not surprising hwloc finds them even without defining CFLAGS.
>
> Well, that's the place where distribution packages install to :)
> I don't think a build system should misbehave, if libraries are installed
> in default places.
>
> SJ> I'm just saying I think you won't get the error message if Open
> SJ> MPI finds CUDA but hwloc does not.
>
> OK, so I think I need to ask the original question again: Is there a way
> to suppress these warnings with a "normal" build? I guess the answer
> must be yes, since 1.8.x didn't have this problem. The real question
> then would be how ...
>
> Thanks,
>
> Roland
>
> SJ> On 03/21/2017 11:05 AM, Roland Fehrenbacher wrote:
> >>> "SJ" == Sylvain Jeaugey >
> writes:
> >> Hi Silvain,
> >>
> >> I get the "NVIDIA : ..." run-time error messages just by
> >> compiling with "--with-cuda=/usr":
> >>
> >> ./configure --prefix=${prefix} \ --mandir=${prefix}/share/man \
> >> --infodir=${prefix}/share/info \
> >> --sysconfdir=/etc/openmpi/${VERSION} --with-devel-headers \
> >> --disable-memchecker \ --disable-vt \ --with-tm --with-slurm
> >> --with-pmi --with-sge \ --with-cuda=/usr \
> >> --with-io-romio-flags='--with-file-system=nfs+lustre' \
> >> --with-cma --without-valgrind \ --enable-openib-connectx-xrc \
> >> --enable-orterun-prefix-by-default \ --disable-java
> >>
> >> Roland
> >>
> SJ> Hi Siegmar, I think this "NVIDIA : ..." error message comes from
> SJ> the fact that you add CUDA includes in the C*FLAGS. If you just
> SJ> use --with-cuda, Open MPI will compile with CUDA support, but
> SJ> hwloc will not find CUDA and that will be fine. However, setting
> SJ> CUDA in CFLAGS will make hwloc find CUDA, compile CUDA support
> SJ> (which is not needed) and then NVML will show this error message
> SJ> when not run on a machine with CUDA devices.
> >>
> SJ> I guess gcc picks the environment variable, while cc does not
> SJ> hence the different behavior. So again, there is no need to add
> SJ> all those CUDA includes, --with-cuda is enough.
> >>
> SJ> About the opal_list_remove_item, we'll try to reproduce the
> SJ> issue and see where it comes from.
> >>
> SJ> Sylvain
> >>
> SJ> On 03/21/2017 12:38 AM, Siegmar Gross wrote:
> >> >> Hi,
> >> >>
> >> >> I have installed openmpi-2.1.0rc4 on my "SUSE Linux Enterprise
> >> >> Server
> >> >> 12.2 (x86_64)" with Sun C 5.14 and gcc-6.3.0. Sometimes I get
> >> >>  once
> >> >> more a warning about a missing item for one of my small
> >> >> programs (it doesn't matter if I use my cc or gcc version). My
> >> >> gcc version also displays the message "NVIDIA: no NVIDIA
> >> >> devices found" for the server without NVIDIA devices (I don't
> >> >> get the message for my cc version).  I used the following
> >> >> commands to build the package (${SYSTEM_ENV} is Linux and
> >> >> ${MACHINE_ENV} is x86_64).
> >> >>
> >> >>
> >> >> mkdir openmpi-2.1.0rc4-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc cd
> >> >> openmpi-2.1.0rc4-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> >> >>
> >> >> ../openmpi-2.1.0rc4/configure \
> >> >> --prefix=/usr/local/openmpi-2.1.0_64_cc \
> >> >> --libdir=/usr/local/openmpi-2.1.0_64_cc/lib64 \
> >> >> --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
> >> >> --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
> >> >> JAVA_HOME=/usr/local/jdk1.8.0_66 \ LDFLAGS="-m64 -mt -Wl,-z
> >> >> -Wl,noexecstack -L/usr/local/lib64 -L/usr/local/cuda/ lib64" \
> >> >> CC="cc" CXX="CC" FC="f95" \ CFLAGS="-m64 -mt
> >> >> -I/usr/local/include -I/usr/local/cuda/include" \
> >> >> CXXFLAGS="-m64 -I/usr/local/include -I/usr/local/cuda/include"
> >> >> \ FCFLAGS="-m64" \ CPP="cpp -I/usr/local/include
> >> >> -I/usr/local/cuda/include" \ CXXCPP="cpp -I/usr/local/include
> >> >> -I/usr/local/cuda/include" \ --enable-mpi-cxx \
> >> >> --enable-cxx-exceptions \ --enable-mpi-java \
> >> >> --with-cuda=/usr/local/cuda \
> >> >> --with-valgrind=/usr/local/valgrind \
> >> >> --enable-mpi-thread-multiple \ --with-hwloc=internal \
> >> >> --without-verbs \ --with-wrapper-cflags="-m64 -mt" \
> >> >> 

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-23 Thread Gilles Gouaillardet
Can you please try
mpirun --mca btl tcp,self ...
And if it works
mpirun --mca btl openib,self ...

Then can you try
mpirun --mca coll ^tuned --mca btl tcp,self ...

That will help figuring out whether the error is in the pml or the coll
framework/module

Cheers,

Gilles

On Thursday, March 23, 2017, Götz Waschk  wrote:

> Hi Howard,
>
> I have attached my config.log file for version 2.1.0. I have based it
> on the OpenHPC package. Unfortunately, it still crashes with disabling
> the vader btl with this command line:
> mpirun --mca btl "^vader" IMB-MPI1
>
>
> [pax11-10:44753] *** Process received signal ***
> [pax11-10:44753] Signal: Bus error (7)
> [pax11-10:44753] Signal code: Non-existant physical address (2)
> [pax11-10:44753] Failing at address: 0x2b3989e27a00
> [pax11-10:44753] [ 0] /usr/lib64/libpthread.so.0(+0xf370)[0x2b3976f44370]
> [pax11-10:44753] [ 1]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.
> so(+0x559a)[0x2b398545259a]
> [pax11-10:44753] [ 2]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libopen-pal.so.20(
> opal_free_list_grow_st+0x1df)[0x2b39777bb78f]
> [pax11-10:44753] [ 3]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.
> so(mca_btl_sm_sendi+0x272)[0x2b3985450562]
> [pax11-10:44753] [ 4]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.
> so(+0x8a3f)[0x2b3985d78a3f]
> [pax11-10:44753] [ 5]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.
> so(mca_pml_ob1_send+0x4a7)[0x2b3985d79ad7]
> [pax11-10:44753] [ 6]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_
> coll_base_sendrecv_nonzero_actual+0x110)[0x2b3976cda620]
> [pax11-10:44753] [ 7]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_
> coll_base_allreduce_intra_ring+0x860)[0x2b3976cdb8f0]
> [pax11-10:44753] [ 8]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(PMPI_
> Allreduce+0x17b)[0x2b3976ca36ab]
> [pax11-10:44753] [ 9] IMB-MPI1[0x40b2ff]
> [pax11-10:44753] [10] IMB-MPI1[0x402646]
> [pax11-10:44753] [11]
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b3977172b35]
> [pax11-10:44753] [12] IMB-MPI1[0x401f79]
> [pax11-10:44753] *** End of error message ***
> [pax11-10:44752] *** Process received signal ***
> [pax11-10:44752] Signal: Bus error (7)
> [pax11-10:44752] Signal code: Non-existant physical address (2)
> [pax11-10:44752] Failing at address: 0x2ab0d270d3e8
> [pax11-10:44752] [ 0] /usr/lib64/libpthread.so.0(+0xf370)[0x2ab0bf7ec370]
> [pax11-10:44752] [ 1]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_
> allocator_bucket.so(mca_allocator_bucket_alloc_align+0x89)[0x2ab0c2eed1c9]
> [pax11-10:44752] [ 2]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmca_common_sm.so.
> 20(+0x1495)[0x2ab0cde8d495]
> [pax11-10:44752] [ 3]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libopen-pal.so.20(
> opal_free_list_grow_st+0x277)[0x2ab0c0063827]
> [pax11-10:44752] [ 4]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.
> so(mca_btl_sm_sendi+0x272)[0x2ab0cdc87562]
> [pax11-10:44752] [ 5]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.
> so(+0x8a3f)[0x2ab0ce630a3f]
> [pax11-10:44752] [ 6]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.
> so(mca_pml_ob1_send+0x4a7)[0x2ab0ce631ad7]
> [pax11-10:44752] [ 7]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_
> coll_base_sendrecv_nonzero_actual+0x110)[0x2ab0bf582620]
> [pax11-10:44752] [ 8]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_
> coll_base_allreduce_intra_ring+0x860)[0x2ab0bf5838f0]
> [pax11-10:44752] [ 9]
> /opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(PMPI_
> Allreduce+0x17b)[0x2ab0bf54b6ab]
> [pax11-10:44752] [10] IMB-MPI1[0x40b2ff]
> [pax11-10:44752] [11] IMB-MPI1[0x402646]
> [pax11-10:44752] [12]
> /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2ab0bfa1ab35]
> [pax11-10:44752] [13] IMB-MPI1[0x401f79]
> [pax11-10:44752] *** End of error message ***
> --
> mpirun noticed that process rank 340 with PID 44753 on node pax11-10
> exited on signal 7 (Bus error).
> --
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-23 Thread Gilles Gouaillardet
Matt,

a C++ compiler is required to configure Open MPI.
That being said, C++ compiler is only used if you build the C++ bindings
(That were removed from MPI-3)
And unless you plan to use the mpic++ wrapper (with or without the C++
bindings),
a valid C++ compiler is not required at all.
/* configure still requires one, and that could be improved */

My point is you should not worry too much about configure messages related
to C++,
and you should instead focus on the Fortran issue.

Cheers,

Gilles

On Thursday, March 23, 2017, Matt Thompson  wrote:

> All, I'm hoping one of you knows what I might be doing wrong here.  I'm
> trying to use Open MPI 2.1.0 for PGI 16.10 (Community Edition) on macOS.
> Now, I built it a la:
>
> http://www.pgroup.com/userforum/viewtopic.php?p=21105#21105
>
> and found that it built, but the resulting mpifort, etc were just not
> good. Couldn't even do Hello World.
>
> So, I thought I'd start from the beginning. I tried running:
>
> configure --disable-wrapper-rpath CC=pgcc CXX=pgc++ FC=pgfortran
> --prefix=/Users/mathomp4/installed/Compiler/pgi-16.10/openmpi/2.1.0
> but when I did I saw this:
>
> *** C++ compiler and preprocessor
> checking whether we are using the GNU C++ compiler... yes
> checking whether pgc++ accepts -g... yes
> checking dependency style of pgc++... none
> checking how to run the C++ preprocessor... pgc++ -E
> checking for the C++ compiler vendor... gnu
>
> Well, that's not the right vendor. So, I took a look at configure and I
> saw that at least some detection for PGI was a la:
>
>   pgCC* | pgcpp*)
> # Portland Group C++ compiler
> case `$CC -V` in
> *pgCC\ [1-5].* | *pgcpp\ [1-5].*)
>
>   pgCC* | pgcpp*)
> # Portland Group C++ compiler
> lt_prog_compiler_wl_CXX='-Wl,'
> lt_prog_compiler_pic_CXX='-fpic'
> lt_prog_compiler_static_CXX='-Bstatic'
> ;;
>
> Ah. PGI 16.9+ now use pgc++ to do C++ compiling, not pgcpp. So, I hacked
> configure so that references to pgCC (nonexistent on macOS) are gone and
> all pgcpp became pgc++, but:
>
> *** C++ compiler and preprocessor
> checking whether we are using the GNU C++ compiler... yes
> checking whether pgc++ accepts -g... yes
> checking dependency style of pgc++... none
> checking how to run the C++ preprocessor... pgc++ -E
> checking for the C++ compiler vendor... gnu
>
> Well, at this point, I think I'm stopping until I get help. Will this
> chunk of configure always return gnu for PGI? I know the C part returns
> 'portland group':
>
> *** C compiler and preprocessor
> checking for gcc... (cached) pgcc
> checking whether we are using the GNU C compiler... (cached) no
> checking whether pgcc accepts -g... (cached) yes
> checking for pgcc option to accept ISO C89... (cached) none needed
> checking whether pgcc understands -c and -o together... (cached) yes
> checking for pgcc option to accept ISO C99... none needed
> checking for the C compiler vendor... portland group
>
> so I thought the C++ section would as well. I also tried passing in
> --enable-mpi-cxx, but that did nothing.
>
> Is this just a red herring? My real concern is with pgfortran/mpifort, but
> I thought I'd start with this. If this is okay, I'll move on and detail the
> fortran issues I'm having.
>
> Matt
> --
> Matt Thompson
>
> Man Among Men
> Fulcrum of History
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] migrating to the MPI_F08 module

2017-03-22 Thread Gilles Gouaillardet
Tom,

what if you use
type(mpi_datatype) :: mpiint

Cheers,

Gilles

On Thursday, March 23, 2017, Tom Rosmond  wrote:

>
> Hello;
>
> I am converting some fortran 90/95 programs from the 'mpif.h' include file
> to the 'mpi_f08' model and have encountered a problem. Here is a simple
> test program that demonstrates it:
>
> 
> __-
>
>   program testf08
> !
>   use mpi_f08 !! F008
>
>   implicit none
>
>   type(mpi_comm) :: mpicomm   ! F08
> ! type(mpi_xxx) :: mpiint ! F08
>   integer :: mpiint
>   integer myrank,nproc,ierr
>
>   mpicomm= mpi_comm_world
>   mpiint= mpi_integer
>
>   call MPI_Init(ierr)
>   call MPI_Comm_rank(mpicomm,myrank,ierr)
>   call MPI_Comm_size(mpicomm,nproc,ierr)
> !
>   call mpi_finalize(ierr)
>   end
>
> 
> ___
>
> Compilation fails with this error:
>
> testf08.f90(13): error #6303: The assignment operation or the binary
> expression operation is invalid for the data types of the two operands.
>  [MPI_INTEGER]
>   mpiint= mpi_integer
>
> Clearly I need something like the commented line with the 'xxx' string as
> an analog to the typing of 'mpicomm' above it.  So, what is needed in place
> of 'xxx'?  The naive choice 'type(mpi_integer)' isn't correct.
>
> I am using OPENMPI 2.0.1 with Intel IFORT 12.0.3.174.
>
>
> T. Rosmond
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-29 Thread Gilles Gouaillardet

Hi,


yes, please open an issue on github, and post your configure and mpirun 
command lines.


ideally, could you try the latest v1.10.6 and v2.1.0 ?


if you can reproduce the issue with a smaller number of MPI tasks, that 
would be great too



Cheers,


Gilles


On 3/28/2017 11:19 PM, Götz Waschk wrote:

Hi everyone,

so how do I proceed with this problem, do you need more information?
Should I open a bug report on github?

Regards, Götz Waschk
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Failed to create a queue pair (QP) error

2017-03-25 Thread Gilles Gouaillardet
Iirc, there used to be a bug in Open MPI leading to such a false positive,
but I cannot remember the details.
I recommend you use at least the latest 1.10 (which is really a 1.8 + a few
more features and several bug fixes)
An other option is to simply +1 a mtt parameter and see if it helps

Cheers,

Gilles

On Sunday, March 26, 2017, Ilchenko Evgeniy  wrote:

> Hi all!
>
> I install older version openmpi 1.8 and get other error. For command
>
> mpirun -np 1 prog
>
> I get next output:
>
> --
> WARNING: It appears that your OpenFabrics subsystem is configured to only
> allow registering part of your physical memory. This can cause MPI jobs to
> run with erratic performance, hang, and/or crash.
>
> This may be caused by your OpenFabrics vendor limiting the amount of
> physical memory that can be registered. You should investigate the
> relevant Linux kernel module parameters that control how much physical
> memory can be registered, and increase them to allow registering all
> physical memory on your machine.
>
> See this Open MPI FAQ item for more information on these Linux kernel module
> parameters:
> http://www.open-mpi.org/faq/?category=openfabrics#ib-..
>
> Local host: node107
> Registerable memory: 32768 MiB
> Total memory: 65459 MiB
>
> Your MPI job will continue, but may be behave poorly and/or hang.
> --
> hello from 0
> hello from 1
> [node107:48993] 1 more process has sent help message help-mpi- btl-openib.txt 
> / reg mem limit low
> [node107:48993] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
> help / error messages
>
> Other installed soft (Intel MPI library) work fine, without any errors and
> using all 64GB memory.
>
> For OpenMPI I don't use any PBS manager (Torque, slurm, etc.), I work on
> single node. I get to the node by command
>
> ssh node107
>
> For command
>
> cat /etc/security/limits.conf
>
> I get next output:
>
> ...
> * soft rss  200
> * soft stack200
> * hard stackunlimited
> * soft data unlimited
> * hard data unlimited
> * soft memlock unlimited
> * hard memlock unlimited
> * soft nproc   1
> * hard nproc   1
> * soft nofile   1
> * hard nofile   1
> * hard cpu unlimited
> * soft cpu unlimited
> ...
>
> For command
>
> cat /sys/module/mlx4_core/parameters/log_num_mtt
>
> I get output:
>
> 0
>
> Command:
>
> cat /sys/module/mlx4_core/parameters/log_mtts_per_seg
>
> output:
>
> 3
>
> Command:
>
> getconf PAGESIZE
>
> output:
>
> 4096
>
> With this params and by formula
>
> max_reg_mem = (2^log_num_mtt) * (2^log_mtts_per_seg) * PAGE_SIZE
>
> max_reg_mem = 32768 bytes, nor 32GB, how specified in openmpi warning.
>
> I think that the cause of errors for different versions (1.8 and 2.1 ) is
> the same...
>
> What is the reason for this?
>
> What programs or settings may restrict memory for openmpi?
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] tuning sm/vader for large messages

2017-03-20 Thread Gilles Gouaillardet

Joshua,


George previously explained you are limited by the size of your level X 
cache.


that means that you might get optimal performance for a given message 
size, let's say


when everything fits in the L2 cache.

when you increase the message size, L2 cache is too small, and you have 
to move to the L3 cache,


which is obviously slower, and hence the drop in performance.


so send/recv a same small message twice might be faster than send/recv 
one twice larger message ...


just because of the cache size


Cheers,


Gilles


On 3/21/2017 9:11 AM, Joshua Mora wrote:

I don't want to push it up.
I just want to sustain the same bandwidth sending at that optimal size. I'd
like to see a constant bw from that size and above , not a significant drop
when I  cross a msg size.

-- Original Message --
Received: 05:11 PM CDT, 03/20/2017
From: George Bosilca 
To: Joshua Mora  Cc:
Open MPI Users 
Subject: Re: [OMPI users] tuning sm/vader for large messages


On Mon, Mar 20, 2017 at 12:45 PM, Joshua Mora  wrote:


If at certain x msg size you achieve X performance (MB/s) and at 2x msg
size
or higher you achieve Y performance, being Y significantly lower than X,
is it
possible to have a parameter that chops messages internally to x size in
order
to sustain X performance rather than let it choke ?


Unfortunately not. After a certain message size you hit the hardware memory
bandwidth limit, and no pipeline can help. To push it up you will need to
have a single copy instead of 2, but vader should do this by default as
long as KNEM or CMA are available on the machine.

   George.




sort of flow control to
avoid congestion ?
If that is possible, what would be that parameter for vader ?

Other than source code, is there any detailed documentation/studies of
vader
related parameters to improve the bandwidth at large message size ? I did
see
some documentation for sm, but not for vader.

Thanks,
Joshua


-- Original Message --
Received: 03:06 PM CDT, 03/17/2017
From: George Bosilca
To: Joshua Mora
Cc: Open MPI Users
Subject: Re: [OMPI users] tuning sm/vader for large messages















On Fri, Mar 17, 2017 at 3:33 PM, Joshua Mora

wrote:

Thanks for the quick reply.
This test is between 2 cores that are on different CPUs. Say data has

to

traverse coherent fabric (eg. QPI,UPI, cHT).
It has to go to main memory independently of cache size. Wrong

assumption
?

Depends on the usage pattern. Some benchmarks have options to

clean/flush

the cache before each round of tests.



Can data be evicted from cache and put into cache of second core on
different
CPU without placing it first in main memory ?


It would depend on the memory coherency protocol. Usually it gets

marked

as

shared, and as a result it might not need to be pushed into main memory
right away.



I am more thinking that there is a parameter that splits large

messages

in

smaller ones at 64k or 128k ?


Pipelining is not the answer to all situations. Once your messages are
larger than the caches, you already built memory pressure (by getting
outside the cache size) so the pipelining is bound by the memory

bandwidth.




This seems (wrong assumption ?) like the kind of parameter I would

need

for

large messages on a NIC. Coalescing data / large MTU,...


Sure, but there are hard limits imposed by the hardware, especially

with

regards to intranode communications. Once you saturate the memory bus,

you

hit a pretty hard limit.

   George.




Joshua








-- Original Message --
Received: 02:15 PM CDT, 03/17/2017
From: George Bosilca
To: Open MPI Users

Subject: Re: [OMPI users] tuning sm/vader for large messages


















Joshua,

In shared memory the bandwidth depends on many parameters,

including

the

process placement and the size of the different cache levels. In

your

particular case I guess after 128k you are outside the L2 cache

(1/2

of

the

cache in fact) and the bandwidth will drop as the data need to be

flushed

to main memory.

   George.



On Fri, Mar 17, 2017 at 1:47 PM, Joshua Mora

wrote:

Hello,
I am trying to get the max bw for shared memory communications

using

osu_[bw,bibw,mbw_mr] benchmarks.
I am observing a peak at ~64k/128K msg size and then drops

instead

of

sustaining it.
What parameters or linux config do I need to add to default

openmpi

settings
to get this improved ?
I am already using vader and knem.

See below one way bandwidth with peak at 64k.

# Size  Bandwidth (MB/s)
1   1.02
2   2.13
4   4.03
8   8.48
16 11.90
32 23.29
64 47.33
12888.08
256   136.77
512   245.06
1024  263.79
2048  405.49
4096 1040.46
8192 1964.81
16384

Re: [OMPI users] How to specify the use of RDMA?

2017-03-20 Thread Gilles Gouaillardet

Rodrigo,


i do not understand what you mean by "deactivate my IB interfaces"


the hostfile is only used in the wire-up phase

(to keep things simple, mpirun does

ssh  orted

under the hood, and  is coming from your hostfile.


so bottom line

mpirun --mca btl openib,self,sm -hostfile hosts_eth ...  (With IB 
interfaces down)

mpirun --mca btl openib,self,sm -hostfile hosts_ib0 ...

are expected to have the same performance


since you have some Infiniband hardware, there are two options

- you built Open MPI with MXM support, in this case you do not use the 
btl/openib, but pml/cm and mtl/mxm


  if you want to force the btl/openib, you have to

mpirun --mca pml ob1 --mca btl openib,self,sm ...

- you did not build Open MPI with MXM support, in this case, btl/openib 
is used for inter node communications,


and btl/sm is used for intra node communications.


if you want the performance numbers for tcp over ethernet, your command 
line is


mpirun --mca btl tcp,self,sm --mca pml ob1 --mca btl_tcp_if_include eth0 
-hostfile hosts_eth ...



Cheers,


Gilles


On 3/21/2017 2:07 AM, Rodrigo Escobar wrote:
Thanks Guilles for the quick reply. I think I am confused about what 
the openib BTL specifies.
What am I doing when I run with the openib BTL but specify my eth 
interface (...and deactivate my IB interfaces)?

Is not openib only for IB interfaces?
Am I using RDMA here?

These two commands give the same performance:
mpirun --mca btl openib,self,sm -hostfile hosts_eth ...  (With IB 
interfaces down)

mpirun --mca btl openib,self,sm -hostfile hosts_ib0 ...

Regards,
Rodrigo

On Mon, Mar 20, 2017 at 8:29 AM, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> 
wrote:


You will get similar results with hosts_ib and hosts_eth

If you want to use tcp over ethernet, you have to
mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include eth0 ...
If you want to use tcp over ib, then
mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include ib0 ...

Keep in mind that IMB calls MPI_Init_thread(MPI_THREAD_MULTIPLE)
this is not only unnecessary here, but it also has an impact on
performances (with older versions, Open MPI felt back on IPoIB,
with v2.1rc the impact should be minimal)

If you simply
mpirun --mca btl tcp,self,sm ...
then Open MPI will multiplex messages on both ethernet and IPoIB

Cheers,

Gilles


Rodrigo Escobar <rodave...@gmail.com <mailto:rodave...@gmail.com>>
wrote:
Hi,
I have trying to run the Intel IMB benchmarks to compare the
performance of Infiniband (IB) vs Ethernet. However, I am not
seeing any difference in performance even for communication
intensive benchmarks, such as alltoallv.

Each one of my machines has one ethernet interface and an
infiniband interface. I use the following command to run the
alltoallv benchmark:
mpirun --mca btl self,openib,sm  -hostfile hosts_ib  IMB-MPI1
alltoallv

The hosts_ib file contains the IP addresses of the infiniband
interfaces, but the performance is the same when I deactivate the
IB interfaces and use my hosts_eth file which has the IP addresses
of the ethernet interfaces. Am I missing something? What is really
happening when I specify the openib btl if I am using the ethernet
network?

Thanks

___
users mailing list
users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
<https://rfd.newmexicoconsortium.org/mailman/listinfo/users>




___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to specify the use of RDMA?

2017-03-20 Thread Gilles Gouaillardet
You will get similar results with hosts_ib and hosts_eth

If you want to use tcp over ethernet, you have to
mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include eth0 ...
If you want to use tcp over ib, then
mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include ib0 ...

Keep in mind that IMB calls MPI_Init_thread(MPI_THREAD_MULTIPLE)
this is not only unnecessary here, but it also has an impact on performances 
(with older versions, Open MPI felt back on IPoIB,
with v2.1rc the impact should be minimal)

If you simply 
mpirun --mca btl tcp,self,sm ...
then Open MPI will multiplex messages on both ethernet and IPoIB

Cheers,

Gilles

Rodrigo Escobar  wrote:
>Hi, 
>
>I have trying to run the Intel IMB benchmarks to compare the performance of 
>Infiniband (IB) vs Ethernet. However, I am not seeing any difference in 
>performance even for communication intensive benchmarks, such as alltoallv. 
>
>
>Each one of my machines has one ethernet interface and an infiniband 
>interface. I use the following command to run the alltoallv benchmark:
>
>mpirun --mca btl self,openib,sm  -hostfile hosts_ib  IMB-MPI1 alltoallv 
>
>
>The hosts_ib file contains the IP addresses of the infiniband interfaces, but 
>the performance is the same when I deactivate the IB interfaces and use my 
>hosts_eth file which has the IP addresses of the ethernet interfaces. Am I 
>missing something? What is really happening when I specify the openib btl if I 
>am using the ethernet network?
>
>
>Thanks
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Install openmpi.2.0.2 with certain option

2017-04-04 Thread Gilles Gouaillardet

Note that might not be enough if hwloc detects nvml.

unfortunatly, there are only workarounds available for this :

1) edit opal/mca/hwloc/hwloc*/configure.m4 and add

enable_nvml=no

for example after enable_xml=yes

note you need recent autotools, and re-run autogen.pl --force

2) build Open MPI with an external hwloc that was configure'd with 
--disable-nvml



Cheers,


Gilles


On 4/4/2017 10:12 PM, r...@open-mpi.org wrote:

--without-cuda --without-slurm

should do the trick


On Apr 4, 2017, at 4:49 AM, Andrey Shtyrov via users  
wrote:

Dear openmpi communite,

I am need to install openmpi.2.0.2 on sistem with slurm, and cuda, without 
support it.

I have tried write ".configure  ... (--without-cuda or 
--enable-mca-no-build=cuda)"

but it havent solve my proble. What about switching off support of slurm,
i dont know what paramter would be wrotten.
What will advise about this proble?

And i will be glad if abbreviation of FOO would be decrypted.

Thank you for your help,
Shtyrov


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Compiler error with PGI: pgcc-Error-Unknown switch: -pthread

2017-04-03 Thread Gilles Gouaillardet
Hi,

The -pthread flag is likely pulled by libtool from the slurm libmpi.la
and/or libslurm.la
Workarounds are
- rebuild slurm with PGI
- remove the .la files (*.so and/or *.a are enough)
- wrap the PGI compiler to ignore the -pthread option

Hope this helps

Gilles

On Monday, April 3, 2017, Prentice Bisbal  wrote:

> Greeting Open MPI users! After being off this list for several years, I'm
> back! And I need help:
>
> I'm trying to compile OpenMPI 1.10.3 with the PGI compilers, version 17.3.
> I'm using the following configure options:
>
> ./configure \
>   --prefix=/usr/pppl/pgi/17.3-pkgs/openmpi-1.10.3 \
>   --disable-silent-rules \
>   --enable-shared \
>   --enable-static \
>   --enable-mpi-thread-multiple \
>   --with-pmi=/usr/pppl/slurm/15.08.8 \
>   --with-hwloc \
>   --with-verbs \
>   --with-slurm \
>   --with-psm \
>   CC=pgcc \
>   CFLAGS="-tp x64 -fast" \
>   CXX=pgc++ \
>   CXXFLAGS="-tp x64 -fast" \
>   FC=pgfortran \
>   FCFLAGS="-tp x64 -fast" \
>   2>&1 | tee configure.log
>
> Which leads to this error  from libtool during make:
>
> pgcc-Error-Unknown switch: -pthread
>
> I've searched the archives, which ultimately lead to this work around from
> 2009:
>
> https://www.open-mpi.org/community/lists/users/2009/04/8724.php
>
> Interestingly, I participated in the discussion that lead to that
> workaround, stating that I had no problem compiling Open MPI with PGI v9.
> I'm assuming the problem now is that I'm specifying
> --enable-mpi-thread-multiple, which I'm doing because a user requested that
> feature.
>
> It's been exactly 8 years and 2 days since that workaround was posted to
> the list. Please tell me a better way of dealing with this issue than
> writing a 'fakepgf90' script. Any suggestions?
>
>
> --
> Prentice
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-12 Thread Gilles Gouaillardet
That should be a two steps tango
- Open MPI bind a MPI task to a socket
- the OpenMP runtime bind OpenMP threads to cores (or hyper threads) inside
the socket assigned by Open MPI

which compiler are you using ?
do you set some environment variables to direct OpenMP to bind threads ?

Also, how do you measure the hyperthread a given OpenMP thread is on ?
is it the hyperthread used at a given time ? If yes, then the thread might
migrate unless it was pinned by the OpenMP runtime.

If you are not sure, please post the source of your program so we can have
a look

Last but not least, as long as OpenMP threads are pinned to distinct cores,
you should not worry about them migrating between hyperthreads from the
same core.

Cheers,

Gilles

On Wednesday, April 12, 2017, Heinz-Ado Arnolds 
wrote:

> Dear rhc,
>
> to make it more clear what I try to achieve, I collected some examples for
> several combinations of command line options. Would be great if you find
> time to look to these below. The most promise one is example "4".
>
> I'd like to have 4 MPI jobs starting 1 OpenMP job each with 10 threads,
> running on 2 nodes, each having 2 sockets, with 10 cores & 10 hwthreads.
> Only 10 cores (no hwthreads) should be used on each socket.
>
> 4 MPI -> 1 OpenMP with 10 thread (i.e. 4x10 threads)
> 2 nodes, 2 sockets each, 10 cores & 10 hwthreads each
>
> 1. mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh"
> -report-bindings ./myid
>
>Machines  :
>pascal-2-05...DE 20
>pascal-1-03...DE 20
>
>[pascal-2-05:28817] MCW rank 0 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>[pascal-2-05:28817] MCW rank 1 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>[pascal-1-03:19256] MCW rank 2 bound to socket 0[core 0[hwt 0-1]],
> socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt
> 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core
> 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket
> 0[core 9[hwt 0-1]]: [BB/BB/BB/BB/BB/BB/BB/BB/BB/
> BB][../../../../../../../../../..]
>[pascal-1-03:19256] MCW rank 3 bound to socket 1[core 10[hwt 0-1]],
> socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core
> 13[hwt 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]],
> socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core
> 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../..
> ][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
>MPI Instance 0001 of 0004 is on pascal-2-05, Cpus_allowed_list:
> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0001(pid
> 28833), 018, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0002(pid
> 28833), 014, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0003(pid
> 28833), 028, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0004(pid
> 28833), 012, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0005(pid
> 28833), 030, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0006(pid
> 28833), 016, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0007(pid
> 28833), 038, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0008(pid
> 28833), 034, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0009(pid
> 28833), 020, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0010(pid
> 28833), 022, Cpus_allowed_list:0,2,4,6,8,10,12,14,16,18,20,
> 22,24,26,28,30,32,34,36,38
>MPI Instance 0002 of 0004 is on pascal-2-05, Cpus_allowed_list:
> 

Re: [OMPI users] Failed to create a queue pair (QP) error

2017-04-08 Thread Gilles Gouaillardet
What happens is mpirun does under the hood
 orted
And your remote_exec does not propagate LD_LIBRARY_PATH
one option is to configure your remote_exec to do so, but I'd rather suggest
you re-configure ompi with --enable-orterun-prefix-by-default
If your remote_exec is ssh (if you are not running under a supported batch
manager), then
ssh node188 ldd $path_to_openmpi_bin/orted
should show zero unresolved libraries

Cheers,

Gilles

On Sunday, April 9, 2017, Ilchenko Evgeniy  wrote:

> Hi!
>
> Problem with random segfault for java-programs solved by adding mca
> options:
>
> $path_to_openmpi_bin/mpirun -np 1  -mca btl self,sm,openib
>  $path_to_java_bin/java randomTest
>
> Thanks to Eshsou Hashba and Michael Kalugin!
>
>
> But i get other problems!
>
> If I start mpirun from manager-node (without ssh-login to calculation node)
>
> $path_to_openmpi_bin/mpirun  -np 2 -host node188,node189 -mca btl
> self,sm,openib   $path_to_java_bin/java randomTest
>
> I get next error:
>
>
> $openmpi1.10_folder/bin/orted: error while loading shared libraries:
> libimf.so: cannot open shared object file: No such file or directory
> --
> ORTE was unable to reliably start one or more daemons.
> This usually is caused by:
>
> * not finding the required libraries and/or binaries on
>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>   settings, or configure OMPI with --enable-orterun-prefix-by-default
>
> * lack of authority to execute on one or more specified nodes.
>   Please verify your allocation and authorities.
>
> * the inability to write startup files into /tmp
> (--tmpdir/orte_tmpdir_base).
>   Please check with your sys admin to determine the correct location to
> use.
>
> *  compilation of the orted with dynamic libraries when static are required
>   (e.g., on Cray). Please check your configure cmd line and consider using
>   one of the contrib/platform definitions for your system type.
>
> * an inability to create a connection back to mpirun due to a
>   lack of common network interfaces and/or no route found between
>   them. Please check network connectivity (including firewalls
>   and network routing requirements).
> --
>
> If I throw LD_LIBRARY_PATH (that contain path to  libimf.so) via -x option
> to mpirun:
>
> $path_to_openmpi_bin/mpirun  -x LD_LIBRARY_PATH -np 2 -host
> node188,node189 -mca btl self,sm,openib   $path_to_java_bin/java randomTest
>
> then I get same error (orted: error while loading shared libraries:
> libimf.so: cannot open shared object file: No such file or directory).
>
> How I can throw lib path for spawned mpi processes and orted?
> I don't have root-privileges on this cluster.
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Build Failed - OpenMPI 1.10.6 / Ubuntu 16.04 / Oracle Studio 12.5

2017-04-08 Thread Gilles Gouaillardet
So it seems
OPAL_HAVE_POSIX_THREADS
is not defined, and that should never happen !

Can you please compress and post (or upload into gist or similar) your
- config.log
- opal/include/opal_config.h

Cheers,

Gilles

On Sunday, April 9, 2017, Travis W. Drayna  wrote:

> Gilles,
>
> Thank you for the quick response. Your suggestion fixed the first issue.
> The 'man clock_gettime' command brings up the man pages on both systems.
>
> I added '#include ' to the beginning of 
> opal/mca/timer/linux/timer_linux_component.c
> and the build now makes it past the first error.
>
>
> The new failure point is on both test systems is the following:
>
> Making all in mca/osc/sm
> make[2]: Entering directory '/home/xxx-admin/INSTALL/
> openmpi-1.10.6/BUILD_SUN/ompi/mca/osc/sm'
>   CC   osc_sm_comm.lo
>   CC   osc_sm_component.lo
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line
> 136"../../../../../opal/include/opal/sys/amd64/atomic.h", : warning:
> parameter in inline asm statement unused: %3
> line 136: warning: parameter in inline asm statement unused: %3
> "../../../../../opal/include/opal/sys/amd64/atomic.h",
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 182: warning:
> line 182: warning: parameter in inline asm statement unused: %2
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 203: warning:
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 203: warning:
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 224: warning:
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 224: warning:
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 245: warning:
> parameter in inline asm statement unused: %2
> "../../../../../opal/include/opal/sys/amd64/atomic.h", line 245: warning:
> parameter in inline asm statement unused: %2
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 332: undefined
> struct/union member: my_sense
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 358: undefined
> struct/union member: mtx
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 358: warning:
> argument #1 is incompatible with prototype:
> prototype: pointer to union  {struct __pthread_mutex_s {..} __data,
> array[40] of char __size, long __align} : "/usr/include/pthread.h", line 749
> argument : pointer to int
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 364: improper
> member use: cond
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 364: warning:
> argument #1 is incompatible with prototype:
> prototype: pointer to union  {struct  {..} __data, array[48] of char
> __size, long long __align} : "/usr/include/pthread.h", line 968
> argument : pointer to struct opal_condition_t {struct opal_object_t
> {..} super, volatile int c_waiting, volatile int c_signaled}
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 370: undefined
> struct/union member: sense
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 370: improper
> member use: my_sense
> "../../../../../ompi/mca/osc/sm/osc_sm_component.c", line 371: improper
> member use: count
> cc: acomp failed for ../../../../../ompi/mca/osc/sm/osc_sm_component.c
> Makefile:1754: recipe for target 'osc_sm_component.lo' failed
> make[2]: *** [osc_sm_component.lo] Error 1
> make[2]: *** Waiting for unfinished jobs
> make[2]: Leaving directory '/home/xxx-admin/INSTALL/
> openmpi-1.10.6/BUILD_SUN/ompi/mca/osc/sm'
> Makefile:3261: recipe for target 'all-recursive' failed
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory '/home/xxx-admin/INSTALL/
> openmpi-1.10.6/BUILD_SUN/ompi'
> Makefile:1777: recipe for target 'all-recursive' failed
> make: *** [all-recursive] Error 1
>
>
> Thanks again,
> Travis
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Cannot run mpirun on ubuntu 16.10

2017-04-20 Thread Gilles Gouaillardet
John,

can you run
free
before the first command and make sure you have all the physical and
available memory you expect  ?

then, after a failed
mpirun -np 1 ./helloWorld
can you run
dmesg
and look for messages from the OOM killer ?
that would indicate you are running out of memory.

maybe some unexpected things happen during the upgrade
(new services are now enabled, process limits were changed, new kernel is
way more memory hungry, swap was disabled, memory oversubscription
parameters were updated, or some of your memory is no more detected)

Cheers,

Gilles


On Friday, April 21, 2017, john jomo via users 
wrote:

> Hi guys,
>
> I updated my ubuntu version to ubuntu 16.10 and cannot execute programs
> with mpirun anymore :(.
> I did not encounter this problem with ubuntu 16.04 and would really
> appreciate some help.
>
> I am running ubuntu 16.10 with the kernel version 4.8.0-46-generic on a
> Lenovo y50-70 laptop with the current version of openmpi available via
> apt-get libopenmpi1.10 openmpi-dev openmpi-bin
>
> my code compiles and links perfectly, I only get trouble when trying to
> launch with mpirun. I can even run a simple hello world mpi code.
>
> *running: mpirun -np 2 ./helloWord:*
> When I run mpirun -np 2 ./progname my program does not even launch. The
> terminal remains blank and I cannot even kill the processes in htop with
> kill PID or kill -9 PID. Rebooting or shutting down ubuntu via the shutdown
> button on the ubuntu desktop has no effect and I have to manually shutdown
> by holing down the power button.
>
> *running mpirun -np 1 hostname:*
> It works on the first run but subsequent runs are killed immediately. The
> output looks something like this
>
> $ mpirun -np 1 hostname
> Killed
>
>
> Has anyone experienced something like this before?
>
> Thanks in advance
>
> John.
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] fatal error for openmpi-master-201704200300-ded63c with SuSE Linux and gcc-6.3.0

2017-04-20 Thread Gilles Gouaillardet

The PR simply disables nvml in hwloc is CUDA is disabled in Open MPI.

it also add cuda directory to CPPFLAGS, so there should be no need to 
manually add -I/usr/local/cuda/include to CPPFLAGS.



Siegmar,

could you please post your config.log

also, is there a nvml.h file in /usr/local/cuda/include

last but not least, can you please run

make V=1

and post the output related to the compilation of topology-nvml.lo


Thanks and regards,


Gilles


On 4/21/2017 3:07 AM, r...@open-mpi.org wrote:
This is a known issue due to something in the NVIDIA library and it’s 
interactions with hwloc. Your tarball tag indicates you should have 
the attempted fix in it, so likely that wasn’t adequate. See 
https://github.com/open-mpi/ompi/pull/3283 for the discussion



On Apr 20, 2017, at 8:11 AM, Siegmar Gross 
> wrote:


Hi,

I tried to install openmpi-master-201704200300-ded63c on my "SUSE Linux
Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-6.3.0.
Unfortunately, "make" breaks with the following error for gcc. I've had
no problems with cc.


loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 136 grep 
topology log.make.Linux.x86_64.64_gcc

 CC   topology.lo
 CC   topology-noos.lo
 CC   topology-synthetic.lo
 CC   topology-custom.lo
 CC   topology-xml.lo
 CC   topology-xml-nolibxml.lo
 CC   topology-pci.lo
 CC   topology-nvml.lo
../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-nvml.c:14:18: 
fatal error: nvml.h: No such file or directory

Makefile:2181: recipe for target 'topology-nvml.lo' failed
make[4]: *** [topology-nvml.lo] Error 1
loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 137





loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 137 grep 
topology 
../openmpi-master-201704200300-ded63c5-Linux.x86_64.64_cc/log.make.Linux.x86_64.64_cc

 CC   topology.lo
 CC   topology-noos.lo
 CC   topology-synthetic.lo
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-synthetic.c", 
line 851: warning: initializer will be sign-extended: -1

 CC   topology-custom.lo
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-custom.c", 
line 88: warning: initializer will be sign-extended: -1

 CC   topology-xml.lo
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-xml.c", 
line 1815: warning: initializer will be sign-extended: -1

 CC   topology-xml-nolibxml.lo
 CC   topology-pci.lo
 CC   topology-nvml.lo
 CC   topology-linux.lo
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c", 
line 2919: warning: initializer will be sign-extended: -1
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c", 
line 2919: warning: initializer will be sign-extended: -1
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-linux.c", 
line 2919: warning: initializer will be sign-extended: -1

 CC   topology-hardwired.lo
 CC   topology-x86.lo
"../../../../../../../openmpi-master-201704200300-ded63c5/opal/mca/hwloc/hwloc1116/hwloc/src/topology-x86.c", 
line 122: warning: initializer will be sign-extended: -1

loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 138




I used the following commands to configure the package.

loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 145 head 
-7 config.log |tail -1
 $ ../openmpi-master-201704200300-ded63c5/configure 
--prefix=/usr/local/openmpi-master_64_gcc 
--libdir=/usr/local/openmpi-master_64_gcc/lib64 
--with-jdk-bindir=/usr/local/jdk1.8.0_66/bin 
--with-jdk-headers=/usr/local/jdk1.8.0_66/include 
JAVA_HOME=/usr/local/jdk1.8.0_66 LDFLAGS=-m64 CC=gcc CXX=g++ 
FC=gfortran CFLAGS=-m64 CXXFLAGS=-m64 FCFLAGS=-m64 CPP=cpp CXXCPP=cpp 
--enable-mpi-cxx --enable-cxx-exceptions --enable-mpi-java 
--with-cuda=/usr/local/cuda --with-valgrind=/usr/local/valgrind 
--with-hwloc=internal --without-verbs --with-wrapper-cflags=-std=c11 
-m64 --with-wrapper-cxxflags=-m64 --with-wrapper-fcflags=-m64 
--enable-debug

loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 146




loki openmpi-master-201704200300-ded63c5-Linux.x86_64.64_gcc 146 head 
-7 
../openmpi-master-201704200300-ded63c5-Linux.x86_64.64_cc/config.log 
| tail -1
 $ ../openmpi-master-201704200300-ded63c5/configure 
--prefix=/usr/local/openmpi-master_64_cc 
--libdir=/usr/local/openmpi-master_64_cc/lib64 
--with-jdk-bindir=/usr/local/jdk1.8.0_66/bin 
--with-jdk-headers=/usr/local/jdk1.8.0_66/include 
JAVA_HOME=/usr/local/jdk1.8.0_66 LDFLAGS=-m64 -mt -Wl,-z 
-Wl,noexecstack -L/usr/local/lib64 -L/usr/local/cuda/lib64 CC=cc 
CXX=CC FC=f95 CFLAGS=-m64 -mt -I/usr/local/include 

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-13 Thread Gilles Gouaillardet

Heinz-Ado,


it seems the OpenMP runtime did *not* bind the OMP threads at all as 
requested,


and the root cause could be the OMP_PROC_BIND environment variable was 
not propagated


can you try

mpirun -x OMP_PROC_BIND ...

and see if it helps ?


Cheers,


On 4/13/2017 12:23 AM, Heinz-Ado Arnolds wrote:

Dear Gilles,

thanks for your answer.

- compiler: gcc-6.3.0
- OpenMP environment vars: OMP_PROC_BIND=true, GOMP_CPU_AFFINITY not set
- hyperthread a given OpenMP thread is on: it's printed in the output below as a 3-digit 
number after the first ",", read by sched_getcpu() in the OpenMP test code
- the migration between cores/hyperthreads should be prevented by 
OMP_PROC_BIND=true
- I didn't find a migration, but the similar use of one core/hyperthread by two OpenMP threads in 
example "4"/"MPI Instance 0002": 011/031 are both on core #11.

Are there any hints how to cleanly transfer the OpenMPI binding to the OpenMP 
tasks?

Thanks and kind regards,

Ado

On 12.04.2017 15:40, Gilles Gouaillardet wrote:

That should be a two steps tango
- Open MPI bind a MPI task to a socket
- the OpenMP runtime bind OpenMP threads to cores (or hyper threads) inside the 
socket assigned by Open MPI

which compiler are you using ?
do you set some environment variables to direct OpenMP to bind threads ?

Also, how do you measure the hyperthread a given OpenMP thread is on ?
is it the hyperthread used at a given time ? If yes, then the thread might 
migrate unless it was pinned by the OpenMP runtime.

If you are not sure, please post the source of your program so we can have a 
look

Last but not least, as long as OpenMP threads are pinned to distinct cores, you 
should not worry about them migrating between hyperthreads from the same core.

Cheers,

Gilles

On Wednesday, April 12, 2017, Heinz-Ado Arnolds <arno...@mpa-garching.mpg.de 
<mailto:arno...@mpa-garching.mpg.de>> wrote:

 Dear rhc,

 to make it more clear what I try to achieve, I collected some examples for several 
combinations of command line options. Would be great if you find time to look to these 
below. The most promise one is example "4".

 I'd like to have 4 MPI jobs starting 1 OpenMP job each with 10 threads, 
running on 2 nodes, each having 2 sockets, with 10 cores & 10 hwthreads. Only 
10 cores (no hwthreads) should be used on each socket.

 4 MPI -> 1 OpenMP with 10 thread (i.e. 4x10 threads)
 2 nodes, 2 sockets each, 10 cores & 10 hwthreads each

 1. mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh" 
-report-bindings ./myid

Machines  :
pascal-2-05...DE 20
pascal-1-03...DE 20

[pascal-2-05:28817] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], 
socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 
0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 
6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 
0[core 9[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..]
[pascal-2-05:28817] MCW rank 1 bound to socket 1[core 10[hwt 0-1]], 
socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 
0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core 
16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket 
1[core 19[hwt 0-1]]: 
[../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
[pascal-1-03:19256] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], 
socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 
0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core 
6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket 
0[core 9[hwt 0-1]]: 
[BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..]
[pascal-1-03:19256] MCW rank 3 bound to socket 1[core 10[hwt 0-1]], 
socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 
0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core 
16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket 
1[core 19[hwt 0-1]]: 
[../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB]
MPI Instance 0001 of 0004 is on pascal-2-05, Cpus_allowed_list:  
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0001(pid 
28833), 018, Cpus_allowed_list:
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0002(pid 
28833), 014, Cpus_allowed_list:
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
MPI Instance 0001 of 0004 is on pascal-2-05: MP thread  #0003(pid 
28833), 028, Cpus_allowed_list:
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
MPI Instance 0001 of 000

Re: [OMPI users] Run-time issues with openmpi-2.0.2 and gcc

2017-04-13 Thread Gilles Gouaillardet
Vincent,

Can you try a small program such as examples/ring_c.c ?
Does your app do MPI_Comm_spawn and friends ?
Can you post your mpirun command line ? Are you using a batch manager ?

This error message is typical of unresolved libraries.
(E.g. "ssh host ldd orted" fails to resolve some libs because
LD_LIBRARY_PATH is not propagated)
We usually recommend to configure with --enable-mpirun-prefix-by-default.
That being said, that does not match your claim app worked for about 5
minutes

Cheers,

Gilles

On Thursday, April 13, 2017, Vincent Drach 
wrote:

>
> Dear mailing list,
>
> We are experimenting run time failure  on a small cluster with
> openmpi-2.0.2 and gcc 6.3 and gcc 5.4.
> The job start normally and lots of communications are performed. After
> 5-10 minutes the connection to the hosts is closed and
> the following error message is reported:
> --
> ORTE was unable to reliably start one or more daemons.
> This usually is caused by:
>
> * not finding the required libraries and/or binaries on
>   one or more nodes. Please check your PATH and LD_LIBRARY_PATH
>   settings, or configure OMPI with --enable-orterun-prefix-by-default
>
> * lack of authority to execute on one or more specified nodes.
>   Please verify your allocation and authorities.
>
> * the inability to write startup files into /tmp
> (--tmpdir/orte_tmpdir_base).
>   Please check with your sys admin to determine the correct location to
> use.
>
> *  compilation of the orted with dynamic libraries when static are required
>   (e.g., on Cray). Please check your configure cmd line and consider using
>   one of the contrib/platform definitions for your system type.
>
> * an inability to create a connection back to mpirun due to a
>   lack of common network interfaces and/or no route found between
>   them. Please check network connectivity (including firewalls
>   and network routing requirements).
>
>
>
> The issue does not seem to be due to the infiniband configuration, because
> the job also crash when using tcp protocol.
>
> Do you have any clue of what could be the issue ?
>
>
> Thanks a lot,
>
> Vincent
>
> --
> 
>
> This email and any files with it are confidential and intended solely for
> the use of the recipient to whom it is addressed. If you are not the
> intended recipient then copying, distribution or other use of the
> information contained is strictly prohibited and you should not rely on it.
> If you have received this email in error please let the sender know
> immediately and delete it from your system(s). Internet emails are not
> necessarily secure. While we take every care, Plymouth University accepts
> no responsibility for viruses and it is your responsibility to scan emails
> and their attachments. Plymouth University does not accept responsibility
> for any changes made after it was sent. Nothing in this email or its
> attachments constitutes an order for goods or services unless accompanied
> by an official order form.
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] openmpi-2.0.2

2017-04-19 Thread Gilles Gouaillardet

Jim,


can you please post your configure command line and test output on both 
systems ?




fwiw, Open MPI strictly sticks to the (current) MPI standard regarding 
MPI_DATATYPE_NULL


(see 
http://lists.mpi-forum.org/pipermail/mpi-forum/2016-January/006417.html)



there have been some attempts to deviate from the MPI standard

(e.g. implement what the standard "should" be versus what the standard 
says)


and they were all crushed at a very early stage in Open MPI.



Cheers,


Gilles


On 4/20/2017 2:53 AM, Jim Edwards wrote:

Hi,

I have openmpi-2.0.2 builds on two different machines and I have a 
test code which works on one machine and does not on the other 
machine.  I'm struggling to understand why and I hope that by posting 
here someone may have some insight.


The test is using mpi derived data types and mpi_alltoallw on 4 
tasks.  On the machine that fails it appears to ignore the 
displacement in the derived datatype defined on task 0 and just send 
0-3 to all tasks.The failing machine is built against gcc 5.4.0, 
the working machine has both intel 16.0.3 and gcc 6.3.0 builds.


#include "mpi.h"

#include 


int main(int argc, char *argv[])

{

int rank, size;

MPI_Datatype type[4], type2[4];

int displacement[1];

int sbuffer[16];

int rbuffer[4];

MPI_Status status;

int scnts[4], sdispls[4], rcnts[4], rdispls[4];

MPI_Init(, );

MPI_Comm_size(MPI_COMM_WORLD, );

if (size < 4)

{

printf("Please run with 4 processes.\n");

MPI_Finalize();

return 1;

}

MPI_Comm_rank(MPI_COMM_WORLD, );


/* task 0 has sbuffer of size 16 and we are going to send 4 values 
to each of tasks 0-3, offsetting in each


   case so that the expected result is

task[0] 0-3

task[1] 4-7

task[2] 8-11

task[3] 12-15

*/



for( int i=0; i

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Gilles Gouaillardet

Angel,


i suggest you get an xml topo with

hwloc --of xml

on both your "exotic" POWER platform and a more standard and recent one.

then you can manually edit the xml topology and add the missing objects.


finally, you can pass this to Open MPI like this


mpirun --mca hwloc_base_topo_file mytopo.xml ...


Cheers,


Gilles



On 3/10/2017 12:19 AM, Brice Goglin wrote:
Ok, that's a very old kernel on a very old POWER processor, it's 
expected that hwloc doesn't get much topology information, and it's 
then expected that OpenMPI cannot apply most binding policies.


Brice



Le 09/03/2017 16:12, Angel de Vicente a écrit :
Can this help? If you think any other information could be relevant, 
let me know.


Cheers,
Ángel

cat /proc/cpuinfo
processor   : 0
cpu : PPC970MP, altivec supported
clock   : 2297.70MHz
revision: 1.1 (pvr 0044 0101)

[4 processors]

timebase: 14318000
machine : CHRP IBM,8844-Z0C

uname -a
Linux login1 2.6.16.60-perfctr-0.42.4-ppc64 #1 SMP Fri Aug 21 
15:25:15 CEST 2009 ppc64 ppc64 ppc64 GNU/Linux


lsb_release -a
Distributor ID: SUSE LINUX
Description:SUSE Linux Enterprise Server 10 (ppc)
Release:10


On 9 March 2017 at 15:04, Brice Goglin > wrote:


What's this machine made of? (processor, etc)
What kernel are you running ?

Getting no "socket" or "package" at all is quite rare these days.

Brice




Le 09/03/2017 15:28, Angel de Vicente a écrit :
> Hi again,
>
> thanks for your help. I installed the latest OpenMPI (2.0.2).
>
> lstopo output:
>
> ,
> | lstopo --version
> | lstopo 1.11.2
> |
> | lstopo
> | Machine (7861MB)
> |   L2 L#0 (1024KB) + L1d L#0 (32KB) + L1i L#0 (64KB) + Core
L#0 + PU L#0
> |   (P#0)
> |   L2 L#1 (1024KB) + L1d L#1 (32KB) + L1i L#1 (64KB) + Core
L#1 + PU L#1
> |   (P#1)
> |   L2 L#2 (1024KB) + L1d L#2 (32KB) + L1i L#2 (64KB) + Core
L#2 + PU L#2
> |   (P#2)
> |   L2 L#3 (1024KB) + L1d L#3 (32KB) + L1i L#3 (64KB) + Core
L#3 + PU L#3
> |   (P#3)
> |   HostBridge L#0
> | PCIBridge
> |   PCI 1014:028c
> | Block L#0 "sda"
> |   PCI 14c1:8043
> | Net L#1 "myri0"
> | PCIBridge
> |   PCI 14e4:166b
> | Net L#2 "eth0"
> |   PCI 14e4:166b
> | Net L#3 "eth1"
> | PCIBridge
> |   PCI 1002:515e
> `
>
> I started with GCC 6.3.0, compiled OpenMPI 2.0.2 with it, and
then HDF5
> 1.10.0-patch1 with it. Our code then compiles OK with it, and
it runs OK
> without "mpirun":
>
> ,
> | ./mancha3D
> | __   ___
> |/'\_/`\ /\ \  /'__`\ /\  _ `\
> |   /\  \ __  ___ ___\ \ \___ __  /\_\L\ \\
\ \/\ \
> |   \ \ \__\ \  /'__`\  /' _ `\  /'___\ \  _ `\ 
/'__`\\/_/_\_<_\ \ \ \ \

> |\ \ \_/\ \/\ \L\.\_/\ \/\ \/\ \__/\ \ \ \ \/\ \L\.\_/\ \L\
\\ \ \_\ \
> | \ \_\\ \_\ \__/.\_\ \_\ \_\ \\\ \_\ \_\ \__/.\_\
\/ \ \/
> |  \/_/ \/_/\/__/\/_/\/_/\/_/\//
\/_/\/_/\/__/\/_/\/___/   \/___/
> |
> |  ./mancha3D should be given the name of a control file as
argument.
> `
>
>
>
>
> But it complains as before when run with mpirun
>
> ,
> | mpirun --map-by socket --bind-to socket -np 1 ./mancha3D
> |
--
> | No objects of the specified type were found on at least one node:
> |
> |   Type: Package
> |   Node: login1
> |
> | The map cannot be done as specified.
> |
--
> `
>
>
> If I submit it directly with srun, then the code runs, but not in
> parallel, and two individual copies of the code are started:
>
> ,
> | srun -n 2 ./mancha3D
> | __   ___
> |/'\_/`\ /\ \  /'__`\ /\  _ `\
> |   /\  \ __  ___ ___\ \ \___ __  /\_\L\ \\
\ \/\ \
> |   \ \ \__\ \  /'__`\  /' _ `\  /'___\ \  _ `\ 
/'__`\\/_/_\_<_\ \ \ \ \

> |\ \ \_/\ \/\ \L\.\_/\ \/\ \/\ \__/\ \ \ \ \/\ \L\.\_/\ \L\
\\ \ \_\ \
> | \ \_\\ \_\ \__/.\_\ \_\ \_\ \\\ \_\ \_\ \__/.\_\
\/ \ \/
> |  \/_/ \/_/\/__/\/_/\/_/\/_/\//
\/_/\/_/\/__/\/_/\/___/   \/___/
> |
> |  should be given the name of a control file as argument.
> | __   ___
> |/'\_/`\ /\ \  /'__`\ /\  _ `\
> |   /\  \ __  ___ ___\ \ \___ __  /\_\L\ 

Re: [OMPI users] "No objects of the specified type were found on at least one node"?

2017-03-09 Thread Gilles Gouaillardet
which version of ompi are you running ?

this error can occur on systems with no NUMA object (e.g. single socket
with hwloc < 2)
as a workaround, you can
mpirun --map-by socket ...

iirc, this has been fixed

Cheers,

Gilles

On Thursday, March 9, 2017, Angel de Vicente  wrote:

> Hi,
>
> I'm trying to get OpenMPI running in a new machine, and I came accross
> an error message that I hadn't seen before.
>
> ,
> | can@login1:> mpirun -np 1 ./code config.txt
> | 
> --
> | No objects of the specified type were found on at least one node:
> |
> |   Type: Package
> |   Node: login1
> |
> | The map cannot be done as specified.
> | 
> --
> `
>
> Some details: in this machine we have gcc_6.0.3, and with it I installed
> OpenMPI (v. 2.0.1). The compilation of OpenMPI went without (obvious)
> errors, and I managed to compile my code without problems (if instead
> of "mpirun -np 1 ./code" I just run the code directly there are no
> issues).
>
> But if I try to use mpirun in the login node of the cluster I get this
> message. If I submit the job to the scheduler (the cluster uses slurm) I
> get the same messsage, but the Node information is obviously different,
> giving the name of one of the compute nodes.
>
> Any pointers as to what can be going on? Many thanks,
> --
> Ángel de Vicente
> http://www.iac.es/galeria/angelv/
> 
> -
> ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de
> Datos, acceda a http://www.iac.es/disclaimer.php
> WARNING: For more information on privacy and fulfilment of the Law
> concerning the Protection of Data, consult http://www.iac.es/disclaimer.
> php?lang=en
>
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Gilles Gouaillardet
Can you run
lstopo
in your machine, and post the output ?

can you also try
mpirun --map-by socket --bind-to socket ...
and see if it helps ?

Cheers,

Gilles



On Thursday, March 9, 2017, Angel de Vicente <ang...@iac.es> wrote:

> Hi,
>
> Gilles Gouaillardet <gilles.gouaillar...@gmail.com <javascript:;>> writes:
> > which version of ompi are you running ?
>
> 2.0.1
>
> > this error can occur on systems with no NUMA object (e.g. single
> > socket with hwloc < 2)
> > as a workaround, you can
> > mpirun --map-by socket ...
>
> with --map-by socket I get exactly the same issue (both in the login and
> the compute node)
>
> I will upgrade to 2.0.2 and see if this changes something.
>
> Thanks,
> --
> Ángel de Vicente
> http://www.iac.es/galeria/angelv/
> 
> -
> ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de
> Datos, acceda a http://www.iac.es/disclaimer.php
> WARNING: For more information on privacy and fulfilment of the Law
> concerning the Protection of Data, consult http://www.iac.es/disclaimer.
> php?lang=en
>
> ___
> users mailing list
> users@lists.open-mpi.org <javascript:;>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI Segfault when psm is enabled?

2017-03-11 Thread Gilles Gouaillardet
PSM is the infinipath driver, so unless you have some infinipath hardware,
you can safely disable it

Cheers,

Gilles

On Sunday, March 12, 2017, Saliya Ekanayake  wrote:

> Hi,
>
> I've been trying to resolve a segfault that kept occurring with OpenMPI
> Java binding. I found this thread in OpenMPI mailing list (
> https://www.mail-archive.com/users@lists.open-mpi.org/msg27524.html) that
> suggested to disable psm as a solution.
>
> It worked, but I would like to know what this module is and is there a
> disadvantage in terms of performance by disabling it?
>
> Thank you,
> Saliya
>
> --
> Saliya Ekanayake, Ph.D
> Applied Computer Scientist
> Network Dynamics and Simulation Science Laboratory (NDSSL)
> Virginia Tech, Blacksburg
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] users Digest, Vol 3729, Issue 2

2017-03-02 Thread Gilles Gouaillardet

Hi,


there is likely something wrong in Open MPI (i will follow up in the 
devel ML)



meanwhile, you can

mpirun --mca opal_set_max_sys_limits core:unlimited ...


Cheers,


Gilles


On 3/3/2017 1:01 PM, gzzh...@buaa.edu.cn wrote:

Hi Jeff:
Thanks for your suggestions.
1. I have execute the command " find / -name core*" in each node, 
no coredump file was found. and no coredump files in the /home, /core 
and pwd file path as well.
2. I have changed core_pattern follow your advice, still no 
expected coredump file.

3. I didn't use any resource scheduler,  but just ssh.
At last I tried to add setrlimit(2) in my code, it worked! I got the 
coredump file which I want. But I don't know why.
If I don't want to modify my code, how can I to setup the config or 
something to achieve coredump?

Here is the result of "ulimit -a"
--
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256511
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
--
Regards!


Eric

*From:* users-request 
*Date:* 2017-03-03 03:00
*To:* users 
*Subject:* users Digest, Vol 3729, Issue 2
Send users mailing list submissions to
users@lists.open-mpi.org
To subscribe or unsubscribe via the World Wide Web, visit
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
users-requ...@lists.open-mpi.org
You can reach the person managing the list at
users-ow...@lists.open-mpi.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."
Today's Topics:
   1. coredump about MPI (gzzh...@buaa.edu.cn)
   2. Re: coredump about MPI (Jeff Squyres (jsquyres))
--
Message: 1
Date: Thu, 2 Mar 2017 22:19:51 +0800
From: "gzzh...@buaa.edu.cn" 
To: users 
Subject: [OMPI users] coredump about MPI
Message-ID: <2017030222195035326...@buaa.edu.cn>
Content-Type: text/plain; charset="us-ascii"
hi developers and users:
I have a question about the coredump of MPI programs. I have
two nodes, when the program was runned on the single node
respectively,
It can get the corefile correctly(In order to make a coredump,
there is a divide-by-zero operation in this program).
But when I runned the program on two nodes, if the illegle
operation happened in the node which is different from the node
used to execute
this "mpirun" command, there is no coredump file.
I have checked "ulimit -c" and so on,but still can not figure out.
thanks a lot for your help and best regards!
-
Eric
-- next part --
An HTML attachment was scrubbed...
URL:


--
Message: 2
Date: Thu, 2 Mar 2017 15:34:56 +
From: "Jeff Squyres (jsquyres)" 
To: "Open MPI User's List" 
Subject: Re: [OMPI users] coredump about MPI
Message-ID: 
Content-Type: text/plain; charset="us-ascii"
A few suggestions:
1. Look for the core files in directories where you might not expect:
   - your $HOME (particularly if your $HOME is not a networked
filesystem)
   - in /cores
   - in the pwd where the executable was launched on that machine
2. If multiple processes will be writing core files in the same
directory, make sure that they don't write to the same filename
(you'll likely end up with a single corrupt corefile).  For
example, on Linux, you can (as root) "echo "core.%e-%t-%p"
>/proc/sys/kernel/core_pattern" to get a unique corefile for each
process and host (this is what I use on my development cluster).
3. If you are launching via a resource scheduler (e.g., SLURM,
Torque, etc.), the scheduler may be resetting the corefile limit
back down to zero before launching your job. If this is what is
happening, it may be a little tricky to override this because the
scheduler will likely do it *on 

Re: [OMPI users] Using mpicc to cross-compile MPI applications

2017-03-02 Thread Gilles Gouaillardet

Graham,


you can configure Open MPI with '--enable-script-wrapper-compilers'

that will make wrappers as scripts instead of binaries.


Cheers,


Gilles


On 3/3/2017 10:23 AM, Graham Holland wrote:

Hello,

I am using OpenMPI version 1.10.2 on an arm development board and have
successfully cross-compiled the OpenMPI libraries and am able to use
them with simple MPI applications.

My question is, what is the best approach for cross-compiling more
complex MPI applications which use mpicc to determine compiler flags?
The mpicc that is built is an ARM executable and obviously cannot
be run on my host.

An example of such an application is MPPTEST, which uses mpicc from
within its Makefile: ftp://ftp.mcs.anl.gov/pub/mpi/tools/perftest.tar.gz

I have read in the FAQ about overriding the compiler wrapper default
flags using environment variables, so one possibility is to use my
host mpicc with these set appropriately.

I am wondering however if there is better way?

Thank you,

Graham


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Questions about integration with resource distribution systems

2017-07-31 Thread Gilles Gouaillardet

Dave,


unless you are doing direct launch (for example, use 'srun' instead of 
'mpirun' under SLURM),


this is the way Open MPI is working : mpirun will use whatever the 
resource manager provides


in order to spawn the remote orted (tm with PBS, qrsh with SGE, srun 
with SLURM, ...).



then mpirun/orted will fork the MPI tasks.


direct launch provides tightest integration, but it requires some 
capabilities (a PMI(x) server)


are provided by the resource manager.


hopefully the resource manager will report memory consumption and so on 
of the spawned process


(e.g. orted) but also its children (e.g. the MPI tasks)


back to SGE, and if i understand correctly, memory is requested per task 
on the qsub command line.


i am not sure what is done then ... this requirement is either ignored, 
or the requirement is set per orted.


(and once again, i do not know if the limit is only for the orted 
process, or its children too)



Bottom line, unless SGE natively provides PMI(x) capabilities, the 
current "tight integration" is imho the best we can do




Cheers,


Gilles




On 7/28/2017 12:50 AM, Dave Love wrote:

"r...@open-mpi.org"  writes:


Oh no, that's not right. Mpirun launches daemons using qrsh and those
daemons spawn the app's procs. SGE has no visibility of the app at all

Oh no, that's not right.

The whole point of tight integration with remote startup using qrsh is
to report resource usage and provide control over the job.  I'm somewhat
familiar with this.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] -host vs -hostfile

2017-08-03 Thread Gilles Gouaillardet
Mahmood,

you might want to have a look at OpenHPC (which comes with a recent Open MPI)

Cheers,

Gilles

On Thu, Aug 3, 2017 at 9:48 PM, Mahmood Naderan  wrote:
> Well, it seems that the default Rocks-openmpi dominates the systems. So, at
> the moment, I stick with that which is 1.6.5 and uses -machinefile.
> I will later debug to see why 2.0.1 doesn't work.
>
> Thanks.
>
> Regards,
> Mahmood
>
>
>
> On Tue, Aug 1, 2017 at 12:30 AM, Gus Correa  wrote:
>>
>> Maybe something is wrong with the Torque installation?
>> Or perhaps with the Open MPI + Torque integration?
>>
>> 1) Make sure your Open MPI was configured and compiled with the
>> Torque "tm" library of your Torque installation.
>> In other words:
>>
>> configure --with-tm=/path/to/your/Torque/tm_library ...
>>
>> 2) Check if your $TORQUE/server_priv/nodes file has all the nodes
>> in your cluster.  If not, edit the file and add the missing nodes.
>> Then restart the Torque server (service pbs_server restart).
>>
>> 3) Run "pbsnodes" to see if all nodes are listed.
>>
>> 4) Run "hostname" with mpirun in a short Torque script:
>>
>> #PBS -l nodes=4:ppn=1
>> ...
>> mpirun hostname
>>
>> The output should show all four nodes.
>>
>> Good luck!
>> Gus Correa
>>
>> On 07/31/2017 02:41 PM, Mahmood Naderan wrote:
>>>
>>> Well it is confusing!! As you can see, I added four nodes to the host
>>> file (the same nodes are used by PBS). The --map-by ppr:1:node works well.
>>> However, the PBS directive doesn't work
>>>
>>> mahmood@cluster:mpitest$ /share/apps/computer/openmpi-2.0.1/bin/mpirun
>>> -hostfile hosts --map-by ppr:1:node a.out
>>>
>>> 
>>> * hwloc 1.11.2 has encountered what looks like an error from the
>>> operating system.
>>> *
>>> * Package (P#1 cpuset 0x) intersects with NUMANode (P#1 cpuset
>>> 0xff00) without inclusion!
>>> * Error occurred in topology.c line 1048
>>> *
>>> * The following FAQ entry in the hwloc documentation may help:
>>> *   What should I do when hwloc reports "operating system" warnings?
>>> * Otherwise please report this error message to the hwloc user's mailing
>>> list,
>>> * along with the output+tarball generated by the hwloc-gather-topology
>>> script.
>>>
>>> 
>>> Hello world from processor cluster.hpc.org , rank
>>> 0 out of 4 processors
>>> Hello world from processor compute-0-0.local, rank 1 out of 4 processors
>>> Hello world from processor compute-0-1.local, rank 2 out of 4 processors
>>> Hello world from processor compute-0-2.local, rank 3 out of 4 processors
>>> mahmood@cluster:mpitest$ cat mmt.sh
>>> #!/bin/bash
>>> #PBS -V
>>> #PBS -q default
>>> #PBS -j oe
>>> #PBS -l  nodes=4:ppn=1
>>> #PBS -N job1
>>> #PBS -o .
>>> cd $PBS_O_WORKDIR
>>> /share/apps/computer/openmpi-2.0.1/bin/mpirun a.out
>>> mahmood@cluster:mpitest$ qsub mmt.sh
>>> 6428.cluster.hpc.org 
>>>
>>> mahmood@cluster:mpitest$ cat job1.o6428
>>> Hello world from processor compute-0-1.local, rank 0 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 2 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 3 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 4 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 5 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 6 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 8 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 9 out of 32 processors
>>> Hello world from processor compute-0-1.local, rank 12 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 15 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 16 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 18 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 19 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 20 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 21 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 22 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 24 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 26 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 27 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 28 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 29 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 30 out of 32
>>> processors
>>> Hello world from processor compute-0-1.local, rank 31 out of 32
>>> processors
>>> Hello world 

Re: [OMPI users] Enforce TCP with mpirun

2017-08-16 Thread Gilles Gouaillardet
My bad, i forgot btl/tcp is using all the interfaces by default (eth0 *and* ib0)


is eth0 available on all your nodes or just the node running mpirun ?

you can try to use a subnet instead of an interface name
mpirun --mca btl_tcp_if_include 172.24.44.0/23 ...

if you are still facing some issues, you can
mpirun ... --mca btl_base_verbose 100 ...
in order to collect (lot of) logs


Cheers,

Gilles


On Wed, Aug 16, 2017 at 10:54 PM, Maksym Planeta
<mplan...@os.inf.tu-dresden.de> wrote:
> Dear Gilles,
>
> thank you for quick response.
>
> pml/cm doesn't work at all
>
> When I use "--mca pml ob1" I still see traffic in /usr/sbin/perfquery, but 
> the program starts running a lot slower. E. g. ib.C.64 benchmarks runs 33 
> seconds in contrast to less than 1.
>
> I also see many if following warning messages when I use PML ob1:
>
> [:25220] mca_base_component_repository_open: unable to open 
> mca_coll_hcoll: libmxm.so.2: cannot open shared object file: No such file or 
> directory (ignored)
>
> But at least I see traffic on ib0 interface. So I basically achieved the goal 
> from the original mail.
>
> Nevertheless I wanted to try avoid IB completely, so I added "--mca 
> btl_tcp_if_exclude ib0". But no program cannot run and fails with these kinds 
> of messages:
>
>
> [][[42657,1],23][btl_tcp_endpoint.c:649:mca_btl_tcp_endpoint_recv_connect_ack]
>  received unexpected process identifier [[42657,1],1]
>
> I see the same message when I also use this flag "--mca btl_tcp_if_include 
> lo,ib0"
>
> Here is full command:
>
> $ mpirun --mca pml ob1 --mca btl self,tcp --mca btl_tcp_if_include lo,ib0  
> -np 64 bin/is.C.64
>
> I also tried to use "--mca btl_tcp_if_include lo,eth0", but MPI started 
> complaining with this message:
>
> [][[40797,1],58][btl_tcp_component.c:706:mca_btl_tcp_component_create_instances]
>  invalid interface "eth0"
>
> I do have this interface:
>
> $ ip addr show eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 
> 1000
> link/ether 08:00:38:3c:4e:65 brd ff:ff:ff:ff:ff:ff
> inet 172.24.44.190/23 brd 172.24.45.255 scope global eth0
> inet6 fe80::a00:38ff:fe3c:4e65/64 scope link
>valid_lft forever preferred_lft forever
>
>
> Can I tell somehow MPI to use eth0, and not use ib0?
>
> On 08/16/2017 02:34 PM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> you can try
>> mpirun --mca pml ob1 --mca btl tcp,self ...
>>
>>
>> pml/cm has a higher priority than pml/ob1, so if you have a mtl that
>> fits your network (such as mtl/mxm),
>> then pml/ob1 will be ignored, and the list of allowed/excluded btl
>> become insignificant.
>>
>> Cheers,
>>
>> Gilles
>>
>> On Wed, Aug 16, 2017 at 8:57 PM, Maksym Planeta
>> <mplan...@os.inf.tu-dresden.de> wrote:
>>> Hello,
>>>
>>> I work with an Infiniband cluster, but I want to force OpenMPI to use
>>> specific network interface.
>>>
>>> I tried to do this for example using mpirun as follows:
>>>
>>> mpirun --map-by node --mca btl self,tcp  -np 16 bin/is.C.16
>>>
>>> But counters returned by /usr/sbin/perfquery still keep showing that
>>> transmission happens over ibverbs.
>>>
>>> There is an iboib module on a machine, but counters there (ip -s link)
>>> indicate that the ib0 interface is not used.
>>>
>>> Could you help me to figure out how to properly tell OpenMPI not to use
>>> ibverbs?
>>>
>>> I tried with Open MPI 2.1.0 and 1.10.2, but saw no difference in behavior.
>>>
>>>
>>> --
>>> Regards,
>>> Maksym Planeta
>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
> --
> Regards,
> Maksym Planeta
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] MPI running in Unikernels

2017-08-11 Thread Gilles Gouaillardet
Keith,

MPI is running on both shared memory (e.g. one single node) and
distributed memory (e.g. several independent nodes).
here is what happens when you
mpirun -np  a.out

1. an orted process is remotely spawned to each node
2. mpirun and orted fork a.out

unless a batch manager is used, remote spawn is implemented by SSH

note it is possible to
- use a custom SSH-like command
- use a custom command instead of the orted command
- use a wrapper when fork'ing a.out

last but not least, an other option is direct run, but that requires
some support from the resource manager (e.g. a PMI(x) server)
for example, with SLURM, you can
srun a.out
and then slurm will remotely spawn a.out on all the nodes.


i am pretty sure Open MPI provides enough flexibility so with a
minimum a creativity, you can run a MPI app on unikernel.
if a ssh daemon runs in each unikernel, that should be straightforward.
if you want to run one orted and several a.out per unikernel, a bit of
creativity is needed (e.g. scripting and wrapping)
if you want to run a single a.out per unikernel, that is a bit
trickier since you have to somehow implement a PMIx server within each
unikernel


Cheers,

Gilles

On Sat, Aug 12, 2017 at 12:11 AM, Keith Collister  wrote:
> Hi,
>
> I'm currently looking into whether it's possible to run MPI applications
> within unikernels.
>
> The idea is to have multiple unikernels as virtual compute nodes in the
> cluster, with physical nodes hosting the virtual nodes. As I understand it,
> in a normal cluster mpirun would be invoked on a physical node and all the
> compute nodes would be processes on the same machine. In contrast, with
> unikernels the compute nodes would need to effectively run in virtual
> machines.
>
> I thought I was making progress when I found the "--ompi-server" switch for
> mpirun: I figured I could spawn an OMPI server instance on a host, then just
> invoke mpirun telling it to start the unikernel (in an emulator (QEMU),
> instead of an application), passing the unikernel the uri of the OMPI
> server. In my mind, the unikernels would be able to connect to the server
> happily and all would be smooth.
>
> In reality, it seems like mpirun doesn't "pass" anything to the application
> it runs (I expected it to pass configuration options via the command line).
> This implies all the configuration is stored somewhere on the host or as
> environment variables, which would make it muuuch harder to configure the
> unikernel. I couldn't find much documentation on this part of the process
> though (how mpirun configures the application), so I figured I'd ask the
> experts.
>
> Is this sort of thing possible? Is the MPI ecosystem tied too tightly to
> virtual nodes being run with mpirun to make it infeasible to run insulated
> virtual nodes like this? Is there some command-line switch I've missed that
> would make my life a lot easier?
>
>
> Any advice/ideas/discussion would be much appreciated,
> Keith
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-13 Thread Gilles Gouaillardet

Boris,


Open MPI should automatically detect the infiniband hardware, and use 
openib (and *not* tcp) for inter node communications


and a shared memory optimized btl (e.g. sm or vader) for intra node 
communications.



note if you "-mca btl openib,self", you tell Open MPI to use the openib 
btl between any tasks,


including tasks running on the same node (which is less efficient than 
using sm or vader)



at first, i suggest you make sure infiniband is up and running on all 
your nodes.


(just run ibstat, at least one port should be listed, state should be 
Active, and all nodes should have the same SM lid)



then try to run two tasks on two nodes.


if this does not work, you can

mpirun --mca btl_base_verbose 100 ...

and post the logs so we can investigate from there.


Cheers,


Gilles



On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:


I would like to know how to invoke InfiniBand hardware on CentOS 6x 
cluster with OpenMPI (static libs.) for running my C++ code. This is 
how I compile and run:


/usr/local/open-mpi/1.10.7/bin/mpic++ -L/usr/local/open-mpi/1.10.7/lib 
-Bstatic main.cpp -o DoWork


usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self --hostfile 
hostfile5 -host node01,node02,node03,node04,node05 -n 200 DoWork


Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and the 
cluster has InfiniBand.


What should be changed in compiling and running commands for 
InfiniBand to be invoked? If I just replace "*-mca btl tcp,self*" with 
"*-mca btl openib,self*" then I get plenty of errors with relevant one 
saying:


/At least one pair of MPI processes are unable to reach each other for 
MPI communications. This means that no Open MPI device has indicated 
that it can be used to communicate between these processes. This is an 
error; Open MPI requires that all MPI processes be able to reach each 
other. This error can sometimes be the result of forgetting to specify 
the "self" BTL./


Thanks very much!!!


*Boris *




___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] weird issue with output redirection to a file when using different compilers

2017-07-13 Thread Gilles Gouaillardet
Fabricio,

the fortran runtime might (or not) use buffering for I/O.
as a consequence, data might be written immediatly to disk, or at a later time
(e.g. the file is closed, the buffer is full or the buffer is flushed)

you might want to manually flush the file, or there might be an option
not to use
buffering when opening a file (sorry, i do not know them off hands)

Cheers,

Gilles

On Fri, Jul 14, 2017 at 8:49 AM, fabricio  wrote:
> Hello there
>
> I'm facing a weird issue in a centos 7.3 x86 machine when remotely running a
> program [https://www.myroms.org] from host1 to host2 and host3.
>
> When the program was compiled with intel ifort 17.0, output redirection
> happens immediately and the file is constantly updated.
> If the program is compiled with gnu gfortran 5.3.0, the file is not written
> (zero bytes) until the end of the execution, as if some kind of buffering
> was happening.
> Openmpi version is the same, 1.10.7, and configure options are as below:
>
> intel:
> --prefix /intel/path
> --enable-shared
> --enable-static
> --enable-orterun-prefix-by-default
> --with-slurm
> --with-psm
> --with-verbs=yes
> --with-threads=posix
> --with-hwloc=/hwloc/path
> --disable-java
> --disable-vt
>
> gnu:
> --prefix /gnu/path
> --enable-shared
> --enable-static
> --enable-orterun-prefix-by-default
> --with-sge
> --with-tm
> --with-slurm
> --with-valgrind
> --with-psm
> --with-verbs=yes
> --with-threads=posix
> --with-hwloc=/hwloc/path
> --with-libevent=/libevent/path
> --with-psm2
> --disable-java
> --disable-vt
>
>
> Does anything strikes as odd? Am I fumbling something?
>
>
> TIA,
> Fabricio
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-17 Thread Gilles Gouaillardet
Node GUID: 0x248a0703005abb31
> System image GUID: 0x248a0703005abb30
> Port 1:
> State: Down
> Physical state: Disabled
> Rate: 100
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x3c01
> Port GUID: 0x268a07fffe5abb31
> Link layer: Ethernet
> -bash-4.1$
> %%%%%%
>
> On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users
> <users@lists.open-mpi.org> wrote:
>>
>> ABoris, as Gilles says - first do som elower level checkouts of your
>> Infiniband network.
>> I suggest running:
>> ibdiagnet
>> ibhosts
>> and then as Gilles says 'ibstat' on each node
>>
>>
>>
>> On 14 July 2017 at 03:58, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
>>>
>>> Boris,
>>>
>>>
>>> Open MPI should automatically detect the infiniband hardware, and use
>>> openib (and *not* tcp) for inter node communications
>>>
>>> and a shared memory optimized btl (e.g. sm or vader) for intra node
>>> communications.
>>>
>>>
>>> note if you "-mca btl openib,self", you tell Open MPI to use the openib
>>> btl between any tasks,
>>>
>>> including tasks running on the same node (which is less efficient than
>>> using sm or vader)
>>>
>>>
>>> at first, i suggest you make sure infiniband is up and running on all
>>> your nodes.
>>>
>>> (just run ibstat, at least one port should be listed, state should be
>>> Active, and all nodes should have the same SM lid)
>>>
>>>
>>> then try to run two tasks on two nodes.
>>>
>>>
>>> if this does not work, you can
>>>
>>> mpirun --mca btl_base_verbose 100 ...
>>>
>>> and post the logs so we can investigate from there.
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>>
>>> On 7/14/2017 6:43 AM, Boris M. Vulovic wrote:
>>>>
>>>>
>>>> I would like to know how to invoke InfiniBand hardware on CentOS 6x
>>>> cluster with OpenMPI (static libs.) for running my C++ code. This is how I
>>>> compile and run:
>>>>
>>>> /usr/local/open-mpi/1.10.7/bin/mpic++ -L/usr/local/open-mpi/1.10.7/lib
>>>> -Bstatic main.cpp -o DoWork
>>>>
>>>> usr/local/open-mpi/1.10.7/bin/mpiexec -mca btl tcp,self --hostfile
>>>> hostfile5 -host node01,node02,node03,node04,node05 -n 200 DoWork
>>>>
>>>> Here, "*-mca btl tcp,self*" reveals that *TCP* is used, and the cluster
>>>> has InfiniBand.
>>>>
>>>> What should be changed in compiling and running commands for InfiniBand
>>>> to be invoked? If I just replace "*-mca btl tcp,self*" with "*-mca btl
>>>> openib,self*" then I get plenty of errors with relevant one saying:
>>>>
>>>> /At least one pair of MPI processes are unable to reach each other for
>>>> MPI communications. This means that no Open MPI device has indicated that 
>>>> it
>>>> can be used to communicate between these processes. This is an error; Open
>>>> MPI requires that all MPI processes be able to reach each other. This error
>>>> can sometimes be the result of forgetting to specify the "self" BTL./
>>>>
>>>> Thanks very much!!!
>>>>
>>>>
>>>> *Boris *
>>>>
>>>>
>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> users@lists.open-mpi.org
>>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>
> --
>
> Boris M. Vulovic
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Network performance over TCP

2017-07-09 Thread Gilles Gouaillardet
Adam,

at first, you need to change the default send and receive socket buffers :
mpirun --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 ...
/* note this will be the default from Open MPI 2.1.2 */

hopefully, that will be enough to greatly improve the bandwidth for
large messages.


generally speaking, i recommend you use the latest (e.g. Open MPI
2.1.1) available version

how many interfaces can be used to communicate between hosts ?
if there is more than one (for example a slow and a fast one), you'd
rather only use the fast one.
for example, if eth0 is the fast interface, that can be achieved with
mpirun --mca btl_tcp_if_include eth0 ...

also, you might be able to achieve better results by using more than
one socket on the fast interface.
for example, if you want to use 4 sockets per interface
mpirun --mca btl_tcp_links 4 ...



Cheers,

Gilles

On Sun, Jul 9, 2017 at 10:10 PM, Adam Sylvester  wrote:
> I am using Open MPI 2.1.0 on RHEL 7.  My application has one unavoidable
> pinch point where a large amount of data needs to be transferred (about 8 GB
> of data needs to be both sent to and received all other ranks), and I'm
> seeing worse performance than I would expect; this step has a major impact
> on my overall runtime.  In the real application, I am using MPI_Alltoall()
> for this step, but for the purpose of a simple benchmark, I simplified to
> simply do a single MPI_Send() / MPI_Recv() between two ranks of a 2 GB
> buffer.
>
> I'm running this in AWS with instances that have 10 Gbps connectivity in the
> same availability zone (according to tracepath, there are no hops between
> them) and MTU set to 8801 bytes.  Doing a non-MPI benchmark of sending data
> directly over TCP between these two instances, I reliably get around 4 Gbps.
> Between these same two instances with MPI_Send() / MPI_Recv(), I reliably
> get around 2.4 Gbps.  This seems like a major performance degradation for a
> single MPI operation.
>
> I compiled Open MPI 2.1.0 with gcc 4.9.1 and default settings.  I'm
> connecting between instances via ssh and using I assume TCP for the actual
> network transfer (I'm not setting any special command-line or programmatic
> settings).  The actual command I'm running is:
> mpirun -N 1 --bind-to none --hostfile hosts.txt my_app
>
> Any advice on other things to test or compilation and/or runtime flags to
> set would be much appreciated!
> -Adam
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] configure in version 2.1.1 doesn't use some necessary LDFLAGS

2017-07-10 Thread Gilles Gouaillardet

Hi Petr,


thanks for the report.


could you please configure Open MPI with the previously working command line

and compress and post the generated config.log ?


Cheers,


Gilles


On 7/11/2017 12:52 AM, Petr Hanousek wrote:

Dear developers,
I am using for a long time the proved configure command:
./configure --with-verbs=/software/ofed-1.5.4
--prefix=/software/openmpi/2.1.1/gcc --enable-mpi-thread-multiple
--enable-shared --enable-mpi-cxx --enable-mpi-fortran

After switching our planning environment from Torque to PBSPro (because
it's free now) I now had to add "--with-tm=/usr" and
LDFLAGS="-L/usr/lib/x86_64-linux-gnu -lutil -lpbs -lcrypto" to be able
to configure the software successfully. Compilation then works fine. The
system I am using is Debian 8 stable in current state. Could you please
adjust the configure checks to reflect the situation?

Thnak you, Petr




___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Network performance over TCP

2017-07-09 Thread Gilles Gouaillardet

Adam,


Thanks for letting us know your performance issue has been resolved.


yes, https://www.open-mpi.org/faq/?category=tcp is the best place to 
look for this kind of information.


i will add a reference to these parameters. i will also ask folks at AWS 
if they have additional/other recommendations.



note you have a few options before 2.1.2 (or 3.0.0) is released :


- update your system wide config file (/.../etc/openmpi-mca-params.conf) 
or user config file


  ($HOME/.openmpi/mca-params.conf) and add the following lines

btl_tcp_sndbuf = 0

btl_tcp_rcvbuf = 0


- add the following environment variable to your environment

export OMPI_MCA_btl_tcp_sndbuf=0

export OMPI_MCA_btl_tcp_rcvbuf=0


- use Open MPI 2.0.3


- last but not least, you can manually download and apply the patch 
available at


https://github.com/open-mpi/ompi/commit/b64fedf4f652cadc9bfc7c4693f9c1ef01dfb69f.patch


Cheers,

Gilles

On 7/9/2017 11:04 PM, Adam Sylvester wrote:

Gilles,

Thanks for the fast response!

The --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 flags you 
recommended made a huge difference - this got me up to 5.7 Gb/s! I 
wasn't aware of these flags... with a little Googling, is 
https://www.open-mpi.org/faq/?category=tcp the best place to look for 
this kind of information and any other tweaks I may want to try (or if 
there's a better FAQ out there, please let me know)?
There is only eth0 on my machines so nothing to tweak there (though 
good to know for the future). I also didn't see any improvement by 
specifying more sockets per instance. But, your initial suggestion had 
a major impact.
In general I try to stay relatively up to date with my Open MPI 
version; I'll be extra motivated to upgrade to 2.1.2 so that I don't 
have to remember to set these --mca flags on the command line. :o)

-Adam

On Sun, Jul 9, 2017 at 9:26 AM, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> 
wrote:


Adam,

at first, you need to change the default send and receive socket
buffers :
mpirun --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 ...
/* note this will be the default from Open MPI 2.1.2 */

hopefully, that will be enough to greatly improve the bandwidth for
large messages.


generally speaking, i recommend you use the latest (e.g. Open MPI
2.1.1) available version

how many interfaces can be used to communicate between hosts ?
if there is more than one (for example a slow and a fast one), you'd
rather only use the fast one.
for example, if eth0 is the fast interface, that can be achieved with
mpirun --mca btl_tcp_if_include eth0 ...

also, you might be able to achieve better results by using more than
one socket on the fast interface.
for example, if you want to use 4 sockets per interface
mpirun --mca btl_tcp_links 4 ...



Cheers,

Gilles

On Sun, Jul 9, 2017 at 10:10 PM, Adam Sylvester <op8...@gmail.com
<mailto:op8...@gmail.com>> wrote:
> I am using Open MPI 2.1.0 on RHEL 7.  My application has one
unavoidable
> pinch point where a large amount of data needs to be transferred
(about 8 GB
> of data needs to be both sent to and received all other ranks),
and I'm
> seeing worse performance than I would expect; this step has a
major impact
> on my overall runtime.  In the real application, I am using
MPI_Alltoall()
> for this step, but for the purpose of a simple benchmark, I
simplified to
> simply do a single MPI_Send() / MPI_Recv() between two ranks of
a 2 GB
> buffer.
>
> I'm running this in AWS with instances that have 10 Gbps
connectivity in the
> same availability zone (according to tracepath, there are no
hops between
> them) and MTU set to 8801 bytes.  Doing a non-MPI benchmark of
sending data
> directly over TCP between these two instances, I reliably get
around 4 Gbps.
> Between these same two instances with MPI_Send() / MPI_Recv(), I
reliably
> get around 2.4 Gbps.  This seems like a major performance
degradation for a
> single MPI operation.
>
> I compiled Open MPI 2.1.0 with gcc 4.9.1 and default settings.  I'm
> connecting between instances via ssh and using I assume TCP for
the actual
> network transfer (I'm not setting any special command-line or
programmatic
> settings).  The actual command I'm running is:
> mpirun -N 1 --bind-to none --hostfile hosts.txt my_app
>
> Any advice on other things to test or compilation and/or runtime
flags to
> set would be much appreciated!
> -Adam
>
> ___
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
>

Re: [OMPI users] NUMA interaction with Open MPI

2017-07-16 Thread Gilles Gouaillardet
Adam,

keep in mind that by default, recent Open MPI bind MPI tasks
- to cores if -np 2
- to NUMA domain otherwise (which is a socket in most cases, unless
you are running on a Xeon Phi)

so unless you specifically asked mpirun to do a binding consistent
with your needs, you might simply try to ask no binding at all
mpirun --bind-to none ...

i am not sure whether you can direclty ask Open MPI to do the memory
binding you expect from the command line.
anyway, as far as i am concerned,
mpirun --bind-to none numactl --interleave=all ...
should do what you expect

if you want to be sure, you can simply
mpirun --bind-to none numactl --interleave=all grep Mems_allowed_list
/proc/self/status
and that should give you an hint

Cheers,

Gilles


On Mon, Jul 17, 2017 at 4:19 AM, Adam Sylvester  wrote:
> I'll start with my question upfront: Is there a way to do the equivalent of
> telling mpirun to do 'numactl --interleave=all' on the processes that it
> runs?  Or if I want to control the memory placement of my applications run
> through MPI will I need to use libnuma for this?  I tried doing "mpirun
>  numactl --interleave=all ".  I
> don't know how to explicitly verify if this ran the numactl command on each
> host or not but based on the performance I'm seeing, it doesn't seem like it
> did (or something else is causing my poor performance).
>
> More details: For the particular image I'm benchmarking with, I have a
> multi-threaded application which requires 60 GB of RAM to run if it's run on
> one machine.  It allocates one large ping/pong buffer upfront and uses this
> to avoid copies when updating the image at each step.  I'm running in AWS
> and comparing performance on an r3.8xlarge (16 CPUs, 244 GB RAM, 10 Gbps)
> vs. an x1.32xlarge (64 CPUs, 2 TB RAM, 20 Gbps).  Running on a single X1, my
> application runs ~3x faster than the R3; using numactl --interleave=all has
> a significant positive effect on its performance,  I assume because the
> various threads that are running are accessing memory spread out across the
> nodes rather than most of them having slow access to it.  So far so good.
>
> My application also supports distributing across machines via MPI.  When
> doing this, the memory requirement scales linearly with the number of
> machines; there are three pinch points that involve large (GBs of data)
> all-to-all communication.  For the slowest of these three, I've pipelined
> this step and use MPI_Ialltoallv() to hide as much of the latency as I can.
> When run on R3 instances, overall runtime scales very well as machines are
> added.  Still so far so good.
>
> My problems start with the X1 instances.  I do get scaling as I add more
> machines, but it is significantly worse than with the R3s.  This isn't just
> a matter of there being more CPUs and the MPI communication time dominating.
> The actual time spent in the MPI all-to-all communication is significantly
> longer than on the R3s for the same number of machines, despite the network
> bandwidth being twice as high (in a post from a few days ago some folks
> helped me with MPI settings to improve the network communication speed -
> from toy benchmark MPI tests I know I'm getting faster communication on the
> X1s than on the R3s, so this feels likely to be an issue with NUMA, though
> I'd be interested in any other thoughts.
>
> I looked at https://www.open-mpi.org/doc/current/man1/mpirun.1.php but this
> didn't seem to have what I was looking for.  I want MPI to let my
> application use all CPUs on the system (I'm the only one running on it)... I
> just want to control the memory placement.
>
> Thanks for the help.
> -Adam
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] How to configure the message size in openMPI (over RDMA)?

2017-07-18 Thread Gilles Gouaillardet

Hi,


i cannot comment for the openib specific part.

the coll/tuned collective module is very likely to split messages in 
order to use a more efficient


algorithm. an other way to put it is you probably do not want to use 
large messages.



but if this is really what you want, then one option is to disable 
coll/tuned,


mpirun --mca coll ^tuned ...

an other option is to force the the algorithm used by coll/tuned

mpirun --mca coll_tuned_use_dynamic_rules 1 --mca 
coll_tuned_allreduce_algorithm  ...


where  is a number between 1 and 6 (see ompi_info --all for the 
definition)



Cheers,


Gilles


On 7/19/2017 6:10 AM, Juncheng Gu wrote:

Hi,

I am trying to setup openMPI over RDMA cross machines.
I call MPI_AllReduce() with a 240MB data buffer.
But, it seems openMPI chunks data into small fragments (1MB ~ 15MB), 
and then sends them out through RDMA.


Which mca parameters can affect the message size in openMPI?
How to configure "mca" to let openMPI use large message size 
(fragment) in data transmission?

For example, is there any way we can set the mini message size of openMPI?

I have followed the instructions from 
https://www.open-mpi.org/faq/?category=openfabrics#ofed-and-ompi. But 
there is no improvement in message size.


I am using " --mca btl self,sm,openib ".

Best,
Juncheng


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Message reception not getting pipelined with TCP

2017-07-20 Thread Gilles Gouaillardet

Sam,


this example is using 8 MB size messages

if you are fine with using more memory, and your application should not 
generate too much unexpected messages, then you can bump the eager_limit

for example

mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ...

worked for me


George,

in master, i thought

mpirun --mca btl_tcp_progress_thread 1 ...

would help but it did not.
did i misunderstand the purpose of the TCP progress thread ?

Cheers,

Gilles

On 7/21/2017 9:05 AM, George Bosilca wrote:

Sam,

Open MPI aggregates messages only when network constraints prevent the 
messages from being timely delivered. In this particular case I think 
that our delayed business card exchange and connection setup is 
delaying the delivery of the first batch of messages (and the BTL will 
aggregate them while waiting for the connection to be correctly setup).


Can you reproduce the same behavior after the first batch of messages ?

Assuming the times showed on the left of your messages are correct, 
the first MPI seems to deliver the entire set of messages 
significantly faster than the second.


  George.





On Thu, Jul 20, 2017 at 5:42 PM, Samuel Thibault 
> wrote:


Hello,

We are getting a strong performance issue, which is due to a missing
pipelining behavior from OpenMPI when running over TCP. I have
attached
a test case. Basically what it does is

if (myrank == 0) {
for (i = 0; i < N; i++)
MPI_Isend(...);
} else {
for (i = 0; i < N; i++)
MPI_Irecv(...);
}
for (i = 0; i < N; i++)
MPI_Wait(...);

with corresponding printfs. And the result is:

0.182620: Isend 0 begin
0.182761: Isend 0 end
0.182766: Isend 1 begin
0.182782: Isend 1 end
...
0.183911: Isend 49 begin
0.183915: Isend 49 end
0.199028: Irecv 0 begin
0.199068: Irecv 0 end
0.199070: Irecv 1 begin
0.199072: Irecv 1 end
...
0.199187: Irecv 49 begin
0.199188: Irecv 49 end
0.233948: Isend 0 done!
0.269895: Isend 1 done!
...
1.982475: Isend 49 done!
1.984065: Irecv 0 done!
1.984078: Irecv 1 done!
...
1.984131: Irecv 49 done!

i.e. almost two seconds happen between the start of the
application and
the first Irecv completes, and then all Irecv complete immediately
too,
i.e. it seems the communications were grouped altogether.

This is really bad, because in our real use case, we trigger
computations after each MPI_Wait calls, and we use several messages so
as to pipeline things: the first computation can start as soon as one
message gets received, thus overlapped with further receptions.

This problem is only with openmpi on TCP, I'm not getting this
behavior
with openmpi on IB, and I'm not getting it either with mpich or
madmpi:

0.182168: Isend 0 begin
0.182235: Isend 0 end
0.182237: Isend 1 begin
0.182242: Isend 1 end
...
0.182842: Isend 49 begin
0.182844: Isend 49 end
0.200505: Irecv 0 begin
0.200564: Irecv 0 end
0.200567: Irecv 1 begin
0.200569: Irecv 1 end
...
0.201233: Irecv 49 begin
0.201234: Irecv 49 end
0.269511: Isend 0 done!
0.273154: Irecv 0 done!
0.341054: Isend 1 done!
0.344507: Irecv 1 done!
...
3.767726: Isend 49 done!
3.770637: Irecv 49 done!

There we do have pipelined reception.

Is there a way to get the second, pipelined behavior with openmpi on
TCP?

Samuel

___
users mailing list
users@lists.open-mpi.org 
https://rfd.newmexicoconsortium.org/mailman/listinfo/users





___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Message reception not getting pipelined with TCP

2017-07-21 Thread Gilles Gouaillardet
Thanks George for the explanation,

with the default eager size, the first message is received *after* the
last message is sent, regardless the progress thread is used or not.
an other way to put it is that MPI_Isend() (and probably MPI_Irecv()
too) do not involve any progression,
so i naively thought the progress thread would have helped here.

just to be 100% sure, could you please confirm this is the intended
behavior and not a bug ?

Cheers,

Gilles

On Sat, Jul 22, 2017 at 5:00 AM, George Bosilca <bosi...@icl.utk.edu> wrote:
>
>
> On Thu, Jul 20, 2017 at 8:57 PM, Gilles Gouaillardet <gil...@rist.or.jp>
> wrote:
>>
>> Sam,
>>
>>
>> this example is using 8 MB size messages
>>
>> if you are fine with using more memory, and your application should not
>> generate too much unexpected messages, then you can bump the eager_limit
>> for example
>>
>> mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ...
>>
>> worked for me
>
>
> Ah, interesting. If forcing a very large eager then the problem might be
> coming from the pipelining algorithm. Not a good solution in general, but
> handy to see what's going on. As many sends are available, the pipelining
> might be overwhelmed and interleave fragments from different requests. Let
> me dig a little bit here, I think I know exactly what is going on.
>
>>
>> George,
>>
>> in master, i thought
>>
>> mpirun --mca btl_tcp_progress_thread 1 ...
>>
>> would help but it did not.
>> did i misunderstand the purpose of the TCP progress thread ?
>
>
> Gilles,
>
> In this example most of the time is spent in an MPI_* function (mainly the
> MPI_Wait), so the progress thread has little opportunity to help. The role
> of the progress thread is to make sure communications are progressed when
> the application is not into an MPI call.
>
>   George.
>
>
>
>>
>>
>> Cheers,
>>
>> Gilles
>>
>> On 7/21/2017 9:05 AM, George Bosilca wrote:
>>>
>>> Sam,
>>>
>>> Open MPI aggregates messages only when network constraints prevent the
>>> messages from being timely delivered. In this particular case I think that
>>> our delayed business card exchange and connection setup is delaying the
>>> delivery of the first batch of messages (and the BTL will aggregate them
>>> while waiting for the connection to be correctly setup).
>>>
>>> Can you reproduce the same behavior after the first batch of messages ?
>>>
>>> Assuming the times showed on the left of your messages are correct, the
>>> first MPI seems to deliver the entire set of messages significantly faster
>>> than the second.
>>>
>>>   George.
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jul 20, 2017 at 5:42 PM, Samuel Thibault
>>> <samuel.thiba...@labri.fr <mailto:samuel.thiba...@labri.fr>> wrote:
>>>
>>> Hello,
>>>
>>> We are getting a strong performance issue, which is due to a missing
>>> pipelining behavior from OpenMPI when running over TCP. I have
>>> attached
>>> a test case. Basically what it does is
>>>
>>> if (myrank == 0) {
>>> for (i = 0; i < N; i++)
>>> MPI_Isend(...);
>>> } else {
>>> for (i = 0; i < N; i++)
>>> MPI_Irecv(...);
>>> }
>>> for (i = 0; i < N; i++)
>>> MPI_Wait(...);
>>>
>>> with corresponding printfs. And the result is:
>>>
>>> 0.182620: Isend 0 begin
>>> 0.182761: Isend 0 end
>>> 0.182766: Isend 1 begin
>>> 0.182782: Isend 1 end
>>> ...
>>> 0.183911: Isend 49 begin
>>> 0.183915: Isend 49 end
>>> 0.199028: Irecv 0 begin
>>> 0.199068: Irecv 0 end
>>> 0.199070: Irecv 1 begin
>>> 0.199072: Irecv 1 end
>>> ...
>>> 0.199187: Irecv 49 begin
>>> 0.199188: Irecv 49 end
>>> 0.233948: Isend 0 done!
>>> 0.269895: Isend 1 done!
>>> ...
>>> 1.982475: Isend 49 done!
>>> 1.984065: Irecv 0 done!
>>> 1.984078: Irecv 1 done!
>>> ...
>>> 1.984131: Irecv 49 done!
>>>
>>> i.e. almost two seconds happen between the start of the
>>> application and
>>> the first Irecv completes, and then all Irecv complete immediately
>>> too,
>&g

Re: [OMPI users] MPI_IN_PLACE

2017-07-27 Thread Gilles Gouaillardet

Volker,


since you are only using

include 'mpif.h'


a workaround is you edit your /.../share/openmpi/mpifort-wrapper-data.txt
and simply remove '-lmpi_usempif08 -lmpi_usempi_ignore_tkr'


Cheers,

Gilles

On 7/27/2017 3:28 PM, Volker Blum wrote:

Thanks!

If you wish, please also keep me posted.

Best wishes
Volker


On Jul 27, 2017, at 7:50 AM, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com> wrote:

Thanks Jeff for your offer, i will contact you off-list later


i tried a gcc+gfortran and gcc+ifort on both linux and OS X
so far, only gcc+ifort on OS X is failing
i will try icc+ifort on OS X from now

short story, MPI_IN_PLACE is not recognized as such by the ompi
fortran wrapper, and i do not know why.

the attached program can be used to evidence the issue.


Cheers,

Gilles

On Thu, Jul 27, 2017 at 2:15 PM, Volker Blum <volker.b...@duke.edu> wrote:

Thanks! That’s great. Sounds like the exact combination I have here.

Thanks also to George. Sorry that the test did not trigger on a more standard 
platform - that would have simplified things.

Best wishes
Volker


On Jul 27, 2017, at 3:56 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:

Folks,


I am able to reproduce the issue on OS X (Sierra) with stock gcc (aka clang) 
and ifort 17.0.4


i will investigate this from now


Cheers,

Gilles

On 7/27/2017 9:28 AM, George Bosilca wrote:

Volker,

Unfortunately, I can't replicate with icc. I tried on a x86_64 box with Intel 
compiler chain 17.0.4 20170411 to no avail. I also tested the 3.0.0-rc1 tarball 
and the current master, and you test completes without errors on all cases.

Once you figure out an environment where you can consistently replicate the 
issue, I would suggest to attach to the processes and:
- make sure the MPI_IN_PLACE as seen through the Fortran layer matches what the 
C layer expects
- what is the collective algorithm used by Open MPI

I have a "Fortran 101" level question. When you pass an array a(:) as argument, 
what exactly gets passed via the Fortran interface to the corresponding C function ?

George.

On Wed, Jul 26, 2017 at 1:55 PM, Volker Blum <volker.b...@duke.edu 
<mailto:volker.b...@duke.edu>> wrote:

   Thanks! Yes, trying with Intel 2017 would be very nice.


On Jul 26, 2017, at 6:12 PM, George Bosilca <bosi...@icl.utk.edu

   <mailto:bosi...@icl.utk.edu>> wrote:

No, I don't have (or used where they were available) the Intel

   compiler. I used clang and gfortran. I can try on a Linux box with
   the Intel 2017 compilers.

  George.



On Wed, Jul 26, 2017 at 11:59 AM, Volker Blum

   <volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:

Did you use Intel Fortran 2017 as well?

(I’m asking because I did see the same issue with a combination

   of an earlier Intel Fortran 2017 version and OpenMPI on an
   Intel/Infiniband Linux HPC machine … but not Intel Fortran 2016 on
   the same machine. Perhaps I can revive my access to that
   combination somehow.)

Best wishes
Volker


On Jul 26, 2017, at 5:55 PM, George Bosilca

   <bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu>> wrote:

I thought that maybe the underlying allreduce algorithm fails

   to support MPI_IN_PLACE correctly, but I can't replicate on any
   machine (including OSX) with any number of processes.

  George.



On Wed, Jul 26, 2017 at 10:59 AM, Volker Blum

   <volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:

Thanks!

I tried ‘use mpi’, which compiles fine.

Same result as with ‘include mpif.h', in that the output is

* MPI_IN_PLACE does not appear to work as intended.
* Checking whether MPI_ALLREDUCE works at all.
* Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work.

Hm. Any other thoughts?

Thanks again!
Best wishes
Volker


On Jul 26, 2017, at 4:06 PM, Gilles Gouaillardet

   <gilles.gouaillar...@gmail.com
   <mailto:gilles.gouaillar...@gmail.com>> wrote:

Volker,

With mpi_f08, you have to declare

Type(MPI_Comm) :: mpi_comm_global

(I am afk and not 100% sure of the syntax)

A simpler option is to

use mpi

Cheers,

Gilles

Volker Blum <volker.b...@duke.edu

   <mailto:volker.b...@duke.edu>> wrote:

Hi Gilles,

Thank you very much for the response!

Unfortunately, I don’t have access to a different system

   with the issue right now. As I said, it’s not new; it just keeps
   creeping up unexpectedly again on different platforms. What
   puzzles me is that I’ve encountered the same problem with low but
   reasonable frequency over a period of now over five years.

We can’t require F’08 in our application, unfortunately,

   since this standard is too new. Since we maintain a large
   application that has to run on a broad range of platforms, Fortran
   2008 would not work for many of our users. In a few years, this
   will be different, but not yet.

On gfortran: In our own tests, unfortunately, Intel Fortran

   consistently produced much faster executabl

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
Thanks Jeff for your offer, i will contact you off-list later


i tried a gcc+gfortran and gcc+ifort on both linux and OS X
so far, only gcc+ifort on OS X is failing
i will try icc+ifort on OS X from now

short story, MPI_IN_PLACE is not recognized as such by the ompi
fortran wrapper, and i do not know why.

the attached program can be used to evidence the issue.


Cheers,

Gilles

On Thu, Jul 27, 2017 at 2:15 PM, Volker Blum <volker.b...@duke.edu> wrote:
> Thanks! That’s great. Sounds like the exact combination I have here.
>
> Thanks also to George. Sorry that the test did not trigger on a more standard 
> platform - that would have simplified things.
>
> Best wishes
> Volker
>
>> On Jul 27, 2017, at 3:56 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
>>
>> Folks,
>>
>>
>> I am able to reproduce the issue on OS X (Sierra) with stock gcc (aka clang) 
>> and ifort 17.0.4
>>
>>
>> i will investigate this from now
>>
>>
>> Cheers,
>>
>> Gilles
>>
>> On 7/27/2017 9:28 AM, George Bosilca wrote:
>>> Volker,
>>>
>>> Unfortunately, I can't replicate with icc. I tried on a x86_64 box with 
>>> Intel compiler chain 17.0.4 20170411 to no avail. I also tested the 
>>> 3.0.0-rc1 tarball and the current master, and you test completes without 
>>> errors on all cases.
>>>
>>> Once you figure out an environment where you can consistently replicate the 
>>> issue, I would suggest to attach to the processes and:
>>> - make sure the MPI_IN_PLACE as seen through the Fortran layer matches what 
>>> the C layer expects
>>> - what is the collective algorithm used by Open MPI
>>>
>>> I have a "Fortran 101" level question. When you pass an array a(:) as 
>>> argument, what exactly gets passed via the Fortran interface to the 
>>> corresponding C function ?
>>>
>>>  George.
>>>
>>> On Wed, Jul 26, 2017 at 1:55 PM, Volker Blum <volker.b...@duke.edu 
>>> <mailto:volker.b...@duke.edu>> wrote:
>>>
>>>Thanks! Yes, trying with Intel 2017 would be very nice.
>>>
>>>> On Jul 26, 2017, at 6:12 PM, George Bosilca <bosi...@icl.utk.edu
>>><mailto:bosi...@icl.utk.edu>> wrote:
>>>>
>>>> No, I don't have (or used where they were available) the Intel
>>>compiler. I used clang and gfortran. I can try on a Linux box with
>>>the Intel 2017 compilers.
>>>>
>>>>   George.
>>>>
>>>>
>>>>
>>>> On Wed, Jul 26, 2017 at 11:59 AM, Volker Blum
>>><volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:
>>>> Did you use Intel Fortran 2017 as well?
>>>>
>>>> (I’m asking because I did see the same issue with a combination
>>>of an earlier Intel Fortran 2017 version and OpenMPI on an
>>>Intel/Infiniband Linux HPC machine … but not Intel Fortran 2016 on
>>>the same machine. Perhaps I can revive my access to that
>>>combination somehow.)
>>>>
>>>> Best wishes
>>>> Volker
>>>>
>>>> > On Jul 26, 2017, at 5:55 PM, George Bosilca
>>><bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu>> wrote:
>>>> >
>>>> > I thought that maybe the underlying allreduce algorithm fails
>>>to support MPI_IN_PLACE correctly, but I can't replicate on any
>>>machine (including OSX) with any number of processes.
>>>> >
>>>> >   George.
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Jul 26, 2017 at 10:59 AM, Volker Blum
>>><volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:
>>>> > Thanks!
>>>> >
>>>> > I tried ‘use mpi’, which compiles fine.
>>>> >
>>>> > Same result as with ‘include mpif.h', in that the output is
>>>> >
>>>> >  * MPI_IN_PLACE does not appear to work as intended.
>>>> >  * Checking whether MPI_ALLREDUCE works at all.
>>>> >  * Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work.
>>>> >
>>>> > Hm. Any other thoughts?
>>>> >
>>>> > Thanks again!
>>>> > Best wishes
>>>> > Volker
>>>

Re: [OMPI users] NUMA interaction with Open MPI

2017-07-27 Thread Gilles Gouaillardet

Dave,


On 7/28/2017 12:54 AM, Dave Love wrote:

Gilles Gouaillardet <gilles.gouaillar...@gmail.com> writes:


Adam,

keep in mind that by default, recent Open MPI bind MPI tasks
- to cores if -np 2
- to NUMA domain otherwise

Not according to ompi_info from the latest release; it says socket.

thanks, i will double check that.
i made a simple test on KNL in SNC4 mode (1 socket, 4 numa nodes) and 
with 4 mpi tasks, the binding is per NUMA domain.
that suggests the ompi_info output is bogus, and i will double check 
which released versions are impacted too

(which is a socket in most cases, unless
you are running on a Xeon Phi)

[There have been multiple nodes/socket on x86 since Magny Cours, and
it's also relevant for POWER.  That's a reason things had to switch to
hwloc from whatever the predecessor was called.]


so unless you specifically asked mpirun to do a binding consistent
with your needs, you might simply try to ask no binding at all
mpirun --bind-to none ...

Why would you want to turn off core binding?  The resource manager is
likely to supply a binding anyhow if incomplete nodes are allocated.
unless i overlooked it, the initial post did not mention any resource 
manager.
to me, the behavior (better performance with interleaved memory) 
suggests that
the application will perform better if a single MPI tasks has its 
threads using all the
available sockets. so unless such a binding is requested (either via 
mpirun or the
resource manager), i suggested no binding at all could lead to better 
performance than

a default binding to socket/numa.

Cheers,

Gilles

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
Volker,

thanks, i will have a look at it

meanwhile, if you can reproduce this issue on a more mainstream
platform (e.g. linux + gfortran) please let me know.

since you are using ifort, Open MPI was built with Fortran 2008
bindings, so you can replace
include 'mpif.h'
with
use mpi_f08
and who knows, that might solve your issue


Cheers,

Gilles

On Wed, Jul 26, 2017 at 5:22 PM, Volker Blum <volker.b...@duke.edu> wrote:
> Dear Gilles,
>
> Thank you very much for the fast answer.
>
> Darn. I feared it might not occur on all platforms, since my former Macbook
> (with an older OpenMPI version) no longer exhibited the problem, a different
> Linux/Intel Machine did last December, etc.
>
> On this specific machine, the configure line is
>
> ./configure CC=gcc FC=ifort F77=ifort
>
> ifort version 17.0.4
>
> blum:/Users/blum/software/openmpi-3.0.0rc1> gcc -v
> Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr
> --with-gxx-include-dir=/usr/include/c++/4.2.1
> Apple LLVM version 8.1.0 (clang-802.0.42)
> Target: x86_64-apple-darwin16.6.0
> Thread model: posix
> InstalledDir:
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>
> The full test program is appended.
>
> Compilation:
>
> mpif90 check_mpi_in_place.f90
>
> blum:/Users/blum/codes/fhi-aims/openmpi_test> which mpif90
> /usr/local/openmpi-3.0.0rc1/bin/mpif90
>
> blum:/Users/blum/codes/fhi-aims/openmpi_test> which mpirun
> /usr/local/openmpi-3.0.0rc1/bin/mpirun
>
> blum:/Users/blum/codes/fhi-aims/openmpi_test> mpirun -np 2 a.out
>  * MPI_IN_PLACE does not appear to work as intended.
>  * Checking whether MPI_ALLREDUCE works at all.
>  * Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work.
>
> blum:/Users/blum/codes/fhi-aims/openmpi_test> mpirun -np 1 a.out
>  * MPI_IN_PLACE does not appear to work as intended.
>  * Checking whether MPI_ALLREDUCE works at all.
>  * Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work.
>
> Hopefully, no trivial mistakes in the testcase. I just spent a few days
> tracing this issue through a fairly large code, which is where the issue
> originally arose (and leads to wrong numbers).
>
> Best wishes
> Volker
>
>
>
>
>> On Jul 26, 2017, at 9:46 AM, Gilles Gouaillardet
>> <gilles.gouaillar...@gmail.com> wrote:
>>
>> Volker,
>>
>> i was unable to reproduce this issue on linux
>>
>> can you please post your full configure command line, your gnu
>> compiler version and the full test program ?
>>
>> also, how many mpi tasks are you running ?
>>
>> Cheers,
>>
>> Gilles
>>
>> On Wed, Jul 26, 2017 at 4:25 PM, Volker Blum <volker.b...@duke.edu> wrote:
>>> Hi,
>>>
>>> I tried openmpi-3.0.0rc1.tar.gz using Intel Fortran 2017 and gcc on a
>>> current MacOS system. For this version, it seems to me that MPI_IN_PLACE
>>> returns incorrect results (while other MPI implementations, including some
>>> past OpenMPI versions, work fine).
>>>
>>> This can be seen with a simple Fortran example code, shown below. In the
>>> test, the values of all entries of an array “test_data” should be 1.0d0 if
>>> the behavior were as intended. However, the version of OpenMPI I have
>>> returns 0.d0 instead.
>>>
>>> I’ve seen this behavior on some other compute platforms too, in the past,
>>> so it wasn’t new to me. Still, I thought that this time, I’d ask. Any
>>> thoughts?
>>>
>>> Thank you,
>>> Best wishes
>>> Volker
>>>
>>>! size of test data array
>>>integer :: n_data
>>>
>>>! array that contains test data for MPI_IN_PLACE
>>>real*8, allocatable :: test_data(:)
>>>
>>>integer :: mpierr
>>>
>>>n_data = 10
>>>
>>>allocate(test_data(n_data),stat=mpierr)
>>>
>>>! seed test data array for allreduce call below
>>>if (myid.eq.0) then
>>>   test_data(:) = 1.d0
>>>else
>>>   test_data(:) = 0.d0
>>>end if
>>>
>>>! Sum the test_data array over all MPI tasks
>>>call MPI_ALLREDUCE(MPI_IN_PLACE, &
>>> test_data(:), &
>>> n_data, &
>>> MPI_DOUBLE_PRECISION, &
>>> MPI_SUM, &
>>> mpi_comm_global, &
>>> mpierr )
>>>
>>>! The value of all entries of test_data should now be 1.d0 on all MPI
>>> tasks.
>>>! If that is not the case, th

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
Volker,

i was unable to reproduce this issue on linux

can you please post your full configure command line, your gnu
compiler version and the full test program ?

also, how many mpi tasks are you running ?

Cheers,

Gilles

On Wed, Jul 26, 2017 at 4:25 PM, Volker Blum  wrote:
> Hi,
>
> I tried openmpi-3.0.0rc1.tar.gz using Intel Fortran 2017 and gcc on a current 
> MacOS system. For this version, it seems to me that MPI_IN_PLACE returns 
> incorrect results (while other MPI implementations, including some past 
> OpenMPI versions, work fine).
>
> This can be seen with a simple Fortran example code, shown below. In the 
> test, the values of all entries of an array “test_data” should be 1.0d0 if 
> the behavior were as intended. However, the version of OpenMPI I have returns 
> 0.d0 instead.
>
> I’ve seen this behavior on some other compute platforms too, in the past, so 
> it wasn’t new to me. Still, I thought that this time, I’d ask. Any thoughts?
>
> Thank you,
> Best wishes
> Volker
>
> ! size of test data array
> integer :: n_data
>
> ! array that contains test data for MPI_IN_PLACE
> real*8, allocatable :: test_data(:)
>
> integer :: mpierr
>
> n_data = 10
>
> allocate(test_data(n_data),stat=mpierr)
>
> ! seed test data array for allreduce call below
> if (myid.eq.0) then
>test_data(:) = 1.d0
> else
>test_data(:) = 0.d0
> end if
>
> ! Sum the test_data array over all MPI tasks
> call MPI_ALLREDUCE(MPI_IN_PLACE, &
>  test_data(:), &
>  n_data, &
>  MPI_DOUBLE_PRECISION, &
>  MPI_SUM, &
>  mpi_comm_global, &
>  mpierr )
>
> ! The value of all entries of test_data should now be 1.d0 on all MPI 
> tasks.
> ! If that is not the case, then the MPI_IN_PLACE flag may be broken.
>
>
>
>
>
>
> Volker Blum
> Associate Professor
> Ab Initio Materials Simulations
> Duke University, MEMS Department
> 144 Hudson Hall, Box 90300, Duke University, Durham, NC 27708, USA
>
> volker.b...@duke.edu
> https://aims.pratt.duke.edu
> +1 (919) 660 5279
> Twitter: Aimsduke
>
> Office:  Hudson Hall
>
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
Volker,

With mpi_f08, you have to declare

Type(MPI_Comm) :: mpi_comm_global

(I am afk and not 100% sure of the syntax)

A simpler option is to

use mpi

Cheers,

Gilles

Volker Blum <volker.b...@duke.edu> wrote:
>Hi Gilles,
>
>Thank you very much for the response!
>
>Unfortunately, I don’t have access to a different system with the issue right 
>now. As I said, it’s not new; it just keeps creeping up unexpectedly again on 
>different platforms. What puzzles me is that I’ve encountered the same problem 
>with low but reasonable frequency over a period of now over five years.
>
>We can’t require F’08 in our application, unfortunately, since this standard 
>is too new. Since we maintain a large application that has to run on a broad 
>range of platforms, Fortran 2008 would not work for many of our users. In a 
>few years, this will be different, but not yet.
>
>On gfortran: In our own tests, unfortunately, Intel Fortran consistently 
>produced much faster executable code in the past. The latter observation may 
>also change someday, but for us, the performance difference was an important 
>constraint.
>
>I did suspect mpif.h, too. Not sure how to best test this hypothesis, however. 
>
>Just replacing 
>
>> include 'mpif.h'
>> with
>> use mpi_f08
>
>did not work, for me. 
>
>This produces a number of compilation errors:
>
>blum:/Users/blum/codes/fhi-aims/openmpi_test> mpif90 check_mpi_in_place_08.f90 
>-o check_mpi_in_place_08.x
>check_mpi_in_place_08.f90(55): error #6303: The assignment operation or the 
>binary expression operation is invalid for the data types of the two operands. 
>  [MPI_COMM_WORLD]
>mpi_comm_global = MPI_COMM_WORLD
>--^
>check_mpi_in_place_08.f90(57): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_COMM_SIZE]
>call MPI_COMM_SIZE(mpi_comm_global, n_tasks, mpierr)
>-^
>check_mpi_in_place_08.f90(58): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_COMM_RANK]
>call MPI_COMM_RANK(mpi_comm_global, myid, mpierr)
>-^
>check_mpi_in_place_08.f90(75): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_ALLREDUCE]
>call MPI_ALLREDUCE(MPI_IN_PLACE, &
>-^
>check_mpi_in_place_08.f90(94): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_ALLREDUCE]
>call MPI_ALLREDUCE(check_success, aux_check_success, 1, MPI_LOGICAL, &
>-^
>check_mpi_in_place_08.f90(119): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_ALLREDUCE]
>   call MPI_ALLREDUCE(test_data(:), &
>^
>check_mpi_in_place_08.f90(140): error #6285: There is no matching specific 
>subroutine for this generic subroutine call.   [MPI_ALLREDUCE]
>   call MPI_ALLREDUCE(check_conventional_mpi, aux_check_success, 1, 
> MPI_LOGICAL, &
>--------^
>compilation aborted for check_mpi_in_place_08.f90 (code 1)
>
>This is an interesting result, however … what might I be missing? Another use 
>statement?
>
>Best wishes
>Volker
>
>> On Jul 26, 2017, at 2:53 PM, Gilles Gouaillardet 
>> <gilles.gouaillar...@gmail.com> wrote:
>> 
>> Volker,
>> 
>> thanks, i will have a look at it
>> 
>> meanwhile, if you can reproduce this issue on a more mainstream
>> platform (e.g. linux + gfortran) please let me know.
>> 
>> since you are using ifort, Open MPI was built with Fortran 2008
>> bindings, so you can replace
>> include 'mpif.h'
>> with
>> use mpi_f08
>> and who knows, that might solve your issue
>> 
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Wed, Jul 26, 2017 at 5:22 PM, Volker Blum <volker.b...@duke.edu> wrote:
>>> Dear Gilles,
>>> 
>>> Thank you very much for the fast answer.
>>> 
>>> Darn. I feared it might not occur on all platforms, since my former Macbook
>>> (with an older OpenMPI version) no longer exhibited the problem, a different
>>> Linux/Intel Machine did last December, etc.
>>> 
>>> On this specific machine, the configure line is
>>> 
>>> ./configure CC=gcc FC=ifort F77=ifort
>>> 
>>> ifort version 17.0.4
>>> 
>>> blum:/Users/blum/software/openmpi-3.0.0rc1> gcc -v
>>> Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr
>>> --with-gxx-include-dir=/usr/include/c++/4.2.1
>>> Apple LLVM version 8.1.0 (clang-802.0.42)
>>> Target: x86_64-apple-darwin16.6

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet

Folks,


I am able to reproduce the issue on OS X (Sierra) with stock gcc (aka 
clang) and ifort 17.0.4



i will investigate this from now


Cheers,

Gilles

On 7/27/2017 9:28 AM, George Bosilca wrote:

Volker,

Unfortunately, I can't replicate with icc. I tried on a x86_64 box 
with Intel compiler chain 17.0.4 20170411 to no avail. I also tested 
the 3.0.0-rc1 tarball and the current master, and you test completes 
without errors on all cases.


Once you figure out an environment where you can consistently 
replicate the issue, I would suggest to attach to the processes and:
- make sure the MPI_IN_PLACE as seen through the Fortran layer matches 
what the C layer expects

- what is the collective algorithm used by Open MPI

I have a "Fortran 101" level question. When you pass an array a(:) as 
argument, what exactly gets passed via the Fortran interface to the 
corresponding C function ?


  George.

On Wed, Jul 26, 2017 at 1:55 PM, Volker Blum <volker.b...@duke.edu 
<mailto:volker.b...@duke.edu>> wrote:


Thanks! Yes, trying with Intel 2017 would be very nice.

> On Jul 26, 2017, at 6:12 PM, George Bosilca <bosi...@icl.utk.edu
<mailto:bosi...@icl.utk.edu>> wrote:
>
> No, I don't have (or used where they were available) the Intel
compiler. I used clang and gfortran. I can try on a Linux box with
the Intel 2017 compilers.
>
>   George.
>
>
>
> On Wed, Jul 26, 2017 at 11:59 AM, Volker Blum
<volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:
> Did you use Intel Fortran 2017 as well?
>
> (I’m asking because I did see the same issue with a combination
of an earlier Intel Fortran 2017 version and OpenMPI on an
Intel/Infiniband Linux HPC machine … but not Intel Fortran 2016 on
the same machine. Perhaps I can revive my access to that
combination somehow.)
>
> Best wishes
> Volker
>
> > On Jul 26, 2017, at 5:55 PM, George Bosilca
<bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu>> wrote:
> >
> > I thought that maybe the underlying allreduce algorithm fails
to support MPI_IN_PLACE correctly, but I can't replicate on any
machine (including OSX) with any number of processes.
> >
> >   George.
> >
> >
> >
> > On Wed, Jul 26, 2017 at 10:59 AM, Volker Blum
<volker.b...@duke.edu <mailto:volker.b...@duke.edu>> wrote:
> > Thanks!
> >
> > I tried ‘use mpi’, which compiles fine.
> >
> > Same result as with ‘include mpif.h', in that the output is
> >
> >  * MPI_IN_PLACE does not appear to work as intended.
> >  * Checking whether MPI_ALLREDUCE works at all.
    > >  * Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work.
> >
> > Hm. Any other thoughts?
> >
> > Thanks again!
> > Best wishes
> > Volker
> >
> > > On Jul 26, 2017, at 4:06 PM, Gilles Gouaillardet
<gilles.gouaillar...@gmail.com
<mailto:gilles.gouaillar...@gmail.com>> wrote:
> > >
> > > Volker,
> > >
> > > With mpi_f08, you have to declare
> > >
> > > Type(MPI_Comm) :: mpi_comm_global
> > >
> > > (I am afk and not 100% sure of the syntax)
> > >
> > > A simpler option is to
> > >
> > > use mpi
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > Volker Blum <volker.b...@duke.edu
<mailto:volker.b...@duke.edu>> wrote:
> > >> Hi Gilles,
> > >>
> > >> Thank you very much for the response!
> > >>
> > >> Unfortunately, I don’t have access to a different system
with the issue right now. As I said, it’s not new; it just keeps
creeping up unexpectedly again on different platforms. What
puzzles me is that I’ve encountered the same problem with low but
reasonable frequency over a period of now over five years.
> > >>
> > >> We can’t require F’08 in our application, unfortunately,
since this standard is too new. Since we maintain a large
application that has to run on a broad range of platforms, Fortran
2008 would not work for many of our users. In a few years, this
will be different, but not yet.
> > >>
> > >> On gfortran: In our own tests, unfortunately, Intel Fortran
consistently produced much faster executable code in the past. The
latter observation may also change someday, but for us, the
performance difference was an important const

Re: [OMPI users] Open MPI in a Infiniband dual-rail configuration issues

2017-07-19 Thread Gilles Gouaillardet
Ludovic,

what happens here is that by default, a MPI task will only use the
closest IB device.
since tasks are bound to a socket, that means that tasks on socket 0
will only use mlx4_0, and tasks on socket 1 will only use mlx4_1.
because these are on independent subnets, that also means that tasks
on socket 0 cannot communicate with tasks on socket 1 via the openib
btl.

so you have to explicitly direct Open MPI to use all the IB interfaces

mpirun --mca btl_openib_ignore_locality 1 ...

i do not think that will perform optimally though :-(
for this type of settings, i'd rather suggest all IB ports are on the
same subnet


Cheers,

Gilles

On Wed, Jul 19, 2017 at 9:20 PM, Ludovic Raess  wrote:
> Hi,
>
> We have an issue on our 32 nodes Linux cluster regarding the usage of Open
> MPI in a Infiniband dual-rail configuration.
>
> Node config:
> - Supermicro dual socket Xeon E5 v3 6 cores CPUs
> - 4 Titan X GPUs
> - 2 IB Connect X FDR single port HCA (mlx4_0 and mlx4_1)
> - Centos 6.6, OFED 3.1, openmpi 2.0.0, gcc 5.4, cuda 7
>
> IB dual rail configuration: two independent IB switches (36 ports), each of
> the two single port IB HCA is connected to its own IB subnet.
>
> The nodes are additionally connected via Ethernet for admin.
>
> 
>
> Consider the node topology below as being valid for every of the 32 nodes
> from the cluster:
>
> At the PCIe root complex level, each CPU manages two GPUs and a single IB
> card :
> CPU0 |CPU1
> mlx4_0   |mlx4_1
> GPU0 |GPU2
> GPU1 |GPU3
>
> MPI ranks are bounded to a socket via a rankfile and are distributed on the
> 2 sockets of each node :
> rank 0=node01 slot=0:2
> rank 1=node01 slot=1:2
> rank 2=node02 slot=0:2
> ...
> rank n=nodeNN slot=0,1:2
>
>
> case 1: with a single IB HCA used (any one of the two), all ranks can
> communicate with each other via
> openib only, and this independently of their relative socket
> binding. The use of tcp btl can be
> explicitly disabled as there is no tcp traffic.
>
> "mpirun -rf rankfile --mca btl_openib_if_include mlx4_0 --mca btl
> self,openib a.out"
>
> case 2: in some rare cases, the topology of our MPI job is such that
> processes on socket 0 communicate only with
> other processes on socket 0 and the same is true for processes on
> socket 1. In this context, the two IB rails
> are effectively used in parallel and all ranks communicate as needed
> via openib only, no tcp traffic.
>
> "mpirun -rf rankfile --mca btl_openib_if_include mlx4_0,mlx4_1 --mca
> btl self,openib a.out"
>
> case 3: most of the time we have "cross socket" communications between ranks
> on different nodes.
> In this context Open MPI reverts to using tcp when communications
> involve even and odd sockets,
> and it slows down our jobs.
>
> mpirun -rf rankfile --mca btl_openib_if_include mlx4_0,mlx4_1 a.out
> [node01.octopoda:16129] MCW rank 0 bound to socket 0[core 2[hwt 0]]:
> [././B/././.][./././././.]
> [node02.octopoda:12061] MCW rank 1 bound to socket 1[core 10[hwt 0]]:
> [./././././.][././././B/.]
> [node02.octopoda:12062] [rank=1] openib: skipping device mlx4_0; it is too
> far away
> [node01.octopoda:16130] [rank=0] openib: skipping device mlx4_1; it is too
> far away
> [node02.octopoda:12062] [rank=1] openib: using port mlx4_1:1
> [node01.octopoda:16130] [rank=0] openib: using port mlx4_0:1
> [node02.octopoda:12062] mca: bml: Using self btl to [[11337,1],1] on node
> node02
> [node01.octopoda:16130] mca: bml: Using self btl to [[11337,1],0] on node
> node01
> [node02.octopoda:12062] mca: bml: Using tcp btl to [[11337,1],0] on node
> node01
> [node02.octopoda:12062] mca: bml: Using tcp btl to [[11337,1],0] on node
> node01
> [node02.octopoda:12062] mca: bml: Using tcp btl to [[11337,1],0] on node
> node01
> [node01.octopoda:16130] mca: bml: Using tcp btl to [[11337,1],1] on node
> node02
> [node01.octopoda:16130] mca: bml: Using tcp btl to [[11337,1],1] on node
> node02
> [node01.octopoda:16130] mca: bml: Using tcp btl to [[11337,1],1] on node
> node02
>
>
> trying to force using the two IB HCA and to disable the use of tcp
> btl results in the following error
>
> mpirun -rf rankfile --mca btl_openib_if_include mlx4_0,mlx4_1 --mca btl
> self,openib a.out
> [node02.octopoda:11818] MCW rank 1 bound to socket 1[core 10[hwt 0]]:
> [./././././.][././././B/.]
> [node01.octopoda:15886] MCW rank 0 bound to socket 0[core 2[hwt 0]]:
> [././B/././.][./././././.]
> [node01.octopoda:15887] [rank=0] openib: skipping device mlx4_1; it is too
> far away
> [node02.octopoda:11819] [rank=1] openib: skipping device mlx4_0; it is too
> far away
> [node01.octopoda:15887] [rank=0] openib: using port mlx4_0:1
> [node02.octopoda:11819] [rank=1] openib: using port mlx4_1:1
> [node02.octopoda:11819] mca: bml: Using self btl to [[25017,1],1] on node
> node02
> [node01.octopoda:15887] mca: bml: 

Re: [OMPI users] PMIx + OpenMPI

2017-08-06 Thread Gilles Gouaillardet
Charles,

did you build Open MPI with the external PMIx ?
iirc, Open MPI 2.0.x does not support cross version PMIx

Cheers,

Gilles

On Sun, Aug 6, 2017 at 7:59 PM, Charles A Taylor  wrote:
>
>> On Aug 6, 2017, at 6:53 AM, Charles A Taylor  wrote:
>>
>>
>> Anyone successfully using PMIx with OpenMPI and SLURM?  I have,
>>
>> 1. Installed an “external” version (1.1.5) of PMIx.
>> 2. Patched SLURM 15.08.13 with the SchedMD-provided PMIx patch (results in 
>> an mpi_pmix plugin along the lines of mpi_pmi2).
>> 3. Built OpenMPI 2.0.1 (tried 2.0.3 as well).
>>
>> However, when attempting to launch MPI apps (LAMMPS in this case), I get
>>
>>[c9a-s2.ufhpc:08914] PMIX ERROR: UNREACHABLE in file 
>> src/client/pmix_client.c at line 199
>>
> I should have mentioned that I’m launching with
>
>srun —mpi=pmix …
>
> If I launch with
>
>   srun —mpi=pmi2 ...
>
> the app starts and runs without issue.
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] question about run-time of a small program

2017-07-31 Thread Gilles Gouaillardet

Siegmar,


a noticeable difference is hello_1 does *not* sleep, whereas 
hello_2_slave *does*


simply comment out the sleep(...) line, and performances will be identical


Cheers,

Gilles

On 7/31/2017 9:16 PM, Siegmar Gross wrote:

Hi,

I have two versions of a small program. In the first one the process 
with rank 0
calls the function "master()" and all other ranks call the function 
"slave()" and in the second one I have two programs: one for the 
master task and another
one for the slave task. The run-time for the second version is much 
bigger than
the one for the first version. Any ideas why the version with two 
separate

programs takes that long?

loki tmp 108 mpicc -o hello_1_mpi hello_1_mpi.c
loki tmp 109 mpicc -o hello_2_mpi hello_2_mpi.c
loki tmp 110 mpicc -o hello_2_slave_mpi hello_2_slave_mpi.c
loki tmp 111 /usr/bin/time -p mpiexec -np 3 hello_1_mpi
Process 0 of 3 running on loki
Process 1 of 3 running on loki
Process 2 of 3 running on loki
...
real 0.14
user 0.00
sys 0.00
loki tmp 112 /usr/bin/time -p mpiexec -np 1 hello_2_mpi : \
  -np 2 hello_2_slave_mpi
Process 0 of 3 running on loki
Process 1 of 3 running on loki
Process 2 of 3 running on loki
...
real 23.15
user 0.00
sys 0.00
loki tmp 113 ompi_info | grep "Open MPI repo revision"
  Open MPI repo revision: v3.0.0rc2
loki tmp 114


Thank you very much for any answer in advance.


Kind regards

Siegmar


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Setting LD_LIBRARY_PATH for orted

2017-08-22 Thread Gilles Gouaillardet
Not really,

you have the option of using an orte_launch_agent i described in a
previous email

Cheers,

Gilles

On Wed, Aug 23, 2017 at 2:43 AM, Jackson, Gary L.
<gary.jack...@jhuapl.edu> wrote:
> Yup. It looks like I’m stuck with .bashrc.
>
> Thank you all for the suggestions.
>
> --
> Gary Jackson, Ph.D.
> Johns Hopkins University Applied Physics Laboratory
>
> On 8/22/17, 1:07 PM, "users on behalf of r...@open-mpi.org" 
> <users-boun...@lists.open-mpi.org on behalf of r...@open-mpi.org> wrote:
>
> I’m afraid not - that only applies the variable to the application, not 
> the daemons.
>
> Truly, your only real option is to put something in your .bashrc since 
> you cannot modify the configure.
>
> Or, if you are running in a managed environment, you can ask to have your 
> resource manager forward your environment to the allocated nodes.
>
> > On Aug 22, 2017, at 9:10 AM, Bennet Fauber <ben...@umich.edu> wrote:
> >
> > Would
> >
> >$ mpirun -x LD_LIBRARY_PATH ...
> >
> > work here?  I think from the man page for mpirun that should request
> > that it would would export the currently set value of LD_LIBRARY_PATH
> > to the remote nodes prior to executing the command there.
> >
> > -- bennet
> >
> >
> >
> > On Tue, Aug 22, 2017 at 11:55 AM, Jackson, Gary L.
> > <gary.jack...@jhuapl.edu> wrote:
> >> I’m using a build of OpenMPI provided by a third party.
> >>
> >> --
> >> Gary Jackson, Ph.D.
> >> Johns Hopkins University Applied Physics Laboratory
> >>
> >> On 8/21/17, 8:04 PM, "users on behalf of Gilles Gouaillardet" 
> <users-boun...@lists.open-mpi.org on behalf of gil...@rist.or.jp> wrote:
> >>
> >>Gary,
> >>
> >>
> >>one option (as mentioned in the error message) is to configure Open 
> MPI
> >>with --enable-orterun-prefix-by-default.
> >>
> >>this will force the build process to use rpath, so you do not have 
> to
> >>set LD_LIBRARY_PATH
> >>
> >>this is the easiest option, but cannot be used if you plan to 
> relocate
> >>the Open MPI installation directory.
> >>
> >>
> >>an other option is to use a wrapper for orted.
> >>
> >>mpirun --mca orte_launch_agent /.../myorted ...
> >>
> >>where myorted is a script that looks like
> >>
> >>#!/bin/sh
> >>
> >>export LD_LIBRARY_PATH=...
> >>
> >>exec /.../bin/orted "$@"
> >>
> >>
> >>you can make this setting system-wide by adding the following line 
> to
> >>/.../etc/openmpi-mca-params.conf
> >>
> >>orte_launch_agent = /.../myorted
> >>
> >>
> >>Cheers,
> >>
> >>
> >>Gilles
> >>
> >>
> >>On 8/22/2017 1:06 AM, Jackson, Gary L. wrote:
> >>>
> >>> I’m using a binary distribution of OpenMPI 1.10.2. As linked, it
> >>> requires certain shared libraries outside of OpenMPI for orted itself
> >>> to start. So, passing in LD_LIBRARY_PATH with the “-x” flag to mpirun
> >>> doesn’t do anything:
> >>>
> >>> $ mpirun –hostfile ${HOSTFILE} -N 1 -n 2 -x LD_LIBRARY_PATH hostname
> >>>
> >>> /path/to/orted: error while loading shared libraries: LIBRARY.so:
> >>> cannot open shared object file: No such file or directory
> >>>
> >>> 
> --
> >>>
> >>> ORTE was unable to reliably start one or more daemons.
> >>>
> >>> This usually is caused by:
> >>>
> >>> * not finding the required libraries and/or binaries on
> >>>
> >>> one or more nodes. Please check your PATH and LD_LIBRARY_PATH
> >>>
> >>> settings, or configure OMPI with --enable-orterun-prefix-by-default
> >>>
> >>> * lack of authority to execute on one or more specified nodes.
> >>>
> >>> Please verify your allocation and authorities.
> >>>
&g

Re: [OMPI users] MPI_Init() failure

2017-05-17 Thread Gilles Gouaillardet

Folks,


for the records, this was investigated off-list

- the root cause was bad permissions on the /.../lib/openmpi directory 
(no components could be found)


- then it was found tm support was not built-in, so mpirun did not 
behave as expected under torque/pbs



Cheers,


Gilles


On 5/15/2017 6:03 PM, Ioannis Botsis wrote:

Hi

I am trying to run the following simple demo to a cluster of two nodes

-- 


#include 
#include 

int main(int argc, char** argv) {
MPI_Init(NULL, NULL);

int world_size;
MPI_Comm_size(MPI_COMM_WORLD, _size);

int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, _rank);

char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, _len);

printf("Hello world from processor %s, rank %d"   " out of %d 
processors\n",  processor_name, world_rank, world_size);


MPI_Finalize();
}
- 



i get always the message

 


It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
-- 



any hint?

Ioannis Botsis



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] (no subject)

2017-05-16 Thread Gilles Gouaillardet

Thanks for all the information,


what i meant by

mpirun --mca shmem_base_verbose 100 ...

is really you modify your mpirun command line (or your torque script if 
applicable) and add


--mca shmem_base_verbose 100

right after mpirun


Cheers,


Gilles


On 5/16/2017 3:59 AM, Ioannis Botsis wrote:

Hi Gilles

Thank you for your prompt response.

Here is some information about the system

Ubuntu 16.04 server
Linux-4.4.0-75-generic-x86_64-with-Ubuntu-16.04-xenial

On HP PROLIANT DL320R05 Generation 5, 4GB RAM,  4x120GB raid-1 HDD,  2
ethernet ports 10/100/1000
HP StorageWorks 70 Modular Smart Array with 14x120GB HDD (RAID-5)

44 HP Proliant BL465c server blade, double AMD Opteron Model 2218(2.6GHz,
2MB, 95W), 4 GB RAM, 2 NC370i Multifunction Gigabit Servers Adapters, 120GB

User's area is shared with the nodes.

ssh and torque 6.0.2 services works fine

Torque and openmpi 2.1.0 are installed from tarball.   configure
--prefix=/storage/exp_soft/tuc  is used for the deployment of openmpi 2.1.0.
After make and make install binaries, lib and include files of openmpi2.1.0
are located under /storage/exp_soft/tuc .

/storage is a shared file system for all the nodes of the cluster

$PATH:
 /storage/exp_soft/tuc/bin
 /storage/exp_soft/tuc/sbin
 /storage/exp_soft/tuc/torque/bin
 /storage/exp_soft/tuc/torque/sbin
 /usr/local/sbin
 /usr/local/bin
 /usr/sbin
 /usr/bin
 /sbin
 /bin
 /snap/bin


LD_LIBRARY_PATH=/storage/exp_soft/tuc/lib

C_INCLUDE_PATH=/storage/exp_soft/tuc/include

I use also jupyterhub (with cluster tab enabled) as a user interface to the
cluster. After the installation of python and some dependencies mpich
and openmpi are also installed in the system directories.



--
mpirun --allow-run-as-root --mca shmem_base_verbose 100 ...

[se01.grid.tuc.gr:19607] mca: base: components_register: registering
framework shmem components
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component sysv
[se01.grid.tuc.gr:19607] mca: base: components_register: component sysv
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component posix
[se01.grid.tuc.gr:19607] mca: base: components_register: component posix
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_register: found loaded
component mmap
[se01.grid.tuc.gr:19607] mca: base: components_register: component mmap
register function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: opening shmem
components
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
sysv
[se01.grid.tuc.gr:19607] mca: base: components_open: component sysv open
function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
posix
[se01.grid.tuc.gr:19607] mca: base: components_open: component posix open
function successful
[se01.grid.tuc.gr:19607] mca: base: components_open: found loaded component
mmap
[se01.grid.tuc.gr:19607] mca: base: components_open: component mmap open
function successful
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: Auto-selecting shmem
components
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [sysv]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [sysv] set priority to 30
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [posix]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [posix] set priority to 40
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Querying
component (run-time) [mmap]
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Query of
component [mmap] set priority to 50
[se01.grid.tuc.gr:19607] shmem: base: runtime_query: (shmem) Selected
component [mmap]
[se01.grid.tuc.gr:19607] mca: base: close: unloading component sysv
[se01.grid.tuc.gr:19607] mca: base: close: unloading component posix
[se01.grid.tuc.gr:19607] shmem: base: best_runnable_component_name:
Searching for best runnable component.
[se01.grid.tuc.gr:19607] shmem: base: best_runnable_component_name: Found
best runnable component: (mmap).
--
mpirun was unable to find the specified executable file, and therefore
did not launch the job.  This error was first reported for process
rank 0; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command
   line parameter option (remember that mpirun interprets the first
   unrecognized command line token as the executable).

Node:   se01
Executable: ...

Re: [OMPI users] mpi_scatterv problem in fortran

2017-05-15 Thread Gilles Gouaillardet

Hi,


if you run this under a debugger and look at how MPI_Scatterv is 
invoked, you will find that


- sendcounts = {1, 1, 1}
- resizedtype has size 32
- recvcount*sizeof(MPI_INTEGER) = 32 on task 0, but 16 on task 1 and 2

=> too much data is sent to tasks 1 and 2, hence the error.

in this case (3 MPI tasks), my best bet is
- sendcounts should be {2, 1, 1}
- resizedtype should be a row (e.g. 16 bytes)
- recvcount is fine (e.g. 8,4,4)

since the program fails on both MPICH and OpenMPI, the error is most 
likely in the program itself



Best regards,

Gilles


On 5/15/2017 3:30 PM, Siva Srinivas Kolukula wrote:


I want to scatter matrix from root to other processors using scatterv. 
I am creating a communicator topology using /mpi_cart_create/. As an 
example I have the below code in fortran:


|PROGRAM SendRecv USE mpi IMPLICIT none integer, PARAMETER :: m = 4, n 
= 4 integer, DIMENSION(m,n) :: a, b,h integer :: i,j,count 
integer,allocatable, dimension(:,:):: loc ! local piece of global 2d 
array INTEGER :: istatus(MPI_STATUS_SIZE),ierr integer, dimension(2) 
:: sizes, subsizes, starts INTEGER :: ista,iend,jsta,jend,ilen,jlen 
INTEGER :: iprocs, jprocs, nprocs integer,allocatable,dimension(:):: 
rcounts, displs INTEGER :: rcounts0,displs0 integer, PARAMETER :: ROOT 
= 0 integer :: dims(2),coords(2) logical :: periods(2) data 
periods/2*.false./ integer :: status(MPI_STATUS_SIZE) integer :: 
comm2d,source,myrank integer :: newtype, resizedtype integer :: 
comsize,charsize integer(kind=MPI_ADDRESS_KIND) :: extent, begin CALL 
MPI_INIT(ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr) CALL 
MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) ! Get a new communicator 
for a decomposition of the domain. dims(1) = 0 dims(2) = 0 CALL 
MPI_DIMS_CREATE(nprocs,2,dims,ierr) if (myrank.EQ.Root) then print 
*,nprocs,'processors have been arranged 
into',dims(1),'X',dims(2),'grid' endif CALL 
MPI_CART_CREATE(MPI_COMM_WORLD,2,dims,periods,.true., & comm2d,ierr) ! 
Get my position in this communicator CALL 
MPI_COMM_RANK(comm2d,myrank,ierr) ! Get the decomposition CALL 
fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend) ! print 
*,ista,jsta,iend,jend ilen = iend - ista + 1 jlen = jend - jsta + 1 
CALL MPI_Cart_get(comm2d,2,dims,periods,coords,ierr) iprocs = dims(1) 
jprocs = dims(2) ! define the global matrix if (myrank==ROOT) then 
count = 0 do j = 1,n do i = 1,m a(i,j) = count count = count+1 enddo 
enddo print *, 'global matrix is: ' do 90 i=1,m do 80 j = 1,n 
write(*,70)a(i,j) 70 format(2x,I5,$) 80 continue print *, ' ' 90 
continue endif call MPI_Barrier(MPI_COMM_WORLD, ierr) starts = [0,0] 
sizes = [m, n] subsizes = [ilen, jlen] call 
MPI_Type_create_subarray(2, sizes, subsizes, starts, & 
MPI_ORDER_FORTRAN, MPI_INTEGER, & newtype, ierr) call 
MPI_Type_size(MPI_INTEGER, charsize, ierr) begin = 0 extent = charsize 
call MPI_Type_create_resized(newtype, begin, extent, resizedtype, 
ierr) call MPI_Type_commit(resizedtype, ierr) ! get counts and 
displacmeents allocate(rcounts(nprocs),displs(nprocs)) rcounts0 = 1 
displs0 = (ista-1) + (jsta-1)*m CALL 
MPI_Allgather(rcounts0,1,MPI_INT,rcounts,1,MPI_INT,MPI_COMM_WORLD,IERR) 
CALL 
MPI_Allgather(displs0,1,MPI_INT,displs,1,MPI_INT,MPI_COMM_WORLD,IERR) 
CALL MPI_Barrier(MPI_COMM_WORLD, ierr) ! scatter data 
allocate(loc(ilen,jlen)) call 
MPI_Scatterv(a,rcounts,displs,resizedtype, & 
loc,ilen*jlen,MPI_INTEGER, & ROOT,MPI_COMM_WORLD,ierr) ! print each 
processor matrix do source = 0,nprocs-1 if (myrank.eq.source) then 
print *,'myrank:',source do i=1,ilen do j = 1,jlen 
write(*,701)loc(i,j) 701 format(2x,I5,$) enddo print *, ' ' enddo 
endif call MPI_Barrier(MPI_COMM_WORLD, ierr) enddo call 
MPI_Type_free(newtype,ierr) call MPI_Type_free(resizedtype,ierr) 
deallocate(rcounts,displs) deallocate(loc) CALL MPI_FINALIZE(ierr) 
contains subroutine fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend) 
integer comm2d integer m,n,ista,jsta,iend,jend integer 
dims(2),coords(2),ierr logical periods(2) ! Get (i,j) position of a 
processor from Cartesian topology. CALL 
MPI_Cart_get(comm2d,2,dims,periods,coords,ierr) ! Decomposition in 
first (ie. X) direction CALL 
MPE_DECOMP1D(m,dims(1),coords(1),ista,iend) ! Decomposition in second 
(ie. Y) direction CALL MPE_DECOMP1D(n,dims(2),coords(2),jsta,jend) end 
subroutine fnd2ddecomp SUBROUTINE MPE_DECOMP1D(n,numprocs,myid,s,e) 
integer n,numprocs,myid,s,e,nlocal,deficit nlocal = n / numprocs s = 
myid * nlocal + 1 deficit = mod(n,numprocs) s = s + min(myid,deficit) 
! Give one more slice to processors if (myid .lt. deficit) then nlocal 
= nlocal + 1 endif e = s + nlocal - 1 if (e .gt. n .or. myid .eq. 
numprocs-1) e = n end subroutine MPE_DECOMP1D END program SendRecv |


I am generating a 4x4 matrix, and using scatterv I am sending the 
blocks of matrices to other processors. Code works fine for 4,2 and 16 
processors. But throws a error for three processors. What 
modifications I have to do make it work for any number of given 
processors.


Global 

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet

Gabriele,


can you

mpirun --mca btl_base_verbose 100 -np 2 ...


so we can figure out why nor sm nor vader is used ?


Cheers,


Gilles


On 5/19/2017 4:23 PM, Gabriele Fatigati wrote:

Oh no, by using two procs:


findActiveDevices Error
We found no active IB device ports
findActiveDevices Error
We found no active IB device ports
--
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[12380,1],0]) is on host: openpower
  Process 2 ([[12380,1],1]) is on host: openpower
  BTLs attempted: self

Your MPI job is now going to abort; sorry.
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
--
MPI_INIT has failed because at least one MPI process is unreachable
from another.  This *usually* means that an underlying communication
plugin -- such as a BTL or an MTL -- has either not loaded or not
allowed itself to be used.  Your MPI job will now abort.

You may wish to try to narrow down the problem;
 * Check the output of ompi_info to see which BTL/MTL plugins are
   available.
 * Run your application with MPI_THREAD_SINGLE.
 * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
   if using MTL-based communications) to see exactly which
   communication plugins were considered and/or discarded.
--
[openpower:88867] 1 more process has sent help message 
help-mca-bml-r2.txt / unreachable proc
[openpower:88867] Set MCA parameter "orte_base_help_aggregate" to 0 to 
see all help / error messages
[openpower:88867] 1 more process has sent help message 
help-mpi-runtime.txt / mpi_init:startup:pml-add-procs-fail






2017-05-19 9:22 GMT+02:00 Gabriele Fatigati <g.fatig...@cineca.it 
<mailto:g.fatig...@cineca.it>>:


Hi GIlles,

using your command with one MPI procs I get:

findActiveDevices Error
We found no active IB device ports
Hello world from rank 0  out of 1 processors

So it seems to work apart the error message.


2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet <gil...@rist.or.jp
<mailto:gil...@rist.or.jp>>:

Gabriele,


so it seems pml/pami assumes there is an infiniband card
available (!)

i guess IBM folks will comment on that shortly.


meanwhile, you do not need pami since you are running on a
single node

mpirun --mca pml ^pami ...

should do the trick

(if it does not work, can run and post the logs)

mpirun --mca pml ^pami --mca pml_base_verbose 100 ...


Cheers,


Gilles


On 5/19/2017 4:01 PM, Gabriele Fatigati wrote:

Hi John,
Infiniband is not used, there is a single node on this
machine.

2017-05-19 8:50 GMT+02:00 John Hearns via users
<users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>
<mailto:users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>>>:

Gabriele,   pleae run  'ibv_devinfo'
It looks to me like you may have the physical
interface cards in
these systems, but you do not have the correct drivers or
libraries loaded.

I have had similar messages when using Infiniband on
x86 systems -
which did not have libibverbs installed.


On 19 May 2017 at 08:41, Gabriele Fatigati
<g.fatig...@cineca.it <mailto:g.fatig...@cineca.it>
<mailto:g.fatig...@cineca.it
<mailto:g.fatig...@cineca.it>>> wrote:

Hi Gilles, using your command:

[openpower:88536] mca: base: components_register:
registering
framework pml components
[openpower:88536] mca: base: components_register:
found loaded
component pami
[openpower:88536] mca: base: components_register:
component
pami register function successful
[openpow

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet

Gabriele,


i am sorry, i really meant

mpirun --mca pml ^pami --mca btl_base_verbose 100 ...


Cheers,

Gilles

On 5/19/2017 4:28 PM, Gabriele Fatigati wrote:

Using:

mpirun --mca pml ^pami --mca pml_base_verbose 100  -n 2  ./prova_mpi

I attach the output

2017-05-19 9:16 GMT+02:00 John Hearns via users 
<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>:


Gabriele,
as Gilles says if you are running within a single host system, you
don not need the pami layer.
Usually you would use the btls  sm,selfthough I guess 'vader'
is the more up to date choice

On 19 May 2017 at 09:10, Gilles Gouaillardet <gil...@rist.or.jp
<mailto:gil...@rist.or.jp>> wrote:

Gabriele,


so it seems pml/pami assumes there is an infiniband card
available (!)

i guess IBM folks will comment on that shortly.


meanwhile, you do not need pami since you are running on a
single node

mpirun --mca pml ^pami ...

should do the trick

(if it does not work, can run and post the logs)

mpirun --mca pml ^pami --mca pml_base_verbose 100 ...


Cheers,


Gilles


On 5/19/2017 4:01 PM, Gabriele Fatigati wrote:

Hi John,
Infiniband is not used, there is a single node on this
machine.

2017-05-19 8:50 GMT+02:00 John Hearns via users
<users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>
<mailto:users@lists.open-mpi.org
<mailto:users@lists.open-mpi.org>>>:

Gabriele,   pleae run  'ibv_devinfo'
It looks to me like you may have the physical
interface cards in
these systems, but you do not have the correct drivers or
libraries loaded.

I have had similar messages when using Infiniband on
x86 systems -
which did not have libibverbs installed.


On 19 May 2017 at 08:41, Gabriele Fatigati
<g.fatig...@cineca.it <mailto:g.fatig...@cineca.it>
<mailto:g.fatig...@cineca.it
<mailto:g.fatig...@cineca.it>>> wrote:

Hi Gilles, using your command:

[openpower:88536] mca: base: components_register:
registering
framework pml components
[openpower:88536] mca: base: components_register:
found loaded
component pami
[openpower:88536] mca: base: components_register:
component
pami register function successful
[openpower:88536] mca: base: components_open:
opening pml
components
[openpower:88536] mca: base: components_open:
found loaded
component pami
[openpower:88536] mca: base: components_open:
component pami
open function successful
[openpower:88536] select: initializing pml
component pami
findActiveDevices Error
We found no active IB device ports
[openpower:88536] select: init returned failure
for component pami
[openpower:88536] PML pami cannot be selected
   
--

No components were able to be opened in the pml
framework.

This typically means that either no components of
this type were
installed, or none of the installed componnets can
be loaded.
Sometimes this means that shared libraries
required by these
components are unable to be found/loaded.

  Host:  openpower
  Framework: pml
   
------



2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet
<gil...@rist.or.jp <mailto:gil...@rist.or.jp>
<mailto:gil...@rist.or.jp <mailto:gil...@rist.or.jp>>>:

Gabriele,


pml/pami is here, at least according to ompi_info


can you update your mpirun command like this

mpirun --mca pml_base_verbose 100 ..


and post the output ?


Cheers,

Gilles

On 5/18/2017 10:41 PM, Gabriele Fatigati wrote:

    Hi Gilles, attached the requested info

2017-05-1

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-18 Thread Gilles Gouaillardet
Gabriele,

can you
ompi_info --all | grep pml

also, make sure there is nothing in your environment pointing to an other
Open MPI install
for example
ldd a.out
should only point to IBM libraries

Cheers,

Gilles

On Thursday, May 18, 2017, Gabriele Fatigati  wrote:

> Dear OpenMPI users and developers, I'm using IBM Spectrum MPI 10.1.0 based
> on OpenMPI, so I hope there are some MPI expert can help me to solve the
> problem.
>
> When I run a simple Hello World MPI program, I get the follow error
> message:
>
> A requested component was not found, or was unable to be opened.  This
> means that this component is either not installed or is unable to be
> used on your system (e.g., sometimes this means that shared libraries
> that the component requires are unable to be found/loaded).  Note that
> Open MPI stopped checking at the first component that it did not find.
>
> Host:  openpower
> Framework: pml
> Component: pami
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
>   mca_pml_base_open() failed
>   --> Returned "Not found" (-13) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
>
> My sysadmin used official IBM Spectrum packages to install MPI, so It's
> quite strange that there are some components missing (pami). Any help?
> Thanks
>
> --
> Ing. Gabriele Fatigati
>
> HPC specialist
>
> SuperComputing Applications and Innovation Department
>
> Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
>
> www.cineca.itTel:   +39 051 6171722
>
> g.fatigati [AT] cineca.it
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-05-23 Thread Gilles Gouaillardet

Tim,


On 5/18/2017 2:44 PM, Tim Jim wrote:


In summary, I have attempted to install OpenMPI on Ubuntu 16.04 to the 
following prefix: /opt/openmpi-openmpi-2.1.0. I have also manually 
added the following to my .bashrc:

export PATH="/opt/openmpi/openmpi-2.1.0/bin:$PATH"
MPI_DIR=/opt/openmpi/openmpi-2.1.0
export LD_LIBRARY_PATH=$MPI_DIR/lib:$LD_LIBRARY_PATH

I later became aware that Ubuntu may handle the LD_LIBRARY_PATH 
differently and instead added a new file containing the library 
path /opt/openmpi/openmpi-2.1.0/lib to 
/etc/ld.so.conf.d/openmpi-2-1-0.conf, in the style of everything else 
in that directory.



about that specific issue, from a linux point of view, you have two options
- have LD_LIBRARY_PATH set in your environment (manually on a per shell 
basis, via .bashrc on a per user basis, via /etc/profile.d/ompi.sh on a 
node basis)
- system wide via /etc/ld.so.conf.d/openmpi-2-1-0.conf (note you must 
run 'ldconfig' after this file is created/updated)


Open MPI gives you an other option (that i always use and usually 
recommends) :

configure with --enable-mpirun-prefix-by-default
as long as you do not plan to relocate the openmpi install directory, 
you can use this option and you will not need to worry about LD_LIBRARY_PATH
any more (and if you have several install of openmpi, they will live 
together in harmony)


fwiw, on HPC clusters, sysadmin usually install 'modules' or 'lmod', 
which is a user friendly way to set your environment with what you need.



Cheers,

Gilles
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Build problem

2017-05-24 Thread Gilles Gouaillardet

Andy,


it looks like some MPI libraries are being mixed in your environment


from the test/datatype directory, what if you

ldd .libs/lt-external32

does it resolve the the libmpi.so you expect ?


Cheers,


Gilles


On 5/25/2017 11:02 AM, Andy Riebs wrote:

Hi,

I'm trying to build OMPI on RHEL 7.2 with MOFED on an x86_64 system, 
and I'm seeing


=
   Open MPI gitclone: test/datatype/test-suite.log
=

# TOTAL: 9
# PASS:  8
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: external32



/data/swstack/packages/shmem-mellanox/openmpi-gitclone/test/datatype/.libs/lt-external32:
symbol lookup error:

/data/swstack/packages/shmem-mellanox/openmpi-gitclone/test/datatype/.libs/lt-external32:
undefined symbol: ompi_datatype_pack_external_size
FAIL external32 (exit status: 127)

I'm probably missing an obvious library or package, but 
libc++-devel.i686 and glibc-devel.i686 didn't cover this for me.


Alex, I'd like to buy a clue, please?

Andy
--
Andy Riebs
andy.ri...@hpe.com
Hewlett-Packard Enterprise
High Performance Computing Software Engineering
+1 404 648 9024
My opinions are not necessarily those of HPE
 May the source be with you!


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Hello world Runtime error: Primary job terminated normally, but 1 process returned a non-zero exit code.

2017-05-22 Thread Gilles Gouaillardet
Hi,

what if you
mpirun -np 4 ./test

Cheers,

Gilles

On Monday, May 22, 2017, Pranav Sumanth  wrote:

> Hello All,
>
> I'm able to successfully compile my code when I execute the make command.
> However, when I run the code as:
>
> mpirun -np 4 test
>
> The error generated is:
>
> ---
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> ---
> --
> mpirun detected that one or more processes exited with non-zero status, thus 
> causing
> the job to be terminated. The first process to do so was:
>
>   Process name: [[63067,1],2]
>   Exit code:1
> --
>
> I have no multiple mpi installations so I don't expect there to be a
> problem.
>
> I've been having trouble with my Hello World OpenMPI program. My main file
> is :
>
> #include 
> #include "mpi.h"
>
> using namespace std;
>
> int main(int argc, const char * argv[]) {
>
>
> MPI_Init(NULL, NULL);
>
> int size, rank;
>
> MPI_Comm_size(MPI_COMM_WORLD, );
> MPI_Comm_rank(MPI_COMM_WORLD, );
>
> cout << "The number of spawned processes are " << size << "And this is 
> the process " << rank;
>
> MPI_Finalize();
>
>
> return 0;
>
> }
>
> My makefile is:
>
> # Compiler
> CXX = mpic++
>
> # Compiler flags
> CFLAGS = -Wall -lm
>
> # Header and Library Paths
> INCLUDE = -I/usr/local/include -I/usr/local/lib -I..
> LIBRARY_INCLUDE = -L/usr/local/lib
> LIBRARIES = -l mpi
>
> # the build target executable
> TARGET = test
>
> all: $(TARGET)
>
> $(TARGET): main.cpp
> $(CXX) $(CFLAGS) -o $(TARGET) main.cpp $(INCLUDE) $(LIBRARY_INCLUDE) 
> $(LIBRARIES)
>
>
> clean:
> rm $(TARGET)
>
> The output of: mpic++ --version is:
>
> Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
> --with-gxx-include-dir=/usr/include/c++/4.2.1
> Apple LLVM version 8.1.0 (clang-802.0.42)
> Target: x86_64-apple-darwin16.5.0
> Thread model: posix
> InstalledDir: 
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>
> And that for mpirun --version is:
>
> mpirun (Open MPI) 2.1.1
>
> Report bugs to http://www.open-mpi.org/community/help/
>
> What could be causing the issue?
>
> Thanks,
> Best Regards,
> Pranav
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet

Allan,


- on which node is mpirun invoked ?

- are you running from a batch manager ?

- is there any firewall running on your nodes ?

- how many interfaces are part of bond0 ?


the error is likely occuring when wiring-up mpirun/orted

what if you

mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 true


then (if the previous command worked)

mpirun -np 12 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 true


and finally (if both previous commands worked)

mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 ring



Cheers,

Gilles


On 5/19/2017 3:07 PM, Allan Overstreet wrote:
I experiencing many different errors with openmpi version 2.1.1. I 
have had a suspicion that this might be related to the way the servers 
were connected and configured. Regardless below is a diagram of how 
the server are configured.


__  _
   [__]|=|
   /::/|_|
   HOST: smd
   Dual 1Gb Ethernet Bonded
   .-> Bond0 IP: 192.168.1.200
   |   Infiniband Card: MHQH29B-XTR <.
   |   Ib0 IP: 10.1.0.1  |
   |   OS: Ubuntu Mate   |
   |   __ _ |
   | [__]|=||
   | /::/|_||
   |   HOST: sm1 |
   |   Dual 1Gb Ethernet Bonded  |
   |-> Bond0 IP: 192.168.1.196   |
   |   Infiniband Card: QLOGIC QLE7340 <-|
   |   Ib0 IP: 10.1.0.2  |
   |   OS: Centos 7 Minimal  |
   |   __ _ |
   | [__]|=||
   |-. /::/|_||
   | | HOST: sm2 |
   | | Dual 1Gb Ethernet Bonded  |
   | '---> Bond0 IP: 192.168.1.199   |
   __  Infiniband Card: QLOGIC QLE7340 __
  [_|||_°] Ib0 IP: 10.1.0.3 [_|||_°]
  [_|||_°] OS: Centos 7 Minimal [_|||_°]
  [_|||_°] __ _ [_|||_°]
   Gb Ethernet Switch [__]|=| Voltaire 4036 QDR Switch
   | /::/|_| |
   |   HOST: sm3  |
   |   Dual 1Gb Ethernet Bonded   |
   |-> Bond0 IP: 192.168.1.203|
   |   Infiniband Card: QLOGIC QLE7340 <--|
   |   Ib0 IP: 10.1.0.4   |
   |   OS: Centos 7 Minimal   |
   |  __ _   |
   | [__]|=|  |
   | /::/|_|  |
   |   HOST: sm4  |
   |   Dual 1Gb Ethernet Bonded   |
   |-> Bond0 IP: 192.168.1.204|
   |   Infiniband Card: QLOGIC QLE7340 <--|
   |   Ib0 IP: 10.1.0.5   |
   |   OS: Centos 7 Minimal   |
   | __ _|
   | [__]|=|   |
   | /::/|_|   |
   |   HOST: dl580|
   |   Dual 1Gb Ethernet Bonded   |
   '-> Bond0 IP: 192.168.1.201|
   Infiniband Card: QLOGIC QLE7340 <--'
   Ib0 IP: 10.1.0.6
   OS: Centos 7 Minimal

I have ensured that the Infiniband adapters can ping each other and 
every node can passwordless ssh into every other node. Every node has 
the same /etc/hosts file,


cat /etc/hosts

127.0.0.1localhost
192.168.1.200smd
192.168.1.196sm1
192.168.1.199sm2
192.168.1.203sm3
192.168.1.204sm4
192.168.1.201dl580

10.1.0.1smd-ib
10.1.0.2sm1-ib
10.1.0.3sm2-ib
10.1.0.4sm3-ib
10.1.0.5sm4-ib
10.1.0.6dl580-ib

I have been using a simple ring test program to test openmpi. The code 
for this program is attached.


The hostfile used in all the commands is,

cat ./nodes

smd slots=2
sm1 

Re: [OMPI users] Problems with IPoIB and Openib

2017-05-28 Thread Gilles Gouaillardet

Allan,


the "No route to host" error indicates there is something going wrong 
with IPoIB on your cluster


(and Open MPI is not involved whatsoever in that)

on sm3 and sm4, you can run

/sbin/ifconfig

brctl show

iptables -L

iptables -t nat -L

we might be able to figure out what is going wrong from that.


if there is no mca_btl_openib.so component, it is likely the infiniband 
headers are not available on the node you compiled Open MPI.


i guess if you configure Open MPI with

--with-verbs

it will abort if the headers are not found.

in this case, simply install them and rebuild Open MPI.

if you are unsure about that part, please compress and post your 
config.log so we can have a look at it



Cheers,


gilles


On 5/29/2017 1:03 PM, Allan Overstreet wrote:

Gilles,

I was able to ping sm4 from sm3 and sm3 from sm4. However running 
netcat from sm4 and sm5 using the commands.


[allan@sm4 ~]$ nc -l 1234

and

[allan@sm3 ~]$ echo hello | nc 10.1.0.5 1234
Ncat: No route to host.

Testing this on other nodes,

[allan@sm2 ~]$ nc -l 1234

and

[allan@sm1 ~]$ echo hello | nc 10.1.0.3 1234
Ncat: No route to host.

These nodes do not have firewalls installed, so I am confused why this 
traffic isn't getting through.


I am compiling openmpi from source and the shared library 
/home/allan/software/openmpi/install/lib/openmpi/mca_btl_openib.so 
doesn't exist.



On 05/27/2017 11:25 AM, gil...@rist.or.jp wrote:

Allan,

about IPoIB, the error message (no route to host) is very puzzling.
did you double check IPoIB is ok between all nodes ?
this error message suggests IPoIB is not working between sm3 and sm4,
this could be caused by the subnet manager, or a firewall.
ping is the first tool you should use to test that, then you can use nc
(netcat).
for example, on sm4
nc -l 1234
on sm3
echo hello | nc 10.1.0.5 1234
(expected result: "hello" should be displayed on sm4)

about openib, you first need to double check the btl/openib was built.
assuming you did not configure with --disable-dlopen, you should have a
mca_btl_openib.so
file in /.../lib/openmpi. it should be accessible by the user, and
ldd /.../lib/openmpi/mca_btl_openib.so
should not have any unresolved dependencies on *all* your nodes

Cheers,

Gilles

- Original Message -

I have been having some issues with using openmpi with tcp over IPoIB
and openib. The problems arise when I run a program that uses basic
collective communication. The two programs that I have been using are
attached.

*** IPoIB ***

The mpirun command I am using to run mpi over IPoIB is,
mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_include
10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self -hostfile nodes
-np 8 ./avg 8000

This program will appear to run on the nodes, but will sit at 100% CPU
and use no memory. On the host node an error will be printed,

[sm1][[58411,1],0][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.3 failed: No route to host (113)

Using another program,

mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_if_

include

10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self -hostfile nodes
-np 8 ./congrad 800
Produces the following result. This program will also run on the nodes
sm1, sm2, sm3, and sm4 at 100% and use no memory.
[sm3][[61383,1],4][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.5 failed: No route to host (113)
[sm4][[61383,1],6][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.4 failed: No route to host (113)
[sm2][[61383,1],3][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.2 failed: No route to host (113)
[sm3][[61383,1],5][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.5 failed: No route to host (113)
[sm4][[61383,1],7][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.4 failed: No route to host (113)
[sm2][[61383,1],2][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.2 failed: No route to host (113)
[sm1][[61383,1],0][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.3 failed: No route to host (113)
[sm1][[61383,1],1][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_

complete_connect]

connect() to 10.1.0.3 failed: No route to host (113)

*** openib ***

Running the congrad program over openib will produce the result,
mpirun --mca btl self,sm,openib --mca mtl ^psm --mca btl_tcp_if_

include

10.1.0.0/24 -hostfile nodes -np 8 ./avg 800
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now

abort,

***and potentially your MPI job)
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now

abort,

***and potentially your MPI job)

Re: [OMPI users] Problems with IPoIB and Openib

2017-05-29 Thread Gilles Gouaillardet

Alan,


note you do not have to use the *-ib hostnames in your host_file

these are only used for SSH, so since oob/tcp is running on your 
ethernet network, i guess you really want to use sm3 and sm4 host names.



did you also run the same netcat test but in the other direction ?
do you run 'mpirun' on sm3 ?

here are a few tests you can perform
- run tcp over ethernet
  mpirun --mca btl_tcp_if_include 192.168.1.0/24 ...
- run all 4 tasks on sm3 (host_file contains one line "sm3 slots=4") 
with tcp (e.g. --mca btl tcp,self)

- run with verbose oob and tcp
  mpirun --mca btl_base_verbose 100 --mca oob_base_verbose 100 ...

when your app hangs, you can manually run
pstack 
on the 4 MPI tasks
so we can get an idea of where they are stuck

Cheers,

Gilles

On 5/30/2017 3:49 AM, Allan Overstreet wrote:

Gilles,

OpenMPI is now working using openib on nodes sm3 and sm4! However I am 
still having some trouble getting openmpi to work over IPoIB. Using 
the command,


mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_include 
10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self -hostfile 
host_file -np 4 ./congrad 400


with the hostfile,

[allan@sm3 proj]$ cat host_file
sm3-ib slots=2
sm4-ib slots=2

Will cause the command to hang.

I ran your netcat test again on sm3 and sm4,

[allan@sm3 proj]$ echo hello | nc 10.1.0.5 1234

[allan@sm4 ~]$ nc -l 1234
hello
[allan@sm4 ~]$

Thanks,
Allan

On 05/29/2017 02:14 AM, Gilles Gouaillardet wrote:

Allan,


a firewall is running on your nodes as evidenced by the iptables 
outputs.


if you do not need it, then you can simply disable it.


otherwise, you can run

iptables -I INPUT -i ib0 -j ACCEPT

iptables -I OUTPUT -o ib0 -j ACCEPT

on all your nodes and that might help

- note this allows *all* traffic on IPoIB

- some other rules in the 'nat' table might block some traffic


Cheers,


Gilles


On 5/29/2017 3:05 PM, Allan Overstreet wrote:

** ifconfig **

[allan@sm3 ~]$ ifconfig
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 192.168.1.203  netmask 255.255.255.0  broadcast 
192.168.1.255
inet6 fe80::225:90ff:fe51:aaad  prefixlen 64  scopeid 
0x20

ether 00:25:90:51:aa:ad  txqueuelen 1000  (Ethernet)
RX packets 7987  bytes 7158426 (6.8 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 4310  bytes 368291 (359.6 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0

enp4s0f0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 00:25:90:51:aa:ad  txqueuelen 1000  (Ethernet)
RX packets 3970  bytes 3576526 (3.4 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 2154  bytes 183276 (178.9 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
device memory 0xfbde-fbdf

enp4s0f1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 00:25:90:51:aa:ad  txqueuelen 1000  (Ethernet)
RX packets 4017  bytes 3581900 (3.4 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 2159  bytes 185665 (181.3 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0
device memory 0xfbd6-fbd7

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 2044
inet 10.1.0.4  netmask 255.255.255.0  broadcast 10.1.0.255
inet6 fe80::211:7500:79:90f6  prefixlen 64  scopeid 0x20
Infiniband hardware address can be incorrect! Please read BUGS 
section in ifconfig(8).
infiniband 
80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 
txqueuelen 256  (InfiniBand)

RX packets 923  bytes 73596 (71.8 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 842  bytes 72724 (71.0 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
inet 127.0.0.1  netmask 255.0.0.0
inet6 ::1  prefixlen 128  scopeid 0x10
loop  txqueuelen 1  (Local Loopback)
RX packets 80  bytes 7082 (6.9 KiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 80  bytes 7082 (6.9 KiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0

[allan@sm4 openmpi]$ ifconfig
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 192.168.1.204  netmask 255.255.255.0  broadcast 
192.168.1.255
inet6 fe80::225:90ff:fe27:9fe3  prefixlen 64  scopeid 
0x20

ether 00:25:90:27:9f:e3  txqueuelen 1000  (Ethernet)
RX packets 20815  bytes 8291279 (7.9 MiB)
RX errors 0  dropped 0  overruns 0  frame 0
TX packets 17168  bytes 2261794 (2.1 MiB)
TX errors 0  dropped 0 overruns 0  carrier 0 collisions 0

enp4s0f0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 00:25:90:27:9f:e3  txqueuelen 1000  (Ethernet)
RX packets 10365  bytes 4157996 (3.9 MiB)

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread Gilles Gouaillardet

Ralph,


the issue Siegmar initially reported was

loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi


per what you wrote, this should be equivalent to

loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi

and this is what i initially wanted to double check (but i made a typo 
in my reply)



anyway, the logs Siegmar posted indicate the two commands produce the 
same output


--
There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  hello_1_mpi

Either request fewer slots for your application, or make more slots 
available

for use.
--


to me, this is incorrect since the command line made 3 available slots.
also, i am unable to reproduce any of these issues :-(



Siegmar,

can you please post your configure command line, and try these commands 
from loki


mpiexec -np 3 --host loki:2,exin --mca plm_base_verbose 5 hostname
mpiexec -np 1 --host exin --mca plm_base_verbose 5 hostname
mpiexec -np 1 --host exin ldd ./hello_1_mpi

if Open MPI is not installed on a shared filesystem (NFS for example), 
please also double check

both install were built from the same source and with the same options


Cheers,

Gilles
On 5/30/2017 10:20 PM, r...@open-mpi.org wrote:

This behavior is as-expected. When you specify "-host foo,bar”, you have told 
us to assign one slot to each of those nodes. Thus, running 3 procs exceeds the 
number of slots you assigned.

You can tell it to set the #slots to the #cores it discovers on the node by 
using “-host foo:*,bar:*”

I cannot replicate your behavior of "-np 3 -host foo:2,bar:3” running more than 
3 procs



On May 30, 2017, at 5:24 AM, Siegmar Gross 
 wrote:

Hi Gilles,


what if you ?
mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi

I need as many slots as processes so that I use "-np 2".
"mpiexec --host loki,exin -np 2 hello_1_mpi" works as well. The command
breaks, if I use at least "-np 3" and distribute the processes across at
least two machines.

loki hello_1 118 mpiexec --host loki:1,exin:1 -np 2 hello_1_mpi
Process 0 of 2 running on loki
Process 1 of 2 running on exin
Now 1 slave tasks are sending greetings.
Greetings from task 1:
  message type:3
  msg length:  131 characters
  message:
hostname:  exin
operating system:  Linux
release:   4.4.49-92.11-default
processor: x86_64
loki hello_1 119




are loki and exin different ? (os, sockets, core)

Yes, loki is a real machine and exin is a virtual one. "exin" uses a newer
kernel.

loki fd1026 108 uname -a
Linux loki 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (2d3e9d4) 
x86_64 x86_64 x86_64 GNU/Linux

loki fd1026 109 ssh exin uname -a
Linux exin 4.4.49-92.11-default #1 SMP Fri Feb 17 08:29:30 UTC 2017 (8f9478a) 
x86_64 x86_64 x86_64 GNU/Linux
loki fd1026 110

The number of sockets and cores is identical, but the processor types are
different as you can see at the end of my previous email. "loki" uses two
"Intel(R) Xeon(R) CPU E5-2620 v3" processors and "exin" two "Intel Core
Processor (Haswell, no TSX)" from QEMU. I can provide a pdf file with both
topologies (89 K) if you are interested in the output from lstopo. I've
added some runs. Most interesting in my opinion are the last two
"mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi" and
"mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi".
Why does mpiexec create five processes although I've asked for only three
processes? Why do I have to break the program with  for the first
of the above commands?



loki hello_1 110 mpiexec --host loki:2,exin:1 -np 3 hello_1_mpi
--
There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  hello_1_mpi

Either request fewer slots for your application, or make more slots available
for use.
--



loki hello_1 111 mpiexec --host exin:3 -np 3 hello_1_mpi
Process 0 of 3 running on exin
Process 1 of 3 running on exin
Process 2 of 3 running on exin
...



loki hello_1 115 mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi
Process 1 of 3 running on loki
Process 0 of 3 running on loki
Process 2 of 3 running on loki
...

Process 0 of 3 running on exin
Process 1 of 3 running on exin
[exin][[52173,1],1][../../../../../openmpi-v3.x-201705250239-d5200ea/opal/mca/btl/tcp/btl_tcp_endpoint.c:794:mca_btl_tcp_endpoint_complete_connect]
 connect() to 193.xxx.xxx.xxx failed: Connection refused (111)

^Cloki hello_1 116




loki hello_1 116 mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi
Process 0 of 3 running on loki
Process 2 of 3 running on loki
Process 1 of 3 running on loki
...
Process 1 of 3 running on exin
Process 0 of 3 running on 

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet

Siegmar,


the "big ORTE update" is a bunch of backports from master to v3.x

btw, does the same error occurs with master ?


i noted mpirun simply does

ssh exin orted ...

can you double check the right orted (e.g. 
/usr/local/openmpi-3.0.0_64_cc/bin/orted)


or you can try to

mpirun --mca orte_launch_agent /usr/local/openmpi-3.0.0_64_cc/bin/orted ...


Cheers,


Gilles



On 5/31/2017 3:24 PM, Siegmar Gross wrote:

Hi Gilles,

I configured Open MPI with the following command.

../openmpi-v3.x-201705250239-d5200ea/configure \
  --prefix=/usr/local/openmpi-3.0.0_64_cc \
  --libdir=/usr/local/openmpi-3.0.0_64_cc/lib64 \
  --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
  --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
  JAVA_HOME=/usr/local/jdk1.8.0_66 \
  LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack -L/usr/local/lib64 
-L/usr/local/cuda/lib64" \

  CC="cc" CXX="CC" FC="f95" \
  CFLAGS="-m64 -mt -I/usr/local/include -I/usr/local/cuda/include" \
  CXXFLAGS="-m64 -I/usr/local/include -I/usr/local/cuda/include" \
  FCFLAGS="-m64" \
  CPP="cpp -I/usr/local/include -I/usr/local/cuda/include" \
  CXXCPP="cpp -I/usr/local/include -I/usr/local/cuda/include" \
  --enable-mpi-cxx \
  --enable-cxx-exceptions \
  --enable-mpi-java \
  --with-cuda=/usr/local/cuda \
  --with-valgrind=/usr/local/valgrind \
  --enable-mpi-thread-multiple \
  --with-hwloc=internal \
  --without-verbs \
  --with-wrapper-cflags="-m64 -mt" \
  --with-wrapper-cxxflags="-m64" \
  --with-wrapper-fcflags="-m64" \
  --with-wrapper-ldflags="-mt" \
  --enable-debug \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc

Do you know when the fixes pending in the big ORTE update PR
are committed? Perhaps Ralph has a point suggesting not to spend
time with the problem if it may already be resolved. Nevertheless,
I added the requested information after the commands below.


Am 31.05.2017 um 04:43 schrieb Gilles Gouaillardet:

Ralph,


the issue Siegmar initially reported was

loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi


per what you wrote, this should be equivalent to

loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi

and this is what i initially wanted to double check (but i made a 
typo in my reply)



anyway, the logs Siegmar posted indicate the two commands produce the 
same output


-- 

There are not enough slots available in the system to satisfy the 3 
slots

that were requested by the application:
   hello_1_mpi

Either request fewer slots for your application, or make more slots 
available

for use.
-- 




to me, this is incorrect since the command line made 3 available slots.
also, i am unable to reproduce any of these issues :-(



Siegmar,

can you please post your configure command line, and try these 
commands from loki


mpiexec -np 3 --host loki:2,exin --mca plm_base_verbose 5 hostname



loki hello_1 112 mpiexec -np 3 --host loki:2,exin --mca 
plm_base_verbose 5 hostname
[loki:25620] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh 
path NULL
[loki:25620] plm:base:set_hnp_name: initial bias 25620 nodename hash 
3121685933

[loki:25620] plm:base:set_hnp_name: final jobfam 64424
[loki:25620] [[64424,0],0] plm:rsh_setup on agent ssh : rsh path NULL
[loki:25620] [[64424,0],0] plm:base:receive start comm
[loki:25620] [[64424,0],0] plm:base:setup_job
[loki:25620] [[64424,0],0] plm:base:setup_vm
[loki:25620] [[64424,0],0] plm:base:setup_vm creating map
[loki:25620] [[64424,0],0] setup:vm: working unmanaged allocation
[loki:25620] [[64424,0],0] using dash_host
[loki:25620] [[64424,0],0] checking node loki
[loki:25620] [[64424,0],0] ignoring myself
[loki:25620] [[64424,0],0] checking node exin
[loki:25620] [[64424,0],0] plm:base:setup_vm add new daemon [[64424,0],1]
[loki:25620] [[64424,0],0] plm:base:setup_vm assigning new daemon 
[[64424,0],1] to node exin

[loki:25620] [[64424,0],0] plm:rsh: launching vm
[loki:25620] [[64424,0],0] plm:rsh: local shell: 2 (tcsh)
[loki:25620] [[64424,0],0] plm:rsh: assuming same remote shell as 
local shell

[loki:25620] [[64424,0],0] plm:rsh: remote shell: 2 (tcsh)
[loki:25620] [[64424,0],0] plm:rsh: final template argv:
/usr/bin/ssh   orted -mca ess "env" -mca 
ess_base_jobid "4222091264" -mca ess_base_vpid "" -mca 
ess_base_num_procs "2" -mca orte_hnp_uri 
"4222091264.0;tcp://193.174.24.40:38978" -mca orte_node_regex 
"loki,exin" --mca plm_base_verbose "5" -mca plm "rsh"

[loki:25620] [[64424,0],0] plm:rsh:launch daemon 0 not a child of mine
[loki:25620] [[64424,0],0] plm:rsh: adding node exin to launch list
[loki:25620] [[64424,0],0] plm:rsh: activating launch event
[loki:25620] [[64

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet

Thanks Siegmar,


i was finally able to reproduce it.

the error is triggered by the VM topology, and i was able to reproduce 
it by manually removing the "NUMA" objects from the topology.



as a workaround, you can

mpirun --map-by socket ...


i will follow-up on the devel ML with Ralph.



Best regards,


Gilles


On 5/31/2017 4:20 PM, Siegmar Gross wrote:

Hi Gilles,

Am 31.05.2017 um 08:38 schrieb Gilles Gouaillardet:

Siegmar,


the "big ORTE update" is a bunch of backports from master to v3.x

btw, does the same error occurs with master ?


Yes, it does, but the error occurs only if I use a real machine with
my virtual machine "exin". I get the expected result if I use two
real machines and I also get the expected output if I login on exin
and start the command on exin.

exin fd1026 108 mpiexec -np 3 --host loki:2,exin hostname
exin
loki
loki
exin fd1026 108



loki hello_1 111 mpiexec -np 1 --host loki which orted
/usr/local/openmpi-master_64_cc/bin/orted
loki hello_1 111 mpiexec -np 3 --host loki:2,exin hostname
-- 


There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  hostname

Either request fewer slots for your application, or make more slots 
available

for use.
-- 


loki hello_1 112 mpiexec -np 3 --host loki:6,exin:6 hostname
loki
loki
loki
loki hello_1 113 mpiexec -np 3 --host loki:2,exin:6 hostname
-- 


There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  hostname

Either request fewer slots for your application, or make more slots 
available

for use.
-- 


loki hello_1 114 mpiexec -np 3 --host loki:2,nfs1 hostname
loki
loki
nfs1
loki hello_1 115



i noted mpirun simply does

ssh exin orted ...

can you double check the right orted (e.g. 
/usr/local/openmpi-3.0.0_64_cc/bin/orted)


loki hello_1 110 mpiexec -np 1 --host loki which orted
/usr/local/openmpi-3.0.0_64_cc/bin/orted
loki hello_1 111 mpiexec -np 1 --host exin which orted
/usr/local/openmpi-3.0.0_64_cc/bin/orted
loki hello_1 112



or you can try to

mpirun --mca orte_launch_agent 
/usr/local/openmpi-3.0.0_64_cc/bin/orted ...


loki hello_1 112 mpirun --mca orte_launch_agent 
/usr/local/openmpi-3.0.0_64_cc/bin/orted -np 3 --host loki:2,exin 
hostname
-- 


There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  hostname

Either request fewer slots for your application, or make more slots 
available

for use.
-- 


loki hello_1 113



Kind regards

Siegmar



Cheers,


Gilles



On 5/31/2017 3:24 PM, Siegmar Gross wrote:

Hi Gilles,

I configured Open MPI with the following command.

../openmpi-v3.x-201705250239-d5200ea/configure \
  --prefix=/usr/local/openmpi-3.0.0_64_cc \
  --libdir=/usr/local/openmpi-3.0.0_64_cc/lib64 \
  --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
  --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
  JAVA_HOME=/usr/local/jdk1.8.0_66 \
  LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack -L/usr/local/lib64 
-L/usr/local/cuda/lib64" \

  CC="cc" CXX="CC" FC="f95" \
  CFLAGS="-m64 -mt -I/usr/local/include -I/usr/local/cuda/include" \
  CXXFLAGS="-m64 -I/usr/local/include -I/usr/local/cuda/include" \
  FCFLAGS="-m64" \
  CPP="cpp -I/usr/local/include -I/usr/local/cuda/include" \
  CXXCPP="cpp -I/usr/local/include -I/usr/local/cuda/include" \
  --enable-mpi-cxx \
  --enable-cxx-exceptions \
  --enable-mpi-java \
  --with-cuda=/usr/local/cuda \
  --with-valgrind=/usr/local/valgrind \
  --enable-mpi-thread-multiple \
  --with-hwloc=internal \
  --without-verbs \
  --with-wrapper-cflags="-m64 -mt" \
  --with-wrapper-cxxflags="-m64" \
  --with-wrapper-fcflags="-m64" \
  --with-wrapper-ldflags="-mt" \
  --enable-debug \
  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc

Do you know when the fixes pending in the big ORTE update PR
are committed? Perhaps Ralph has a point suggesting not to spend
time with the problem if it may already be resolved. Nevertheless,
I added the requested information after the commands below.


Am 31.05.2017 um 04:43 schrieb Gilles Gouaillardet:

Ralph,


the issue Siegmar initially reported was

loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi


per what you wrote, this should be equivalent to

loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi

and th

Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" error message when using Open MPI 1.6.2

2017-06-08 Thread Gilles Gouaillardet
MPI_Comm_create_group was not available in Open MPI v1.6.
so unless you are willing to create your own subroutine in your
application, you'd rather upgrade to Open MPI v2

i recomment you configure Open MPI with
--disable-dlopen --prefix=

unless you plan to scale on thousands of nodes, you should be just
fine with that.

Cheers,

Gilles


On Thu, Jun 8, 2017 at 6:58 PM, Arham Amouie via users
 wrote:
> Hello. Open MPI 1.6.2 is installed on the cluster I'm using. At the moment I
> can't upgrade Open MPI on the computing nodes of this system. My C code
> contains many calls to MPI functions. When I try to 'make' this code on the
> cluster, the only error that I get is "undefined reference to
> `MPI_Comm_create_group'".
>
> I'm able to install a newer version (like 2.1.1) of Open MPI only on the
> frontend of this cluster. Using newer version, the code is compiled and
> linked successfully. But in this case I face problem in running the program,
> since the newer version of Open MPI is not installed on the computing nodes.
>
> Is there any way that I can compile and link the code using Open MPI 1.6.2?
>
> Thanks,
> Arham Amouei
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-14 Thread Gilles Gouaillardet

Ted,


fwiw, the 'master' branch has the behavior you expect.


meanwhile, you can simple edit your 'dum.sh' script and replace

/home/buildadina/src/aborttest02/aborttest02.exe

with

exec /home/buildadina/src/aborttest02/aborttest02.exe


Cheers,


Gilles


On 6/15/2017 3:01 AM, Ted Sussman wrote:

Hello,

My question concerns MPI_ABORT, indirect execution of executables by mpirun and 
Open
MPI 2.1.1.  When mpirun runs executables directly, MPI_ABORT works as expected, 
but
when mpirun runs executables indirectly, MPI_ABORT does not work as expected.

If Open MPI 1.4.3 is used instead of Open MPI 2.1.1, MPI_ABORT works as 
expected in all
cases.

The examples given below have been simplified as far as possible to show the 
issues.

---

Example 1

Consider an MPI job run in the following way:

mpirun ... -app addmpw1

where the appfile addmpw1 lists two executables:

-n 1 -host gulftown ... aborttest02.exe
-n 1 -host gulftown ... aborttest02.exe

The two executables are executed on the local node gulftown.  aborttest02 calls 
MPI_ABORT
for rank 0, then sleeps.

The above MPI job runs as expected.  Both processes immediately abort when rank 
0 calls
MPI_ABORT.

---

Example 2

Now change the above example as follows:

mpirun ... -app addmpw2

where the appfile addmpw2 lists shell scripts:

-n 1 -host gulftown ... dum.sh
-n 1 -host gulftown ... dum.sh

dum.sh invokes aborttest02.exe.  So aborttest02.exe is executed indirectly by 
mpirun.

In this case, the MPI job only aborts process 0 when rank 0 calls MPI_ABORT.  
Process 1
continues to run.  This behavior is unexpected.



I have attached all files to this E-mail.  Since there are absolute pathnames 
in the files, to
reproduce my findings, you will need to update the pathnames in the appfiles 
and shell
scripts.  To run example 1,

sh run1.sh

and to run example 2,

sh run2.sh

---

I have tested these examples with Open MPI 1.4.3 and 2.0.3.  In Open MPI 1.4.3, 
both
examples work as expected.  Open MPI 2.0.3 has the same behavior as Open MPI 
2.1.1.

---

I would prefer that Open MPI 2.1.1 aborts both processes, even when the 
executables are
invoked indirectly by mpirun.  If there is an MCA setting that is needed to 
make Open MPI
2.1.1 abort both processes, please let me know.


Sincerely,

Theodore Sussman


The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

 File information ---
  File:  config.log.bz2
  Date:  14 Jun 2017, 13:35
  Size:  146548 bytes.
  Type:  Binary


The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

 File information ---
  File:  ompi_info.bz2
  Date:  14 Jun 2017, 13:35
  Size:  24088 bytes.
  Type:  Binary


The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any other MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

 File information ---
  File:  aborttest02.tgz
  Date:  14 Jun 2017, 13:52
  Size:  4285 bytes.
  Type:  Binary


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Double free or corruption problem updated result

2017-06-17 Thread Gilles Gouaillardet
Ashwin,

did you try to run your app with a MPICH-based library (mvapich,
IntelMPI or even stock mpich) ?
or did you try with Open MPI v1.10 ?
the stacktrace does not indicate the double free occurs in MPI...

it seems you ran valgrind vs a shell and not your binary.
assuming your mpirun command is
mpirun lmparbin_all
i suggest you try again with
mpirun --tag-output valgrind lmparbin_all
that will generate one valgrind log per task, but these are prefixed
so it should be easier to figure out what is going wrong

Cheers,

Gilles


On Sun, Jun 18, 2017 at 11:41 AM, ashwin .D  wrote:
> There is a sequential version of the same program COSMO (no reference to
> MPI) that I can run without any problems. Of course it takes a lot longer to
> complete. Now I also ran valgrind (not sure whether that is useful or not)
> and I have enclosed the logs.
>
> On Sat, Jun 17, 2017 at 7:20 PM, ashwin .D  wrote:
>>
>> Hello Gilles,
>>I am enclosing all the information you requested.
>>
>> 1)  as an attachment I enclose the log file
>> 2) I did rebuild OpenMPI 2.1.1 with the --enable-debug feature and I
>> reinstalled it /usr/lib/local.
>> I ran all the examples in the examples directory. All passed except
>> oshmem_strided_puts where I got this message
>>
>> [[48654,1],0][pshmem_iput.c:70:pshmem_short_iput] Target PE #1 is not in
>> valid range
>> --
>> SHMEM_ABORT was invoked on rank 0 (pid 13409, host=a-Vostro-3800) with
>> errorcode -1.
>> --
>>
>>
>> 3) I deleted all old OpenMPI versions under /usr/local/lib.
>> 4) I am using the COSMO weather model - http://www.cosmo-model.org/ to run
>> simulations
>> The support staff claim they have seen no errors with a similar setup.
>> They use
>>
>> 1) gfortran 4.8.5
>> 2) OpenMPI 1.10.1
>>
>> The only difference is I use OpenMPI 2.1.1.
>>
>> 5) I did try this option as well mpirun --mca btl tcp,self -np 4 cosmo.
>> and I got the same error as in the mpi_logs file
>>
>> 6) Regarding compiler and linking options on Ubuntu 16.04
>>
>> mpif90 --showme:compile and --showme:link give me the options for
>> compiling and linking.
>>
>> Here are the options from my makefile
>>
>> -pthread -lmpi_usempi -lmpi_mpifh -lmpi for linking
>>
>> 7) I have a 64 bit OS.
>>
>> Well I think I have responded all of your questions. In any case I have
>> not please let me know and I will respond ASAP. The only thing I have not
>> done is look at /usr/local/include. I saw some old OpenMPI files there. If
>> those need to be deleted I will do after I hear from you.
>>
>> Best regards,
>> Ashwin.
>>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-15 Thread Gilles Gouaillardet
o its own process group upon launch.
>> When we issue a
>> "kill", however, we only issue it to the individual process (instead
>> of the process group
>> that is headed by that child process). This is probably a bug as I
>> don´t believe that is
>> what we intended, but set that aside for now.
>>
>> 2.x: each process is put into its own process group upon launch. When
>> we issue a
>> "kill", we issue it to the process group. Thus, every child proc of
>> that child proc will
>> receive it. IIRC, this was the intended behavior.
>>
>> It is rather trivial to make the change (it only involves 3 lines of
>> code), but I´m not sure
>> of what our intended behavior is supposed to be. Once we clarify that,
>> it is also trivial
>> to add another MCA param (you can never have too many!) to allow you
>> to select the
>> other behavior.
>>
>>
>> On Jun 15, 2017, at 5:23 AM, Ted Sussman <ted.suss...@adina.com>
>> wrote:
>>
>> Hello Gilles,
>>
>> Thank you for your quick answer.  I confirm that if exec is used, both
>> processes
>> immediately
>> abort.
>>
>> Now suppose that the line
>>
>> echo "After aborttest:
>> OMPI_COMM_WORLD_RANK="$OMPI_COMM_WORLD_RANK
>>
>> is added to the end of dum.sh.
>>
>> If Example 2 is run with Open MPI 1.4.3, the output is
>>
>> After aborttest: OMPI_COMM_WORLD_RANK=0
>>
>> which shows that the shell script for the process with rank 0
>> continues after the
>> abort,
>> but that the shell script for the process with rank 1 does not
>> continue after the
>> abort.
>>
>> If Example 2 is run with Open MPI 2.1.1, with exec used to invoke
>> aborttest02.exe, then
>> there is no such output, which shows that both shell scripts do not
>> continue after
>> the abort.
>>
>> I prefer the Open MPI 1.4.3 behavior because our original application
>> depends
>> upon the
>> Open MPI 1.4.3 behavior.  (Our original application will also work if
>> both
>> executables are
>> aborted, and if both shell scripts continue after the abort.)
>>
>> It might be too much to expect, but is there a way to recover the Open
>> MPI 1.4.3
>> behavior
>> using Open MPI 2.1.1?
>>
>> Sincerely,
>>
>> Ted Sussman
>>
>>
>> On 15 Jun 2017 at 9:50, Gilles Gouaillardet wrote:
>>
>> Ted,
>>
>>
>> fwiw, the 'master' branch has the behavior you expect.
>>
>>
>> meanwhile, you can simple edit your 'dum.sh' script and replace
>>
>> /home/buildadina/src/aborttest02/aborttest02.exe
>>
>> with
>>
>> exec /home/buildadina/src/aborttest02/aborttest02.exe
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>> On 6/15/2017 3:01 AM, Ted Sussman wrote:
>> Hello,
>>
>> My question concerns MPI_ABORT, indirect execution of
>> executables by mpirun and Open
>> MPI 2.1.1.  When mpirun runs executables directly, MPI_ABORT
>> works as expected, but
>> when mpirun runs executables indirectly, MPI_ABORT does not
>> work as expected.
>>
>> If Open MPI 1.4.3 is used instead of Open MPI 2.1.1, MPI_ABORT
>> works as expected in all
>> cases.
>>
>> The examples given below have been simplified as far as possible
>> to show the issues.
>>
>> ---
>>
>> Example 1
>>
>> Consider an MPI job run in the following way:
>>
>> mpirun ... -app addmpw1
>>
>> where the appfile addmpw1 lists two executables:
>>
>> -n 1 -host gulftown ... aborttest02.exe
>> -n 1 -host gulftown ... aborttest02.exe
>>
>> The two executables are executed on the local node gulftown.
>>  aborttest02 calls MPI_ABORT
>> for rank 0, then sleeps.
>>
>> The above MPI job runs as expected.  Both processes immediately
>> abort when rank 0 calls
>> MPI_ABORT.
>>
>> ---
>>
>> Example 2
>>
>> Now change the above example as follows:
>>
>> mpirun ... -app addmpw2
>>
>> where the appfile addmpw2 lists shell scripts:
>>
>> -n 1 -host gulftown ... dum.sh
>> -n 1 -host gulftown ... dum.sh
>>
&g

Re: [OMPI users] Double free or corruption problem updated result

2017-06-19 Thread Gilles Gouaillardet

Ashwin,


the valgrind logs clearly indicate you are trying to access some memory 
that was already free'd



for example

[1,0]:==4683== Invalid read of size 4
[1,0]:==4683==at 0x795DC2: __src_input_MOD_organize_input 
(src_input.f90:2318)
[1,0]:==4683==  Address 0xb4001d0 is 0 bytes inside a block of 
size 24 free'd
[1,0]:==4683==by 0x63F3690: free_NC_var (in 
/usr/local/lib/libnetcdf.so.11.0.3)


[1,0]:==4683==by 0x63BB431: nc_close (in 
/usr/local/lib/libnetcdf.so.11.0.3)
[1,0]:==4683==by 0x435A9F: __io_utilities_MOD_close_file 
(io_utilities.f90:995)

[1,0]:==4683==  Block was alloc'd at
[1,0]:==4683==by 0x63F378C: new_x_NC_var (in 
/usr/local/lib/libnetcdf.so.11.0.3)
[1,0]:==4683==by 0x63BAF85: nc_open (in 
/usr/local/lib/libnetcdf.so.11.0.3)

[1,0]:==4683==by 0x547E6F6: nf_open_ (nf_control.F90:189)

so the double-free error could be a side effect of this.

at this stage, i suggest you fix your application, and see if it 
resolves your issue.

(e.g. there is no need to try an other MPI library and/or version for now)

Cheers,

Gilles

On 6/18/2017 2:41 PM, ashwin .D wrote:

Hello Gilles,
   First of all I am extremely grateful for this communication from 
you on a weekend and that too few hours after I
posted my email. Well I am not sure I can go on posting log files as you 
rightly point out that MPI is not the source of the
problem. Still I have enclosed the valgrind log files as you requested. I have 
downloaded the MPICH packages as you suggested
and I am going to install them shortly. But before I do that I think I have a 
clue on the source of my problem(double free or corruption) and I would really 
appreciate
your advice.
As I mentioned before COSMO has been compiled with mpif90 for shared memory 
usage and with gfortran for sequential access.
But it is dependent on a lot of external third party software such as zlib, 
libcurl, hdf5, netcdf and netcdf-fortran. When I
looked at the config.log of those packages all of them had  been compiled with 
gfortran and gcc and some cases g++ with
enable-shared option. So my question then is could that be a source of the 
"mismatch" ?

In other words I would have to recompile all those packages with mpif90 and 
mpicc and then try another test. At the very
least there should be no mixing of gcc/gfortran compiled code with mpif90 
compiled code. Comments ?
Best regards,
Ashwin.

>Ashwin,

>did you try to run your app with a MPICH-based library (mvapich,
>IntelMPI or even stock mpich) ?
>or did you try with Open MPI v1.10 ?
>the stacktrace does not indicate the double free occurs in MPI...
>it seems you ran valgrind vs a shell and not your binary.
>assuming your mpirun command is
>mpirun lmparbin_all
>i suggest you try again with
>mpirun --tag-output valgrind lmparbin_all
>that will generate one valgrind log per task, but these are prefixed
>so it should be easier to figure out what is going wrong

>Cheers,

>Gilles


On Sun, Jun 18, 2017 at 11:41 AM, ashwin .D > wrote:
> There is a sequential version of the same program COSMO (no reference to
> MPI) that I can run without any problems. Of course it takes a lot longer to
> complete. Now I also ran valgrind (not sure whether that is useful or not)
> and I have enclosed the logs.

On Sun, Jun 18, 2017 at 8:11 AM, ashwin .D > wrote:


There is a sequential version of the same program COSMO (no
reference to MPI) that I can run without any problems. Of course
it takes a lot longer to complete. Now I also ran valgrind (not
sure whether that is useful or not) and I have enclosed the logs.

On Sat, Jun 17, 2017 at 7:20 PM, ashwin .D > wrote:

Hello Gilles,
   I am enclosing all the information you
requested.

1)  as an attachment I enclose the log file
2) I did rebuild OpenMPI 2.1.1 with the --enable-debug feature
and I reinstalled it /usr/lib/local.
I ran all the examples in the examples directory. All passed
except oshmem_strided_puts where I got this message

[[48654,1],0][pshmem_iput.c:70:pshmem_short_iput] Target PE #1
is not in valid range

--
SHMEM_ABORT was invoked on rank 0 (pid 13409,
host=a-Vostro-3800) with errorcode -1.

--


3) I deleted all old OpenMPI versions under /usr/local/lib.
4) I am using the COSMO weather model -
http://www.cosmo-model.org/ to run simulations
The support staff claim they have seen no errors with a
similar setup. They use

1) gfortran 4.8.5
2) OpenMPI 1.10.1

The only difference is I use OpenMPI 2.1.1.

5) I did try this option as 

Re: [OMPI users] OMPI users] Double free or corruption with OpenMPI 2.0

2017-06-13 Thread Gilles Gouaillardet
Hi

Can you please post your configure command line for 2.1.1 ?
On which architecture are you running? x86_64 ?

Cheers,

Gilles

"ashwin .D"  wrote:
>Also when I try to build and run a make check I get these errors - Am I clear 
>to proceed or is my installation broken ? This is on Ubuntu 16.04 LTS. 
>
>==
>   Open MPI 2.1.1: test/datatype/test-suite.log
>==
>
># TOTAL: 9
># PASS:  8
># SKIP:  0
># XFAIL: 0
># FAIL:  1
># XPASS: 0
># ERROR: 0
>
>.. contents:: :depth: 2
>
>FAIL: external32
>
>
>/home/t/openmpi-2.1.1/test/datatype/.libs/lt-external32: symbol lookup error: 
>/home/openmpi-2.1.1/test/datatype/.libs/lt-external32: undefined symbol: 
>ompi_datatype_pack_external_size
>FAIL external32 (exit status: 
>
>
>On Tue, Jun 13, 2017 at 5:24 PM, ashwin .D  wrote:
>
>Hello,
>
>  I am using OpenMPI 2.0.0 with a computational fluid dynamics 
>software and I am encountering a series of errors when running this with 
>mpirun. This is my lscpu output
>
>CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 
>Socket(s): 1 and I am running OpenMPI's mpirun in the following
>way
>mpirun -np 4 cfd_software
>
>and I get double free or corruption every single time.
>
>I have two questions -
>
>1) I am unable to capture the standard error that mpirun throws in a file
>How can I go about capturing the standard error of mpirun ? 
>2) Has this error i.e. double free or corruption been reported by others ? Is 
>there a Is a 
>bug fix available ?
>
>Regards,
>Ashwin.
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] [OMPI USERS] Jumbo frames

2017-05-05 Thread Gilles Gouaillardet
Alberto,

Are you saying the program hang even without jumbo frame (aka 1500 MTU) ?
At first, make sure there is no firewall running, and then you can try
mpirun --mca btl tcp,vader,self --mca oob_tcp_if_include eth0 --mca 
btl_tcp_if_include eth0 ...
(Replace eth0 with the interface name you want to use)

Cheers,

Gilles

Alberto Ortiz  wrote:
>I am using version 1.10.6 on archlinux.
>
>The option I should pass to mpirun should then be "-mca btl_tcp_mtu 13000"? 
>Just to be sure.
>
>Thank you,
>
>Alberto
>
>
>El 5 may. 2017 16:26, "r...@open-mpi.org"  escribió:
>
>If you are looking to use TCP packets, then you want to set the send/recv 
>buffer size in the TCP btl, not the openib one, yes?
>
>Also, what version of OMPI are you using?
>
>> On May 5, 2017, at 7:16 AM, Alberto Ortiz  wrote:
>>
>> Hi,
>> I have a program running with openMPI over a network using a gigabit switch. 
>> This switch supports jumbo frames up to 13.000 bytes, so, in order to test 
>> and see if it would be faster communicating with this frame lengths, I am 
>> trying to use them with my program. I have set the MTU in each node to be 
>> 13.000 but when running the program it doesn't even initiate, it gets 
>> blocked. I have tried different lengths from 1.500 up to 13.000 but it 
>> doesn't work with any length.
>>
>> I have searched and only found that I have to set OMPI with "-mca 
>> btl_openib_ib_mtu 13000" or the length to be used, but I don't seem to get 
>> it working.
>>
>> Which are the steps to get OMPI to use larger TCP packets length? Is it 
>> possible to reach 13000 bytes instead of the standard 1500?
>>
>> Thank yo in advance,
>> Alberto
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet

Gabriele,


so it seems pml/pami assumes there is an infiniband card available (!)

i guess IBM folks will comment on that shortly.


meanwhile, you do not need pami since you are running on a single node

mpirun --mca pml ^pami ...

should do the trick

(if it does not work, can run and post the logs)

mpirun --mca pml ^pami --mca pml_base_verbose 100 ...


Cheers,


Gilles


On 5/19/2017 4:01 PM, Gabriele Fatigati wrote:

Hi John,
Infiniband is not used, there is a single node on this machine.

2017-05-19 8:50 GMT+02:00 John Hearns via users 
<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>:


Gabriele,   pleae run  'ibv_devinfo'
It looks to me like you may have the physical interface cards in
these systems, but you do not have the correct drivers or
libraries loaded.

I have had similar messages when using Infiniband on x86 systems -
which did not have libibverbs installed.


On 19 May 2017 at 08:41, Gabriele Fatigati <g.fatig...@cineca.it
<mailto:g.fatig...@cineca.it>> wrote:

Hi Gilles, using your command:

[openpower:88536] mca: base: components_register: registering
framework pml components
[openpower:88536] mca: base: components_register: found loaded
component pami
[openpower:88536] mca: base: components_register: component
pami register function successful
[openpower:88536] mca: base: components_open: opening pml
components
[openpower:88536] mca: base: components_open: found loaded
component pami
[openpower:88536] mca: base: components_open: component pami
open function successful
[openpower:88536] select: initializing pml component pami
findActiveDevices Error
We found no active IB device ports
[openpower:88536] select: init returned failure for component pami
[openpower:88536] PML pami cannot be selected

--
No components were able to be opened in the pml framework.

This typically means that either no components of this type were
installed, or none of the installed componnets can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.

  Host:  openpower
  Framework: pml

--


2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet
<gil...@rist.or.jp <mailto:gil...@rist.or.jp>>:

Gabriele,


pml/pami is here, at least according to ompi_info


can you update your mpirun command like this

mpirun --mca pml_base_verbose 100 ..


and post the output ?


Cheers,

Gilles

On 5/18/2017 10:41 PM, Gabriele Fatigati wrote:

Hi Gilles, attached the requested info

        2017-05-18 15:04 GMT+02:00 Gilles Gouaillardet
<gilles.gouaillar...@gmail.com
<mailto:gilles.gouaillar...@gmail.com>
<mailto:gilles.gouaillar...@gmail.com
<mailto:gilles.gouaillar...@gmail.com>>>:

Gabriele,

can you
ompi_info --all | grep pml

also, make sure there is nothing in your
environment pointing to
an other Open MPI install
for example
ldd a.out
should only point to IBM libraries

Cheers,

Gilles


On Thursday, May 18, 2017, Gabriele Fatigati
<g.fatig...@cineca.it <mailto:g.fatig...@cineca.it>
<mailto:g.fatig...@cineca.it
<mailto:g.fatig...@cineca.it>>> wrote:

Dear OpenMPI users and developers, I'm using
IBM Spectrum MPI
10.1.0 based on OpenMPI, so I hope there are
some MPI expert
can help me to solve the problem.

When I run a simple Hello World MPI program, I
get the follow
error message:


A requested component was not found, or was
unable to be
opened.  This
means that this component is either not
installed or is unable
to be
used on your system (e.g., sometimes this
means that shared
libraries
that the component requires are unable to be
found/loaded). Note that
Open MPI stopped checking

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet

Allan,


i just noted smd has a Mellanox card, while other nodes have QLogic cards.

mtl/psm works best for QLogic while btl/openib (or mtl/mxm) work best 
for Mellanox,


but these are not interoperable. also, i do not think btl/openib can be 
used with QLogic cards


(please someone correct me if i am wrong)


from the logs, i can see that smd (Mellanox) is not even able to use the 
infiniband port.


if you run with 2 MPI tasks, both run on smd and hence btl/vader is 
used, that is why it works


if you run with more than 2 MPI tasks, then smd and other nodes are 
used, and every MPI task fall back to btl/tcp


for inter node communication.

[smd][[41971,1],1][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] 
connect() to 192.168.1.196 failed: No route to host (113)


this usually indicates a firewall, but since both ssh and oob/tcp are 
fine, this puzzles me.



what if you

mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca btl_tcp_if_include 192.168.1.0/24 --mca pml ob1 --mca btl 
tcp,sm,vader,self  ring


that should work with no error messages, and then you can try with 12 
MPI tasks


(note internode MPI communications will use tcp only)


if you want optimal performance, i am afraid you cannot run any MPI task 
on smd (so mtl/psm can be used )


(btw, make sure PSM support was built in Open MPI)

a suboptimal option is to force MPI communications on IPoIB with

/* make sure all nodes can ping each other via IPoIB first */

mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_if_include 
10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self




Cheers,


Gilles


On 5/19/2017 3:50 PM, Allan Overstreet wrote:

Gilles,

On which node is mpirun invoked ?

The mpirun command was involed on node smd.

Are you running from a batch manager?

No.

Is there any firewall running on your nodes ?

No CentOS minimal does not have a firewall installed and Ubuntu 
Mate's firewall is disabled.


All three of your commands have appeared to run successfully. The 
outputs of the three commands are attached.


mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 true &> cmd1


mpirun -np 12 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 true &> cmd2


mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 ring &> cmd3


If I increase the number of processors in the ring program, mpirun 
will not succeed.


mpirun -np 12 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 ring &> cmd4



On 05/19/2017 02:18 AM, Gilles Gouaillardet wrote:

Allan,


- on which node is mpirun invoked ?

- are you running from a batch manager ?

- is there any firewall running on your nodes ?


the error is likely occuring when wiring-up mpirun/orted

what if you

mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 true


then (if the previous command worked)

mpirun -np 12 --hostfile nodes --mca oob_tcp_if_include 
192.168.1.0/24 --mca oob_base_verbose 100 true


and finally (if both previous commands worked)

mpirun -np 2 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 
--mca oob_base_verbose 100 ring



Cheers,

Gilles

On 5/19/2017 3:07 PM, Allan Overstreet wrote:
I experiencing many different errors with openmpi version 2.1.1. I 
have had a suspicion that this might be related to the way the 
servers were connected and configured. Regardless below is a diagram 
of how the server are configured.


__  _
   [__]|=|
   /::/|_|
   HOST: smd
   Dual 1Gb Ethernet Bonded
   .-> Bond0 IP: 192.168.1.200
   |   Infiniband Card: MHQH29B-XTR <.
   |   Ib0 IP: 10.1.0.1  |
   |   OS: Ubuntu Mate   |
   |   __ _ |
   | [__]|=||
   | /::/|_||
   |   HOST: sm1 |
   |   Dual 1Gb Ethernet Bonded  |
   |-> Bond0 IP: 192.168.1.196   |
   |   Infiniband Card: QLOGIC QLE7340 <-|
   |   Ib0 IP: 10.1.0.2  |
   |   OS: Centos 7 Minimal  |
   |   __ _ |
   | [__]|=||
   |-. /::/|_||
   | | HOST: sm2 |
   

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-14 Thread Gilles Gouaillardet
Peter and all,

an easier option is to configure Open MPI with --mpirun-prefix-by-default
this will automagically add rpath to the libs.

Cheers,

Gilles

On Thu, Sep 14, 2017 at 6:43 PM, Peter Kjellström  wrote:
> On Wed, 13 Sep 2017 20:13:54 +0430
> Mahmood Naderan  wrote:
> ...
>> `/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/libc.a(strcmp.o)'
>> can not be used when making an executable; recompile with -fPIE and
>> relink with -pie collect2: ld returned 1 exit status
>>
>>
>> With such an error, I thought it is better to forget static linking!
>> (as it is related to libc) and work with the shared libs and
>> LD_LIBRARY_PATH
>
> First, I think giving up on static linking is the right choice.
>
> If the main thing you were after was the convenience of a binary that
> will run without the need to setup LD_LIBRARY_PATH correctly you should
> have a look at passing -rpath to the linker.
>
> In short, "mpicc -Wl,-rpath=/my/lib/path helloworld.c -o hello", will
> compile a dynamic binary "hello" with built in search path
> to "/my/lib/path".
>
> With OpenMPI this will be added as a "runpath" due to how the wrappers
> are designed. Both rpath and runpath works for finding "/my/lib/path"
> wihtout LD_LIBRARY_PATH but the difference is in priority. rpath is
> higher priority than LD_LIBRARY_PATH etc. and runpath is lower.
>
> You can check your rpath or runpath in a binary using the command
> chrpath (package on rhel/centos/... is chrpath):
>
> $ chrpath hello
> hello: RUNPATH=/my/lib/path
>
> If what you really wanted is the rpath behavior (winning over any
> LD_LIBRARY_PATH in the environment etc.) then you need to modify the
> openmpi wrappers (rebuild openmpi) such that it does NOT pass
> "--enable-new-dtags" to the linker.
>
> /Peter
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-14 Thread Gilles Gouaillardet
Mahmood,

there is a typo, it should be
-Wl,-rpath,/.../

(note the minus before rpath)


Cheers,

Gilles

On Thu, Sep 14, 2017 at 6:58 PM, Mahmood Naderan  wrote:
>>In short, "mpicc -Wl,-rpath=/my/lib/path helloworld.c -o hello", will
>>compile a dynamic binary "hello" with built in search path
>>to "/my/lib/path".
>
> Excuse me... Is that a path or file? I get this:
>
> mpif90 -g -pthread -Wl,rpath=/share/apps/computer/OpenBLAS-0.2.18 -o
> iotk_print_kinds.x iotk_print_kinds.o libiotk.a
> /usr/bin/ld: rpath=/share/apps/computer/OpenBLAS-0.2.18: No such file: No
> such file or directory
> collect2: ld returned 1 exit status
>
>
> However, the lib files are there.
>
> [root@cluster source]# ls -l
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas*
> lrwxrwxrwx 1 nfsnobody nfsnobody   32 Sep  8 14:40
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas.a ->
> libopenblas_bulldozerp-r0.2.18.a
> -rw-r--r-- 1 nfsnobody nfsnobody 28075178 Sep  8 14:41
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas_bulldozerp-r0.2.18.a
> -rwxr-xr-x 1 nfsnobody nfsnobody 14906048 Sep  8 14:41
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas_bulldozerp-r0.2.18.so
> lrwxrwxrwx 1 nfsnobody nfsnobody   33 Sep  8 14:41
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas.so ->
> libopenblas_bulldozerp-r0.2.18.so
> lrwxrwxrwx 1 nfsnobody nfsnobody   33 Sep  8 14:41
> /share/apps/computer/OpenBLAS-0.2.18/libopenblas.so.0 ->
> libopenblas_bulldozerp-r0.2.18.so
>
>
>
> Please note that, I added that option to the linker section of make.inc from
> ESPRESSO
>
> # compiler flags: C, F90, F77
> # C flags must include DFLAGS and IFLAGS
> # F90 flags must include MODFLAGS, IFLAGS, and FDFLAGS with appropriate
> syntax
> CFLAGS = -O3 $(DFLAGS) $(IFLAGS)
> F90FLAGS   = $(FFLAGS) -x f95-cpp-input $(FDFLAGS) $(IFLAGS) $(MODFLAGS)
> FFLAGS = -O3 -g
> # Linker, linker-specific flags (if any)
> # Typically LD coincides with F90 or MPIF90, LD_LIBS is empty
> LD = mpif90
> LDFLAGS= -g -pthread -Wl,rpath=/share/apps/computer/OpenBLAS-0.2.18
> LD_LIBS=
>
>
> Regards,
> Mahmood
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet
Was there an error in the copy/paste ?

The mpicc command should be
mpicc  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c

Cheers,

Gilles

On Fri, Sep 22, 2017 at 3:33 PM, Tim Jim <timothy.m@gmail.com> wrote:

> Thanks for the thoughts and comments. Here is the setup information:
> OpenMPI Ver. 3.0.0. Please see attached for the compressed config.log and
> ompi_info --all call.
> In this compile, my install steps were:
> 1. declared  "export nvml_enable=no" and "export enable_opencl=no" in the
> terminal
> and the rest as seen in the logs:
> 2. ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0
> 3. make all install
>
> I ultimately would like CUDA to be utilised if it can speed up my
> computation time - should I still attempt to get openMPI working without
> CUDA first?
>
> Thanks for the heads up about compiling the executables first - I tried
> mpicc again with the compiled version but got the following output:
>
> tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc /opt/openmpi/openmpi-3.0.0_
> src/examples/hello_c
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_start':
> (.text+0x0): multiple definition of `_start'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.text+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_fini':
> (.fini+0x0): multiple definition of `_fini'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.fini+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.rodata+0x0): multiple
> definition of `_IO_stdin_used'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `data_start':
> (.data+0x0): multiple definition of `__data_start'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.data+0x0):
> first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `data_start':
> (.data+0x8): multiple definition of `__dso_handle'
> /usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o:(.data+0x0): first defined here
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function `_init':
> (.init+0x0): multiple definition of `_init'
> /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.init+0x0):
> first defined here
> /usr/lib/gcc/x86_64-linux-gnu/5/crtend.o:(.tm_clone_table+0x0): multiple
> definition of `__TMC_END__'
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.data+0x10): first
> defined here
> /usr/bin/ld: error in 
> /opt/openmpi/openmpi-3.0.0_src/examples/hello_c(.eh_frame);
> no .eh_frame_hdr table will be created.
> collect2: error: ld returned 1 exit status
>
> Is this due to a failed install?
> Regards,
> Tim
>
>
> On 22 September 2017 at 01:10, Sylvain Jeaugey <sjeau...@nvidia.com>
> wrote:
>
>> The issue is related to openCL, not NVML.
>>
>> So the correct export would be "export enable_opencl=no" (you may want to
>> "export enable_nvml=no" as well).
>>
>>
>> On 09/21/2017 12:32 AM, Tim Jim wrote:
>>
>> Hi,
>>
>> I tried as you suggested: export nvml_enable=no, then reconfigured and
>> ran make all install again, but mpicc is still producing the same error.
>> What should I try next?
>>
>> Many thanks,
>> Tim
>>
>> On 21 September 2017 at 16:12, Gilles Gouaillardet <gil...@rist.or.jp>
>> wrote:
>>
>>> Tim,
>>>
>>>
>>> do that in your shell, right before invoking configure.
>>>
>>> export nvml_enable=no
>>>
>>> ./configure ...
>>>
>>> make && make install
>>>
>>>
>>> you can keep the --without-cuda flag (i think this is unrelated though)
>>>
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On 9/21/2017 3:54 PM, Tim Jim wrote:
>>>
>>>> Dear Gilles,
>>>>
>>>> Thanks for the mail - where should I set export nvml_enable=no? Should
>>>> I reconfigure with default cuda support or keep the --without-cuda flag?
>>>>
>>>> Kind regards,
>>>> Tim
>>>>
>>>> On 21 September 2017 at 15:22, Gilles Gouaillardet <gil...@rist.or.jp
>>>> <mailto:gil...@rist.or.jp>> wrote:
>>>>
>>>> Tim,
>>>>
>>>>
>>>> i am not familiar with CUDA, but that might help
>>>>
>>>> can you please
>>>>
>>>> export nvml_enable=no
>>

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet

Great it is finally working !


nvml and opencl are only used by hwloc, and i do not think Open MPI is 
using these features,


so i suggest you go ahead, reconfigure and rebuild Open MPI and see how 
things go



Cheers,


Gilles


On 9/22/2017 4:59 PM, Tim Jim wrote:

Hi Gilles,

Yes, you're right. I wanted to double check the compile but didn't 
notice I was pointing to the exec I compiled from a previous Make.


mpicc now seems to work, running mpirun hello_c gives:

Hello, world, I am 0 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 3 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 1 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)
Hello, world, I am 2 of 4, (Open MPI v3.0.0, package: Open MPI 
tjim@DESKTOP-TA3P0PS Distribution, ident: 3.0.0, repo rev: v3.0.0, Sep 
12, 2017, 115)


- which I assume that means it's working?

Should I try a recompile with CUDA while declaring "export 
nvml_enable=no" and "export enable_opencl=no"? What effects do these 
declarations have on the normal functioning of mpi?


Many thanks.


On 22 September 2017 at 15:55, Gilles Gouaillardet 
<gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> 
wrote:


Was there an error in the copy/paste ?

The mpicc command should be
mpicc  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c

Cheers,

Gilles

On Fri, Sep 22, 2017 at 3:33 PM, Tim Jim <timothy.m@gmail.com
<mailto:timothy.m@gmail.com>> wrote:

Thanks for the thoughts and comments. Here is the setup
information:
OpenMPI Ver. 3.0.0. Please see attached for the compressed
config.log and ompi_info --all call.
In this compile, my install steps were:
1. declared  "export nvml_enable=no" and "export
enable_opencl=no" in the terminal
and the rest as seen in the logs:
2. ./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0
3. make all install

I ultimately would like CUDA to be utilised if it can speed up
my computation time - should I still attempt to get openMPI
working without CUDA first?

Thanks for the heads up about compiling the executables first
- I tried mpicc again with the compiled version but got the
following output:

tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_start':
(.text+0x0): multiple definition of `_start'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.text+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_fini':
(.fini+0x0): multiple definition of `_fini'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.fini+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.rodata+0x0):
multiple definition of `_IO_stdin_used'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.rodata.cst4+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`data_start':
(.data+0x0): multiple definition of `__data_start'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:(.data+0x0):
first defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`data_start':
(.data+0x8): multiple definition of `__dso_handle'
/usr/lib/gcc/x86_64-linux-gnu/5/crtbegin.o:(.data+0x0): first
defined here
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c: In function
`_init':
(.init+0x0): multiple definition of `_init'

/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o:(.init+0x0):
first defined here
/usr/lib/gcc/x86_64-linux-gnu/5/crtend.o:(.tm_clone_table+0x0):
multiple definition of `__TMC_END__'
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c:(.data+0x10):
first defined here
/usr/bin/ld: error in
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c(.eh_frame); no
.eh_frame_hdr table will be created.
collect2: error: ld returned 1 exit status

Is this due to a failed install?
Regards,
Tim


On 22 September 2017 at 01:10, Sylvain Jeaugey
<sjeau...@nvidia.com <mailto:sjeau...@nvidia.com>> wrote:

The issue is related to openCL, not NVML.

So the correct export would be "export enable_opencl=no"
   

Re: [OMPI users] Strange benchmarks at large message sizes

2017-09-21 Thread Gilles Gouaillardet
Unless you are using mxm, you can disable tcp with

mpirun --mca pml ob1 --mca btl ^tcp ...

coll/tuned select an algorithm based on communicator size and message size. The 
spike could occur because a suboptimal (on your cluster and with your job 
topology) algo is selected.

Note you can force an algo, or redefine the rules of algo selection.

Cheers,

Gilles

Cooper Burns  wrote:
>Ok I tried that ( sorry for delay... Network issues killed our cluster )
>
>Setting the env variable you suggested changed results, but all it did was to 
>move the run time spike from between 4mb and 8mb to between 32kb and 64kb
>
>The nodes I'm running on have infiniband but i think I am running on ethernet 
>for these tests.
>
>Any other ideas?
>
>Thanks!
>
>Cooper
>
>
>Cooper Burns
>
>Senior Research Engineer
>
>    
>
>
>
>(608) 230-1551
>
>convergecfd.com
>
>
>
>
>On Tue, Sep 19, 2017 at 3:44 PM, Howard Pritchard  wrote:
>
>Hello Cooper
>
>
>Could you rerun your test with the following env. variable set
>
>export OMPI_MCA_coll=self,basic,libnbc
>
>and see if that helps?
>
>Also, what type of interconnect are you using - ethernet, IB, ...?
>
>Howard
>
>
>
>2017-09-19 8:56 GMT-06:00 Cooper Burns :
>
>Hello,
>
>I have been running some simple benchmarks and saw some strange behaviour:
>
>All tests are done on 4 nodes with 24 cores each (total of 96 mpi processes)
>
>When I run MPI_Allreduce() I see the run time spike up (about 10x) when I go 
>from reducing a total of 4096KB to 8192KB for example, when count is 2^21 
>(8192 kb of 4 byte ints):
>
>MPI_Allreduce(send_buf, recv_buf, count, MPI_SUM, MPI_COMM_WORLD)
>
>is slower than:
>
>MPI_Allreduce(send_buf, recv_buf, count/2, MPI_INT, MPI_SUM, MPI_COMM_WORLD)
>
>MPI_Allreduce(send_buf + count/2, recv_buf + count/2, count/2,MPI_INT,  
>MPI_SUM, MPI_COMM_WORLD)
>
>Just wondering if anyone knows what the cause of this behaviour is.
>
>Thanks!
>
>Cooper
>
>
>
>Cooper Burns
>
>Senior Research Engineer
>
>    
>
>
>
>(608) 230-1551
>
>convergecfd.com
>
>
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://lists.open-mpi.org/mailman/listinfo/users
>
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://lists.open-mpi.org/mailman/listinfo/users
>
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-20 Thread Gilles Gouaillardet

Thanks for the report,


is this related to https://github.com/open-mpi/ompi/issues/4211 ?

there is a known issue when libnl-3 is installed but libnl-route-3 is not


Cheers,


Gilles


On 9/21/2017 8:53 AM, Stephen Guzik wrote:

When compiling (on Debian stretch), I see:

In file included from libnl_utils.h:52:0,
  from reachable_netlink_utils_common.c:48:
libnl1_utils.h:54:26: error: too few arguments to function ‘nl_geterror’
  #define NL_GETERROR(err) nl_geterror()
   ^
libnl1_utils.h:80:5: note: in expansion of macro ‘NL_GETERROR’
  NL_GETERROR(err)); \
  ^~~
reachable_netlink_utils_common.c:310:5: note: in expansion of macro
‘NL_RECVMSGS’
  NL_RECVMSGS(unlsk->nlh, arg, EHOSTUNREACH, err, out);
  ^~~
In file included from /usr/include/libnl3/netlink/netlink.h:31:0,
  from libnl1_utils.h:47,
  from libnl_utils.h:52,
  from reachable_netlink_utils_common.c:48:
/usr/include/libnl3/netlink/errno.h:56:21: note: declared here
  extern const char * nl_geterror(int);

Modifying openmpi-3.0.0/opal/mca/reachable/netlink/libnl1_utils.h from

#define NL_GETERROR(err) nl_geterror()

to

#define NL_GETERROR(err) nl_geterror(err)

as in libnl3_utils.h allows for successful compilation.  But from
configure, I see

checking for libraries that use libnl v1... (none)
checking for libraries that use libnl v3... ibverbs nl-3

so I wonder if perhaps there is something more serious is going on.  Any
suggestions?

Thanks,
Stephen Guzik

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-20 Thread Gilles Gouaillardet

Stephen,


this is very likely related to the issue already reported in github.

meanwhile, you can apply the attached patch

patch configure < configure.diff

and then re-configure and make.

note this is a temporary workaround, it simply prevent the build of the 
reachable/netlink component,

and the upcoming real fix will be able to build this component.

Cheers,

Gilles

On 9/21/2017 9:22 AM, Gilles Gouaillardet wrote:

Thanks for the report,


is this related to https://github.com/open-mpi/ompi/issues/4211 ?

there is a known issue when libnl-3 is installed but libnl-route-3 is not


Cheers,


Gilles


On 9/21/2017 8:53 AM, Stephen Guzik wrote:

When compiling (on Debian stretch), I see:

In file included from libnl_utils.h:52:0,
  from reachable_netlink_utils_common.c:48:
libnl1_utils.h:54:26: error: too few arguments to function ‘nl_geterror’
  #define NL_GETERROR(err) nl_geterror()
   ^
libnl1_utils.h:80:5: note: in expansion of macro ‘NL_GETERROR’
  NL_GETERROR(err)); \
  ^~~
reachable_netlink_utils_common.c:310:5: note: in expansion of macro
‘NL_RECVMSGS’
  NL_RECVMSGS(unlsk->nlh, arg, EHOSTUNREACH, err, out);
  ^~~
In file included from /usr/include/libnl3/netlink/netlink.h:31:0,
  from libnl1_utils.h:47,
  from libnl_utils.h:52,
  from reachable_netlink_utils_common.c:48:
/usr/include/libnl3/netlink/errno.h:56:21: note: declared here
  extern const char * nl_geterror(int);

Modifying openmpi-3.0.0/opal/mca/reachable/netlink/libnl1_utils.h from

#define NL_GETERROR(err) nl_geterror()

to

#define NL_GETERROR(err) nl_geterror(err)

as in libnl3_utils.h allows for successful compilation.  But from
configure, I see

checking for libraries that use libnl v1... (none)
checking for libraries that use libnl v3... ibverbs nl-3

so I wonder if perhaps there is something more serious is going on.  Any
suggestions?

Thanks,
Stephen Guzik

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users





--- configure.orig  2017-09-21 11:51:58.073688470 +0900
+++ configure   2017-09-21 11:52:30.589454863 +0900
@@ -155813,7 +155813,7 @@
 # If we found everything
 if test $opal_libnlv3_happy -eq 1; then :
   opal_reachable_netlink_LIBS="-lnl-3 -lnl-route-3"
-   OPAL_HAVE_LIBNL3=1
+   OPAL_HAVE_LIBNL3=1; else opal_reachable_netlink_LIBS=""
 fi
 
 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-21 Thread Gilles Gouaillardet

Tim,


i am not familiar with CUDA, but that might help

can you please

export nvml_enable=no

and then re-configure and rebuild Open MPI ?


i hope this will help you


Cheers,


Gilles


On 9/21/2017 3:04 PM, Tim Jim wrote:

Hello,

Apologies to bring up this old thread - I finally had a chance to try 
again with openmpi but I am still have trouble getting it to run. I 
downloaded version 3.0.0 hoping it would solve some of the problems 
but on running mpicc for the previous test case, I am still getting an 
undefined reference error. I did as you suggested and also configured 
it to install without cuda using


./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0

and at the end of the summary, CUDA support shows 'no'. Unfortunately, 
the error is still the same, and for some reason, mpicc still seems to 
have referenced my cuda targets.


tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc 
/opt/mpi4py/mpi4py_src/demo/helloworld.c -o hello.bin
mpicc: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: no 
version information available (required by 
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40)
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined reference 
to `clGetPlatformInfo@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined reference 
to `clGetPlatformIDs@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined reference 
to `clGetDeviceInfo@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined reference 
to `clGetDeviceIDs@OPENCL_1.0'

collect2: error: ld returned 1 exit status

I also attempted to test mpirun, as suggested in the readme, however I 
get the following problem:


 tjim@DESKTOP-TA3P0PS:~/Documents$ mpirun 
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c
mpirun: /usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1: 
no version information available (required by 
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40)

--
Open MPI tried to fork a new process via the "execve" system call but
failed.  Open MPI checks many things before attempting to launch a
child process, but nothing is perfect. This error may be indicative
of another problem on the target host, or even something as silly as
having specified a directory for your application. Your job will now
abort.

  Local host:        DESKTOP-TA3P0PS
  Working dir:       /home/tjim/Documents
  Application name:  /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c
  Error:             Exec format error
--
--
mpirun was unable to start the specified application as it encountered an
error:

Error code: 1
Error name: (null)
Node: DESKTOP-TA3P0PS

when attempting to start process rank 0.
--
4 total processes failed to start
[DESKTOP-TA3P0PS:15231] 3 more processes have sent help message 
help-orte-odls-default.txt / execve error
[DESKTOP-TA3P0PS:15231] Set MCA parameter "orte_base_help_aggregate" 
to 0 to see all help / error messages



Do you have any suggestions to what might have gone wrong on this 
install? I'm not sure if this thread is still alive, so if you need a 
refresh on the situation/any more info, please let me know.

Kind regards,
Tim

On 24 May 2017 at 09:12, Tim Jim > wrote:


Thanks for the thoughts, I'll give it a go. For reference, I have
installed it in the opt directory, as that is where I have kept my
installs currently. Will this be a problem when calling mpi from
other packages?

Thanks,
Tim

On 24 May 2017 06:30, "Reuti" > wrote:

Hi,

Am 23.05.2017 um 05:03 schrieb Tim Jim:

> Dear Reuti,
>
> Thanks for the reply. What options do I have to test whether
it has successfully built?

LIke before: can you compile and run mpihello.c this time –
all as ordinary user in case you installed the Open MPI into
something like $HOME/local/openmpi-2.1.1 and set paths
accordingly. There is no need to be root to install a personal
Open MPI version in your home directory.

-- Reuti


>
> Thanks and kind regards.
> Tim
>
> On 22 May 2017 at 19:39, Reuti > wrote:
> Hi,
>
> > Am 22.05.2017 um 07:22 schrieb Tim Jim
>:
> >
> > Hello,
> >
> > Thanks for your message. I'm trying to get this to work on
a single
> > machine.
>
> Ok.
>
>
> > How might you 

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-21 Thread Gilles Gouaillardet

Stephen,


a simpler option is to install the libnl-route-3-dev.

note you will not be able to build the reachable/netlink component 
without this package.



Cheers,


Gilles


On 9/21/2017 1:04 PM, Gilles Gouaillardet wrote:

Stephen,


this is very likely related to the issue already reported in github.

meanwhile, you can apply the attached patch

patch configure < configure.diff

and then re-configure and make.

note this is a temporary workaround, it simply prevent the build of 
the reachable/netlink component,

and the upcoming real fix will be able to build this component.

Cheers,

Gilles

On 9/21/2017 9:22 AM, Gilles Gouaillardet wrote:

Thanks for the report,


is this related to https://github.com/open-mpi/ompi/issues/4211 ?

there is a known issue when libnl-3 is installed but libnl-route-3 is 
not



Cheers,


Gilles


On 9/21/2017 8:53 AM, Stephen Guzik wrote:

When compiling (on Debian stretch), I see:

In file included from libnl_utils.h:52:0,
  from reachable_netlink_utils_common.c:48:
libnl1_utils.h:54:26: error: too few arguments to function 
‘nl_geterror’

  #define NL_GETERROR(err) nl_geterror()
   ^
libnl1_utils.h:80:5: note: in expansion of macro ‘NL_GETERROR’
  NL_GETERROR(err)); \
  ^~~
reachable_netlink_utils_common.c:310:5: note: in expansion of macro
‘NL_RECVMSGS’
  NL_RECVMSGS(unlsk->nlh, arg, EHOSTUNREACH, err, out);
  ^~~
In file included from /usr/include/libnl3/netlink/netlink.h:31:0,
  from libnl1_utils.h:47,
  from libnl_utils.h:52,
  from reachable_netlink_utils_common.c:48:
/usr/include/libnl3/netlink/errno.h:56:21: note: declared here
  extern const char * nl_geterror(int);

Modifying openmpi-3.0.0/opal/mca/reachable/netlink/libnl1_utils.h from

#define NL_GETERROR(err) nl_geterror()

to

#define NL_GETERROR(err) nl_geterror(err)

as in libnl3_utils.h allows for successful compilation.  But from
configure, I see

checking for libraries that use libnl v1... (none)
checking for libraries that use libnl v3... ibverbs nl-3

so I wonder if perhaps there is something more serious is going on.  
Any

suggestions?

Thanks,
Stephen Guzik

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users







___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-21 Thread Gilles Gouaillardet

Tim,


do that in your shell, right before invoking configure.

export nvml_enable=no

./configure ...

make && make install


you can keep the --without-cuda flag (i think this is unrelated though)


Cheers,

Gilles

On 9/21/2017 3:54 PM, Tim Jim wrote:

Dear Gilles,

Thanks for the mail - where should I set export nvml_enable=no? Should 
I reconfigure with default cuda support or keep the --without-cuda flag?


Kind regards,
Tim

On 21 September 2017 at 15:22, Gilles Gouaillardet <gil...@rist.or.jp 
<mailto:gil...@rist.or.jp>> wrote:


Tim,


i am not familiar with CUDA, but that might help

can you please

export nvml_enable=no

and then re-configure and rebuild Open MPI ?


i hope this will help you


Cheers,


Gilles



On 9/21/2017 3:04 PM, Tim Jim wrote:

Hello,

Apologies to bring up this old thread - I finally had a chance
to try again with openmpi but I am still have trouble getting
it to run. I downloaded version 3.0.0 hoping it would solve
some of the problems but on running mpicc for the previous
test case, I am still getting an undefined reference error. I
did as you suggested and also configured it to install without
cuda using

./configure --without-cuda --prefix=/opt/openmpi/openmpi-3.0.0

and at the end of the summary, CUDA support shows 'no'.
Unfortunately, the error is still the same, and for some
reason, mpicc still seems to have referenced my cuda targets.

tjim@DESKTOP-TA3P0PS:~/Documents$ mpicc
/opt/mpi4py/mpi4py_src/demo/helloworld.c -o hello.bin
mpicc:
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1:
no version information available (required by
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40)
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined
reference to `clGetPlatformInfo@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined
reference to `clGetPlatformIDs@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined
reference to `clGetDeviceInfo@OPENCL_1.0'
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40: undefined
reference to `clGetDeviceIDs@OPENCL_1.0'
collect2: error: ld returned 1 exit status

I also attempted to test mpirun, as suggested in the readme,
however I get the following problem:

 tjim@DESKTOP-TA3P0PS:~/Documents$ mpirun
/opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c
mpirun:
/usr/local/cuda-8.0/targets/x86_64-linux/lib/libOpenCL.so.1:
no version information available (required by
/opt/openmpi/openmpi-3.0.0/lib/libopen-pal.so.40)

--
Open MPI tried to fork a new process via the "execve" system
call but
failed.  Open MPI checks many things before attempting to launch a
child process, but nothing is perfect. This error may be
indicative
of another problem on the target host, or even something as
silly as
having specified a directory for your application. Your job
will now
abort.

  Local host:        DESKTOP-TA3P0PS
  Working dir:       /home/tjim/Documents
  Application name:
 /opt/openmpi/openmpi-3.0.0_src/examples/hello_c.c
  Error:             Exec format error

--

--
mpirun was unable to start the specified application as it
encountered an
error:

Error code: 1
Error name: (null)
Node: DESKTOP-TA3P0PS

when attempting to start process rank 0.

--
4 total processes failed to start
[DESKTOP-TA3P0PS:15231] 3 more processes have sent help
message help-orte-odls-default.txt / execve error
[DESKTOP-TA3P0PS:15231] Set MCA parameter
"orte_base_help_aggregate" to 0 to see all help / error messages


Do you have any suggestions to what might have gone wrong on
this install? I'm not sure if this thread is still alive, so
if you need a refresh on the situation/any more info, please
let me know.
Kind regards,
Tim

On 24 May 2017 at 09:12, Tim Jim <timothy.m@gmail.com
<mailto:timothy.m@gmail.com>
<mailto:timothy.m@gmail.com
<mailto:timothy.m@gmail.com>>> wrote:

    Thanks for the thoughts, I'll give it a go. For reference,
I have
    installed it in the opt directory, as that is where I have
  

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-20 Thread Gilles Gouaillardet
On Tue, Sep 19, 2017 at 11:58 AM, Jeff Hammond  wrote:

> Fortran is a legit problem, although if somebody builds a standalone Fortran
> 2015 implementation of the MPI interface, it would be decoupled from the MPI
> library compilation.

Is this even doable without making any assumptions ?
For example, how should the MPI C library handle MPI_INTEGER if the
datatype size if it is not known at build time ?
it is likely to be 4 (most common case), but might be 8 (if the
infamous -i8 option, or its equivalent, is used)

Cheers,

Gilles
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Gilles Gouaillardet

Anthony,


in your script, can you


set -x

env

pbsdsh hostname

mpirun --display-map --display-allocation --mca ess_base_verbose 10 
--mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname



and then compress and send the output ?


Cheers,


Gilles

On 10/3/2017 1:19 PM, Anthony Thyssen wrote:
I noticed that too.  Though the submitting host for torque is a 
different host (main head node, "shrek"),  "node21" is the host that 
torque runs the batch script (and the mpirun command) it being the 
first node in the "dualcore" resource group.


Adding option...

It fixed the hostname in the allocation map, though had no effect on 
the outcome.  The allocation is still simply ignored.


===8>

 --
   The equivalent of an armoured car should always be used to
   protect any secret kept in a cardboard box.
   -- Anthony Thyssen, On the use of Encryption
 --


On Tue, Oct 3, 2017 at 2:00 PM, r...@open-mpi.org 
 > 
wrote:


One thing I can see is that the local host (where mpirun executed)
shows as “node21” in the allocation, while all others show their
FQDN. This might be causing some confusion.

You might try adding "--mca orte_keep_fqdn_hostnames 1” to your
cmd line and see if that helps.



On Oct 2, 2017, at 8:14 PM, Anthony Thyssen
> wrote:

Update...  Problem of all processes runing on first node
(oversubscribed dual-core machine) is NOT resolved.

Changing the mpirun  command in the Torque batch script
("pbs_hello" - See previous) to

   mpirun --nooversubscribe --display-allocation hostname

Then submitting to PBS/Torque using

qsub -l nodes=5:ppn=1:dualcore pbs_hello

To run on 5 dual-core machines. Produces the following result...

===8>
 --
   Using encryption on the Internet is the equivalent of arranging

<    2   3   4   5   6   7   8   9   10   11   >