Re: [OMPI users] Slurm binding not propagated to MPI jobs

2016-11-04 Thread r...@open-mpi.org
See https://github.com/open-mpi/ompi/pull/2365 


Let me know if that solves it for you


> On Nov 3, 2016, at 9:48 AM, Andy Riebs  wrote:
> 
> Getting that support into 2.1 would be terrific -- and might save us from 
> having to write some Slurm prolog scripts to effect that.
> 
> Thanks Ralph!
> 
> On 11/01/2016 11:36 PM, r...@open-mpi.org  wrote:
>> Ah crumby!! We already solved this on master, but it cannot be backported to 
>> the 1.10 series without considerable pain. For some reason, the support for 
>> it has been removed from the 2.x series as well. I’ll try to resolve that 
>> issue and get the support reinstated there (probably not until 2.1).
>> 
>> Can you manage until then? I think the v2 RM’s are thinking Dec/Jan for 
>> 2.1.
>> Ralph
>> 
>> 
>>> On Nov 1, 2016, at 11:38 AM, Riebs, Andy >> > wrote:
>>> 
>>> To close the thread here… I got the following information:
>>>  
>>> Looking at SLURM_CPU_BIND is the right idea, but there are quite a few more 
>>> options. It misses map_cpu, rank, plus the NUMA-based options:
>>> rank_ldom, map_ldom, and mask_ldom. See the srun man pages for 
>>> documentation.
>>>  
>>>  
>>> From: Riebs, Andy 
>>> Sent: Thursday, October 27, 2016 1:53 PM
>>> To: users@lists.open-mpi.org 
>>> Subject: Re: [OMPI users] Slurm binding not propagated to MPI jobs
>>>  
>>> Hi Ralph,
>>> 
>>> I haven't played around in this code, so I'll flip the question over to the 
>>> Slurm list, and report back here when I learn anything.
>>> 
>>> Cheers
>>> Andy
>>> 
>>> On 10/27/2016 01:44 PM, r...@open-mpi.org  wrote:
>>> Sigh - of course it wouldn’t be simple :-( 
>>>  
>>> All right, let’s suppose we look for SLURM_CPU_BIND:
>>>  
>>> * if it includes the word “none”, then we know the user specified that 
>>> they don’t want us to bind
>>>  
>>> * if it includes the word mask_cpu, then we have to check the value of that 
>>> option.
>>>  
>>> * If it is all F’s, then they didn’t specify a binding and we should do 
>>> our thing.
>>>  
>>> * If it is anything else, then we assume they _did_ specify a binding, and 
>>> we leave it alone
>>>  
>>> Would that make sense? Is there anything else that could be in that envar 
>>> which would trip us up?
>>>  
>>>  
>>> On Oct 27, 2016, at 10:37 AM, Andy Riebs >> > wrote:
>>>  
>>> Yes, they still exist:
>>> $ srun --ntasks-per-node=2 -N1 env | grep BIND | sort -u
>>> SLURM_CPU_BIND_LIST=0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> Here are the relevant Slurm configuration options that could conceivably 
>>> change the behavior from system to system:
>>> SelectType  = select/cons_res
>>> SelectTypeParameters= CR_CPU
>>> 
>>>  
>>> On 10/27/2016 01:17 PM, r...@open-mpi.org  wrote:
>>> And if there is no --cpu_bind on the cmd line? Do these not exist?
>>>  
>>> On Oct 27, 2016, at 10:14 AM, Andy Riebs >> > wrote:
>>>  
>>> Hi Ralph,
>>> 
>>> I think I've found the magic keys...
>>> 
>>> $ srun --ntasks-per-node=2 -N1 --cpu_bind=none env | grep BIND
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=none
>>> SLURM_CPU_BIND_LIST=
>>> SLURM_CPU_BIND=quiet,none
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=none
>>> SLURM_CPU_BIND_LIST=
>>> SLURM_CPU_BIND=quiet,none
>>> $ srun --ntasks-per-node=2 -N1 --cpu_bind=core env | grep BIND
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_LIST=0x,0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
>>> SLURM_CPU_BIND_VERBOSE=quiet
>>> SLURM_CPU_BIND_TYPE=mask_cpu:
>>> SLURM_CPU_BIND_LIST=0x,0x
>>> SLURM_CPU_BIND=quiet,mask_cpu:0x,0x
>>> 
>>> Andy
>>> 
>>> On 10/27/2016 11:57 AM, r...@open-mpi.org  wrote:
>>> 
>>> Hey Andy
>>> 
>>> Is there a SLURM envar that would tell us the binding option from the srun 
>>> cmd line? We automatically bind when direct launched due to user complaints 
>>> of poor performance if we don’t. If the user specifies a 
>>> binding option, then we detect that we were already bound and 
>>> don’t do it.
>>> 
>>> However, if the user specifies that they not be bound, then we think they 
>>> simply didn’t specify anything - and that 
>>> isn’t the case. If we can see something that tells us 
>>> “they explicitly said not to do it”, then we can 
>>> avoid the situation.
>>> 
>>> Ralph
>>> 
>>> 
>>> On Oct 27, 2016, at 8:48 AM, Andy Riebs >> > wrote:
>>> 
>>> Hi All,
>>> 
>>> We are running Open MPI version 1.10.2, built with support for Slurm 
>>> version 16.05.0. When a user specifies "--cpu_bind=none", MPI tries to bind 

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
>​What problems are you referring to?
I mean errors that are saying failed to load X.so. Then the user has to add
some paths to LD_LIBRARY_PATH. Although such problem can be fixed by adding
an export to the .bashrc, but I prefer to avoid that.


>We might need a bit more detail than that; I use "--enable-static
--disable-shared" and I do not get dlopen errors

I also have seen that on Centos. But as I test an application on
Ubuntu-15.04, I saw that error. Maybe on Centos, an external library has
been installed but it is missed on Ubuntu... This is a guess though.


Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Jeff Squyres (jsquyres)
On Nov 4, 2016, at 12:14 PM, Mahmood Naderan  wrote:
> 
> >​If there's a reason you did --enable-static --disable-shared​
> Basically, I want to prevent dynamic library problems (ldd) on a distributed 
> environment.

What problems are you referring to?

> $ mpifort --showme
> gfortran -I/opt/openmpi-2.0.1/include -pthread -I/opt/openmpi-2.0.1/lib 
> -Wl,-rpath -Wl,/opt/openmpi-2.0.1/lib -Wl,--enable-new-dtags 
> -L/opt/openmpi-2.0.1/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh 
> -lmpi -lopen-rte -lopen-pal -lm -lrt -lutil
> ​
> As I said, --disbale-dlopen fixed that error. But, if anybody know how to 
> have --enable-static --disable-shared with dlopen, please let me know.

We might need a bit more detail than that; I use "--enable-static 
--disable-shared" and I do not get dlopen errors.

When you enable static/disable shared, can you build simple MPI applications 
(e.g., hello world)?  I.e., is the problem with Open MPI, or some kind of 
effect that happens with building your large/complex application?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
>​If there's a reason you did --enable-static --disable-shared​
Basically, I want to prevent dynamic library problems (ldd) on a
distributed environment.


​$ mpifort --showme
gfortran -I/opt/openmpi-2.0.1/include -pthread -I/opt/openmpi-2.0.1/lib
-Wl,-rpath -Wl,/opt/openmpi-2.0.1/lib -Wl,--enable-new-dtags
-L/opt/openmpi-2.0.1/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr
-lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lm -lrt -lutil
​
As I said, --disbale-dlopen fixed that error. But, if anybody know how to
have --enable-static --disable-shared with dlopen, please let me know.



Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
All true - but I reiterate. The source of the problem is that the "--map-by 
node” on the cmd line must come *before* your application. Otherwise, none of 
these suggestions will help.

> On Nov 4, 2016, at 6:52 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> In your case, using slots or --npernode or --map-by node will result in the 
> same distribution of processes because you're only launching 1 process per 
> node (a.k.a. "1ppn").
> 
> They have more pronounced differences when you're launching more than 1ppn.
> 
> Let's take a step back: you should know that Open MPI uses 3 phases to plan 
> out how it will launch your MPI job:
> 
> 1. Mapping: where each process will go
> 2. Ordering: after mapping, how each process will be numbered (this 
> translates to rank ordering MPI_COMM_WORLD)
> 3. Binding: binding processes to processors
> 
> #3 is not pertinent to this conversation, so I'll leave it out of my 
> discussion below.
> 
> We're mostly talking about #1 here.  Let's look at each of the three options 
> mentioned in this thread individually.  In each of the items below, I assume 
> you are using *just* that option, and *neither of the other 2 options*:
> 
> 1. slots: this tells Open MPI the maximum number of processes that can be 
> placed on a server before it is considered to be "oversubscribed" (and Open 
> MPI won't let you oversubscribe by default).
> 
> So when you say "slots=1", you're basically telling Open MPI to launch 1 
> process per node and then to move on to the next node.  If you said 
> "slots=3", then Open MPI would launch up to 3 processes per node before 
> moving on to the next (until the total np processes were launched).
> 
> *** Be aware that we have changed the hostfile default value of slots (i.e., 
> what number of slots to use if it is not specified in the hostfile) in 
> different versions of Open MPI.  When using hostfiles, in most cases, you'll 
> see either a default value of 1 or the total number of cores on the node.
> 
> 2. --map-by node: in this case, Open MPI will map out processes round robin 
> by *node* instead of its default by *core*.  Hence, even if you had "slots=3" 
> and -np 9, Open MPI would first put a process on node A, then put a process 
> on node B, then a process on node C, and then loop back to putting a 2nd 
> process on node A, ...etc.
> 
> 3. --npernode: in this case, you're telling Open MPI how many processes to 
> put on each node before moving on to the next node.  E.g., if you "mpirun -np 
> 9 ..." (and assuming you have >=3 slots per node), Open MPI will put 3 
> processes on each node before moving on to the next node.
> 
> With the default MPI_COMM_WORLD rank ordering, the practical difference in 
> these three options is:
> 
> Case 1:
> 
> $ cat hostfile
> a slots=3
> b slots=3
> c slots=3
> $ mpirun --hostfile hostfile -np 9 my_mpi_executable
> 
> In this case, you'll end up with MCW ranks 0-2 on a, 3-5 on b, and 6-8 on c.
> 
> Case 2:
> 
> # Setting an arbitrarily large number of slots per host just to be explicitly 
> clear for this example
> $ cat hostfile
> a slots=20
> b slots=20
> c slots=20
> $ mpirun --hostfile hostfile -np 9 --map-by node my_mpi_executable
> 
> In this case, you'll end up with MCW ranks 0,3,6 on a, 1,4,7 on b, and 2,5,8 
> on c.
> 
> Case 3:
> 
> # Setting an arbitrarily large number of slots per host just to be explicitly 
> clear for this example
> $ cat hostfile
> a slots=20
> b slots=20
> c slots=20
> $ mpirun --hostfile hostfile -np 9 --npernode 3 my_mpi_executable
> 
> In this case, you'll end up with the same distribution / rank ordering as 
> case #1, but you'll still have 17 more slots you could have used.
> 
> There are lots of variations on this, too, because these mpirun options (and 
> many others) can be used in conjunction with each other.  But that gets 
> pretty esoteric pretty quickly; most users don't have a need for such 
> complexity.
> 
> 
> 
>> On Nov 4, 2016, at 8:57 AM, Bennet Fauber  wrote:
>> 
>> Mahesh,
>> 
>> Depending what you are trying to accomplish, might using the mpirun option
>> 
>> -pernode  -o-  --pernode
>> 
>> work for you?  That requests that only one process be spawned per
>> available node.
>> 
>> We generally use this for hybrid codes, where the single process will
>> spawn threads to the remaining processors.
>> 
>> Just a thought,   -- bennet
>> 
>> 
>> 
>> 
>> 
>> On Fri, Nov 4, 2016 at 8:39 AM, Mahesh Nanavalla
>>  wrote:
>>> s...
>>> 
>>> Thanks for responding me.
>>> i have solved that as below by limiting slots in hostfile
>>> 
>>> root@OpenWrt:~# cat myhostfile
>>> root@10.73.145.1 slots=1
>>> root@10.74.25.1  slots=1
>>> root@10.74.46.1  slots=1
>>> 
>>> 
>>> I want the difference between the slots limiting in myhostfile and runnig
>>> --map-by node.
>>> 
>>> I am awaiting for your reply.
>>> 
>>> On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org  wrote:
 
 My apologies - the problem is that you list the option _after_ your
 executable nam

Re: [OMPI users] error on dlopen

2016-11-04 Thread Jeff Squyres (jsquyres)
> On Nov 4, 2016, at 7:07 AM, Mahmood Naderan  wrote:
> 
> > You might have to remove -ldl from the scalapack makefile
> I removed that before... I will try one more time
> 
> Actually, using --disable-dlopen fixed the error.

To clarify:

1. Using --enable-static causes all the plugins in Open MPI to be "slurped up" 
into the MPI libraries.  I.e., they won't be opened at runtime as plugins -- 
they're just part of the MPI library.

This does *not* disable Open MPI from trying to dlopen() additional plugins at 
runtime, though.

2. Using --disable-dlopen *also* causes all the plugins in Open MPI to be 
"slurped up" into the MPI libraries.  It *also* disables Open MPI from trying 
to dlopen() additional plugins at runtime.

> >mpirun --showme

Gilles meant to say:

mpicc --showme
mpifort --showme

-

If there's a reason you did --enable-static --disable-shared, then since you're 
having a problem with the dl library, you might as well also --disable-dlopen 
and then remove the -ldl from where you added it to the Makefile.  Open MPI 
will no longer be using dlopen, but that does not mean that something else 
isn't using dlopen.

If you don't have a specific reason for using --enable-static, then you might 
as well not specify *any* of --enable-static, --disable-shared, or 
--disable-dlopen, and then I'm guessing using mpicc/mpifort in your Makefile 
will "just work".

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Jeff Squyres (jsquyres)
In your case, using slots or --npernode or --map-by node will result in the 
same distribution of processes because you're only launching 1 process per node 
(a.k.a. "1ppn").

They have more pronounced differences when you're launching more than 1ppn.

Let's take a step back: you should know that Open MPI uses 3 phases to plan out 
how it will launch your MPI job:

1. Mapping: where each process will go
2. Ordering: after mapping, how each process will be numbered (this translates 
to rank ordering MPI_COMM_WORLD)
3. Binding: binding processes to processors

#3 is not pertinent to this conversation, so I'll leave it out of my discussion 
below.

We're mostly talking about #1 here.  Let's look at each of the three options 
mentioned in this thread individually.  In each of the items below, I assume 
you are using *just* that option, and *neither of the other 2 options*:

1. slots: this tells Open MPI the maximum number of processes that can be 
placed on a server before it is considered to be "oversubscribed" (and Open MPI 
won't let you oversubscribe by default).

So when you say "slots=1", you're basically telling Open MPI to launch 1 
process per node and then to move on to the next node.  If you said "slots=3", 
then Open MPI would launch up to 3 processes per node before moving on to the 
next (until the total np processes were launched).

*** Be aware that we have changed the hostfile default value of slots (i.e., 
what number of slots to use if it is not specified in the hostfile) in 
different versions of Open MPI.  When using hostfiles, in most cases, you'll 
see either a default value of 1 or the total number of cores on the node.

2. --map-by node: in this case, Open MPI will map out processes round robin by 
*node* instead of its default by *core*.  Hence, even if you had "slots=3" and 
-np 9, Open MPI would first put a process on node A, then put a process on node 
B, then a process on node C, and then loop back to putting a 2nd process on 
node A, ...etc.

3. --npernode: in this case, you're telling Open MPI how many processes to put 
on each node before moving on to the next node.  E.g., if you "mpirun -np 9 
..." (and assuming you have >=3 slots per node), Open MPI will put 3 processes 
on each node before moving on to the next node.

With the default MPI_COMM_WORLD rank ordering, the practical difference in 
these three options is:

Case 1:

$ cat hostfile
a slots=3
b slots=3
c slots=3
$ mpirun --hostfile hostfile -np 9 my_mpi_executable

In this case, you'll end up with MCW ranks 0-2 on a, 3-5 on b, and 6-8 on c.

Case 2:

# Setting an arbitrarily large number of slots per host just to be explicitly 
clear for this example
$ cat hostfile
a slots=20
b slots=20
c slots=20
$ mpirun --hostfile hostfile -np 9 --map-by node my_mpi_executable

In this case, you'll end up with MCW ranks 0,3,6 on a, 1,4,7 on b, and 2,5,8 on 
c.

Case 3:

# Setting an arbitrarily large number of slots per host just to be explicitly 
clear for this example
$ cat hostfile
a slots=20
b slots=20
c slots=20
$ mpirun --hostfile hostfile -np 9 --npernode 3 my_mpi_executable

In this case, you'll end up with the same distribution / rank ordering as case 
#1, but you'll still have 17 more slots you could have used.

There are lots of variations on this, too, because these mpirun options (and 
many others) can be used in conjunction with each other.  But that gets pretty 
esoteric pretty quickly; most users don't have a need for such complexity.



> On Nov 4, 2016, at 8:57 AM, Bennet Fauber  wrote:
> 
> Mahesh,
> 
> Depending what you are trying to accomplish, might using the mpirun option
> 
> -pernode  -o-  --pernode
> 
> work for you?  That requests that only one process be spawned per
> available node.
> 
> We generally use this for hybrid codes, where the single process will
> spawn threads to the remaining processors.
> 
> Just a thought,   -- bennet
> 
> 
> 
> 
> 
> On Fri, Nov 4, 2016 at 8:39 AM, Mahesh Nanavalla
>  wrote:
>> s...
>> 
>> Thanks for responding me.
>> i have solved that as below by limiting slots in hostfile
>> 
>> root@OpenWrt:~# cat myhostfile
>> root@10.73.145.1 slots=1
>> root@10.74.25.1  slots=1
>> root@10.74.46.1  slots=1
>> 
>> 
>> I want the difference between the slots limiting in myhostfile and runnig
>> --map-by node.
>> 
>> I am awaiting for your reply.
>> 
>> On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org  wrote:
>>> 
>>> My apologies - the problem is that you list the option _after_ your
>>> executable name, and so we think it is an argument for your executable. You
>>> need to list the option _before_ your executable on the cmd line
>>> 
>>> 
>>> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla
>>>  wrote:
>>> 
>>> Thanks for reply,
>>> 
>>> But,with space also not running on one process one each node
>>> 
>>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
>>> myhostfile /usr/bin/openmpiWiFiBulb --map-by node
>>> 
>>> And
>>> 
>>> If use like this it,s workin

Re: [OMPI users] OMPI users] mpirun --map-by-node

2016-11-04 Thread Gilles Gouaillardet
As long as you run 3 MPI tasks, both options will produce the same mapping.
If you want to run up to 12 tasks, then --map-by node is the way to go

Mahesh Nanavalla  wrote:
>s...
>
>
>Thanks for responding me.
>
>i have solved that as below by limiting slots in hostfile
>
>
>root@OpenWrt:~# cat myhostfile 
>
>root@10.73.145.1 slots=1
>
>root@10.74.25.1  slots=1
>
>root@10.74.46.1  slots=1
>
>
>
>I want the difference between the slots limiting in myhostfile and runnig 
>--map-by node.
>
>
>I am awaiting for your reply.
>
>
>On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org  wrote:
>
>My apologies - the problem is that you list the option _after_ your executable 
>name, and so we think it is an argument for your executable. You need to list 
>the option _before_ your executable on the cmd line
>
>
>
>On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla  
>wrote:
>
>
>Thanks for reply,
>
>
>But,with space also not running on one process one each node
>
>
>root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
>myhostfile /usr/bin/openmpiWiFiBulb --map-by node
>
>
>And 
>
>
>If use like this it,s working fine(running one process on each node)
>
>/root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host 
>root@10.74.25.1,root@10.74.46.1,root@10.73.145.1 /usr/bin/openmpiWiFiBulb 
>
>
>But,i want use hostfile only..
>
>kindly help me.
>
>
>
>On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org  wrote:
>
>you mistyped the option - it is “--map-by node”. Note the space between “by” 
>and “node” - you had typed it with a “-“ instead of a “space”
>
>
>
>On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla  
>wrote:
>
>
>Hi all,
>
>
>I am using openmpi-1.10.3,using quad core processor(node).
>
>
>I am running 3 processes on three nodes(provided by hostfile) each node 
>process is limited  by --map-by-node as below
>
>
>root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
>myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
>
>
>root@OpenWrt:~# cat myhostfile 
>
>root@10.73.145.1:1
>
>root@10.74.25.1:1
>
>root@10.74.46.1:1
>
>
>
>Problem is 3 process running on one node.it's not mapping one process by node.
>
>
>is there any library used to run like above.if yes please tell me that .
>
>
>Kindly help me where am doing wrong...
>
>
>Thanks&Regards,
>
>Mahesh N
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Bennet Fauber
Mahesh,

Depending what you are trying to accomplish, might using the mpirun option

-pernode  -o-  --pernode

work for you?  That requests that only one process be spawned per
available node.

We generally use this for hybrid codes, where the single process will
spawn threads to the remaining processors.

Just a thought,   -- bennet





On Fri, Nov 4, 2016 at 8:39 AM, Mahesh Nanavalla
 wrote:
> s...
>
> Thanks for responding me.
> i have solved that as below by limiting slots in hostfile
>
> root@OpenWrt:~# cat myhostfile
> root@10.73.145.1 slots=1
> root@10.74.25.1  slots=1
> root@10.74.46.1  slots=1
>
>
> I want the difference between the slots limiting in myhostfile and runnig
> --map-by node.
>
> I am awaiting for your reply.
>
> On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org  wrote:
>>
>> My apologies - the problem is that you list the option _after_ your
>> executable name, and so we think it is an argument for your executable. You
>> need to list the option _before_ your executable on the cmd line
>>
>>
>> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla
>>  wrote:
>>
>> Thanks for reply,
>>
>> But,with space also not running on one process one each node
>>
>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
>> myhostfile /usr/bin/openmpiWiFiBulb --map-by node
>>
>> And
>>
>> If use like this it,s working fine(running one process on each node)
>> /root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host
>> root@10.74.25.1,root@10.74.46.1,root@10.73.145.1 /usr/bin/openmpiWiFiBulb
>>
>> But,i want use hostfile only..
>> kindly help me.
>>
>>
>> On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org  wrote:
>>>
>>> you mistyped the option - it is “--map-by node”. Note the space between
>>> “by” and “node” - you had typed it with a “-“ instead of a “space”
>>>
>>>
>>> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla
>>>  wrote:
>>>
>>> Hi all,
>>>
>>> I am using openmpi-1.10.3,using quad core processor(node).
>>>
>>> I am running 3 processes on three nodes(provided by hostfile) each node
>>> process is limited  by --map-by-node as below
>>>
>>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
>>> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
>>>
>>> root@OpenWrt:~# cat myhostfile
>>> root@10.73.145.1:1
>>> root@10.74.25.1:1
>>> root@10.74.46.1:1
>>>
>>>
>>> Problem is 3 process running on one node.it's not mapping one process by
>>> node.
>>>
>>> is there any library used to run like above.if yes please tell me that .
>>>
>>> Kindly help me where am doing wrong...
>>>
>>> Thanks&Regards,
>>> Mahesh N
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>>
>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Mahesh Nanavalla
s...

Thanks for responding me.
i have solved that as below by limiting* slots in hostfile*

root@OpenWrt:~# cat myhostfile
root@10.73.145.1 slots=1
root@10.74.25.1  slots=1
root@10.74.46.1  slots=1


I want the difference between the *slots* limiting in myhostfile and
runnig *--map-by
node.*

*I am awaiting for your reply.*

On Fri, Nov 4, 2016 at 5:25 PM, r...@open-mpi.org  wrote:

> My apologies - the problem is that you list the option _after_ your
> executable name, and so we think it is an argument for your executable. You
> need to list the option _before_ your executable on the cmd line
>
>
> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla <
> mahesh.nanavalla...@gmail.com> wrote:
>
> Thanks for reply,
>
> But,with space also not running on one process one each node
>
> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
> myhostfile /usr/bin/openmpiWiFiBulb --map-by node
>
> And
>
> If use like this it,s working fine(running one process on each node)
> */root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host
> root@10.74.25.1 ,root@10.74.46.1
> ,root@10.73.145.1 
> /usr/bin/openmpiWiFiBulb *
>
> *But,i want use hostfile only..*
> *kindly help me.*
>
>
> On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org  wrote:
>
>> you mistyped the option - it is “--map-by node”. Note the space between
>> “by” and “node” - you had typed it with a “-“ instead of a “space”
>>
>>
>> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla <
>> mahesh.nanavalla...@gmail.com> wrote:
>>
>> Hi all,
>>
>> I am using openmpi-1.10.3,using quad core processor(node).
>>
>> I am running 3 processes on three nodes(provided by hostfile) each node
>> process is limited  by --map-by-node as below
>>
>> *root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
>> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node*
>>
>>
>>
>>
>>
>>
>>
>> *root@OpenWrt:~# cat myhostfile root@10.73.145.1:1
>> root@10.74.25.1:1
>> root@10.74.46.1:1
>> Problem is 3 process running on one node.it
>> 's not mapping one process by node.is there any library
>> used to run like above.if yes please tell me that .Kindly help me where am
>> doing wrong...Thanks&Regards,Mahesh N*
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
My apologies - the problem is that you list the option _after_ your executable 
name, and so we think it is an argument for your executable. You need to list 
the option _before_ your executable on the cmd line


> On Nov 4, 2016, at 4:44 AM, Mahesh Nanavalla  
> wrote:
> 
> Thanks for reply,
> 
> But,with space also not running on one process one each node
> 
> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
> myhostfile /usr/bin/openmpiWiFiBulb --map-by node
> 
> And 
> 
> If use like this it,s working fine(running one process on each node)
> /root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host 
> root@10.74.25.1 ,root@10.74.46.1 
> ,root@10.73.145.1  
> /usr/bin/openmpiWiFiBulb 
> 
> But,i want use hostfile only..
> kindly help me.
> 
> 
> On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org  
> mailto:r...@open-mpi.org>> wrote:
> you mistyped the option - it is “--map-by node”. Note the space between “by” 
> and “node” - you had typed it with a “-“ instead of a “space”
> 
> 
>> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla > > wrote:
>> 
>> Hi all,
>> 
>> I am using openmpi-1.10.3,using quad core processor(node).
>> 
>> I am running 3 processes on three nodes(provided by hostfile) each node 
>> process is limited  by --map-by-node as below
>> 
>> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
>> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
>> 
>> root@OpenWrt:~# cat myhostfile 
>> root@10.73.145.1:1 
>> root@10.74.25.1:1 
>> root@10.74.46.1:1 
>> 
>> 
>> Problem is 3 process running on one node.it 's not mapping 
>> one process by node.
>> 
>> is there any library used to run like above.if yes please tell me that .
>> 
>> Kindly help me where am doing wrong...
>> 
>> Thanks&Regards,
>> Mahesh N
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org 
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
>> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Mahesh Nanavalla
Thanks for reply,

But,with space also not running on one process one each node

root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
myhostfile /usr/bin/openmpiWiFiBulb --map-by node

And

If use like this it,s working fine(running one process on each node)
*/root@OpenWrt:~#/usr/bin/mpirun --allow-run-as-root -np 3 --host
root@10.74.25.1 ,root@10.74.46.1
,root@10.73.145.1 
/usr/bin/openmpiWiFiBulb *

*But,i want use hostfile only..*
*kindly help me.*


On Fri, Nov 4, 2016 at 5:00 PM, r...@open-mpi.org  wrote:

> you mistyped the option - it is “--map-by node”. Note the space between
> “by” and “node” - you had typed it with a “-“ instead of a “space”
>
>
> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla <
> mahesh.nanavalla...@gmail.com> wrote:
>
> Hi all,
>
> I am using openmpi-1.10.3,using quad core processor(node).
>
> I am running 3 processes on three nodes(provided by hostfile) each node
> process is limited  by --map-by-node as below
>
> *root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node*
>
>
>
>
>
>
>
> *root@OpenWrt:~# cat myhostfile root@10.73.145.1:1
> root@10.74.25.1:1
> root@10.74.46.1:1
> Problem is 3 process running on one node.it
> 's not mapping one process by node.is there any library
> used to run like above.if yes please tell me that .Kindly help me where am
> doing wrong...Thanks&Regards,Mahesh N*
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread r...@open-mpi.org
you mistyped the option - it is “--map-by node”. Note the space between “by” 
and “node” - you had typed it with a “-“ instead of a “space”


> On Nov 4, 2016, at 4:28 AM, Mahesh Nanavalla  
> wrote:
> 
> Hi all,
> 
> I am using openmpi-1.10.3,using quad core processor(node).
> 
> I am running 3 processes on three nodes(provided by hostfile) each node 
> process is limited  by --map-by-node as below
> 
> root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile 
> myhostfile /usr/bin/openmpiWiFiBulb --map-by-node
> 
> root@OpenWrt:~# cat myhostfile 
> root@10.73.145.1:1 
> root@10.74.25.1:1 
> root@10.74.46.1:1 
> 
> 
> Problem is 3 process running on one node.it 's not mapping 
> one process by node.
> 
> is there any library used to run like above.if yes please tell me that .
> 
> Kindly help me where am doing wrong...
> 
> Thanks&Regards,
> Mahesh N
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] mpirun --map-by-node

2016-11-04 Thread Mahesh Nanavalla
Hi all,

I am using openmpi-1.10.3,using quad core processor(node).

I am running 3 processes on three nodes(provided by hostfile) each node
process is limited  by --map-by-node as below

*root@OpenWrt:~# /usr/bin/mpirun --allow-run-as-root -np 3 --hostfile
myhostfile /usr/bin/openmpiWiFiBulb --map-by-node*







*root@OpenWrt:~# cat myhostfile root@10.73.145.1:1
root@10.74.25.1:1
root@10.74.46.1:1
Problem is 3 process running on one node.it
's not mapping one process by node.is there any library
used to run like above.if yes please tell me that .Kindly help me where am
doing wrong...Thanks&Regards,Mahesh N*
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
> You might have to remove -ldl from the scalapack makefile
I removed that before... I will try one more time

Actually, using --disable-dlopen fixed the error.

>mpirun --showme

$ mpirun --showme
mpirun: Error: unknown option "--showme"
Type 'mpirun --help' for usage.


Regards,
Mahmood



On Fri, Nov 4, 2016 at 2:12 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> You might have to remove -ldl from the scalapack makefile
>
> If it still does not work, can you please post
> mpirun --showme ...
> output ?
>
> Cheers,
>
> Gilles
>
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Gilles Gouaillardet
You might have to remove -ldl from the scalapack makefile

If it still does not work, can you please post
mpirun --showme ...
output ?

Cheers,

Gilles

On Friday, November 4, 2016, Mahmood Naderan  wrote:

> Hi Gilles,
> I noticed that /opt/openmpi-2.0.1/share/openmpi/mpifort-wrapper-data.txt
> is created after "make install". So, I edited it and appended -ldl to
> libs_static.
> Then I ran "make clean && make all" for scalapack.
>
> However, still get the same error!!
>
> So, let me try disabling dlopen.
>
>
> Regards,
> Mahmood
>
>
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
Hi Gilles,
I noticed that /opt/openmpi-2.0.1/share/openmpi/mpifort-wrapper-data.txt is
created after "make install". So, I edited it and appended -ldl to
libs_static.
Then I ran "make clean && make all" for scalapack.

However, still get the same error!!

So, let me try disabling dlopen.


Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Gilles Gouaillardet

not much difference from a performance point of view.

the difference is more from a space (both memory and disk) point of view


also, if you --disable-dlopen, Open MPI is rebuilt when a single 
component is updated.


(without it, you can simply make install from the updated component 
directory)


so if you are developing a new component, *not* using --disable-dlopen 
can save you some build time



Cheers,


Gilles


On 11/4/2016 5:12 PM, Mahmood Naderan wrote:
I will try that. Meanwhile, I want to know what is the performance 
effect of disabling/enabling dlopen?


Regards,
Mahmood



On Fri, Nov 4, 2016 at 11:02 AM, Gilles Gouaillardet 
mailto:gil...@rist.or.jp>> wrote:


Yes, that is a problem :-(


you might want to reconfigure with

--enable-static --disable-shared --disable-dlopen

and see if it helps


or you can simply manuall edit
/opt/openmpi-2.0.1/share/openmpi/mpifort-wrapper-data.txt,

and append -ldl to the libs_static definition


Cheers,


Gilles





___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
I will try that. Meanwhile, I want to know what is the performance effect
of disabling/enabling dlopen?

Regards,
Mahmood



On Fri, Nov 4, 2016 at 11:02 AM, Gilles Gouaillardet 
wrote:

> Yes, that is a problem :-(
>
>
> you might want to reconfigure with
>
> --enable-static --disable-shared --disable-dlopen
>
> and see if it helps
>
>
> or you can simply manuall edit /opt/openmpi-2.0.1/share/
> openmpi/mpifort-wrapper-data.txt,
>
> and append -ldl to the libs_static definition
>
>
> Cheers,
>
>
> Gilles
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Gilles Gouaillardet

Yes, that is a problem :-(


you might want to reconfigure with

--enable-static --disable-shared --disable-dlopen

and see if it helps


or you can simply manuall edit 
/opt/openmpi-2.0.1/share/openmpi/mpifort-wrapper-data.txt,


and append -ldl to the libs_static definition


Cheers,


Gilles

On 11/4/2016 4:13 PM, Mahmood Naderan wrote:

>did you build Open MPI as a static only library ?
Yes, I used --enable-static --disable-shared


Please see the output

# mpifort -O3 -o xCbtest --showme blacstest.o btprim.o tools.o Cbt.o 
../../libscalapack.a -ldl
gfortran -O3 -o xCbtest blacstest.o btprim.o tools.o Cbt.o 
../../libscalapack.a -ldl -I/opt/openmpi-2.0.1/include -pthread 
-I/opt/openmpi-2.0.1/lib -Wl,-rpath -Wl,/opt/openmpi-2.0.1/lib 
-Wl,--enable-new-dtags -L/opt/openmpi-2.0.1/lib -lmpi_usempif08 
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lm 
-lrt -lutil



I don't see what you said after "-lopen-pal". Is that OK?

Regards,
Mahmood



On Fri, Nov 4, 2016 at 10:23 AM, Gilles Gouaillardet 
mailto:gil...@rist.or.jp>> wrote:


Mahmood,


did you build Open MPI as a static only library ?


i guess the -ldl position is wrong. your link command line should be

mpifort -O3 -o xCbtest blacstest.o btprim.o tools.o Cbt.o
../../libscalapack.a -ldl


you can manually

mpifort -O3 -o xCbtest --showme blacstest.o btprim.o tools.o Cbt.o
../../libscalapack.a -ldl


it should show -ldl is added *after* -lopen-pal


Cheers,

Gilles




___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] error on dlopen

2016-11-04 Thread Mahmood Naderan
>did you build Open MPI as a static only library ?
Yes, I used --enable-static --disable-shared


Please see the output

# mpifort -O3 -o xCbtest --showme blacstest.o btprim.o tools.o Cbt.o
../../libscalapack.a -ldl
gfortran -O3 -o xCbtest blacstest.o btprim.o tools.o Cbt.o
../../libscalapack.a -ldl -I/opt/openmpi-2.0.1/include -pthread
-I/opt/openmpi-2.0.1/lib -Wl,-rpath -Wl,/opt/openmpi-2.0.1/lib
-Wl,--enable-new-dtags -L/opt/openmpi-2.0.1/lib -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lm -lrt
-lutil


I don't see what you said after "-lopen-pal". Is that OK?

Regards,
Mahmood



On Fri, Nov 4, 2016 at 10:23 AM, Gilles Gouaillardet 
wrote:

> Mahmood,
>
>
> did you build Open MPI as a static only library ?
>
>
> i guess the -ldl position is wrong. your link command line should be
>
> mpifort -O3 -o xCbtest blacstest.o btprim.o tools.o Cbt.o
> ../../libscalapack.a -ldl
>
>
> you can manually
>
> mpifort -O3 -o xCbtest --showme blacstest.o btprim.o tools.o Cbt.o
> ../../libscalapack.a -ldl
>
> it should show -ldl is added *after* -lopen-pal
>
>
> Cheers,
>
> Gilles
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users