Re: [OMPI users] round-robin scheduling question [hostfile]

Ralph Castain Sat, 21 Feb 2009 09:52:54 -0500


On Feb 21, 2009, at 1:05 AM, Raymond Wan wrote:

Hi Ralph,

Thank you very much for your explanation!


Ralph Castain wrote:
It is a little bit of both:
* historical, because most MPI's default to mapping by slot, and
* performance, because procs that share a node can communicate viashared memory, which is faster than sending messages over aninterconnect, and most apps are communication-boundIf your app is disk-intensive, then mapping it -bynode may be abetter
Ok -- by this, it seems that there is no "rule" that says one isobviously better than the other. It depends on factors such as diskaccess and shared memory access and which one is dominating. So, itis worth to try both to see?

Can't hurt! You might be able to tell by knowing what your app isdoing, but otherwise, feel free to experiment.

option for you. That's why we provide it. Note, however, that youcan still wind up with multiple procs on a node. All "bynode" meansis that the ranks are numbered consecutively bynode - it doesn'tmean that there is only one proc/node.
I see. But if the number of processes (as specified using -np) isless than the number of nodes, if "by node" is chosen, then is itguaranteed that only one process will be on each node?


That is correct

Is there a way to write the hostfile to ensure this?

You don't need to do anything in the hostfile - if you use bynode andnp < #nodes, it is guaranteed that you will have only one proc/node

I was curious if a node has 4 slots, whether writing it 4 times inthe hostfile with 1 slot each has any meaning. Might be a bad ideaas we are trying to fool mpirun?

It won't have any meaning as we aggregate the results. In other words,we read through the hostfile, and if a host appears more than once, wesimply add the #slots on subsequent entries to the earlier one. So wewind up with just one instance of that host that has the total numberof slots allocated to it.

If you truly want one proc/node, then you should use the -pernodeoption. This maps one proc on each node up to either the number ofprocs you specified or the number of available nodes. If you don'tspecify -np, we just put one proc on each node in your allocation/hostfile.
I see ... I was not aware of that option; thank you!

Do a "man mpirun" and you will see that there are several mappingoptions that might interest you, including:

1. npernode - let's you specify how many procs/node (as opposed to"pernode", where you only get one proc/node - obviously, pernode isthe equivalent of "-npernode 1")

2. seq - a sequential mapper. This mapper will read a file (which canbe different from the hostfile used to specify your allocation) andassign one proc to each entry in a sequential manner like this:


node1 ----> rank 0 goes on node1
node5 ----> rank 1 goes on node5
node1 ----> rank 2 goes on node1
...

3. rank_file - allows you to specify that rank x goes on node foo, andwhat core/socket that rank should be bound to

The man page will describe all the various options. Which one is bestfor your app really depends on what the app is doing, the capabilitiesand topology of your cluster, etc. A little experimentation can helpyou get a feel for when to use which one.


HTH
Ralph



Ray



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] round-robin scheduling question [hostfile]

Reply via email to