I may be opening a can of worms...
But, what prevents a user from running across clusters in a "normal
OMPI", i.e., non-ALPS environment? When he puts hosts into his
hostfile, does it parse and abort/filter non-matching hostnames? The
problem for ALPS based systems is that nodes are addressed via NID,PID
pairs at the portals level. Thus, these are unique only within a
cluster. In point of fact, I could rewrite all of the ALPS support to
identify the nodes by "cluster_id".NID. It would be a bit inefficient
within a cluster because, we would have to extract the NID from this
syntax as we go down to the portals layer. It also would lead to a
larger degree of change within the OMPI ALPS code base. However, I
can
give ALPS-based systems the same feature set as the rest of the world.
It just is more efficient to use an additional pointer in the
orte_node_t structure and results is a far simpler code structure.
This
makes it easier to maintain.
The only thing that "this change" really does is to identify the
cluster
under which the ALPS allocation is made. If you are addressing a node
in another cluster, (e.g., via accept/connect), the clustername/NID
pair
is unique for ALPS as a hostname on a cluster node is unique between
clusters. If you do a gethostname() on a normal cluster node, you are
going to get mynameNNNNN, or something similar. If you do a
gethostname() on an ALPS node, you are going to get nidNNNNN; there is
no differentiation between cluster A and cluster B.
Perhaps, my earlier comment was not accurate. In reality, it provides
the same degree of identification for ALPS nodes as hostname provides
for normal clusters. From your perspective, it is immaterial that it
also would allow us to support our limited form of multi-cluster
support. However, of and by itself, it only provides the same
level of
identification as is done for other cluster nodes.
--
Ken
-----Original Message-----
From: Ralph Castain [mailto:r...@lanl.gov]
Sent: Monday, September 22, 2008 2:33 PM
To: Open MPI Developers
Cc: Matney Sr, Kenneth D.
Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r19600
The issue isn't with adding a string. The question is whether or not
OMPI is to support one job running across multiple clusters. We made a
conscious decision (after lengthy discussions on OMPI core and ORTE
mailing lists, plus several telecons) to not do so - we require that
the job execute on a single cluster, while allowing connect/accept to
occur between jobs on different clusters.
It is difficult to understand why we need a string (or our old "cell
id") to tell us which cluster we are on if we are only following that
operating model. From the commit comment, and from what I know of the
system, the only rationale for adding such a designator is to shift
back to the one-mpirun-spanning-multiple-cluster model.
If we are now going to make that change, then it merits a similar
level of consideration as the last decision to move away from that
model. Making that move involves considerably more than just adding a
cluster id string. You may think that now, but the next step is
inevitably to bring back remote launch, killing jobs on all clusters
when one cluster has a problem, etc.
Before we go down this path and re-open Pandora's box, we should at
least agree that is what we intend to do...or agree on what hard
constraints we will place on multi-cluster operations. Frankly, I'm
tired of bouncing back-and-forth on even the most basic design
decisions.
Ralph
On Sep 22, 2008, at 11:55 AM, Richard Graham wrote:
What Ken put in is what is needed for the limited multi-cluster
capabilities
we need, just one additional string. I don't think there is a need
for any
discussion of such a small change.
Rich
On 9/22/08 1:32 PM, "Ralph Castain" <r...@lanl.gov> wrote:
We really should discuss that as a group first - there is quite a
bit
of code required to actually support multi-clusters that has been
removed.
Our operational model that was agreed to quite a while ago is that
mpirun can -only- extend over a single "cell". You can
connect/accept
multiple mpiruns that are sitting on different cells, but you cannot
execute a single mpirun across multiple cells.
Please keep this on your own development branch for now. Bringing it
into the trunk will require discussion as this changes the operating
model, and has significant code consequences when we look at
abnormal
terminations, comm_spawn, etc.
Thanks
Ralph
On Sep 22, 2008, at 11:26 AM, Richard Graham wrote:
This check in was in error - I had not realized that the checkout
was from
the 1.3 branch, so we will fix this, and put these into the trunk
(1.4). We
are going to bring in some limited multi-cluster support - limited
is the
operative word.
Rich
On 9/22/08 12:50 PM, "Jeff Squyres" <jsquy...@cisco.com> wrote:
I notice that Ken Matney (the committer) is not on the devel
list; I
added him explicitly to the CC line.
Ken: please see below.
On Sep 22, 2008, at 12:46 PM, Ralph Castain wrote:
Whoa! We made a decision NOT to support multi-cluster apps in
OMPI
over a year ago!
Please remove this from 1.3 - we should discuss if/when this
would
even be allowed in the trunk.
Thanks
Ralph
On Sep 22, 2008, at 10:35 AM, mat...@osl.iu.edu wrote:
Author: matney
Date: 2008-09-22 12:35:54 EDT (Mon, 22 Sep 2008)
New Revision: 19600
URL: https://svn.open-mpi.org/trac/ompi/changeset/19600
Log:
Added member to orte_node_t to enable multi-cluster jobs in ALPS
scheduled systems (like Cray XT).
Text files modified:
branches/v1.3/orte/runtime/orte_globals.h | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
Modified: branches/v1.3/orte/runtime/orte_globals.h
=
=
=
=
=
=
=
=
=
=
=
=
=
=================================================================
--- branches/v1.3/orte/runtime/orte_globals.h (original)
+++ branches/v1.3/orte/runtime/orte_globals.h 2008-09-22
12:35:54
EDT (Mon, 22 Sep 2008)
@@ -222,6 +222,10 @@
/** Username on this node, if specified */
char *username;
char *slot_list;
+ /** Clustername (machine name of cluster) on which this
node
+ resides. ALPS scheduled systems need this to enable
+ multi-cluster support. */
+ char *clustername;
} orte_node_t;
ORTE_DECLSPEC OBJ_CLASS_DECLARATION(orte_node_t);
_______________________________________________
svn mailing list
s...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel