On Dec 5, 2007, at 11:23 AM, Ralph H Castain wrote:
Well, I think it is pretty obvious that I am a fan of a attribute
system :)
For completeness, I will point out that we also exchange architecture
and hostname info in the modex.
True - except we should note that hostname info is only exchanged if
someone
specifically requests it.
Note that I am a fan of *always* exchanging the hostname information.
I say this because multiple Cisco customers have told us that this is
invaluable debugging information: when a BTL fails to send a message,
for example, we specifically put in the error message "hostA tried to
send to hostB and failed" (vs. "communicator X rank Y tried to send to
rank Z"). System administrators want/need the actual hostnames in
order to [greatly] simplify the process of troubleshooting if there
is a problem in the fabric, and if so, where it is.
This is especially important for very large fabrics.
Do we really need a complete node map? A far as I can tell, it looks
like the MPI layer only needs a list of local processes. So maybe it
would be better to forget about the node ids at the mpi layer and
just
return the local procs.
I agree, though I don't think we want a parallel list of procs. We
just need
to set the "local" flag in the existing ompi_proc_t structures.
I agree that the desired end result is that we need that "local" flag
set in the relevant ompi_proc_t's.
As previously implied: strcmp'ing hostnames is not always sufficient
(e.g., on the cray). Hence, sending hostnames around is useful for
the reasons I cited above, but it may not be sufficient for what is
needed.
So my vote would be to leave the modex alone, but remove the node id,
and add a function to get the list of local procs. It doesn't
matter to
me how the RTE implements that.
I think we would need to be careful here that we don't create a need
for
more communication. We have two functions currently in the modex:
1. how to exchange the info required to populate the ompi_proc_t
structures;
and
2. how to identify which of those procs are "local"
The problem with leaving the modex as it currently sits is that some
environments require a different mechanism for exchanging the
ompi_proc_t
info. While most can use the RML, some can't. The same division of
capabilities applies to getting the "local" info, so it makes sense
to me to
put the modex in a framework.
Otherwise, we wind up with a bunch of #if's in the code to support
environments like the Cray. I believe the mca system was put in place
precisely to avoid those kind of practices, so it makes sense to me
to take
advantage of it.
FWIW, I'm very against putting #if's in the code for specific
architectures / RTE's. Such differences is what the MCA is for.
Alternatively, if we did a process attribute system we could just use
predefined attributes, and the runtime can get each process's node id
however it wants.
Same problem as above, isn't it? Probably ignorance on my part, but
it seems
to me that we simply exchange a modex framework for an attribute
framework
(since each environment would have to get the attribute values in a
different manner) - don't we?
I have no problem with using attributes instead of the modex, but
the issue
appears to be the same either way - you still need a framework to
handle the
different methods.
I agree -- I don't see the difference. Tim -- can you explain? (I
also didn't quite understand your statement about being a fan of
attribute systems; other than it being an ASCII system with a flat
namespace [why is a flat namespace good, btw?], I don't really see how
it's significantly different than the modex principle...?)
--
Jeff Squyres
Cisco Systems