On Dec 5, 2007, at 11:23 AM, Ralph H Castain wrote:

Well, I think it is pretty obvious that I am a fan of a attribute system :)

For completeness, I will point out that we also exchange architecture
and hostname info in the modex.

True - except we should note that hostname info is only exchanged if someone
specifically requests it.

Note that I am a fan of *always* exchanging the hostname information.

I say this because multiple Cisco customers have told us that this is invaluable debugging information: when a BTL fails to send a message, for example, we specifically put in the error message "hostA tried to send to hostB and failed" (vs. "communicator X rank Y tried to send to rank Z"). System administrators want/need the actual hostnames in order to [greatly] simplify the process of troubleshooting if there is a problem in the fabric, and if so, where it is.

This is especially important for very large fabrics.

Do we really need a complete node map? A far as I can tell, it looks
like the MPI layer only needs a list of local processes. So maybe it
would be better to forget about the node ids at the mpi layer and just
return the local procs.

I agree, though I don't think we want a parallel list of procs. We just need
to set the "local" flag in the existing ompi_proc_t structures.

I agree that the desired end result is that we need that "local" flag set in the relevant ompi_proc_t's.

As previously implied: strcmp'ing hostnames is not always sufficient (e.g., on the cray). Hence, sending hostnames around is useful for the reasons I cited above, but it may not be sufficient for what is needed.

So my vote would be to leave the modex alone, but remove the node id,
and add a function to get the list of local procs. It doesn't matter to
me how the RTE implements that.

I think we would need to be careful here that we don't create a need for
more communication. We have two functions currently in the modex:

1. how to exchange the info required to populate the ompi_proc_t structures;
and

2. how to identify which of those procs are "local"

The problem with leaving the modex as it currently sits is that some
environments require a different mechanism for exchanging the ompi_proc_t
info. While most can use the RML, some can't. The same division of
capabilities applies to getting the "local" info, so it makes sense to me to
put the modex in a framework.

Otherwise, we wind up with a bunch of #if's in the code to support
environments like the Cray. I believe the mca system was put in place
precisely to avoid those kind of practices, so it makes sense to me to take
advantage of it.

FWIW, I'm very against putting #if's in the code for specific architectures / RTE's. Such differences is what the MCA is for.

Alternatively, if we did a process attribute system we could just use
predefined attributes, and the runtime can get each process's node id
however it wants.

Same problem as above, isn't it? Probably ignorance on my part, but it seems to me that we simply exchange a modex framework for an attribute framework
(since each environment would have to get the attribute values in a
different manner) - don't we?

I have no problem with using attributes instead of the modex, but the issue appears to be the same either way - you still need a framework to handle the
different methods.

I agree -- I don't see the difference. Tim -- can you explain? (I also didn't quite understand your statement about being a fan of attribute systems; other than it being an ASCII system with a flat namespace [why is a flat namespace good, btw?], I don't really see how it's significantly different than the modex principle...?)

--
Jeff Squyres
Cisco Systems

Reply via email to