Okay, George - this is fixed in r21690.
Thanks again
Ralph
On Jul 15, 2009, at 2:40 PM, Ralph Castain wrote:
Ah - interesting scenario!
Definitely a "bug" in the code, then. What it looks like, though, is
that the jdata->num_procs is wrong. There shouldn't be any way that
the num_procs in
Found the bug - we indeed failed to update the jdata->num_procs field
when adding the non-rf-mapped procs to the job.
Fix coming shortly.
On Jul 15, 2009, at 2:40 PM, Ralph Castain wrote:
Ah - interesting scenario!
Definitely a "bug" in the code, then. What it looks like, though, is
that
Ah - interesting scenario!
Definitely a "bug" in the code, then. What it looks like, though, is that
the jdata->num_procs is wrong. There shouldn't be any way that the num_procs
in the node array is different than jdata->num_procs.
My guess is that the rank_file mapper isn't correctly maintaining
I think I found a better solution (in r21688). Here is what I was
trying to do.
I have a more or less homogeneous cluster. In fact all processors are
identical, except that some are quad core and some dual core. Of
course I care how my processes are mapped on the quad cores, but not
reall
The routed comm system relies on each daemon having complete information as
to where every process is located, so the expectation was that only full
maps would ever be sent. Thus, the nidmap code is setup to always send a
full map.
I don't know how to even generate a "partial" map. I assume you ar
I have a question regarding the mapping. How can I declare a partial
mapping ? In fact I only care about how some of the processes are
mapped on some specific nodes. Right now if the rmaps doesn't contain
information about all nodes, we give up (before this patch we
segfaulted).
Does it m