Thanks for the quick and helpful response.I'm willing to make the changes if only I were more confident I understood the issues.
What you call the first usage of num_procs_alive seems to count "local procs alive or about to be launched". It tests on "child->alive || opal_dss.compare(&job, &(child->name->jobid), ORTE_JOBID))". This is only in the !oversubscribed case.
What you call "the later usage" ("0 < opal_sys_limits.num_procs" or "check the system limits") seems to want only the number of alive procs (not including any that are about to be launched). So, as I guess you were driving at, a fresh computation of num_procs_alive seems to be needed at this point, both in v1.5 and in the trunk.
There is actually a third usage of num_procs_alive ("available file descriptors" or "0 < opal_sys_limits.num_files"). I think there are two problems here. One is that the number of procs we're about to launch is not accounted for. Secondly, and perhaps most importantly, if we "wait a little time" and recompute num_procs_alive, we loop over item/child. This is bad since we are already inside an item/child loop! (This is actually the origin of my foray into this mess... MPI_Comm_spawn_multiple gets screwed up by this reuse of variable names.)
I'm attaching an original (trunk) file, a proposed version, and the diff file. Let me know if it looks sane. If so, I'll try it out and put it back and then file a CMR for 1.5.
Some more below... On 07/12/11 18:19, Ralph Castain wrote:
Well, yes and no. Yes - the value definitely needs to be computed in all cases since it is used later on. No - the way it is computed is only correct for that one usage. The later usage (in the block that starts with 0< opal_sys_limits.num_procs)) needs a completely different value. The current computation only comes up with the number of procs for the specific job being launched. The latter code requires the total number of local procs that are alive - a rather different value.
The first usage, to compare with num_processors, includes not only procs for the job being launched but also already-alive procs.
The second usage, to be compared to opal_sys_limits.num_procs, should only have alive procs since app->num_procs will be added later. (Plus, we try to keep a running total of this number of alive procs.)
The third usage, to be compared to opal_sys_limits.num_files, should also only have alive procs for the same reasons.
So v1.5 is incorrect in either case. Would you like me to create a patch to correct it, or would you like to do it (covering both cases)?
Trunk also appears to be incorrect.
On Jul 12, 2011, at 3:35 PM, Eugene Loh wrote:The function orte_odls_base_default_launch_local() has a variable num_procs_alive that is basically initialized like this: if ( oversubscribed ) { ... } else { num_procs_alive = ...; } Specifically, if the "oversubscribed" test passes, the variable is not initialized. (Strictly speaking, this is true only in v1.5. In the trunk, the variable is set to 0 when it is declared, but I'm not sure that's very helpful.) I'm inclined to move the num_procs_alive computation ahead of the "if" block so that this computation is always performed.
files.tar.gz
Description: GNU Zip compressed data