environment-modules)

Ralph Castain Tue, 17 Nov 2009 22:13:48 -0500

Sorry I didn't answer more completely before - a tad tied up today with network 
problems :-/


Actually, both you and Michael pointed out the "flaw" in your own reasoning, 
and hit the reason why we -don't- forward environment. It is obvious, for 
example, that you don't want to forward HOSTNAME and DISPLAY. But how is OMPI 
supposed to know precisely what -is- and -isn't- safe to forward?

Do we grab a sample based on what we developers think? What happens when the 
sys admin of a cluster sets up the environment with site-specific variables on 
the head node that must not be forwarded to the backend compute nodes? We had 
plenty of those at my former employer, and it isn't an uncommon situation. So 
how does OMPI identify and avoid those?

This is why we don't forward anything we are not specifically told to forward. 
The code that crept into the Torque launcher is incorrect and can cause a lot 
of problems. For one thing, it makes the remote processes think they are 
running on the incorrect node name!

Unfortunately, that code was copy/pasted from a different launcher that 
fork/exec's a local cmd to launch the daemons. In that scenario, passing a copy 
of mpirun's environment to execve is fine - the environment is not passed to 
anything remote. tm_spawn is a different story.

I will consult with other developers, but I do believe the right answer is to 
not forward the entire environment for the previously identified reasons, and 
to implement the "forward all except these" as an alternative to the current 
"-x" option. I'll pass along our decision about what to do with the current 
code.

HTH
Ralph

On Nov 17, 2009, at 1:49 PM, David Singleton wrote:

> 
> Hi Ralph,
> 
> Now I'm in a quandry - if I show you that its actually Open MPI that is
> propagating the environment then you are likely to "fix it" and then tm
> users will lose a nice feature.  :-)
> 
> Can I suggest that "least surprise" would require that MPI tasks get
> exactly the same environment/limits/... as mpirun so that "mpirun a.out"
> behaves just like "a.out".  [Following this principle we modified
> tm_spawn to propagate the callers rlimits to the spawned tasks.]
> A comment in orterun.c (see below) below suggests that Open MPI is trying
> to distinguish between "local" and "remote" processes.  I would have
> thought that distinction should be invisible to users as much as possible
> - a user asking for 4 cpus would like to see the same behaviour if all
> 4 are local or "2 local, 2 remote".
> 
> As to why tm does "The Right Thing": in the case of rsh/ssh the full
> mpirun environment is given to the rsh/ssh process locally while in the tm
> case it is an argument to tm_spawn and so gets given to the process (in
> this case orted) being launched remotely. Relevant lines from 1.3.3 below.
> PBS just passes along the environment it is told to.  We dont use torque
> but as of 2.3.3, it was still the same as OpenPBS in this respect.
> 
> Michael just pointed out the slight flaw.  The environment should be
> somewhat selectively propagated (exclude HOSTNAME etc).  I guess if you
> were to "fix" plm_tm_module I would put the propagation behaviour in
> tm_spawn and try to handle these exceptional cases.
> 
> Cheers,
> David
> 
> 
> orterun.c:
> 
>    510     /* save the environment for launch purposes. This MUST be
>    511      * done so that we can pass it to any local procs we
>    512      * spawn - otherwise, those local procs won't see any
>    513      * non-MCA envars were set in the enviro prior to calling
>    514      * orterun
>    515      */
>    516     orte_launch_environ = opal_argv_copy(environ);
> 
> 
> plm_rsh_module.c:
> 
>    681 /* actually ssh the child */
>    682 static void ssh_child(int argc, char **argv,
>    683                       orte_vpid_t vpid, int proc_vpid_index)
>    684 {
> 
>    694     /* setup environment */
>    695     env = opal_argv_copy(orte_launch_environ);
> 
>    766     execve(exec_path, exec_argv, env);
> 
> 
> plm_tm_module.c:
> 
>    128 static int plm_tm_launch_job(orte_job_t *jdata)
>    129 {
> 
>    228     /* setup environment */
>    229     env = opal_argv_copy(orte_launch_environ);
> 
>    311     rc = tm_spawn(argc, argv, env, node->launch_id, tm_task_ids + 
> launched, tm_events + launched);
> 
> 
> 
> Ralph Castain wrote:
>> Not exactly. It completely depends on how Torque was setup - OMPI isn't 
>> forwarding the environment. Torque is.
>> We made a design decision at the very beginning of the OMPI project not to 
>> forward non-OMPI envars unless directed to do so by the user. I'm afraid I 
>> disagree with Michael's claim that other MPIs do forward them - yes, MPICH 
>> does, but not all others do.
>> The world is bigger than MPICH and OMPI :-)
>> Since there is inconsistency in this regard between MPIs, we chose not to 
>> forward. Reason was simple: there is no way to know what is safe to forward 
>> vs what is not (e.g., what to do with DISPLAY), nor what the underlying 
>> environment is trying to forward vs what it isn't. It is very easy to get 
>> cross-wise and cause totally unexpected behavior, as users have complained 
>> about for years.
>> First, if you are using a managed environment like Torque, we recommend that 
>> you work with your sys admin to decide how to configure it. This is the best 
>> way to resolve a problem.
>> Second, if you are not using a managed environment and/or decide not to have 
>> that environment do the forwarding, you can tell OMPI to forward the envars 
>> you need by specifying them via the -x cmd line option. We already have a 
>> request to expand this capability, and I will be doing so as time permits. 
>> One option I'll be adding is the reverse of -x - i.e., "forward all envars 
>> -except- the specified one(s)".
>> HTH
>> ralph

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

Reply via email to