On Fri, 2010-01-22 at 08:12 -0700, Ralph Castain wrote:
> For SLURM, there is a config file where you can specify what gets propagated. 
> It is clearly an error to include hostname as it messes many things up, not 
> just OMPI. Frankly, I've never seen someone do that on SLURM.
> 
I'm going to check that.

Thanks,
Nadia

> I believe in this case OMPI is likely incorrectly picking up the environment 
> and propagating it. We know this is incorrectly happening on Torque, and it 
> appears to also be happening on SLURM. This is a bug that I will be fixing on 
> Torque - and as soon as Nadia confirms, on SLURM as well.
> 
> I know that on Torque it was an innocent mistake where a line got added to 
> the launch code that shouldn't have...
> 
> On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote:
> 
> > On Jan 22 2010, Nadia Derbey wrote:
> >> 
> >> I'm wondering whether the HOSTNAME environment variable shouldn't be
> >> handled as a "special case" when the orted daemons launch the remote
> >> jobs. This particularly applies to batch schedulers where the caller's
> >> environment is copied to the remote job: we are inheriting a $HOSTNAME
> >> which is the name of the host mpirun was called from:
> > 
> > This is slightly orthogonal, but relevant.
> > 
> > This is an ancient mess with propagating environment variables, and predates
> > MPI by many years.  The most traditional form was the demented connexion
> > protocols that propagated TERM - truly wonderful when logging in from SunOS
> > to HP-UX!  Whether it is worth kludging up one variable and leaving the rest
> > is unclear.
> > 
> > Even if systems are fairly homogeneous, it is common for the head node to
> > have a different set of standard values from the others.  TMPDIR is one
> > very common one, but any of the dozen of so path variables is likely to
> > vary, at least sometimes, as are many of the others.
> > 
> > I used to have to write the most DISGUSTING hacks to stop unwanted export
> > when I managed our supercomputer.  Yet there are other systems that will
> > work only if you DO export environment variables.  And there are systems
> > where the secondary nodes aren't real systems, and using the parent hostname
> > would be better, though I haven't managed any.
> > 
> > Realistically, there should really be some kind of hook to control which
> > are transferred and which are not.  I haven't found one - if there is, it's
> > a better way to tackle this.
> > 
> > Regards,
> > Nick Maclaren.
> > 
> > 
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
-- 
Nadia Derbey <nadia.der...@bull.net>

Reply via email to