On Mar 3, 2017, at 3:05 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
> 
> bottom line, MPI tasks that run on the same node mpirun was invoked on inherit
> the core file size limit from mpirun, whereas tasks that run on the other node
> use the default core file size limit.

I am not sure that this is inconsistent.  This is quite similar to -- at least 
with ssh -- how environment variables can propagate (or not).

With SSH:
- environment variables:
  - propagate from mpirun's environment if the user specifies -x on the mpirun 
command line
  - are set per the user's shell startup files on each node
- process limits (e.g., corefile size)
  - are set per ssh's defaults and the user's shell startup files on each node

With a resource manager (e.g., SLURM)
- environment variables:
  - (typically) propagate from mpirun's environment via the resource manager
- process limits (e.g,. corefile size)
  - may or may not propagate from mpirun's environment, but the resource 
manager may impose its own limits

It sounds like we already have an MCA param that allows propagating those 
limits to override ssh / resource manager / shell startup file settings.  I 
think that's probably enough.

> 
> a manual workaround is
> 
> mpirun --mca opal_set_max_sys_limits core:unlimited ...
> 
> 
> i guess we should do something about that, but what
> 
> - just document it
> 
> - mpirun forwards all/some limits to all the spawned tasks regardless where 
> they run
> 
> - mpirun forwards all/some limits to all the spawned tasks regardless where 
> they run
> 
>  but only if they are 0 or unlimited
> 
> - something else

I think the first option (documenting it) is probably a good idea.  The FAQ 
would likely be a good place for this (and maybe also the README?).

-- 
Jeff Squyres
jsquy...@cisco.com

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to