Isn't this supposed to be part of cluster 101?

I would rather add it to our faq, maybe in a slightly more generic way (not
only focused towards 'ulimit - c'. Otherwise we will be bound to define
what is forwarded and what is not, and potentially creates chaos for
knowledgeable users (that know how to deal with these issues).

George


On Mar 3, 2017 3:05 AM, "Gilles Gouaillardet" <gil...@rist.or.jp> wrote:

Folks,


this is a follow-up on https://www.mail-archive.com/u
s...@lists.open-mpi.org//msg30715.html


on my cluster, the core file size is 0 by default, but it can be set to
unlimited by any user.

i think this is a pretty common default.


$ ulimit -c
0
$ bash -c 'ulimit -c'
0
$ mpirun -np 1 bash -c 'ulimit -c'
0

$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0

$ ssh n1
[n1 ~]$ ulimit -c
0
[n1 ~]$ bash -c 'ulimit -c'
0

*but*

$ ssh motomachi-n1 bash -c 'ulimit -c'
unlimited


now if i manually set the core file size to unlimited

$ ulimit -c unlimited
$ ulimit -c
unlimited
$ bash -c 'ulimit -c'
unlimited
$ mpirun -np 1 bash -c 'ulimit -c'
unlimited


*but*

$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0


fun fact

$ ssh n1 bash -c 'ulimit -c; bash -c "ulimit -c"'
unlimited
0


bottom line, MPI tasks that run on the same node mpirun was invoked on
inherit

the core file size limit from mpirun, whereas tasks that run on the other
node

use the default core file size limit.


a manual workaround is

mpirun --mca opal_set_max_sys_limits core:unlimited ...


i guess we should do something about that, but what

- just document it

- mpirun forwards all/some limits to all the spawned tasks regardless where
they run

- mpirun forwards all/some limits to all the spawned tasks regardless where
they run

  but only if they are 0 or unlimited

- something else



thoughts anyone ?


Gilles


_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to