Folks,

this is a follow-up on https://www.mail-archive.com/users@lists.open-mpi.org//msg30715.html


on my cluster, the core file size is 0 by default, but it can be set to unlimited by any user.

i think this is a pretty common default.


$ ulimit -c
0
$ bash -c 'ulimit -c'
0
$ mpirun -np 1 bash -c 'ulimit -c'
0

$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0

$ ssh n1
[n1 ~]$ ulimit -c
0
[n1 ~]$ bash -c 'ulimit -c'
0

*but*

$ ssh motomachi-n1 bash -c 'ulimit -c'
unlimited


now if i manually set the core file size to unlimited

$ ulimit -c unlimited
$ ulimit -c
unlimited
$ bash -c 'ulimit -c'
unlimited
$ mpirun -np 1 bash -c 'ulimit -c'
unlimited


*but*

$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0


fun fact

$ ssh n1 bash -c 'ulimit -c; bash -c "ulimit -c"'
unlimited
0


bottom line, MPI tasks that run on the same node mpirun was invoked on inherit

the core file size limit from mpirun, whereas tasks that run on the other node

use the default core file size limit.


a manual workaround is

mpirun --mca opal_set_max_sys_limits core:unlimited ...


i guess we should do something about that, but what

- just document it

- mpirun forwards all/some limits to all the spawned tasks regardless where they run

- mpirun forwards all/some limits to all the spawned tasks regardless where they run

  but only if they are 0 or unlimited

- something else



thoughts anyone ?


Gilles


_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to