Folks,
this is a follow-up on
https://www.mail-archive.com/users@lists.open-mpi.org//msg30715.html
on my cluster, the core file size is 0 by default, but it can be set to
unlimited by any user.
i think this is a pretty common default.
$ ulimit -c
0
$ bash -c 'ulimit -c'
0
$ mpirun -np 1 bash -c 'ulimit -c'
0
$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0
$ ssh n1
[n1 ~]$ ulimit -c
0
[n1 ~]$ bash -c 'ulimit -c'
0
*but*
$ ssh motomachi-n1 bash -c 'ulimit -c'
unlimited
now if i manually set the core file size to unlimited
$ ulimit -c unlimited
$ ulimit -c
unlimited
$ bash -c 'ulimit -c'
unlimited
$ mpirun -np 1 bash -c 'ulimit -c'
unlimited
*but*
$ mpirun -np 1 --host n1 bash -c 'ulimit -c'
0
fun fact
$ ssh n1 bash -c 'ulimit -c; bash -c "ulimit -c"'
unlimited
0
bottom line, MPI tasks that run on the same node mpirun was invoked on
inherit
the core file size limit from mpirun, whereas tasks that run on the
other node
use the default core file size limit.
a manual workaround is
mpirun --mca opal_set_max_sys_limits core:unlimited ...
i guess we should do something about that, but what
- just document it
- mpirun forwards all/some limits to all the spawned tasks regardless
where they run
- mpirun forwards all/some limits to all the spawned tasks regardless
where they run
but only if they are 0 or unlimited
- something else
thoughts anyone ?
Gilles
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel