> On Dec 1, 2017, at 8:10 AM, Götz Waschk <goetz.was...@gmail.com> wrote: > > On Fri, Dec 1, 2017 at 10:13 AM, Götz Waschk <goetz.was...@gmail.com> wrote: >> I have attached my slurm job script, it will simply do an mpirun >> IMB-MPI1 with 1024 processes. I haven't set any mca parameters, so for >> instance, vader is enabled. > I have tested again, with > mpirun --mca btl "^vader" IMB-MPI1 > it made no difference.
I’ve lost track of the earlier parts of this thread, but has anyone suggested logging into the nodes it’s running on, doing “gdb -p PID” for each of the mpi processes, and doing “where” to see where it’s hanging? I use this script (trace_all), which depends on a variable process that is a grep regexp that matches the mpi executable: echo "where" > /tmp/gf pids=`ps aux | grep $process | grep -v grep | grep -v trace_all | awk '{print \$2}'` for pid in $pids; do echo $pid prog=`ps auxw | grep " $pid " | grep -v grep | awk '{print $11}'` gdb -x /tmp/gf -batch $prog $pid echo "" done
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users