Folks,

I noticed several errors such as
http://mtt.open-mpi.org/index.php?do_redir=2244
that did not make any sense to me (at first glance)

I was able to attach one process when the issue occurs.
the sigsegv occurs in thread 2, while thread 1 is invoking
ompi_rte_finalize.

All I can think is a scenario in which the progress thread (aka thread 2)
is still dealing with some memory that was just freed/unmapped/corrupted by
the main thread.

I empirically noticed the error is more likely to occur when there are many
tasks on one node
e.g. mpirun --oversubscribe -np 32 a.out

Cheers,

Gilles

Reply via email to