Folks, I noticed several errors such as http://mtt.open-mpi.org/index.php?do_redir=2244 that did not make any sense to me (at first glance)
I was able to attach one process when the issue occurs. the sigsegv occurs in thread 2, while thread 1 is invoking ompi_rte_finalize. All I can think is a scenario in which the progress thread (aka thread 2) is still dealing with some memory that was just freed/unmapped/corrupted by the main thread. I empirically noticed the error is more likely to occur when there are many tasks on one node e.g. mpirun --oversubscribe -np 32 a.out Cheers, Gilles