Ralph,
I got consistent segfaults during the infrastructure tearing down in the
orterun (I noticed them on a OSX). After digging a little bit it turns out
that the opal_buffet_t class has been cleaned-up in orte_finalize before
orte_proc_info_finalize is called, leading to calling the destructors into
a randomly initialized memory. If I change the order of the teardown to
move orte_proc_info_finalize before orte_finalize things work better, but I
still get a very annoying warning about a "Bad file descriptor in select".
Any better fix ?
George.
PS: Here is the patch I am currently using to get rid of the segfaults
diff --git a/orte/tools/orterun/orterun.c b/orte/tools/orterun/orterun.c
index 85aba0a0f3..506b931d35 100644
--- a/orte/tools/orterun/orterun.c
+++ b/orte/tools/orterun/orterun.c
@@ -222,10 +222,10 @@ int orterun(int argc, char *argv[])
DONE:
/* cleanup and leave */
orte_submit_finalize();
- orte_finalize();
- orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);
/* cleanup the process info */
orte_proc_info_finalize();
+ orte_finalize();
+ orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);
if (orte_debug_flag) {
fprintf(stderr, "exiting with status %d\n", orte_exit_status);
_______________________________________________
devel mailing list
[email protected]
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel