On Fri, Mar 5, 2010 at 12:03 PM, Josh Hursey <jjhur...@open-mpi.org> wrote: > This type of failure is usually due to prelink'ing being left enabled on one > or more of the systems. This has come up multiple times on the Open MPI > list, but is actually a problem between BLCR and the Linux kernel. BLCR has > a FAQ entry on this that you will want to check out: > https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink > > If that does not work, then we can look into other causes.
I also suggest checkpointing and restarting the app with BLCR directly. I.e., take any simple app, run it with cr_run, checkpoint it with cr_checkpoint then restart it with cr_restart. Make sure the blcr module is loaded too. That way you can tell whether it's related to OpenMPI or not. Regards,