Josh,
When I use cr_{run,checkpoint,restart} to start a checkpoint and restart
a single-threaded, single-process app on a different host, it works,
even with prelinking enabled. That's kinda why I assumed the problem
was with the OpenMPI code, and didn't look at the BLCR FAQ that closely,
to be h
Often this type of problem is due to the 'prelink' option in Linux.
BLCR has a FAQ item that discusses this issue and how to resolve it:
https://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink
I would give that a try. If that does not help then you might want to
try checkpointing a single (non-M
Hi, all.
I'm in the middle of testing some of the checkpoint/restart capabilities
of OpenMPI with BLCR on our cluster. I've been able to checkpoint and
restart successfully when I restart on the same nodes as it was running
previously. But when I try to restart on a different host, I always get