Dear All,
         I have installed BLCR 0.8.1 and OPENMPI 1.3 on a linux platform. 
However, when i tried checkpoiting an application, it hangs forever just before 
ending.

A chekcpoint file is generated. However, when i try restarting it, i get the 
following error: 

raj@sun06:~$ ompi-restart ompi_global_snapshot_22390.ckpt
[sun06:22423] *** Process received signal ***
[sun06:22423] Signal: Segmentation fault (11)
[sun06:22423] Signal code: Address not mapped (1)
[sun06:22423] Failing at address: (nil)
[sun06:22423] [ 0] [0xb7fb640c]
[sun06:22423] [ 1] 
/usr/local/openmpi/lib/libopen-pal.so.0(opal_crs_blcr_restart+0x103) 
[0xb7f76925]
[sun06:22423] [ 2] opal-restart [0x8049435]
[sun06:22423] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0xb7d9a455]
[sun06:22423] [ 4] opal-restart [0x8049001]
[sun06:22423] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 22423 on node sun06 exited on 
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Any help will be very appreciated.

kind regards,

Raj



Reply via email to