Hi Members,

I tried to use checkpoint/restart by openmpi.
But I can not get collect checkpoint data.
I prepared execution environment as follows, the strings in () mean
name of output file which attached on next e-mail ( for mail size
limitation ):

1. installed BLCR and checked BLCR is working correctly by "make check"
2. executed ./configure with some parameters on openMPI source dir
(config.output / config.log)
3. executed make and make install (make.output.2 / install.output.2)
4. confirmed that mca_crs_blcr.[la|so], mca_crs_self.[la|so] on
/${INSTALL_DIR}/lib/openmpi
5. make ~/.openmpi/mca-params.conf (mca-params.conf)
6. compiled NPB and executed with -am ft-enable-cr
7. invoked ompi-checkpoint <MPIRUN_PID>

As result, I got the message "Checkpoint failed: no processes checkpointed."
(cr_test_cg)

In addition, when I confirmed open_info output as your demo movie, I got
"MCA crs: none (MCA v2.0, API v2.0, Component v1.4.1)" (open_info.output)

How should I do for checkpointing ?
Any guidance in this regard would be highly appreciated.

Thank you,
Hideyuki

--
Sincerely Yours,
Hideyuki Jitsumoto (jitum...@gsic.titech.ac.jp)
Tokyo Institute of Technology
Global Scientific Information and Computing center (Matsuoka Lab.)

Reply via email to