[OMPI users] problem restarting multiprocess mpi application

2009-12-13 Thread Kritiraj Sajadah
Dear All, I am running a simple mpi application which looks as follows: ## #include #include #include #include #include int main(int argc, char **argv) { int rank,size; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, );

[OMPI users] Problem with checkpointing multihosts, multiprocesses MPI application

2009-12-12 Thread Kritiraj Sajadah
Dear All, I am trying to checkpoint am MPI application which has two processes each running on two seperate hosts. I run the application as follows: raj@sun32:~$ mpirun -am ft-enable-cr -np 2 --hostfile sunhost -mca btl ^openib -mca snapc_base_global_snapshot_dir /tmp m. and I

[OMPI users] a good grid simulator to run open MPI applications

2009-12-06 Thread Kritiraj Sajadah
Hi All, Can you recommend me a good open source Grid simulation tool to execute open mpi applcaiton. Thanks Raj

[OMPI users] get the process Id of mpirun

2009-11-14 Thread Kritiraj Sajadah
Dear All, I am trying to get the process Id of Mpirun from within my MPI application. When i use getpid() and getppid(), i get the PID of my application and the PID of "orted --daemonize -mca..." respectively. Is there a way to get the PID of the mpirun? In this case, it looks like it

[OMPI users] mpirun noticed that process rank 1 ... exited on signal 13 (Broken pipe).

2009-11-06 Thread Kritiraj Sajadah
Hi Everyone, I have install openmpi 1.3 and blcr 0.81 on my laptop (single processor). I am trying to checkpoint a small test application: ### #include #include #include #include #include int main(int argc, char **argv) { int rank,size; MPI_Init(, );

[OMPI users] problem using openmpi with DMTCP

2009-09-28 Thread Kritiraj Sajadah
Dear All, I am trying to integrate DMTCP with openmpi. IF I run a c application, it works fine. But when I execute the program using mpirun, It checkpoints application but gives error when restarting the application. # [31007] WARNING at connection.cpp:303 in restore;

Re: [OMPI users] configure OPENMPI with DMTCP

2009-08-13 Thread Kritiraj Sajadah
DMTCP > To: "Open MPI Users" <us...@open-mpi.org> > Date: Thursday, August 13, 2009, 2:40 PM > > On Aug 12, 2009, at 3:35 PM, Kritiraj Sajadah wrote: > > > HI, > >   I want to configure OPENMPI to > checkpoint MPI applications using DMTCP. Does any

[OMPI users] configure OPENMPI with DMTCP

2009-08-12 Thread Kritiraj Sajadah
HI, I want to configure OPENMPI to checkpoint MPI applications using DMTCP. Does anyone know how to specify the path to the DMTCP application when installing OPENMPI. Also, I wanted to use OPENMPI with SELF instead of BLCR. Is there any guide for setting up OPENMPI with SELF? Thanks a lot.

Re: [OMPI users] Checkpointing automatically at regular intervals

2009-06-30 Thread Kritiraj Sajadah
> > Hi, > > I think that you can write a simple script such as: > > > > wihle `pgrep mpirun`  != "" > > ompi-checkpoint `pidof mpirun` > > sleep 5 > > done > > > > Le 30 juin 09 à 14:29, Kritiraj Sajadah a écrit : > > > >

Re: [OMPI users] Apllication level checkpointing tools.

2009-06-30 Thread Kritiraj Sajadah
heckpointing tools. > To: "Open MPI Users" <us...@open-mpi.org> > Date: Tuesday, June 30, 2009, 1:09 PM > Dear Kritiraj, > You can use DMTCP  http://sourceforge.net/projects/dmtcp > > Le 30 juin 09 à 13:59, Kritiraj Sajadah a écrit : > > > > > Daer

[OMPI users] Checkpointing automatically at regular intervals

2009-06-30 Thread Kritiraj Sajadah
Dear All, I can manually checkpoint an MPI application using OPEN MPI and BLCR. However, I now want to checkpointing my application automatically at every 5 minutes. Is there a way in OPEN MPI to ensure automatic checkpointing without the user intervention while the application is

[OMPI users] Apllication level checkpointing tools.

2009-06-30 Thread Kritiraj Sajadah
Daer All, I have successfully comfigure OPENMPI with BLCR and id some test. hover, i now want to do some testing with an Application Level checkpointng tools. I tried using libckpt but could not install it. Do anyone of you know any open source application level checkpointing tools

Re: [OMPI users] vfs_write returned -14

2009-06-20 Thread Kritiraj Sajadah
gt; Date: Friday, June 19, 2009, 2:48 PM > > On Jun 18, 2009, at 7:33 PM, Kritiraj Sajadah wrote: > > > > > Hello Josh, > >           ThanK you > again for your respond. I tried chekpointing a  > > simple c program using BLCR...and got the same error, &

Re: [OMPI users] vfs_write returned -14

2009-06-18 Thread Kritiraj Sajadah
nts, environment > variables, ...). > > I should note that for the program that you sent it is > important that  > you compile Open MPI with the Fault Tolerance Thread > enabled to ensure  > a timely checkpoint. Otherwise the checkpoint will be > delayed until  > the MPI prog

Re: [OMPI users] vfs_write returned -14

2009-06-16 Thread Kritiraj Sajadah
application are you trying to checkpoint? > Some of the MPI interfaces are not fully supported at the > moment (outlined in the FT User Document that I mentioned in > a previous email). > > -- Josh > > On Jun 16, 2009, at 11:30 AM, Kritiraj Sajadah wrote: > > >

[OMPI users] vfs_write returned -14

2009-06-16 Thread Kritiraj Sajadah
Dear All, I have install openmpi 1.3 and blcr 0.8.1 on a linux machine (ubuntu). however, when i try checkpointing an MPI application, I get the following error: - vfs_write returned -14 - file_header: write returned -14 Can someone help please. Regards, Raj

[OMPI users] Segmentation fault (11)

2009-06-15 Thread Kritiraj Sajadah
Dear All, I have installed BLCR 0.8.1 and OPENMPI 1.3 on a linux platform. However, when i tried checkpoiting an application, it hangs forever just before ending. A chekcpoint file is generated. However, when i try restarting it, i get the following error: raj@sun06:~$ ompi-restart

[OMPI users] Compiling and Building OPENMPI for checkpointing using self

2009-06-06 Thread Kritiraj Sajadah
HI All, I have successfully install and configured openmpi to perfrom checkpointing using the BLCR mechanism. However, i now want to to try checkpointing using self. Has anyone do that? If so, i would very much appreciate if anyone of you could sent be the steps necessary to enable

Re: [OMPI users] *** An error occurred in MPI_Init

2009-05-08 Thread Kritiraj Sajadah
t; or "mpicc --showme" and "mpirun --help" to get a bit > more > > information about what you are really using. > > > > I hope this helps. > > Gus Correa > > > - > > Gustavo Correa &g

[OMPI users] *** An error occurred in MPI_Init

2009-05-08 Thread Kritiraj Sajadah
Dear All, I have install and configured openmpi with BLCR on my laptop: 1) configure and install blcr ./configure --prefix=/usr/local/ --enable-debug=yes --enable-libcr-tracing=yes --enable-kernel-tracing=yes --enable-testsuite=yes --enable-all-static=yes --enable-static=yes make

[OMPI users] error while loading shared libraries: libcr.so.0: cannot open shared object file: No such file or directory.

2009-05-04 Thread Kritiraj Sajadah
Dear All, I have install openmpi and blcr on my laptop and is trying to checkpoint an mpi application. Both openmpi and blcr are installed in /usr/local. When i try to checkpoint and mpi application, i get the following error: error while loading shared libraries: libcr.so.0: cannot

[OMPI users] mca: base: component_find: unable to open /usr/local/lib/openmpi/mca_crs_blcr: file not found (ignored)

2009-05-04 Thread Kritiraj Sajadah
Dear All, Thanks to Josh and Yaakoub, i was able to configure my openmpi as follows: raj@raj:./configure --prefix=/usr/local --with-ft=cr --enable-ft-thread --enable-mpi-threads --with-blcr=/usr/local. raj@raj:make all install I try to checkppoint an mpi application using the

[OMPI users] Checkpointing configuration problem

2009-05-01 Thread Kritiraj Sajadah
Dear all, I am trying to install openmpi 1.3 on my laptop. I successfully installed BLCR in /usr/local. When installing openmpi using the following options: ./configure --prefix=/usr/local --with-ft=cr --enable-ft-thread --enable-MPI-thread --with-blcr=/usr/local I got the

[OMPI users] checkpoint file contains nothing

2008-06-29 Thread Kritiraj Sajadah
HI, I have installed the openmpi-1.3a1r18651 and tried to checkpoint an mpi application. raj@portal018:~/examples> mpirun -np 1 -am ft-enable-cr ./myapp.sh & raj@portal018:~/examples> ompi-checkpoint --term 30416 However, when i try to restart the checkped file, I get the following