Maybe you could make a system call to ping the other machine. char sCommand[512]; // build the command string sprintf(sCommand, "ping -c %d -q %s > /dev/null", numPings, sHostName); // execute the command int iResult =system(sCommand);
If the ping was successful, iResult will have the value 0. Jody On Thu, Jul 23, 2009 at 1:36 PM, vipin kumar<vipinkuma...@gmail.com> wrote: > > > On Thu, Jul 23, 2009 at 3:03 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >> It depends on which network fails. If you lose all TCP connectivity, Open >> MPI should abort the job as the out-of-band system will detect the loss of >> connection. If you only lose the MPI connection (whether TCP or some other >> interconnect), then I believe the system will eventually generate an error >> after it retries sending the message a specified number of times, though it >> may not abort. > > Thank you Ralph, > > From your reply I came to know that the question I posted earlier was not > reflecting the problem properly. > > I can't use blocking communication routines in my main program ( > "masterprocess") because any type of network failure( may be due to physical > connectivity or TCP connectivity or MPI connection as you told) may occur. > So I am using non blocking point to point communication routines, and TEST > later for completion of that Request. Once I enter a TEST loop I will test > for Request complition till TIMEOUT. Suppose TIMEOUT has occured, In this > case first I will check whether > > 1: Slave machine is reachable or not, (How I will do that ??? Given - I > have IP address and Host Name of Slave machine.) > > 2: if reachable, check whether program(orted and "slaveprocess") is alive > or not. > > I don't want to abort my master process in case 1 and hope that network > connection will come up in future. Fortunately OpenMPI doesn't abort any > process. Both processes can run independently without communicating. > > > Thanks and Regards, > -- > Vipin K. > Research Engineer, > C-DOTB, India > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >