Maybe this question is not entirely appropriate to this forum, but maybe someone reading this forum has already tried this and knows which method is faster.

I am about to hook up a NAS node to my Centos based Linux cluster. NAS storage will be shared amongst the nodes using GFS2. My OpenMPI program needs to synchronize temporary files between the nodes, which is one of reasons I am switching to NAS. My program checkpoints/restarts by copying checkpoint/restart files to a local directory. The current code for taking a checkpoint looks something like:

if (mpi_id == mpi_host_id)
{
   //save global variales and temporary files to the local CR directory
}

When switching to NAS I can leave the code as is assuming GFS2 is smart enough to detect that the temporary files and the CR directory reside on the NAS node and does not copy the files unncessary across the network:

if (mpi_id == mpi_host_id)
{
//save global variales and temporary files (now residing on the GFS2 server node) to a CR directory (also residing on the GFS2 server node)
}

or I can change the code to:

if (mpi_id == mpi_host_id)
{
//save global variables to the CR directory residing on the GFS2 server node //send an OpenMPI message to the GFS2 server node to copy the temporary files to a CR directory on the GFS2 server node
}

Is the second method a lot faster than the first method or is it about the same?

Regards,
Gijsbert

Reply via email to