Hi Danesh

Make sure you have 700GB of RAM on the sum of all nodes you are using.
Otherwise context switching and memory swapping may be the problem.
MPI doesn't perform well in this conditions (and may break, particularly
on large problems, I suppose).

A good way to go about it is to look at the physical
"RAM per core" if those are multi-core machines,
and compare to the actual memory per core your program requires.
You need to give the system some RAM also, and use no more than 80% or
so of the memory.

If you or a system administrator has access to the nodes,
you can monitor the memory use with "top".
If you have Ganglia on this cluster, you can use the memory report
metric also.

Another possibility is a memory leak, which may be in your program,
or (less likely) in MPI.
Note, however, that OpenMPI 1.3.0 and 1.3.1 had this problem (with Infinband only), which was fixed in 1.3.2:

http://www.open-mpi.org/community/lists/announce/2009/04/0030.php
https://svn.open-mpi.org/trac/ompi/ticket/1853

If you are using 1.3.0 or 1.3.1, upgrade to 1.3.2.

I hope this helps.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Danesh Daroui wrote:
Dear all,

I am not sure if this the right forum to ask this question, so sorry if
I am wrong. I am using ScaLAPACK in my code and MPI of course (OMPI) in
a electromagnetic solver program, running on a cluster. I get very
strange behavior when I use a large number of processors to run my code
for very large problems. In these cases, however, the program finishes
successfully, but it stays until the wall time exceeds the limit and the
job is terminated by queue manager (I use qsub ti submit a job). This
happens when, for example I use more than 80 processors for a problem
which needs more than 700 GB memory. For smaller problem, everything is
OK and all output files are generated correctly, while when this
happens, the output files are empty. I am almost sure that there is a
synchronization problem and some processes fail to reach the
finalization point while others are done.

My code is written in C++ and in "main" function I call a routine called
"Solver". My Solver function looks like below:

Solver()
{
        for (std::vector<double>::iterator ti=times.begin();
ti!=times.end(); ++ti)
        {
                Stopwatch iwatch, dwatch, twatch;

                // some ScaLAPACK operations

                if (iamroot())
                {
                         // some operation only for root process
                }
          }

        blacs::gridexit(ictxt);
        blacs::exit(1);
}

and my "main" function which calls "Solver" looks like below:


int main()
{

       // some preparing operations

        Solver();
        if (rank==0)
                std::cout << "Total execution time: " << time.tick() <<
" s\n" << std::flush;

      err=MPI_Finalize();

      if (MPI_SUCCESS!=err)
      {
              std::cerr << "MPI_Finalize failed: " << err << "\n";
              return err;
      }

        return 0;
}

I did put a "blacs::barrier(ictxt, 'A')" at the and of "Solver" routine,
before calling "blacs::exit(1)" to make sure that all processes arrive
here before MPI_Finalize, but the problem didn't solve. Do you have any
idea where the problem is?

Thanks in advance,



Reply via email to