On Apr 22, 2010, at 10:08 AM, Rainer Keller wrote:

Hello Oliver,
thanks for the update.

Just my $0.02: the upcoming Open MPI v1.5 will warn users, if their session
directory is on NFS (or Lustre).

... or panfs :-)

Samuel K. Gutierrez


Best regards,
Rainer


On Thursday 22 April 2010 11:37:48 am Oliver Geisler wrote:
To sum up and give an update:

The extended communication times while using shared memory communication of openmpi processes are caused by openmpi session directory laying on
the network via NFS.

The problem is resolved by establishing on each diskless node a ramdisk
or mounting a tmpfs. By setting the MCA parameter orte_tmpdir_base to
point to the according mountpoint shared memory communication and its
files are kept local, thus decreasing the communication times by
magnitudes.

The relation of the problem to the kernel version is not really
resolved, but maybe not "the problem" in this respect.
My benchmark is now running fine on a single node with 4 CPU, kernel
2.6.33.1 and openmpi 1.4.1.
Running on multiple nodes I experience still higher (TCP) communication
times than I would expect. But that requires me some more deep
researching the issue (e.g. collisions on the network) and should
probably posted to a new thread.

Thank you guys for your help.

oli


--
------------------------------------------------------------------------
Rainer Keller, PhD                  Tel: +1 (865) 241-6293
Oak Ridge National Lab          Fax: +1 (865) 241-4811
PO Box 2008 MS 6164           Email: kel...@ornl.gov
Oak Ridge, TN 37831-2008    AIM/Skype: rusraink

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to