Oliver, Thank you for this summary insight. This substantially affects the structural design of software implementations, which points to a new analysis "opportunity" in our software.
Ken Lloyd -----Original Message----- From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf Of Oliver Geisler Sent: Thursday, April 22, 2010 9:38 AM To: Open MPI Developers Subject: Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times To sum up and give an update: The extended communication times while using shared memory communication of openmpi processes are caused by openmpi session directory laying on the network via NFS. The problem is resolved by establishing on each diskless node a ramdisk or mounting a tmpfs. By setting the MCA parameter orte_tmpdir_base to point to the according mountpoint shared memory communication and its files are kept local, thus decreasing the communication times by magnitudes. The relation of the problem to the kernel version is not really resolved, but maybe not "the problem" in this respect. My benchmark is now running fine on a single node with 4 CPU, kernel 2.6.33.1 and openmpi 1.4.1. Running on multiple nodes I experience still higher (TCP) communication times than I would expect. But that requires me some more deep researching the issue (e.g. collisions on the network) and should probably posted to a new thread. Thank you guys for your help. oli -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel