I appreciate the input and have captured it in the ticket. Since this appears to be a NUMA-related issue, the lack of NUMA support in your setup makes the test difficult to interpret.
I agree, though, that this is likely something peculiar to our particular setup. Of primary concern is that it might be related to the relatively old kernel (2.6.18) on these machines. There has been a lot of change since that kernel was released, and some of those changes may be relevant to this problem. Unfortunately, upgrading the kernel will take persuasive argument. We are going to try and run the reproducers on machines with more modern kernels to see if we get different behavior. Please feel free to follow this further on the ticket. Thanks again! Ralph On Wed, Jun 10, 2009 at 11:29 AM, Bogdan Costescu < bogdan.coste...@iwr.uni-heidelberg.de> wrote: > On Wed, 10 Jun 2009, Ralph Castain wrote: > > Meantime, I have filed a bunch of data on this in ticket #1944, so perhaps >> you might take a glance at that and offer some thoughts? >> >> https://svn.open-mpi.org/trac/ompi/ticket/1944 >> > > I wasn't able to reproduce this. I have run with the following setup: > - OS is Scientific Linux 5.1 with a custom compiled kernel based on > 2.6.22.19, but (due to circumstances that I can't control): > > checking if MCA component maffinity:libnuma can compile... no > > - Intel compiler 10.1 > - OpenMPI 1.3.2 > - nodes have 2 CPUs of type E5440 (quad core), 16GB RAM and a ConnectX IB > DDR > > I've used the platform file that you have provided, but took out the > references to PanFS and fixed the paths. I've also used the MCA file that > you have provided. > > I have run with nodes=1:ppn=8 and nodes=2:ppn=8 and the test finished > successfully with m=50 several times. This, together with the earlier post > also describing a negative result, points to a problem related to your > particular setup... > > -- > Bogdan Costescu > > IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany > Phone: +49 6221 54 8240, Fax: +49 6221 54 8850 > E-mail: bogdan.coste...@iwr.uni-heidelberg.de > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >