[O-MPI devel] bug in iof/orted

2005-11-29 Thread George Bosilca
I found something strange in iof and/or orte. If we run an application and it does not finish correctly (one of the processes segfault) or it leaks some orted on the nodes all subsequent MPI runs will not have any output. The output just disappear somehow ... It start happening only today,

Re: [O-MPI devel] TCP performance

2005-11-29 Thread Tim S. Woodall
George Bosilca wrote: Tim, It looks a little bit better. Here are the latencies for 1 to 4 bytes messages as well as for the maximum length in Netpipe (8 MB). old ob1: 0: 1 bytes694 times --> 0.06 Mbps in 137.54 usec 1: 2 bytes727 times --> 0.11 Mbps i

Re: [O-MPI devel] TCP performance

2005-11-29 Thread George Bosilca
Tim, It looks a little bit better. Here are the latencies for 1 to 4 bytes messages as well as for the maximum length in Netpipe (8 MB). old ob1: 0: 1 bytes694 times --> 0.06 Mbps in 137.54 usec 1: 2 bytes727 times --> 0.11 Mbps in 140.54 usec 2:

Re: [O-MPI devel] Linux processor affinity

2005-11-29 Thread Bogdan Costescu
On Tue, 29 Nov 2005, Jeff Squyres wrote: Here's the problem: there are 3 different APIs for processor affinity in Linux. Could you please list them (at least the ones that you know about) ? In the kernel source, in kernel/sched.c, the sys_sched_setaffinity function appears only in 2.6.0 (tal

Re: [O-MPI devel] Linux processor affinity

2005-11-29 Thread Paul H. Hargrove
Eureka! Operationally the 3-argument variants are ALMOST identical. The older version required len == sizeof(long), while the later version allowed the len to vary (so an Altix could have more than 64 cpus). However, in the kernel both effectively treat the 3rd argument as an array of unsig

Re: [O-MPI devel] MPI_Probe_tag_c mvapi hand

2005-11-29 Thread Galen M. Shipman
I can replicate this on thor with the trunk, this looks like a multi- nic issue, as we pass the test when I restrict open-mpi to use a single ib nic. I will dig into this further but should we consider the priority of multi-nic for the 1.0.1 release? Thanks, Galen On Nov 28, 2005, at 7:

Re: [O-MPI devel] Linux processor affinity

2005-11-29 Thread Paul H. Hargrove
Jeff, et al., My own "research" into processor affinity for the GASNet runtime began by "borrowing" the related autoconf code from OpenMPI. My experience is the same as Jeff's when it comes to looking for a correlation between the API and any system parameter such as libc or kernel version

Re: [O-MPI devel] TCP performance

2005-11-29 Thread Tim S. Woodall
George, Can you try out the changes I just commited on the trunk? We were doing more select/recvs then necessary. Thanks, Tim George Bosilca wrote: I run Netpipe on 4 different clusters with differents OSes and Eternet devices. The results is that nearly the same behaviour happens all the ti

[O-MPI devel] Linux processor affinity

2005-11-29 Thread Jeff Squyres
Greetings all. I'm writing this to ask for help from the general development community. We've run into a problem with Linux processor affinity, and although I've individually talked to a lot of people about this, no one has been able to come up with a solution. So I thought I'd open this to

[O-MPI devel] TCP performance

2005-11-29 Thread George Bosilca
I run Netpipe on 4 different clusters with differents OSes and Eternet devices. The results is that nearly the same behaviour happens all the time for small messages. Basically, our latency is really bad. Attached are 2 of the graphs on one MAC OS X cluster (wotan) and a Linux 2.6.10 32 bits on

[O-MPI devel] 1.0.1rc4 is up

2005-11-29 Thread Jeff Squyres
rc4 is up: http://www.open-mpi.org/software/v1.0/ Here's the NEWS: - Fix so that Open MPI correctly handles the Fortran value for .TRUE., regardless of what the Fortran compiler's value for .TRUE. is. - Improved scalability of MX startup. - Fix datatype offset handling in the coll bas