I found something strange in iof and/or orte. If we run an
application and it does not finish correctly (one of the processes
segfault) or it leaks some orted on the nodes all subsequent MPI runs
will not have any output. The output just disappear somehow ...
It start happening only today,
George Bosilca wrote:
Tim,
It looks a little bit better. Here are the latencies for 1 to 4 bytes
messages as well as for the maximum length in Netpipe (8 MB).
old ob1:
0: 1 bytes694 times --> 0.06 Mbps in 137.54 usec
1: 2 bytes727 times --> 0.11 Mbps i
Tim,
It looks a little bit better. Here are the latencies for 1 to 4 bytes
messages as well as for the maximum length in Netpipe (8 MB).
old ob1:
0: 1 bytes694 times --> 0.06 Mbps in 137.54 usec
1: 2 bytes727 times --> 0.11 Mbps in 140.54 usec
2:
On Tue, 29 Nov 2005, Jeff Squyres wrote:
Here's the problem: there are 3 different APIs for processor affinity
in Linux.
Could you please list them (at least the ones that you know about) ?
In the kernel source, in kernel/sched.c, the sys_sched_setaffinity
function appears only in 2.6.0 (tal
Eureka!
Operationally the 3-argument variants are ALMOST identical. The older
version required len == sizeof(long), while the later version allowed
the len to vary (so an Altix could have more than 64 cpus). However, in
the kernel both effectively treat the 3rd argument as an array of
unsig
I can replicate this on thor with the trunk, this looks like a multi-
nic issue, as we pass the test when I restrict open-mpi to use a
single ib nic. I will dig into this further but should we consider
the priority of multi-nic for the 1.0.1 release?
Thanks,
Galen
On Nov 28, 2005, at 7:
Jeff, et al.,
My own "research" into processor affinity for the GASNet runtime
began by "borrowing" the related autoconf code from OpenMPI. My
experience is the same as Jeff's when it comes to looking for a
correlation between the API and any system parameter such as libc or
kernel version
George,
Can you try out the changes I just commited on the trunk? We were doing
more select/recvs then necessary.
Thanks,
Tim
George Bosilca wrote:
I run Netpipe on 4 different clusters with differents OSes and Eternet
devices. The results is that nearly the same behaviour happens all the
ti
Greetings all. I'm writing this to ask for help from the general
development community. We've run into a problem with Linux processor
affinity, and although I've individually talked to a lot of people
about this, no one has been able to come up with a solution. So I
thought I'd open this to
I run Netpipe on 4 different clusters with differents OSes and Eternet
devices. The results is that nearly the same behaviour happens all the
time for small messages. Basically, our latency is really bad. Attached
are 2 of the graphs on one MAC OS X cluster (wotan) and a Linux 2.6.10 32
bits on
rc4 is up:
http://www.open-mpi.org/software/v1.0/
Here's the NEWS:
- Fix so that Open MPI correctly handles the Fortran value for .TRUE.,
regardless of what the Fortran compiler's value for .TRUE. is.
- Improved scalability of MX startup.
- Fix datatype offset handling in the coll bas
11 matches
Mail list logo