Here's a clue: ompi_coll_tuned_gather_intra_dec_fixed() changes its
algorithm for job sizes > 60 to some binomial method. I changed the
threshold to 100 and my NP64 jobs run fine. Now to try and understand
what about ompi_coll_tuned_gather_intra_binomial() is causing these
connect delays...
Oops. One key typo here: This is the IMB-MPI1 gather test, not
barrier. :(
On 9/16/2010 12:05 PM, Steve Wise wrote:
Hi,
I'm debugging a performance problem with running IMB-MP1/barrier in an
NP64 cluster (8 nodes, 8 cores each). I'm using openmpi-1.4.1 from
the OFED-1.5.1 distribution.
Hi,
I'm debugging a performance problem with running IMB-MP1/barrier in an
NP64 cluster (8 nodes, 8 cores each). I'm using openmpi-1.4.1 from the
OFED-1.5.1 distribution. The BTL is openib/iWARP via Chelsio's T3
RNIC. In short, a NP60 and smaller run completes in a timely manner as
expect