On Aug 29, 2011, at 3:51 AM, Xin He wrote:

>> -----
>> $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 
>> hostname
>> svbu-mpi008
>> svbu-mpi009
>> $ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 
>> IMB-MPI1 PingPong
>> #---------------------------------------------------
>> #    Intel (R) MPI Benchmark Suite V3.2, MPI-1 part
>> #---------------------------------------------------
>> 
> Hi, I think these models are reasonably new :)
> The result I gave you, they are tested on 2 processes but on 2 different 
> servers. I get that the result you showed is 2 processes on one machine?

Nope -- check my output -- I'm running across 2 different servers and through a 
1GB TOR ethernet switch (it's not a particularly high-performance ethernet 
switch, either).

Can you run some native netpipe TCP numbers across the same nodes that you ran 
the TIPC MPI tests over?  You should be getting lower latency than what you're 
seeing.

Do you have jumbo frames enabled, perchance?  Are you going through only 1 
switch?  If you're on a NUMA server, do you have processor affinity enabled, 
and have the processes located "near" the NIC?

> BTW, I forgot to tell you about SM & TIPC. Unfortunately, TIPC does not beat 
> SM...

That's probably not surprising; SM is tuned pretty well specifically for MPI 
communication across shared memory.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to