On Aug 25, 2011, at 8:25 AM, Xin He wrote:

>> Can you edit your configure.m4 directly and test it and whatnot?  I provided 
>> the configure.m4 as a starting point for you.  :-)  It shouldn't be hard to 
>> make it check linux/tipc.h instead of tipc.h.  I'm happy to give you direct 
>> write access to the bitbucket, if you want it.
> I think me having write access is convenient for both of us :)

Sure -- what's your bitbucket account ID?

>> As we've discussed off-list, we can't take the code upstream until the 
>> contributor agreement is signed, unfortunately.
>> 
> The agreement thing is ongoing right now, but it may take some time.

No worries.  Lawyers tend to take time when reviewing this stuff; we've seen 
this pattern in most organizations who sign the OMPI agreement.

> But to save time, can you guys do some test on TIPC BTL, so that
> when the agreement is ready, the code can be used?

I don't know if any of us has the TIPC support libraries installed.

So... what *is* TIPC?  Is there a writeup anywhere that we can read about what 
it is / how it works?  For example, what makes TIPC perform better than TCP?

>>> I have done some tests using tools like NetPIPE, osu and IMB and the result 
>>> shows that TIPC BTL has a better performance
>>> than TCP BTL.
>> Great!  Can you share any results?
> Yes, please check the appendix for the results using IMB 3.2.
> 
> I have done the tests on 2 computers. Dell SC1435
> Dual-Core AMD Opteron(tm) Processor 2212 HE x 2
> 4 GB Mem
> Ubuntu Server 10.04 LTS 32-bit Linux 2.6.32-24

I'm not familiar with the Dell or Opteron lines -- how recent are those models?

I ask because your TCP latency is a bit high (about 85us in 2-process IMB 
PingPong); it might suggest older hardware.  Or you may have built a debugging 
version of Open MPI (if you have a .svn or .hg checkout, that's the default). 
See the HACKING top-level file for how to get an optimized build.

For example, with my debug build of Open MPI on fairly old Xeons with 1GB 
ethernet, I'm getting the following PingPong results (note: this is a debug 
build; it's not even an optimized build):

-----
$ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 
hostname
svbu-mpi008
svbu-mpi009
$ mpirun --mca btl tcp,self --bynode -np 2 --mca btl_tcp_if_include eth0 
IMB-MPI1 PingPong
#---------------------------------------------------
#    Intel (R) MPI Benchmark Suite V3.2, MPI-1 part    
#---------------------------------------------------
...
#---------------------------------------------------
# Benchmarking PingPong 
# #processes = 2 
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        57.31         0.00
            1         1000        57.71         0.02
            2         1000        57.73         0.03
            4         1000        57.81         0.07
            8         1000        57.78         0.13
-----

With an optimized build, it shaves off only a few us (which isn't too important 
in this case, but it does matter in the low-latency transport cases):

-----
#---------------------------------------------------
# Benchmarking PingPong 
# #processes = 2 
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        54.62         0.00
            1         1000        54.92         0.02
            2         1000        55.15         0.03
            4         1000        55.16         0.07
            8         1000        55.15         0.14
-----

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to