Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 03:59:28PM -0400, Richard Graham wrote: > > > > On 8/13/07 3:52 PM, "Gleb Natapov" wrote: > > > On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > > Here are the > > items we have identified: > > > All those things sounds very promising. Is there > > tmp

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Richard Graham
On 8/13/07 3:52 PM, "Gleb Natapov" wrote: > On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > Here are the > items we have identified: > All those things sounds very promising. Is there > tmp branch where you are going to work on this? > > tmp/latency Some changes have alr

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > Here are the items we have identified: > All those things sounds very promising. Is there tmp branch where you are going to work on this? > > > > > 1)

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Richard Graham
On 8/13/07 12:34 PM, "Galen Shipman" wrote: > Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see that header cachin

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 11:12 AM, Galen Shipman wrote: 1) remove 0 byte optimization of not initializing the convertor This costs us an “if“ in MCA_PML_BASE_SEND_REQUEST_INIT and an “if“ in mca_pml_ob1_send_request_start_copy +++ Measure the convertor initialization before taking any other act

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 11:28 AM, George Bosilca wrote: Such a scheme is certainly possible, but I see even less use for it than use cases for the existing microbenchmarks. Specifically, header caching *can* happen in real applications (i.e., repeatedly send short messages with the same MPI signatu

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Brian Barrett wrote: On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 b

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Galen Shipman
Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see that header caching decrease the mvapich latency on 0.23 1 bytes mvapich with header caching: 1.58 mvapich

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich with

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Brian Barrett
On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with head

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:07 AM, Jeff Squyres wrote: Such a scheme is certainly possible, but I see even less use for it than use cases for the existing microbenchmarks. Specifically, header caching *can* happen in real applications (i.e., repeatedly send short messages with the same MPI signatur

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Galen Shipman
I think we need to take a step back from micro-optimizations such as header caching. Rich, George, Brian and I are currently looking into latency improvements. We came up with several areas of performance enhancements that can be done with minimal disruption. The progress issue that Chr

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
We're working on it. Give us few weeks to finish implementing all the planned optimizations/cleanups in th PML and then we can talk about tricks. We're expecting/hoping to slim down the PML layer by more than 0.5 so this header caching optimization might not make any sense at that point.

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:49 AM, George Bosilca wrote: You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history (s

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Christian Bell
On Sun, 12 Aug 2007, Gleb Natapov wrote: > > Any objections? We can discuss what approaches we want to take > > (there's going to be some complications because of the PML driver, > > etc.); perhaps in the Tuesday Mellanox teleconf...? > > > My main objection is that the only reason you propo

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history (so we can claim our communication protocol is adap

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 10:36:19AM -0400, Jeff Squyres wrote: > On Aug 13, 2007, at 6:36 AM, Gleb Natapov wrote: > > >> Pallas, Presta (as i know) also use static rank. So lets start to fix > >> all "bogus" benchmarks :-) ? > >> > > All benchmarks are bogus. I have better optimization. Check a nam

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:34 AM, Jeff Squyres wrote: All this being said -- is there another reason to lower our latency? My main goal here is to lower the latency. If header caching is unattractive, then another method would be fine. Oops: s/reason/way/. That makes my sentence make much more s

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 6:36 AM, Gleb Natapov wrote: Pallas, Presta (as i know) also use static rank. So lets start to fix all "bogus" benchmarks :-) ? All benchmarks are bogus. I have better optimization. Check a name of executable and if this is some know benchmark send one byte instead of real

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 12, 2007, at 3:49 PM, Gleb Natapov wrote: - Mellanox tested MVAPICH with the header caching; latency was around 1.4us - Mellanox tested MVAPICH without the header caching; latency was around 1.9us As far as I remember Mellanox results and according to our testing difference between MVAP

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Scott Atchley
On Aug 13, 2007, at 4:06 AM, Pavel Shamis (Pasha) wrote: Any objections? We can discuss what approaches we want to take (there's going to be some complications because of the PML driver, etc.); perhaps in the Tuesday Mellanox teleconf...? My main objection is that the only reason you propose

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Terry D. Dontje
Jeff Squyres wrote: With Mellanox's new HCA (ConnectX), extremely low latencies are possible for short messages between two MPI processes. Currently, OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel MPI, MVAPICH[2], etc.) are around 1.4us. A big reason for this differ

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 11:06:00AM +0300, Pavel Shamis (Pasha) wrote: > > > > >> Any objections? We can discuss what approaches we want to take > >> (there's going to be some complications because of the PML driver, > >> etc.); perhaps in the Tuesday Mellanox teleconf...? > >> > >> >

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Any objections? We can discuss what approaches we want to take (there's going to be some complications because of the PML driver, etc.); perhaps in the Tuesday Mellanox teleconf...? My main objection is that the only reason you propose to do this is some bogus benchmark? Palla

Re: [OMPI devel] openib btl header caching

2007-08-12 Thread Gleb Natapov
On Sat, Aug 11, 2007 at 09:55:18AM -0700, Jeff Squyres wrote: > With Mellanox's new HCA (ConnectX), extremely low latencies are > possible for short messages between two MPI processes. Currently, > OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel > MPI, MVAPICH[2], etc.) a

[OMPI devel] openib btl header caching

2007-08-11 Thread Jeff Squyres
With Mellanox's new HCA (ConnectX), extremely low latencies are possible for short messages between two MPI processes. Currently, OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel MPI, MVAPICH[2], etc.) are around 1.4us. A big reason for this difference is that, at least