Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Dirk Eddelbuettel
(adding pkg-openmpi-maintain...@lists.alioth.debian.org which I should have added earlier, sorry! --Dirk) On 14 August 2007 at 00:08, Adrian Knoth wrote: | On Mon, Aug 13, 2007 at 04:26:31PM -0500, Dirk Eddelbuettel wrote: | | > > I'll now compile the 1.2.3 release tarball and see if I can repro

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Mon, Aug 13, 2007 at 04:26:31PM -0500, Dirk Eddelbuettel wrote: > > I'll now compile the 1.2.3 release tarball and see if I can reproduce The 1.2.3 release also works fine: adi@debian:~$ ./ompi123/bin/mpirun -np 2 ring 0: sending message (0) to 1 0: sent message 1: waiting for message 1: got

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Dirk Eddelbuettel
Adrian, On 13 August 2007 at 22:28, Adrian Knoth wrote: | On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: | | > > We (as in the Debian maintainer for Open MPI) got this bug report from | > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD | > > kernel. | > > Any i

Re: [OMPI devel] Collectives interface change

2007-08-13 Thread Li-Ta Lo
On Thu, 2007-08-09 at 14:49 -0600, Brian Barrett wrote: > Hi all - > > There was significant discussion this week at the collectives meeting > about improving the selection logic for collective components. While > we'd like the automated collectives selection logic laid out in the > Collv2

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 4:28 PM, Adrian Knoth wrote: I'll now compile the 1.2.3 release tarball and see if I can reproduce the segfaults. On the other hand, I guess nobody is using OMPI on GNU/kFreeBSD, so upgrading the openmpi-package to a subversion snapshot would also fix the problem (think of

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: > > We (as in the Debian maintainer for Open MPI) got this bug report from > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > > kernel. > > Any input would be greatly appreciated! > I'll follow the QEMU instructions

Re: [OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 4:04 PM, Gleb Natapov wrote: mpool rdma finalize was empty function. I changed it to do the "finalize" job - go over all registered segments in mpool and release them one by one, Mpool use reference counter for each memory region and it prevents us from double free bug. I

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 03:59:28PM -0400, Richard Graham wrote: > > > > On 8/13/07 3:52 PM, "Gleb Natapov" wrote: > > > On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > > Here are the > > items we have identified: > > > All those things sounds very promising. Is there > > tmp

Re: [OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 05:00:37PM +0300, Pavel Shamis (Pasha) wrote: > Jeff Squyres wrote: > > FWIW: we fixed this recently in the openib BTL by ensuring that all > > registered memory is freed during the BTL finalize (vs. the mpool > > finalize). > > > > This is a new issue because the mpool

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Richard Graham
On 8/13/07 3:52 PM, "Gleb Natapov" wrote: > On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > Here are the > items we have identified: > All those things sounds very promising. Is there > tmp branch where you are going to work on this? > > tmp/latency Some changes have alr

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > Here are the items we have identified: > All those things sounds very promising. Is there tmp branch where you are going to work on this? > > > > > 1)

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Richard Graham
On 8/13/07 12:34 PM, "Galen Shipman" wrote: > Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see that header cachin

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 11:12 AM, Galen Shipman wrote: 1) remove 0 byte optimization of not initializing the convertor This costs us an “if“ in MCA_PML_BASE_SEND_REQUEST_INIT and an “if“ in mca_pml_ob1_send_request_start_copy +++ Measure the convertor initialization before taking any other act

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 11:28 AM, George Bosilca wrote: Such a scheme is certainly possible, but I see even less use for it than use cases for the existing microbenchmarks. Specifically, header caching *can* happen in real applications (i.e., repeatedly send short messages with the same MPI signatu

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Brian Barrett wrote: On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 b

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Galen Shipman
Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see that header caching decrease the mvapich latency on 0.23 1 bytes mvapich with header caching: 1.58 mvapich

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich with

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Brian Barrett
On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with head

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:07 AM, Jeff Squyres wrote: Such a scheme is certainly possible, but I see even less use for it than use cases for the existing microbenchmarks. Specifically, header caching *can* happen in real applications (i.e., repeatedly send short messages with the same MPI signatur

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Galen Shipman
I think we need to take a step back from micro-optimizations such as header caching. Rich, George, Brian and I are currently looking into latency improvements. We came up with several areas of performance enhancements that can be done with minimal disruption. The progress issue that Chr

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
We're working on it. Give us few weeks to finish implementing all the planned optimizations/cleanups in th PML and then we can talk about tricks. We're expecting/hoping to slim down the PML layer by more than 0.5 so this header caching optimization might not make any sense at that point.

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:49 AM, George Bosilca wrote: You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history (s

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Christian Bell
On Sun, 12 Aug 2007, Gleb Natapov wrote: > > Any objections? We can discuss what approaches we want to take > > (there's going to be some complications because of the PML driver, > > etc.); perhaps in the Tuesday Mellanox teleconf...? > > > My main objection is that the only reason you propo

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history (so we can claim our communication protocol is adap

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 10:36:19AM -0400, Jeff Squyres wrote: > On Aug 13, 2007, at 6:36 AM, Gleb Natapov wrote: > > >> Pallas, Presta (as i know) also use static rank. So lets start to fix > >> all "bogus" benchmarks :-) ? > >> > > All benchmarks are bogus. I have better optimization. Check a nam

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:34 AM, Jeff Squyres wrote: All this being said -- is there another reason to lower our latency? My main goal here is to lower the latency. If header caching is unattractive, then another method would be fine. Oops: s/reason/way/. That makes my sentence make much more s

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 6:36 AM, Gleb Natapov wrote: Pallas, Presta (as i know) also use static rank. So lets start to fix all "bogus" benchmarks :-) ? All benchmarks are bogus. I have better optimization. Check a name of executable and if this is some know benchmark send one byte instead of real

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 12, 2007, at 3:49 PM, Gleb Natapov wrote: - Mellanox tested MVAPICH with the header caching; latency was around 1.4us - Mellanox tested MVAPICH without the header caching; latency was around 1.9us As far as I remember Mellanox results and according to our testing difference between MVAP

Re: [OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Pavel Shamis (Pasha)
Jeff Squyres wrote: FWIW: we fixed this recently in the openib BTL by ensuring that all registered memory is freed during the BTL finalize (vs. the mpool finalize). This is a new issue because the mpool finalize was just recently expanded to un-register all of its memory as part of the NIC

Re: [OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Jeff Squyres
FWIW: we fixed this recently in the openib BTL by ensuring that all registered memory is freed during the BTL finalize (vs. the mpool finalize). This is a new issue because the mpool finalize was just recently expanded to un-register all of its memory as part of the NIC-restart effort (an

[OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Tim Prins
Hi folks, I have run into a problem with mca_mpool_rdma_finalize as implemented in r15557. With the t_win onesided test, running over gm, it segfaults. What appears to be happening is that some memory is registered with gm, and then gets freed by mca_mpool_rdma_finalize. But the free function t

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Scott Atchley
On Aug 13, 2007, at 4:06 AM, Pavel Shamis (Pasha) wrote: Any objections? We can discuss what approaches we want to take (there's going to be some complications because of the PML driver, etc.); perhaps in the Tuesday Mellanox teleconf...? My main objection is that the only reason you propose

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Terry D. Dontje
Jeff Squyres wrote: With Mellanox's new HCA (ConnectX), extremely low latencies are possible for short messages between two MPI processes. Currently, OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel MPI, MVAPICH[2], etc.) are around 1.4us. A big reason for this differ

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 11:06:00AM +0300, Pavel Shamis (Pasha) wrote: > > > > >> Any objections? We can discuss what approaches we want to take > >> (there's going to be some complications because of the PML driver, > >> etc.); perhaps in the Tuesday Mellanox teleconf...? > >> > >> >

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Any objections? We can discuss what approaches we want to take (there's going to be some complications because of the PML driver, etc.); perhaps in the Tuesday Mellanox teleconf...? My main objection is that the only reason you propose to do this is some bogus benchmark? Palla