[OMPI devel] Memory corruption with mpool

2009-11-02 Thread Mondrian Nuessle
Hi everybody! We are working on a new, experimental interconnection network (the EXTOLL network) and I am currently working on a MTL component for that hardware. Actually, it works quite good :-) Recently I included the RDMA mpool component for memory registration caching into my code. Again, t

Re: [OMPI devel] Memory corruption with mpool

2009-11-02 Thread Christopher Yeoh
Hi Mondrian, On Mon, 02 Nov 2009 13:22:11 +0100 Mondrian Nuessle wrote: > > If I turn on mpi_leave_pinned (and thus the registration cache is > actually used), I see occasional memory corruption issues for example > when I call MPI_Allreduce often. > > Debugging with valgrind did not lead to an

Re: [OMPI devel] Memory corruption with mpool

2009-11-02 Thread Mondrian Nuessle
Hi Christopher, >> Do you have any suggestions how to investigate this situation? > > Have you got OMPI_ENABLE_DEBUG defined? The symptoms of what you are > seeing sound like what might happen if debug is off and you trigger an > issue I posted about here related to thread safety of mpool. unfort

Re: [OMPI devel] Adding support for RDMAoE devices.

2009-11-02 Thread Jeff Squyres
I see you remove support for #if defined(HAVE_STRUCT_IBV_DEVICE_TRANSPORT_TYPE) -- that doesn't seem like a good idea. We still have users on older OFED's without that field. Can you create a 1.5 ticket for this item? On Nov 1, 2009, at 6:44 AM, Vasily Philipov wrote: The attached patch

Re: [OMPI devel] Adding support for RDMAoE devices.

2009-11-02 Thread Jeff Squyres
Perhaps since we now have 2 levels of m4 checking that is necessary (1) check to see if we have transport_type, and 2) check to see if have RDMA_TRANSPORT_RDMAOE), perhaps it would be better to put all these checks into a single place somewhere to avoid proliferating the #if's for these 2 t

Re: [OMPI devel] MPI_Grequest_start and MPI_Wait clarification

2009-11-02 Thread Jeff Squyres
On Oct 25, 2009, at 9:10 PM, Christopher Yeoh wrote: I've been running some threaded test suites against OpenMPI and was just wanting to clarify something in the specification and how OpenMPI implements it. Sorry for the delay in replying (my inbox has become a disaster lately -- please don

Re: [OMPI devel] mpool rdma deadlock

2009-11-02 Thread Jeff Squyres
Ewww yikes. This could definitely be an issue if we weren't (multi-thread) careful when writing these portions of the code. :-( On Oct 28, 2009, at 8:18 AM, Christopher Yeoh wrote: Hi, I've been investigating some OpenMPI deadlocks triggered by a test suite written to test the threa

Re: [OMPI devel] Memory corruption with mpool

2009-11-02 Thread Jeff Squyres
Note that the problems Chris is talking about *should* only occur if you have compiled Open MPI with multi-threaded support. Did you do that, perchance? On Nov 2, 2009, at 9:26 AM, Mondrian Nuessle wrote: Hi Christopher, >> Do you have any suggestions how to investigate this situation? >