[OMPI devel] request help debugging openib btl problem

2008-02-08 Thread Ralph Campbell
I'm using openmpi 1.2.5 with a QLogic HCA and using the openib btl (not PSM). osu_latency and osu_bw work OK but when I run osu_bibw with a message size of 2MB (1<<21), it hangs in btl_openib_component_progress() waiting for something. I tried adding printfs at each point where ibv_post_send(), i

Re: [OMPI devel] Datasize confusion in MPI_Write can lead to data los!

2008-02-08 Thread George Bosilca
The patch I send few minutes ago will only remove the problem for Open MPI. However, their generic test for contiguous data types is still broken. Only checking for COMBINER_NAMED is clearly not enough. A second test checking that the size and the extent of the data types are equal will mak

Re: [OMPI devel] Datasize confusion in MPI_Write can lead to data los!

2008-02-08 Thread George Bosilca
Here is sketch of a ROMIO patch for Open MPI. I just wrote it, I didn't had time to test it. If you can test it please let me know if this solve the problem. Thanks, george. Index: iscontig.c === --- iscontig.c (revision

[OMPI devel] PML V will be enabled again

2008-02-08 Thread Aurélien Bouteiller
Hi everyone, All the problems detected last time PML V has been enabled in trunk have been fixed. We invite you to give it a try (add a .ompi_unignore in ompi/mca/pml/v) on your favorite platform and compilation options and report any issues you may encounter. If none are detected, we plan

Re: [OMPI devel] 3rd party code contributions

2008-02-08 Thread Ralph H Castain
I'm going to "re-integrate" Jeff and Brian's comments into one reponse. I have no problem with either of their observations. I only included the event library, backtrace, and PLPA in my list for completeness. I expected we would continue to treat those as we are, recognizing that this means -someo

Re: [OMPI devel] Datasize confusion in MPI_Write can lead to data los!

2008-02-08 Thread Rainer Keller
Hi George, Good, if You come to the same conclusion with regard to romio using MPI_Type_size internally in RomIO... So taking iscontig.c ,-] /* This function needs more work. It should check for contiguity in other cases as well.*/ and mail to the romio list or have a specialized vers

Re: [OMPI devel] Datasize confusion in MPI_Write can lead to data los!

2008-02-08 Thread George Bosilca
MPI_Type_size is supposed to return only the size of useful data, which apparently it does (MPI_SHORT_INT is 6 bytes). What I think it happens is that the MPI_SHORT_INT type is a predefined one, but it's a really strange predefined type. It's one of the few that are not contiguous. The prob

Re: [OMPI devel] [RFC] Non-blocking collectives (LibNBC) merge to trunk

2008-02-08 Thread Jeff Squyres
Terry -- I reluctantly agree. :-) What I envision is not difficult (a first cut/feature-lean version is probably only several hundred lines of perl?), but I don't have the cycles (at present) to implement it -- my priorities are elsewhere at the moment. If anyone is interested in this, I

[OMPI devel] Datasize confusion in MPI_Write can lead to data los!

2008-02-08 Thread Christoph Niethammer
Hello! I tested openMPI at HLRS for some time without detecting new problems in the implementation but now I recognized some awful ones with MPI_Write which can lead to data los: When creating a struct for a mixed datatype like struct { short a; int b; } the C-compiler introduce a gap of

Re: [OMPI devel] 3rd party code contributions

2008-02-08 Thread Brian W. Barrett
On Fri, 8 Feb 2008, Ralph Castain wrote: 1. event library 2. ROMIO 3. VT 4. backtrace 5. PLPA - this one is a little less obvious, but still being released as a separate package 6. libNBC Sorry to Ralph, but I clipped everything from his e-mail, then am going to make references to it. oh wel

Re: [OMPI devel] 3rd party code contributions

2008-02-08 Thread Jeff Squyres
On Feb 8, 2008, at 10:38 AM, Ralph Castain wrote: I thought maybe we should move this to another thread as it really isn't about Torsten's specific RFC. I just took a quick gander at the code base to see how extensive this problem might really be per Terry's concern. What I found was that w

Re: [OMPI devel] ROMIO

2008-02-08 Thread Jeff Squyres
I know that Argonne was engaged at some level to help with the OMPI ROMIO integration -- was it on a formal or informal basis? On Feb 7, 2008, at 12:02 PM, Ralph H Castain wrote: I just -know- this is everyone's favorite subject, but... Brian used to take care of the ROMIO code in Open MPI,

[OMPI devel] 3rd party code contributions

2008-02-08 Thread Ralph Castain
I thought maybe we should move this to another thread as it really isn't about Torsten's specific RFC. I just took a quick gander at the code base to see how extensive this problem might really be per Terry's concern. What I found was that we have added 3rd party code in several places. How we wan