Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-26 Thread Timothy Hayes
It it was just a few kinks actually. I think the the bitmap type moved from orte to opal, then I think the opal_hash_table functions changed slightly and also I think the modex stuff was called something like pml_modex where it's now ompi_modex. There were a few extra functions in the module descri

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-26 Thread Lenny Verkhovsky
What is the error that you are getting from compilation failure? Lenny. On 3/23/09, Timothy Hayes wrote: > > That's a relief to know, although I'm still a bit concerned. I'm looking at > the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the > following sequence: > > mca_pml_o

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-25 Thread Jeff Squyres
George -- correct me if I'm wrong -- we went through and audited ob1 and the relevant BTLs. The only places where WAIT remains are places that are guaranteed to not be problematic. So you shouldn't need to edit ob1 at all. If you're working with Open MPI, you might want to investigate using

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Timothy Hayes
That's a relief to know, although I'm still a bit concerned. I'm looking at the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the following sequence: mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list -> MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread George Bosilca
It is a known problem. When the freelist is empty going in the ompi_free_list_wait will block the process until at least one fragment became available. As a fragment can became available only when returned by the BTL, this can lead to deadlocks in some cases. The workaround is to ban the us

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Lenny Verkhovsky
did you try it with OpenMPI 1.3.1 version? There have been few changes and bug fixes (example r20591, fix in ob1 PML) . Lenny. 2009/3/23 Timothy Hayes > Hello, > > I'm working on an OpenMPI BTL component and am having a recurring problem, > I was wondering if anyone could shed some light on

[OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Timothy Hayes
Hello, I'm working on an OpenMPI BTL component and am having a recurring problem, I was wondering if anyone could shed some light on it. I have a component that's quite straight forward, it uses a pair of lightweight sockets to take advantage of being in a virtualised environment (specifically Xen