Re: [OMPI devel] === CREATE FAILURE (trunk) ===

2009-03-23 Thread Ralph Castain
Wow, sometimes I even amaze myself! Two for two on create failures in a single night!! :-) Anyway, both are fixed or shortly will be. However, there will be no MTT runs tonight as neither branch successfully generated a tarball. Ralph On Mar 23, 2009, at 7:30 PM, MPI Team wrote: ERROR

Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value

2009-03-23 Thread George Bosilca
You are absolutely right, the peer should never be set to -1 on any of the PERUSE callbacks. I checked the code this morning and figure out what was the problem. We report the peer and the tag attached to a request before setting the right values (some code moved around). I submitted a patc

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Timothy Hayes
That's a relief to know, although I'm still a bit concerned. I'm looking at the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the following sequence: mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list -> MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread George Bosilca
It is a known problem. When the freelist is empty going in the ompi_free_list_wait will block the process until at least one fragment became available. As a fragment can became available only when returned by the BTL, this can lead to deadlocks in some cases. The workaround is to ban the us

Re: [OMPI devel] OMPI 1.3 - PERUSE peruse_comm_spec_t peer Negative Value

2009-03-23 Thread Samuel K. Gutierrez
Hi Kiril, Appreciate the quick response. > Hi Samuel, > > On Sat, 21 Mar 2009 18:18:54 -0600 (MDT) > "Samuel K. Gutierrez" wrote: >> Hi All, >> >> I'm writing a simple profiling library which utilizes >>PERUSE. My callback > > So am I :) > >> function counts communication events (see example

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Lenny Verkhovsky
did you try it with OpenMPI 1.3.1 version? There have been few changes and bug fixes (example r20591, fix in ob1 PML) . Lenny. 2009/3/23 Timothy Hayes > Hello, > > I'm working on an OpenMPI BTL component and am having a recurring problem, > I was wondering if anyone could shed some light on

[OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Timothy Hayes
Hello, I'm working on an OpenMPI BTL component and am having a recurring problem, I was wondering if anyone could shed some light on it. I have a component that's quite straight forward, it uses a pair of lightweight sockets to take advantage of being in a virtualised environment (specifically Xen

Re: [OMPI devel] 1.3.1rc5

2009-03-23 Thread Ralph Castain
We have had one user hit it with 1.3.0 - haven't installed 1.3.1 yet. On Mar 23, 2009, at 9:34 AM, Eugene Loh wrote: Jeff Squyres wrote: Looks good to cisco. Ship it. I'm still seeing a very low incidence of the sm segv during startup (. 01% -- 23 tests out of ~160k), so let's ship 1.3.1

[OMPI devel] Updated Sonoma/OpenFabrics WebEx URLs

2009-03-23 Thread Jeff Squyres
It looks like the URLs I sent before were incorrect -- they ask for a username/password.  Try these URLs instead: Monday, 23 Mar 2009: https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&UID=0&PW=1c8c7f352179 Tuesday, 24 Mar 2009: https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&U

Re: [OMPI devel] 1.3.1rc5

2009-03-23 Thread Eugene Loh
Jeff Squyres wrote: Looks good to cisco. Ship it. I'm still seeing a very low incidence of the sm segv during startup (. 01% -- 23 tests out of ~160k), so let's ship 1.3.1 and roll in Eugene's new sm code for 1.3.2. For what it's worth, I just ran a start-up test... "main() {MPI_Init();M

Re: [OMPI devel] Next week: WebEx remote attendance of OpenFabricsSonoma conference

2009-03-23 Thread Jeff Squyres
On Mar 17, 2009, at 9:17 AM, Jeff Squyres (jsquyres) wrote: Monday, 23 Mar 2009: https://ciscosales.webex.com/ciscosales/j.php?ED=116762862 Tuesday, 23 Mar 2009: https://ciscosales.webex.com/ciscosales/j.php?ED=116762862 Wednesday, 24 Mar 2009: https://ciscosales.webex.com/ciscosales/j