Wow, sometimes I even amaze myself! Two for two on create failures in
a single night!!
:-)
Anyway, both are fixed or shortly will be. However, there will be no
MTT runs tonight as neither branch successfully generated a tarball.
Ralph
On Mar 23, 2009, at 7:30 PM, MPI Team wrote:
ERROR
You are absolutely right, the peer should never be set to -1 on any of
the PERUSE callbacks. I checked the code this morning and figure out
what was the problem. We report the peer and the tag attached to a
request before setting the right values (some code moved around). I
submitted a patc
That's a relief to know, although I'm still a bit concerned. I'm looking at
the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the
following sequence:
mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list ->
MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free
It is a known problem. When the freelist is empty going in the
ompi_free_list_wait will block the process until at least one fragment
became available. As a fragment can became available only when
returned by the BTL, this can lead to deadlocks in some cases. The
workaround is to ban the us
Hi Kiril,
Appreciate the quick response.
> Hi Samuel,
>
> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
> "Samuel K. Gutierrez" wrote:
>> Hi All,
>>
>> I'm writing a simple profiling library which utilizes
>>PERUSE. My callback
>
> So am I :)
>
>> function counts communication events (see example
did you try it with OpenMPI 1.3.1 version?
There have been few changes and bug fixes (example r20591, fix in ob1 PML)
.
Lenny.
2009/3/23 Timothy Hayes
> Hello,
>
> I'm working on an OpenMPI BTL component and am having a recurring problem,
> I was wondering if anyone could shed some light on
Hello,
I'm working on an OpenMPI BTL component and am having a recurring problem, I
was wondering if anyone could shed some light on it. I have a component
that's quite straight forward, it uses a pair of lightweight sockets to take
advantage of being in a virtualised environment (specifically Xen
We have had one user hit it with 1.3.0 - haven't installed 1.3.1 yet.
On Mar 23, 2009, at 9:34 AM, Eugene Loh wrote:
Jeff Squyres wrote:
Looks good to cisco. Ship it.
I'm still seeing a very low incidence of the sm segv during startup
(. 01% -- 23 tests out of ~160k), so let's ship 1.3.1
It looks like the URLs I sent before were incorrect -- they ask for a
username/password. Try these URLs instead:
Monday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&UID=0&PW=1c8c7f352179
Tuesday, 24 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&U
Jeff Squyres wrote:
Looks good to cisco. Ship it.
I'm still seeing a very low incidence of the sm segv during startup (.
01% -- 23 tests out of ~160k), so let's ship 1.3.1 and roll in
Eugene's new sm code for 1.3.2.
For what it's worth, I just ran a start-up test... "main()
{MPI_Init();M
On Mar 17, 2009, at 9:17 AM, Jeff Squyres (jsquyres) wrote:
Monday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862
Tuesday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862
Wednesday, 24 Mar 2009:
https://ciscosales.webex.com/ciscosales/j
11 matches
Mail list logo