On Mar 17, 2009, at 9:17 AM, Jeff Squyres (jsquyres) wrote:
Monday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862
Tuesday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862
Wednesday, 24 Mar 2009:
https://ciscosales.webex.com/ciscosales/j
Jeff Squyres wrote:
Looks good to cisco. Ship it.
I'm still seeing a very low incidence of the sm segv during startup (.
01% -- 23 tests out of ~160k), so let's ship 1.3.1 and roll in
Eugene's new sm code for 1.3.2.
For what it's worth, I just ran a start-up test... "main()
{MPI_Init();M
It looks like the URLs I sent before were incorrect -- they ask for a
username/password. Try these URLs instead:
Monday, 23 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&UID=0&PW=1c8c7f352179
Tuesday, 24 Mar 2009:
https://ciscosales.webex.com/ciscosales/j.php?ED=116762862&U
We have had one user hit it with 1.3.0 - haven't installed 1.3.1 yet.
On Mar 23, 2009, at 9:34 AM, Eugene Loh wrote:
Jeff Squyres wrote:
Looks good to cisco. Ship it.
I'm still seeing a very low incidence of the sm segv during startup
(. 01% -- 23 tests out of ~160k), so let's ship 1.3.1
Hello,
I'm working on an OpenMPI BTL component and am having a recurring problem, I
was wondering if anyone could shed some light on it. I have a component
that's quite straight forward, it uses a pair of lightweight sockets to take
advantage of being in a virtualised environment (specifically Xen
did you try it with OpenMPI 1.3.1 version?
There have been few changes and bug fixes (example r20591, fix in ob1 PML)
.
Lenny.
2009/3/23 Timothy Hayes
> Hello,
>
> I'm working on an OpenMPI BTL component and am having a recurring problem,
> I was wondering if anyone could shed some light on
Hi Kiril,
Appreciate the quick response.
> Hi Samuel,
>
> On Sat, 21 Mar 2009 18:18:54 -0600 (MDT)
> "Samuel K. Gutierrez" wrote:
>> Hi All,
>>
>> I'm writing a simple profiling library which utilizes
>>PERUSE. My callback
>
> So am I :)
>
>> function counts communication events (see example
It is a known problem. When the freelist is empty going in the
ompi_free_list_wait will block the process until at least one fragment
became available. As a fragment can became available only when
returned by the BTL, this can lead to deadlocks in some cases. The
workaround is to ban the us
That's a relief to know, although I'm still a bit concerned. I'm looking at
the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the
following sequence:
mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list ->
MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free
You are absolutely right, the peer should never be set to -1 on any of
the PERUSE callbacks. I checked the code this morning and figure out
what was the problem. We report the peer and the tag attached to a
request before setting the right values (some code moved around). I
submitted a patc
Wow, sometimes I even amaze myself! Two for two on create failures in
a single night!!
:-)
Anyway, both are fixed or shortly will be. However, there will be no
MTT runs tonight as neither branch successfully generated a tarball.
Ralph
On Mar 23, 2009, at 7:30 PM, MPI Team wrote:
ERROR
11 matches
Mail list logo