They symptom is that the process hangs forever. Its difficult to differentiate 
this bug and simply running out of registered memory.

The bug is hit if the pml is using the mpi_leave_pinned protocol and the btl 
returns an error from its send function.

-Nathan

________________________________________
From: [email protected] [[email protected]] on behalf of 
Christopher Samuel [[email protected]]
Sent: Thursday, March 01, 2012 7:58 PM
To: [email protected]
Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/03/12 02:56, Nathan Hjelm wrote:

> Found a pretty nasty frag leak (and a minor one) in ob1 (see
> commit below). If this fix addresses some hangs we are seeing on
> infiniband LANL might want a 1.4.6 rolled (or a faster rollout for
> 1.6.0).

What symptoms would an affected job show?  Does it fail with an OMPI
error or does it just hang using 0% CPU?

cheers,
Chris
- --
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: [email protected] Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk9QN10ACgkQO2KABBYQAh9aRgCePZXdzqlI8lpfqWtHf8rtFvup
2D8An3E9y411xTyRBpfwHLPpWTzqUiuv
=3EXP
-----END PGP SIGNATURE-----
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to