Hi George:
To report that an entire BTL is down, one just sets the ompi_proc_t
argument is set to NULL. That is how I was using it. That means the
mca_pml_ob1_error_handler could see that it is NULL, and map out the
entire BTL. BTLs can set the ompi_proc_t if they want and the PML is
free t
I tried and failed to get my cluster up and going yesterday (my MTT runs last
night didn't go well -- they're all flagged as "trial" for the moment for
exactly this reason). I may have just figured out what the major cause of my
problems was; hopefully I'll be able to submit another big MTT run
Per the telecon Tuesday, I committed a new OMPI MPI extension to the trunk:
https://svn.open-mpi.org/trac/ompi/changeset/23018
Please read the commit message and let me know what you think. Suggestions are
welcome.
If everyone is ok with it, I'd like to see this functionality hit the 1.5
The current error system follows a different design. There are basically two
ways to report errors, per peer or global. The per-peer can only be triggered
by a specific send or receive, and is based on the value of the last argument
on the callbacks. Such errors, clearly indicated which is the p
The comment doesn't match the commit itself.
george.
On Apr 20, 2010, at 20:00 , cy...@osl.iu.edu wrote:
> Author: cyeoh
> Date: 2010-04-20 20:00:14 EDT (Tue, 20 Apr 2010)
> New Revision: 23014
> URL: https://svn.open-mpi.org/trac/ompi/changeset/23014
>
> Log:
> fixes #2355 - race in interact
WHAT:
Add two arguments to the mca_pml_ob1_error_handler to make it more
useful for BTLs that may take advantage of that feature. Adding an
ompi_proc_t pointer and a char pointer. This is what the new signature
looks like.
void mca_pml_ob1_error_handler(
struct mca_btl_base_module_t*