[OMPI devel] RFC: Change PML error handler signature

2010-04-21 Thread Rolf vandeVaart
WHAT: Add two arguments to the mca_pml_ob1_error_handler to make it more useful for BTLs that may take advantage of that feature. Adding an ompi_proc_t pointer and a char pointer. This is what the new signature looks like. void mca_pml_ob1_error_handler( struct mca_btl_base_module_t*

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r23014

2010-04-21 Thread George Bosilca
The comment doesn't match the commit itself. george. On Apr 20, 2010, at 20:00 , cy...@osl.iu.edu wrote: > Author: cyeoh > Date: 2010-04-20 20:00:14 EDT (Tue, 20 Apr 2010) > New Revision: 23014 > URL: https://svn.open-mpi.org/trac/ompi/changeset/23014 > > Log: > fixes #2355 - race in interact

Re: [OMPI devel] RFC: Change PML error handler signature

2010-04-21 Thread George Bosilca
The current error system follows a different design. There are basically two ways to report errors, per peer or global. The per-peer can only be triggered by a specific send or receive, and is based on the value of the last argument on the callbacks. Such errors, clearly indicated which is the p

[OMPI devel] New OMPI MPI extension

2010-04-21 Thread Jeff Squyres
Per the telecon Tuesday, I committed a new OMPI MPI extension to the trunk: https://svn.open-mpi.org/trac/ompi/changeset/23018 Please read the commit message and let me know what you think. Suggestions are welcome. If everyone is ok with it, I'd like to see this functionality hit the 1.5

[OMPI devel] Cisco MTT testing

2010-04-21 Thread Jeff Squyres
I tried and failed to get my cluster up and going yesterday (my MTT runs last night didn't go well -- they're all flagged as "trial" for the moment for exactly this reason). I may have just figured out what the major cause of my problems was; hopefully I'll be able to submit another big MTT run

Re: [OMPI devel] RFC: Change PML error handler signature

2010-04-21 Thread Rolf vandeVaart
Hi George: To report that an entire BTL is down, one just sets the ompi_proc_t argument is set to NULL. That is how I was using it. That means the mca_pml_ob1_error_handler could see that it is NULL, and map out the entire BTL. BTLs can set the ompi_proc_t if they want and the PML is free t