Re: [OMPI devel] 1.5.4 and 1.4.4 NEWS items

2011-08-18 Thread Jeff Squyres
It will be ABI compatible, yes. On Aug 18, 2011, at 8:11 PM, Christopher Samuel wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 18/08/11 23:11, Jeff Squyres wrote: > >> 1.4.4 > > Haven't been keeping up I'm afraid - is 1.4.4 backwards > compatible with 1.4.2 ? > > cheers! > C

Re: [OMPI devel] 1.5.4 and 1.4.4 NEWS items

2011-08-18 Thread Christopher Samuel
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 18/08/11 23:11, Jeff Squyres wrote: > 1.4.4 Haven't been keeping up I'm afraid - is 1.4.4 backwards compatible with 1.4.2 ? cheers! Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Ini

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread George Bosilca
On Aug 18, 2011, at 14:58 , TERRY DONTJE wrote: > > > On 8/18/2011 2:32 PM, George Bosilca wrote: >> Terry, >> >> The test succeeded in both of your runs. >> > Not really. Granted the test aborted in both cases however the case you > show below has further issues while the orte is trying t

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE
Thought I'd throw this out there, I retraced my MTT steps and did find that there were failures of this test back until r24774. r24775 has a comment that looks very relevant. I am talking to the committer of that change now. Sorry for the false accusation. --td On 8/18/2011 2:32 PM, George

[OMPI devel] 1.5.4 is ready

2011-08-18 Thread Jeff Squyres
MTT looks good, manual ABI testing from 1.5.3 passed, ...etc. All looks good. I'm updating the web site and will send the release notice shortly. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE
On 8/18/2011 2:32 PM, George Bosilca wrote: Terry, The test succeeded in both of your runs. Not really. Granted the test aborted in both cases however the case you show below has further issues while the orte is trying to clean things up. It certainly is not what I would call friendly. B

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread George Bosilca
Terry, The test succeeded in both of your runs. However, I rolled back before the epoch change (24814) and the output is the following: MPITEST info (0): Starting MPI_Errhandler_fatal test MPITEST info (0): This test will abort after printing the results message MPITEST info (0): If it does

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE
Just ran MPI_Errhandler_fatal_c with r25063 and it still fails. Everything is the same except I don't see the "readv failed.." message. Have your tried to run this code yourself? It is pretty simple and fails with one node using np=4. --td On 8/18/2011 10:57 AM, Wesley Bland wrote: I just

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread Ralph Castain
I doubt that will solve the problem. The issue is that procs are continuing to fail while you are trying to respond to the first one. Here is what happens: 1. first proc fails, causing a "connection failed" error that gets reported to the orted errmgr. 2. errmgr_orted starts trying to send "pro

Re: [OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread Wesley Bland
I just checked in a fix (I hope). I think the problem was that the errmgr was removing children from the list of odls children without using the mutex to prevent race conditions. Let me know if the MTT is still having problems tomorrow. Wes > I am seeing the intel test suite tests MPI_Errhandler_

[OMPI devel] MPI_Errhandler_fatal_c failure

2011-08-18 Thread TERRY DONTJE
I am seeing the intel test suite tests MPI_Errhandler_fatal_c and MPI_Errhandler_fatal_f fail with an oob failure quite a bit I have not seen this test failing under MTT until the epoch code was added. So I have a suspicion the epoch code might be at fault. Could someone familiar with the ep

[OMPI devel] 1.5.4 and 1.4.4 NEWS items

2011-08-18 Thread Jeff Squyres
Please check through these; thanks! 1.5.4 - - Add support for the (as yet unreleased) Mellanox MXM transport. - Add support for dynamic service levels (SLs) in the openib BTL. - Fixed C++ bindings cosmetic/warnings issue with MPI::Comm::NULL_COPY_FN and MPI::Comm::NULL_DELETE_FN. Thanks to