For the list: we figured this out.  These neighbor tests require np>=4 (whew!). 
 I added minimum np checks to the tests so that they'll skip (exit 77) if np<4. 
 Nathan and I worked through the other three tests.


On Mar 18, 2014, at 11:22 PM, Ralph Castain <r...@open-mpi.org> wrote:

> Just to be safe, I blew away my existing installations and got completely 
> fresh checkouts. I am doing a vanilla configure, with the only configure 
> options besides prefix being --enable-orterun-prefix-by-default and 
> --enable-mpi-java (so I can test the Java bindings)
> 
> For 1.7.5, running the IBM test suite, I get the following failures on my 
> 2-node cluster, running map-by node:
> 
> *** WARNING: Test: ineighbor_allgatherv, np=2, variant=1: FAILED
> *** WARNING: Test: neighbor_allgatherv, np=2, variant=1: FAILED
> *** WARNING: Test: ineighbor_alltoallv, np=2, variant=1: FAILED
> *** WARNING: Test: ineighbor_alltoall, np=2, variant=1: FAILED
> *** WARNING: Test: neighbor_alltoallw, np=2, variant=1: FAILED
> *** WARNING: Test: neighbor_alltoallv, np=2, variant=1: FAILED
> *** WARNING: Test: neighbor_alltoall, np=2, variant=1: FAILED
> *** WARNING: Test: ineighbor_alltoallw, np=2, variant=1: FAILED
> *** WARNING: Test: ineighbor_allgather, np=2, variant=1: FAILED
> *** WARNING: Test: neighbor_allgather, np=2, variant=1: FAILED
> *** WARNING: Test: create_group_usempi, np=2, variant=1: FAILED
> *** WARNING: Test: create_group_mpifh, np=2, variant=1: FAILED
> *** WARNING: Test: create_group, np=2, variant=1: FAILED
> *** WARNING: Test: idx_null, np=2, variant=1: FAILED
> 
> 
> From the Intel test suite:
> 
> *** WARNING: Test: MPI_Keyval3_c, np=6, variant=1: FAILED
> *** WARNING: Test: MPI_Allgatherv_c, np=6, variant=1: TIMED OUT (failed)
> *** WARNING: Test: MPI_Graph_create_undef_c, np=6, variant=1: FAILED
> 
> I subsequently removed the map-by node directive so everything basically ran 
> on the head node with mpirun, just in case having the procs on separate nodes 
> was the cause of the problem. However, the exact same failures were observed 
> again.
> 
> Note that the 1.7.5 branch ran clean (except for idx_null, which we 
> understand) yesterday, so this is caused by something new today. I then 
> tested the trunk and got the identical errors.
> 
> I don't see how we can release with this situation, so we appear to be stuck 
> until someone can figure out what happened and fix it.
> Ralph
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/03/14369.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to