[OMPI devel] Still problems with del_procs in trunkj

2014-05-23 Thread Rolf vandeVaart
I am still seeing problems with del_procs with openib. Do we believe everything should be working? This is with the latest trunk (updated 1 hour ago). [rvandevaart@drossetti-ivy0 examples]$ mpirun --mca btl_openib_if_include mlx5_0:1 -np 2 -host drossetti-ivy0,drossetti-ivy1 connectivity_cCon

Re: [OMPI devel] Still problems with del_procs in trunkj

2014-05-25 Thread Gilles Gouaillardet
Rolf, the assert fails because the endpoint reference count is greater than one. the root cause is the endpoint has been added to the list of eager_rdma_buffers of the openib btl device (and hence OBJ_RETAIN'ed at ompi/mca/btl/openib/btl_openib_endpoint.c:1009) a simple workaround is not to use e

Re: [OMPI devel] Still problems with del_procs in trunkj

2014-05-27 Thread Nathan Hjelm
On Mon, May 26, 2014 at 12:09:38PM +0900, Gilles Gouaillardet wrote: >Rolf, > >the assert fails because the endpoint reference count is greater than one. >the root cause is the endpoint has been added to the list of >eager_rdma_buffers of the openib btl device (and hence OBJ_RETAIN