Adrian,

Has there been any progress on this bug? If you still cannot reproduce it, if you send either Tim Prins or I a debugging patch we can run with it. Or we can try to arrange access to one of our machines for you.

This bug is making it difficult for us to continue working off of the trunk since we get these connection errors so frequently.

-- Josh

On Apr 18, 2008, at 2:26 PM, Tim Prins wrote:

To echo what Josh said, there are no special compile flags being used.
If you send me a patch with debug output, I'd be happy to run it for you.

Both odin and sif are fairly normal linux based clusters, with ethernet
and openib IP networks. The ethernet network has both ipv4 & ipv6, and
the openib network runs ipv4.

Tim

Adrian Knoth wrote:
On Fri, Apr 18, 2008 at 01:00:40PM -0400, Josh Hursey wrote:

The trick is to force Open MPI to use only tcp,self and nothing else.
Did you try adding this (-mca btl tcp,self) to the runtime parameter
set?

Sure. Even with 64 processes, I cannot trigger this behaviour. Neither
on Linux nor Solaris.

Any special compile flags?

I guess a little bit more debug output could probably reveal the
culprit.



_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to