Adrian,
Has there been any progress on this bug? If you still cannot reproduce
it, if you send either Tim Prins or I a debugging patch we can run
with it. Or we can try to arrange access to one of our machines for you.
This bug is making it difficult for us to continue working off of the
trunk since we get these connection errors so frequently.
-- Josh
On Apr 18, 2008, at 2:26 PM, Tim Prins wrote:
To echo what Josh said, there are no special compile flags being used.
If you send me a patch with debug output, I'd be happy to run it for
you.
Both odin and sif are fairly normal linux based clusters, with
ethernet
and openib IP networks. The ethernet network has both ipv4 & ipv6, and
the openib network runs ipv4.
Tim
Adrian Knoth wrote:
On Fri, Apr 18, 2008 at 01:00:40PM -0400, Josh Hursey wrote:
The trick is to force Open MPI to use only tcp,self and nothing
else.
Did you try adding this (-mca btl tcp,self) to the runtime parameter
set?
Sure. Even with 64 processes, I cannot trigger this behaviour.
Neither
on Linux nor Solaris.
Any special compile flags?
I guess a little bit more debug output could probably reveal the
culprit.
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel