On Thu, Feb 06, 2014 at 02:45:07PM -0800, Ralph Castain wrote:
> On Feb 6, 2014, at 2:16 PM, Adrian Reber wrote:
>
> > Josh explained it to me a few days ago, that after a checkpoint has been
> > received TCP should no longer be used to not lose any messages. The
> > communication happens over na
On 13/02/2014 16:36, Ralph Castain wrote:
Hi Marco
Quick question for you: we don't support Windows any more anyway. If we just
remove the #if WIN32 cruft, would that solve the problem?
in theory yes.
Regards
Marco
Hi Marco
Quick question for you: we don't support Windows any more anyway. If we just
remove the #if WIN32 cruft, would that solve the problem?
On Feb 12, 2014, at 11:09 PM, Marco Atzeri wrote:
>
>
> On 12/02/2014 04:18, Ralph Castain wrote:
>> Things are looking relatively good - I see two
Okay, this exposed the problem. The issue is that "ib0" on the two machines is
defined on two completely different IP subnets:
linuxbmc0008: 134.61.202.7
linuxscc004: 192.168.222.4
The OOB doesn't think those two are directly reachable by each other as the
IP/subnet-mask don't match - we
Attached the output from openmpi/1.7.5a1r30708
$ $MPI_BINDIR/mpiexec -mca oob_tcp_if_include ib0 -mca oob_base_verbose 100 -H
linuxscc004 -np 1 hostname 2>&1 | tee oob_base_verbose-linuxbmc0008-175a1r29587.txt
Well, some 5 lines added.
(The ib0 on linuxscc004 is not reachable from linuxbmc00
On 12/02/2014 04:18, Ralph Castain wrote:
Things are looking relatively good - I see two recurring failures:
1. idx_null - no idea what that test does, but it routinely fails
2. intercomm_create - this is the 3-way connect/accept/merge. Nathan - I
believe you had a fix for that?
Ralph
o
yes, not it is fine
Thanks!
On Wed, Feb 12, 2014 at 8:37 PM, Jeff Squyres (jsquyres) wrote:
> Mike -- this should be fixed. Has Jenkins been re-run yet?
>
>
> On Feb 12, 2014, at 9:30 AM, Ralph Castain
> wrote:
>
> > I can't reproduce this regardless - since you are using a git mirror,
> ar