Greetings,

I'm having trouble getting openmpi 2.1.2 to work when launching a
process from debian 8 on a remote debian 9 host. To keep things simple
in this example, I'm just launching date on the remote host.

deb8host$ mpirun -H deb9host date
[deb8host:01552] [[32763,0],0] ORTE_ERROR_LOG: Error in file
base/plm_base_launch_support.c at line 954

It works fine when executed from debian 9:
deb9host$ mpirun -H deb8host date
Sat Jul 21 13:40:43 CDT 2018

Also works when executed from debian 8 against debian 8:
deb8host:~$ mpirun -H deb8host2 date
Sat Jul 21 13:55:57 CDT 2018

The failure results from an error code returned by:
opal_dss.unpack(buffer, &topo, &idx, OPAL_HWLOC_TOPO)

openmpi was built with the same configure flags on both hosts.

        --prefix=$(PREFIX) \
        --with-verbs \
        --with-libfabric \
        --disable-silent-rules \
        --with-hwloc=/usr \
        --with-libltdl=/usr \
        --with-devel-headers \
        --with-slurm \
        --with-sge \
        --without-tm \
        --disable-heterogeneous \
        --with-contrib-vt-flags=--disable-iotrace \
        --sysconfdir=$(PREFIX)/etc         \
        --libdir=$(PREFIX)/lib    \
        --includedir=$(PREFIX)/include


deb9host libhwloc and libhwloc-plugins is 1.11.5-1
deb8host libhwloc and libhwloc-plugins is 1.10.0-3

I've been trying to debug this for the past few days and would
appreciate any help on determining why this failure is occurring
and/or resolving the problem.

-- 
Brian T. Smith
System Fabric Works
Senior Technical Staff
bsm...@systemfabricworks.com
GPG Key: B3C2C7B73BA3CD7F
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to