On 27/10/16 16:00, Darren Wise wrote:

> Along with seven dual socket, quadcore AMD x86-64 CISC nodes running
> ubuntu 16.4LTS, MPICH and OpenMPI are giving me some strange errors but
> as soon as I opt out the SUN box everything runs smoothly.

You're not just mixing architectures, you're mixing Debian derived
distros on x86 with RHEL derived distros on Sparc.  So you are likely
mixing quite different OpenMPI versions as well.

I'd suggest building the latest OpenMPI 1.10.x release from source on
both and trying that instead.

Ideally I'd suggest building Slurm on all architectures, then build
OpenMPI 1.10.x using the --with-slurm flag and then try launching that
with srun so Slurm can do the MPI wire up for you (so you don't have to
bother with SSH issues).  That might be over-engineering it though. :-)

-- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: [email protected] Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to