On 27/10/16 16:00, Darren Wise wrote: > Along with seven dual socket, quadcore AMD x86-64 CISC nodes running > ubuntu 16.4LTS, MPICH and OpenMPI are giving me some strange errors but > as soon as I opt out the SUN box everything runs smoothly.
You're not just mixing architectures, you're mixing Debian derived distros on x86 with RHEL derived distros on Sparc. So you are likely mixing quite different OpenMPI versions as well. I'd suggest building the latest OpenMPI 1.10.x release from source on both and trying that instead. Ideally I'd suggest building Slurm on all architectures, then build OpenMPI 1.10.x using the --with-slurm flag and then try launching that with srun so Slurm can do the MPI wire up for you (so you don't have to bother with SSH issues). That might be over-engineering it though. :-) -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: [email protected] Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
