Greetings all,

I have been doing some experimenting and I wanted to share what I've
learned so that in case anyone else finds this thread maybe I can save them
some time. Though it probably won't be that useful to that many people.

What I've been doing is trying to get some MPI workloads running in marss
using the OpenMPI shared memory interfaces. Now, I know this sort of
defeats the spirit of MPI, but we have an interest in such workloads and
since I can't imagine setting up multiple marss instances with a network
between them to do *actual* message passing is very easy this is one avenue
to run multiple MPI ranks without multiple marss instances.

I started with the parsec ROI image, converted it to a raw image and
chrooted the image so I could use apt-get and build stuff without the
slowdown of using qemu.

0. My initial attempt at simply apt-getting libopenmpi failed and I thought
it was due to an old binary so I built 1.4.5 from source. Though it turns
out the orte_init() error was because of missing ssh/rsh [see #3 below], so
it could be that you can simply apt-get install libopenmpi -- though I did
find some bug reports that the shared memory mode is broken in 1.3.x, so
use at your own peril.
1. Due to the age of the ubuntu distro, you have to change the entries in
/etc/apt/sources.list to point to old-releases.ubuntu.com and then run an
apt-get update.
2. Do not try to apt-get openssh-client, this will break your image. It
turns out openssh-client pulls in a  few things like mountall and upstart
which confuse the boot process and your root partition won't mount. I
didn't investigate further about why this fails. Perhaps doing a full on
dist-upgrade to pull in new kernels and new libc would have fixed this, but
I didn't try it.
3. OpenMPI requires either ssh or rsh for some reason even if you're using
the shared memory system and have no networked nodes. Since installing ssh
from apt-get breaks the boot process, you should install rsh-client -- it
pulls in no dependencies and it gets rid of the errors in orte_init() when
trying to run mpirun
4. To enable the sm mode in MPI, issue the following command: mpirun -mca
btl self,sm -np 4 ./your_mpi_binary

That's about it.
Happy simulating.
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to