Hi,
I think i'm close to finishing an initial version of the MOSIX support
for open-mpi. A perliminary draft is attached.
The support consists of two modules: ODLS module for launching processes
under MOSIX, and BTL module for efficient communication between processes.
I'm not quite there yet -
I can't speak to the BTL itself, but I do have questions as to how this can
work. If MOSIX migrates a process, or starts new processes on another node
during the course of a job, there is no way for MPI to handle the wireup and so
it will fail. We need ALL the procs started at the beginning of t
MOSIX works as a sandbox, wrapping the executed process. Suppose I run
with "-n 3": three processes will be launched via MOSIX on nodes A, B
and C. MOSIX can choose to "migrate" process #2 from B to D - this will
not restart the process, nor will the process know about it's current
location unl
I've added some documentation and made a few other changes in the hope
of making the code more readable (the attached diff replaces the
previous one), though the BTL is still giving me that error. There are
some TODOs in the code where I was unsure about the code (it should
still work - I'm not