2015-10-26 8:04 GMT+01:00 Gilles Gouaillardet :
> Federico,
>
> that looks good to me.
> the image does not show the channel between orded and its children.
> this is a currently a TCP socket (v1.10) and we are moving to Unix socket
> (already in master)
>
>
Which is the framework involved in this
Thank you guys, your help is really appriciated! We'll keep in touch for
further information.
Gianmario
Il 23/ott/2015 12:44 "Jeff Squyres (jsquyres)" ha
scritto:
> On Oct 22, 2015, at 7:17 AM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
> >
> > Gianmario,
> >
> > there was c/
Federico,
that looks good to me.
the image does not show the channel between orded and its children.
this is a currently a TCP socket (v1.10) and we are moving to Unix
socket (already in master)
Cheers,
Gilles
On 10/26/2015 3:28 PM, Federico Reghenzani wrote:
Hi Gilles,
thank you again fo
Hi Gilles,
thank you again for your great answer. Our idea is to migrate tasks
between nodes, possibly individually, and other tasks still run (obviously,
if they want to communicate with "migrating" node, we should pause them).
Just to be sure if we have understood correctly, is the attached i
Each module has the opportunity to provide an ft_event function, that is
supposedly called when a change in the module behavior is necessary. Thus,
it is relatively easy to let the BTL knows about the fact that a particular
destination process will migrate to a new location.
George.
On Fri, Oc
On Oct 22, 2015, at 7:17 AM, Gilles Gouaillardet
wrote:
>
> Gianmario,
>
> there was c/r support in the v1.6 series but it has been removed.
To be specific: the C/R support was removed from the v2.x branch because it is
stale / not working. The support is still in master, albeit with Adrian'
Gianmario,
Iirc, there is one pipe between orted and each children stderr.
stdout is a pty, and stdin is /dev/null, but it might be a pipe on task 0
This is the way stdout/stderr from tasks end up being printed by mpirun : orted
does i/o forwarding (aka IOF)
are you trying to migrate only one ta
Hi Adrian and Gilles,
first of all thank you for your responses. I'm working with Gianmario on
this ambitious project.
2015-10-22 13:17 GMT+02:00 Gilles Gouaillardet <
gilles.gouaillar...@gmail.com>:
> Gianmario,
>
> there was c/r support in the v1.6 series but it has been removed.
> the current
Gianmario,
there was c/r support in the v1.6 series but it has been removed.
the current trend is to do application level checkpointing
(much more efficient and much smaller checkpoint file size)
iirc, ompi took care of closing/restoring all communication, and a third
party checkpoint was require
On Thu, Oct 22, 2015 at 12:15:22PM +0200, Gianmario Pozzi wrote:
> My team and I are working on the possibility to checkpoint a process and
> restarting it on another node. We are using CRIU framework for the
> checkpoint/restart part, but we are facing some issues related to migration.
>
> First
Hi everyone!
My team and I are working on the possibility to checkpoint a process and
restarting it on another node. We are using CRIU framework for the
checkpoint/restart part, but we are facing some issues related to migration.
First of all: we found out that some attempts to C/R an OMPI proces
11 matches
Mail list logo