Let's chat off-list about it - I don't see exactly how this works, but it may 
be similar enough. 


On Aug 27, 2011, at 8:30 AM, Joshua Hursey wrote:

> There is a 'self' checkpointer (CRS component) that does application level 
> checkpointing - exposed at the MPI level. I don't know how different what you 
> are working on is, but maybe something like that could be harnessed. Note 
> that I have not tested the 'self' checkpointer with the process migration 
> support, it -should- work, but there might be some bugs to work out.
> 
> Documentation and examples at the link below:
>  http://osl.iu.edu/research/ft/ompi-cr/examples.php#example-self
> 
> -- Josh
> 
> On Aug 26, 2011, at 6:17 PM, Ralph Castain wrote:
> 
>> FWIW: I'm in the process of porting some code from a branch that allows apps 
>> to do on-demand checkpoint/recovery style operations at the app level. 
>> Specifically, it provides the ability to:
>> 
>> * request a "recovery image" - an application-level blob containing state 
>> info required for the app to recover its state.
>> 
>> * register a callback point for providing a "recovery image", either to 
>> store for later use (separate API is used to indicate when to acquire it) or 
>> to provide to another process upon request
>> 
>> This is at the RTE level, so someone would have to expose it via an 
>> appropriate MPI call if someone wants to use it at that layer (I'm open to 
>> changes to support that use, if someone is interested).
>> 
>> 
>> On Aug 26, 2011, at 3:16 PM, Josh Hursey wrote:
>> 
>>> There are some great comments in this thread. Process migration (like
>>> many topics in systems) can get complex fast.
>>> 
>>> The Open MPI process migration implementation is checkpoint/restart
>>> based (currently using BLCR), and uses an 'eager' style of migration.
>>> This style of migration stops a process completely on the source
>>> machine, checkpoints/terminates it, restarts it on the destination
>>> machine, then rejoins it with the other running processes. I think the
>>> only documentation that we have is at the webpage below (and my PhD
>>> thesis, if you want the finer details):
>>> http://osl.iu.edu/research/ft/ompi-cr/
>>> 
>>> We have wanted to experiment with a 'pre-copy' or 'live' migration
>>> style, but have not had the necessary support from the underlying
>>> checkpointer or time to devote to making it happen. I think BLCR is
>>> working on including the necessary pieces in a future release (there
>>> are papers where a development version of BLCR has done this with
>>> LAM/MPI). So that might be something of interest.
>>> 
>>> Process migration techniques can benefit from fault prediction and
>>> 'good' target destination selection. Fault prediction allows us to
>>> move processes away from soon-to-fail locations, but it can be
>>> difficult to accurately predict failures. Open MPI has some hooks in
>>> the runtime layer that support 'sensors' which might help here. Good
>>> target destination selection is equally complex, but the idea here is
>>> to move processes to a machine where they can continue supporting the
>>> efficient execution of the application. So this might mean moving to
>>> the least loaded machine, or moving to a machine with other processes
>>> to reduce interprocess communication (something like dynamic load
>>> balancing).
>>> 
>>> So there are some ideas to get you started.
>>> 
>>> -- Josh
>>> 
>>> On Thu, Aug 25, 2011 at 12:06 PM, Rayson Ho <raysonlo...@gmail.com> wrote:
>>>> Don't know which SSI project you are referring to... I only know the
>>>> OpenSSI project, and I was one of the first who subscribed to its
>>>> mailing list (since 2001).
>>>> 
>>>> http://openssi.org/cgi-bin/view?page=openssi.html
>>>> 
>>>> I don't think those OpenSSI clusters are designed for tens of
>>>> thousands of nodes, and not sure if it scales well to even a thousand
>>>> nodes -- so IMO they have limited use for HPC clusters.
>>>> 
>>>> Rayson
>>>> 
>>>> 
>>>> 
>>>> On Thu, Aug 25, 2011 at 11:45 AM, Durga Choudhury <dpcho...@gmail.com> 
>>>> wrote:
>>>>> Also, in 2005 there was an attempt to implement SSI (Single System
>>>>> Image) functionality to the then-current 2.6.10 kernel. The proposal
>>>>> was very detailed and covered most of the bases of task creation, PID
>>>>> allocation etc across a loosely tied cluster (without using fancy
>>>>> hardware such as RDMA fabric). Anybody knows if it was ever
>>>>> implemented? Any pointers in this direction?
>>>>> 
>>>>> Thanks and regards
>>>>> Durga
>>>>> 
>>>>> 
>>>>> On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho <raysonlo...@gmail.com> wrote:
>>>>>> Srinivas,
>>>>>> 
>>>>>> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
>>>>>> if you can checkpoint an MPI task and restart it on a new node, then
>>>>>> this is also "process migration".
>>>>>> 
>>>>>> Of course, doing a checkpoint & restart can be slower than pure
>>>>>> in-kernel process migration, but the advantage is that you don't need
>>>>>> any kernel support, and can in fact do all of it in user-space.
>>>>>> 
>>>>>> Rayson
>>>>>> 
>>>>>> 
>>>>>> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain <r...@open-mpi.org> 
>>>>>> wrote:
>>>>>>> It also depends on what part of migration interests you - are you 
>>>>>>> wanting to look at the MPI part of the problem (reconnecting MPI 
>>>>>>> transports, ensuring messages are not lost, etc.) or the RTE part of 
>>>>>>> the problem (where to restart processes, detecting failures, etc.)?
>>>>>>> 
>>>>>>> 
>>>>>>> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>>>>>>> 
>>>>>>>> Be aware that process migration is a pretty complex issue.
>>>>>>>> 
>>>>>>>> Josh is probably the best one to answer your question directly, but 
>>>>>>>> he's out today.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>>>>>>> 
>>>>>>>>> I am final year grad student looking for my final year project in 
>>>>>>>>> OpenMPI.We are group of 4 students.
>>>>>>>>> I wanted to know about the "Process Migration" process of MPI 
>>>>>>>>> processes in OpenMPI.
>>>>>>>>> Can anyone suggest me any ideas for project related to process 
>>>>>>>>> migration in OenMPI or other topics in Systems.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> regards,
>>>>>>>>> Srinivas Kundaram
>>>>>>>>> srinu1...@gmail.com
>>>>>>>>> +91-8149399160
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jeff Squyres
>>>>>>>> jsquy...@cisco.com
>>>>>>>> For corporate legal information go to:
>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Rayson
>>>>>> 
>>>>>> ==================================================
>>>>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>>>>> http://gridscheduler.sourceforge.net/
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Rayson
>>>> 
>>>> ==================================================
>>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>>> http://gridscheduler.sourceforge.net/
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Joshua Hursey
>>> Postdoctoral Research Associate
>>> Oak Ridge National Laboratory
>>> http://users.nccs.gov/~jjhursey
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to