[Devel] C/R minisummit notes

Daniel Lezcano Wed, 23 Jul 2008 04:36:16 -0700

  * What are the problems that the linux community can solve with the 
checkpoint/restart ?


        Eric Biederman reminds at the previous OLS nobody complained about the 
checkpoint/restart

        Pavel Emylianov : The startup of Oracle takes some minutes, if we 
checkpoint just after the startup, Oracle can be restarted from this 
point later and provide fast startup

        Oren Laaden : Time travel, we can do monotonic snapshot and go back on 
one of this snaphost.

        Eric Biedreman : Priority running, checkpoint/kill an application and 
run another application with a bigger priority

        Denis Lunev : Task migration, move application on one host to another 
host

        Daniel Lezcano : SSI (task migration)

  * Preparing the kernel internals

        OL : Can we implement a kernel module and move CR functionality into 
the kernel itself later ?

        EB : Better to add a little CR functionnality into the kernel itself 
and add more after.

        DLu : Problem with kernel version

        OL : Compatibility with intermediate kernel version should be possible 
with userspace conversion tools

        DLu : Non sequential file for checkpoint statefile is a challenge

        OL : yes, but possible and useful for compression/encryption

        We showed that there are five steps to realize a checkpoint:

        1 - Pre-dump
        2 - Freeze
        3 - Dump
        4 - Resume/kill
        5 - Post-dump

        At this point we state we want create a proof of concept and 
checkpoint/restart the simplest application.

        We will add iteratively more and more kernel resources.

        Process hierarchy created from kernel or userspace ?

        OL : Seems better to send a chunk of data to kernel and that restores 
the processes hierarchy
        PE : Agreed
        OL : We should be able to checkpoint from inside the container, keep 
that in mind for later.
        
        => we need a syscall or a ioctl

        The first items to address before implementing the Checkpoint are:
        1 - Make a container object (the context)
        2 - Freeze the container (extend cgroup freezer ?)
        3 - syscall | ioctl

        First step:
                * simplest application : A single process, without any file, no 
checkpoint of text file (same file system for restart), no signals, no 
syscall in the application, no ipc/no msgq, no network

        Second step:
                * multiple processes + zombie state

        Third step:
                * files, pipe, signals, socketpair ?

        This proof of concept must came with a documentation describing what is 
supported, what is not supported and what we plan to do.

_______________________________________________
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/containers

_______________________________________________
Devel mailing list
[email protected]
https://openvz.org/mailman/listinfo/devel

[Devel] C/R minisummit notes

Reply via email to