>> >> I thought that after being on a track with virtio drivers, I will go with >> doing >> some work on check pointing vkernel. After that we can move forward with >> vkernel >> migration using opportunistic replay approach as we have the checkpoints >> generated. But all this depends on your guidance, I can go the way you guys >> suggest. > > Actually, I have asked Matt about vkernel checkpointing recently as I > have huge interest in being able to suspend to disk or migrate vkernels > to other machines. I thought that one had to modify the kernel to > support checkpointing of multiple vmspaces (which the vkernel makes use > of), and which is currently not supported (actually doesn't sound too > complicated), but Matt said that it should be possible to do without any > modification of the real kernel. One would have to teach the vkernel to > catch the SIGCKPT and to collect and save the vpagetable information for > the vkernel-processes together with configuration information like > network device and console, and to restore it upon SIGTHAW. > > I might take a stab at it after my exam on Monday. > > Being able to checkpoint vkernels would enable us to migrate a running > vkernel within a few seconds to another box, e.g. for the purpose of > CPU/network load-balancing, or to save energy by employing VaryOn/Off > strategies, i.e. shutting down servers with too little load. The whole > consolidation/virtualization buzz would find it's way into > DragonFly :). > > Another great application of vkernels could be for laptops. Imagine your > laptop runs on DragonFly, and your host kernel is running a X11 display > server, while all your X11 clients run inside the vkernel. This enables > you to checkpoint the vkernel, shut down the laptop, and to later > restore where you left off (assuming X11 doesn't keep state on the > X11-server). > > I also think that vkernel checkpointing is a great feature for virtual > hosting. Especially combined with swapcache, one could run a huge number > of vkernels on a single machine, giving each enough main memory, backed > by a SSD. Whether it performs well is another question :) > > Regards, > > Michael > >
There is another component to this, we currently can't checkpoint any threaded program because we do not save and restore the segment registers, which is how TLS is implemented. I started on this (dirty) http://gitweb.dragonflybsd.org/~sjg/dragonfly.git/commitdiff/e29b6f2d0c961889b55e94ce7ac6aa0e398d7071 but its really a bit beyond me. Sam
