于 2013-3-21 23:08, Eric Blake 写道:
> On 03/21/2013 08:56 AM, Stefan Hajnoczi wrote:
>> On Thu, Mar 21, 2013 at 02:42:23PM +0100, Paolo Bonzini wrote:
>>> Il 21/03/2013 14:38, Stefan Hajnoczi ha scritto:
>>>> There already is a guest RAM cloning mechanism: fork the QEMU process.
>>>> Then you have a copy-on-write guest RAM.
>>>>
>>>> In a little more detail:
>>>>
>>>> 1. save non-RAM device state
>>>> 2. quiesce QEMU to a state that is safe for forking
>>>> 3. create an EventNotifier for live savevm completion signal
>>>> 4. fork and pass completion EventNotifier to child
>>>> 5. parent continues running VM
>>>> 6. child performs vmsave of copy-on-write guest RAM
>>>> 7. child signals completion EventNotifier and terminates
>>>> 8. parent raises live savevm completion QMP event
>>>
>>> Forking a threaded program is not so easy, but it could be done if the
>>> child is very simple and only uses syscalls to communicate back with the
>>> parent:
>>
>> On Linux you should be able to use clone(2) to spawn a thread with
>> copy-on-write memory.  Too bad it's not portable because it gets around
>> the messy fork issues.
> 
> And introduces its own messy issues - once you clone() using different
> flags than what fork() does, you have invalidated the use of a LOT of
> libc interfaces in that child; in particular, any use of pthread is
> liable to break.
> 
  I think the core of fork() is snapshot RAM pages with RAM, just like
LVM2's block snapshot, very cool idea :).
  The problem is implemention, an API like following is needed:
void *mem_snapshot(void *addr, uint64_t len);
  Briefly I haven't found it on Linux, and not sure if it is available
on upstream Linux kernel/C lib. Make this API available then use it
in qemu, would be much nicer.
  It is very challenge to use fork()/clone() way in qemu, I guess
there will be many sparse code preparing for fork(), and some
resource handling code after fork(), code to query progress, exception
handling, child/parent talking mechnism, ah... seems complex. But I am
looking forward to see how good it is.
  Compared with migration to image, the later one use less mem with
more I/O, but is much easier to be implemented and portable, maybe
it can be used as a simple improvement for "migrate to fd", before
an underlining mem snapshot API is available.
-- 
Best Regards

Wenchao Xia


Reply via email to