On Mon, Oct 03, 2011 at 08:51:10AM -0500, Anthony Liguori wrote: > On 10/03/2011 08:38 AM, Michael S. Tsirkin wrote: > >On Mon, Oct 03, 2011 at 07:55:48AM -0500, Anthony Liguori wrote: > >>On 10/02/2011 04:08 PM, Michael S. Tsirkin wrote: > >>>On Sun, Oct 02, 2011 at 04:21:47PM -0400, Stefan Berger wrote: > >>>> > >>>>>4) Implement the BERVisitor and make this the default migration protocol. > >>>>> > >>>>>Most of the work will be in 1), though with the implementation in this > >>>>>series we should be able to do it incrementally. I'm not sure if the > >>>>>best approach is doing the mechanical phase 1 conversion, then doing > >>>>>phase 2 sometime after 4), doing phase 1 + 2 as part of 1), or just > >>>>>doing VMState conversions which gives basically the same capabilities as > >>>>>phase 1 + 2. > >>>>> > >>>>>Thoughts? > >>>>Is anyone working on this? If not I may give it a shot (tomorrow++) > >>>>for at least some of the primitives... for enabling vNVRAM metadata > >>>>of course. Indefinite length encoding of constructed data types I > >>>>suppose won't be used otherwise the visitor interface seems wrong > >>>>for parsing and skipping of extra data towards the end of a > >>>>structure if version n wrote the stream and appended some of its > >>>>version n data and now version m< n is trying to read the struct > >>>>and needs to skip the version [m+1, n ] data fields ... in that case > >>>>the de-serialization of the stream should probably be stream-driven > >>>>rather than structure-driven. > >>>> > >>>> Stefan > >>> > >>>Yes I've been struggling with that exactly. > >>>Anthony, any thoughts? > >> > >>It just depends on how you write your visitor. If you used > >>sequences, you'd probably do something like this: > >> > >>start_struct -> > >> check for sequence tag, push starting offset and size onto stack > >> increment offset to next tag > >> > >>type_int (et al) -> > >> check for explicit type, parse data > >> increment offset to next tag > >> > >>end_struct -> > >> pop starting offset and size to temp variables > >> set offset to starting offset + size > >> > >>This is roughly how the QMP input marshaller works FWIW. > >> > >>Regards, > >> > >>Anthony Liguori > > > >One thing I worry about is enabling zero copy for > >large string types (e.g. memory migration). > > Memory shouldn't be done through Visitors. It should be handled as a special > case.
OK, that's fine then. > >So we need to be able to see a tag for memory page + address, > >read that from socket directly at the correct virtual address. > > > >Probably, we can avoid using visitors for memory, and hope > >everything else can stand an extra copy since it's small. > > > >But then, why do we worry about the size of > >encoded device state as Anthony seems to do? > > There's a significant difference between the cost of something on > the wire and the cost of doing a memcpy. The cost of the data on > the wire is directly proportional to downtime. So if we increase > the size of the device state by a factor of 10, we increase the > minimum downtime by a factor of 10. > > Of course, *if* the size of device state is already negligible with > respect to the minimum downtime, then it doesn't matter. This is > easy to quantify though. For a normal migration session today, > what's the total size of the device state in relation to the > calculated bandwidth of the minimum downtime? > > If it's very small, then we can add names and not worry about it. > > Regards, > > Anthony Liguori Yes, it's easy to quantify. I think the following gives us the offset before and after, so the difference is the size we seek, right? diff --git a/savevm.c b/savevm.c index 1feaa70..dbbbcc6 100644 --- a/savevm.c +++ b/savevm.c @@ -1543,6 +1543,7 @@ int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f) int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f) { SaveStateEntry *se; + unsigned long long vm_state_size; cpu_synchronize_all_states(); @@ -1557,6 +1558,8 @@ int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f) se->save_live_state(mon, f, QEMU_VM_SECTION_END, se->opaque); } + vm_state_size = qemu_ftell(f); + fprintf(stderr, "start size: %lld\n", vm_state_size); QTAILQ_FOREACH(se, &savevm_handlers, entry) { int len; @@ -1577,6 +1580,8 @@ int qemu_savevm_state_complete(Monitor *mon, QEMUFile *f) vmstate_save(f, se); } + vm_state_size = qemu_ftell(f); + fprintf(stderr, "end size: %lld\n", vm_state_size); qemu_put_byte(f, QEMU_VM_EOF); -- MST