* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Thu, May 12, 2022 at 05:58:46PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berra...@redhat.com) wrote: > > > On Wed, May 11, 2022 at 07:31:45PM +0200, Claudio Fontana wrote: > > > > That's great, I love when things are simple. > > > > > > > > If indeed we want to remove the copy in libvirt (which will also mean > > > > explicitly fsyncing elsewhere, as the iohelper would not be there > > > > anymore to do that for us on image creation), > > > > with QEMU having a "file" protocol support for migration, > > > > > > > > do we plan to have libvirt and QEMU both open the file for writing > > > > concurrently, with QEMU opening O_DIRECT? > > > > > > For non-libvirt users, I expect QEMU would open the > > > file directly . For libvirt usage, it is likely > > > preferrable to pass the pre-opened FD, because that > > > simplifies file permission handling. > > > > > > > The alternative being having libvirt open the file with > > > > O_DIRECT, write some libvirt stuff in a new, O_DIRECT- > > > > friendly format, and then pass the fd to qemu to migrate to, > > > > and QEMU sending its new O_DIRECT friendly stream there. > > > > > > Yep. > > > > > > > In any case, the expectation here is to have a new > > > > "file://pathname" or "file:://fdname" as an added feature in QEMU, > > > > where QEMU would write a new O_DIRECT friendly stream > > > > directly into the file, taking care of both optional > > > > parallelization and compression. > > > > > > I could see several distinct building blocks > > > > > > * First a "file:/some/path" migration protocol > > > that can just do "normal" I/O, but still writing > > > in the traditional migration data stream > > > > > > * Modify existing 'fd:' protocol so that it fstat()s > > > and passes over to the 'file' protocol handler if > > > it sees the FD is not a socket/pipe > > > > We used to have that at one point. > > > > > * Add a migration capability "direct-mapped" to > > > indicate we want the RAM data written/read directly > > > to/from fixed positions in the file, as opposed to > > > a stream. Obviously only valid with a sub-set > > > of migration protocols (file, and fd: if a seekable > > > FD). > > > > This worries me about how you're going to cleanly glue this into the > > migration code; it sounds like what you want it to do is very different > > to what it currently does. > > I've only investigated it lightly, but I see the key bit of code > is this method which emits the header + ram page content: > > > static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, > uint8_t *buf, bool async) > { > ram_transferred_add(save_page_header(rs, rs->f, block, > offset | RAM_SAVE_FLAG_PAGE)); > if (async) { > qemu_put_buffer_async(rs->f, buf, TARGET_PAGE_SIZE, > migrate_release_ram() && > migration_in_postcopy()); > } else { > qemu_put_buffer(rs->f, buf, TARGET_PAGE_SIZE); > } > ram_transferred_add(TARGET_PAGE_SIZE); > ram_counters.normal++; > return 1; > } > > > my (perhaps wishful) thinking was that we just have an alternative > impl of this which doesn't save the page header, and puts the > page content at a fixed offset.
Hmm OK, probably can; note I think the multifd is separate code (and currently much cleaner - which you'd make more complex again). > I'm fuzzy on how we figure out the right offset - I was hoping > that "RAMState" or "RAMBlock" somehow gives us enough info to figure > out a deterministic mapping to a file location. I think that's probably the ram_addr_t type, RAMBlock->offset + the index intot he ramblock; that gets you the same thing as the dirty bitmap (hmm although we don't have a single one of those any more). Dave > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK