On Mon, Jun 10, 2024 at 03:02:10PM -0400, Peter Xu wrote: > On Mon, Jun 10, 2024 at 02:45:53PM -0300, Fabiano Rosas wrote: > > >> AIUI, the issue here that users are already allowed to specify in > > >> libvirt the equivalent to direct-io and multifd independent of each > > >> other (bypass-cache, parallel). To start requiring both together now in > > >> some situations would be a regression. I confess I don't know libvirt > > >> code to know whether this can be worked around somehow, but as I said, > > >> it's a relatively simple change from the QEMU side. > > > > > > Firstly, I definitely want to already avoid all the calls to either > > > migration_direct_io_start() or *_finish(), now we already need to > > > explicitly call them in three paths, and that's not intuitive and less > > > readable, just like the hard coded rdma codes. > > > > Right, but that's just a side-effect of how the code is structured and > > the fact that writes to the stream happen in small chunks. Setting > > O_DIRECT needs to happen around aligned IO. We could move the calls > > further down into qemu_put_buffer_at(), but that would be four fcntl() > > calls for every page. > > Hmm.. why we need four fcntl()s instead of two? > > > > > A tangent: > > one thing that occured to me now is that we may be able to restrict > > calls to qemu_fflush() to internal code like add_to_iovec() and maybe > > use that function to gather the correct amount of data before writing, > > making sure it disables O_DIRECT in case alignment is about to be > > broken? > > IIUC dio doesn't require alignment if we don't care about perf? I meant it > should be legal to write(fd, buffer, 5) even if O_DIRECT?
No, we must assume that O_DIRECT requires alignment both of the userspace memory buffers, and the file offset on disk: [quote man(open)] O_DIRECT The O_DIRECT flag may impose alignment restrictions on the length and address of user-space buffers and the file offset of I/Os. In Linux alignment restrictions vary by filesystem and kernel version and might be absent entirely. The handling of misaligned O_DIRECT I/Os also varies; they can either fail with EINVAL or fall back to buffered I/O. [/quote] Given QEMU's code base, it is only safe for us to use O_DIRECT with RAM blocks where we have predictable in-memory alignment, and have defined a good on-disk offset alignment too. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|