On Wed, Oct 25, 2023 at 02:30:01PM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berra...@redhat.com> writes:
> 
> > On Wed, Oct 25, 2023 at 11:32:00AM -0300, Fabiano Rosas wrote:
> >> Daniel P. Berrangé <berra...@redhat.com> writes:
> >> 
> >> > On Tue, Oct 24, 2023 at 04:32:10PM -0300, Fabiano Rosas wrote:
> >> >> Markus Armbruster <arm...@redhat.com> writes:
> >> >> 
> >> >> > Fabiano Rosas <faro...@suse.de> writes:
> >> >> >
> >> >> >> Add the direct-io migration parameter that tells the migration code 
> >> >> >> to
> >> >> >> use O_DIRECT when opening the migration stream file whenever 
> >> >> >> possible.
> >> >> >>
> >> >> >> This is currently only used for the secondary channels of fixed-ram
> >> >> >> migration, which can guarantee that writes are page aligned.
> >> >> >>
> >> >> >> However the parameter could be made to affect other types of
> >> >> >> file-based migrations in the future.
> >> >> >>
> >> >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de>
> >> >> >
> >> >> > When would you want to enable @direct-io, and when would you want to
> >> >> > leave it disabled?
> >> >> 
> >> >> That depends on a performance analysis. You'd generally leave it
> >> >> disabled unless there's some indication that the operating system is
> >> >> having trouble draining the page cache.
> >> >
> >> > That's not the usage model I would suggest.
> >> >
> >> 
> >> Hehe I took a shot at answering but I really wanted to say "ask Daniel".
> >> 
> >> > The biggest value of the page cache comes when it holds data that
> >> > will be repeatedly accessed.
> >> >
> >> > When you are saving/restoring a guest to file, that data is used
> >> > once only (assuming there's a large gap between save & restore).
> >> > By using the page cache to save a big guest we essentially purge
> >> > the page cache of most of its existing data that is likely to be
> >> > reaccessed, to fill it up with data never to be reaccessed.
> >> >
> >> > I usually describe save/restore operations as trashing the page
> >> > cache.
> >> >
> >> > IMHO, mgmt apps should request O_DIRECT always unless they expect
> >> > the save/restore operation to run in quick succession, or if they
> >> > know that the host has oodles of free RAM such that existing data
> >> > in the page cache won't be trashed, or
> >> 
> >> Thanks, I'll try to incorporate this to some kind of doc in the next
> >> version.
> >> 
> >> > if the host FS does not support O_DIRECT of course.
> >> 
> >> Should we try to probe for this and inform the user?
> >
> > qemu_open_internall will already do a nice error message. If it gets
> > EINVAL when using O_DIRECT, it'll retry without O_DIRECT and if that
> > works, it'll reoprt "filesystem does not support O_DIRECT"
> >
> > Having said that I see a problem with /dev/fdset handling, because
> > we're only validating O_ACCMODE and that excludes O_DIRECT.
> >
> > If the mgmt apps passes an FD with O_DIRECT already set, then it
> > won't work for VMstate saving which is unaligned.
> >
> > If the mgmt app passes an FD without O_DIRECT set, then we are
> > not setting O_DIRECT for the multifd RAM threads.
> 
> Worse, the fds get dup'ed so even without O_DIRECT, we we enable it for
> the secondary channels the main channel will break on unaligned writes.
> 
> For now I can only think of requiring two fds. One for the main channel
> and a second one for the rest of the channels. And validating the fd
> flags to make sure O_DIRECT is only allowed to be set in the second fd.

In this new model I think there's no reason for libvirt to set O_DIRECT
for its own initial I/O. So we could just totally ignore O_DIRECT when
initially opening the QIOCHannelFile.

Then provide a method on QIOCHannelFile to enable O_DIRECT on the fly
which can be called for the multifd threads setup ?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to