On 02/26/2014 08:41 AM, Stefan Hajnoczi wrote:
> On Wed, Feb 26, 2014 at 11:14:04AM +0100, Peter Lieven wrote:
>> I was wondering if it would be a good idea to set the O_DIRECT mode for the 
>> source
>> files of a qemu-img convert process if the source is a host_device?
>>
>> Currently the backup of a host device is polluting the page cache.
> 
> Points to consider:
> 
> 1. O_DIRECT does not work on Linux tmpfs, you get EINVAL when opening
>    the file.  A fallback is necessary.
> 
> 2. O_DIRECT has no readahead so performance could actually decrease.
>    The question is, how important is reahead versus polluting page
>    cache?
> 
> 3. For raw files it would make sense to tell the kernel that access is
>    sequential and data will be used only once.  Then we can get the best
>    of both worlds (avoid polluting page cache but still get readahead).
>    This is done using posix_fadvise(2).

Except that posix_fadvise is advisory only (the kernel is free to ignore
it), and currently not stateful enough inside the kernel to be useful
when handing fds between processes.  For several years now, I've asked
if the kernel could provide better guarantees about what posix_fadvise
can actually do, and expose user-space introspection of those guarantees
through procfs and/or fpathconf.

See https://bugzilla.redhat.com/show_bug.cgi?id=634653 for some
backstory on libvirt's dealings with O_DIRECT. I'd really like to ditch
libvirt's use of O_DIRECT in favor of posix_fadvise for avoiding page
cache pollution, but the kernel isn't at a point yet that lets libvirt
do that.  I suppose that if the kernel ever does improve posix_fadvise,
then both libvirt and qemu would benefit from it.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to