Anthony Liguori wrote:
> Daniel P. Berrange wrote:
>> On Wed, Oct 08, 2008 at 11:06:27AM -0500, Anthony Liguori wrote:
>>   Sorry, it was mistakenly private - fixed now.
>> Xen does use O_DIRECT for paravirt driver case  - blktap is using the
>> combo
>> of AIO+O_DIRECT.
> 
> You have to use O_DIRECT with linux-aio.  And blktap is well known to
> have terrible performance.  Most serious users use blkback/blkfront and
> blkback does not avoid the host page cache.  It maintains data integrity
> by passing through barriers from the guest to the host.  You can
> approximate this in userspace by using fdatasync.

This is not accurate (at least for HVM guests using PV drivers on Xen 3.2).  
blkback does indeed bypass the host page cache completely.  It's I/O behavior 
is akin to O_DIRECT.  I/O is dma'd directly to/from guest pages without 
involving any dom0 buffering.  blkback barrier support only enforces write 
ordering of the blkback I/O stream(s).  It does nothing to synchronize data in 
the host page cache.  Data written through blkback will modify the storage 
"underneath" any data in the host page cache (w/o flushing the page cache).  
Subsequent access to the page cache by qemu-dm will access stale data.  In our 
own Xen product we must explicitly flush the host page cache backing store data 
at qemu-dm start up, to guarantee proper data access.  It is not safe to access 
the same backing object with both qemu-dm and blkback simultaneously.

> The issue the bug addresses, iozone performs better than native, can be
> addressed in the following way:
> 
> 1) For IDE, you have to disable write-caching in the guest.  This should
> force an fdatasync in the host.
> 2) For virtio-blk, we need to implement barrier support.  This is what
> blkfront/blkback do.

I don't think this is enough.  Barrier semantics are local to a particular I/O 
stream.  There would be no reason for the barrier to affect the host page cache 
(unless the I/Os are buffered by the cache).

> 3) For SCSI, we should support ordered queuing which would result in an
> fdatasync when barriers are injected.
> 
> This would result in write performanc> e being what was expected in the
> guest while still letting the host coalesce IO requests, perform
> scheduling with other guests (while respecting each guest's own ordering
> requirements).

I generally agree with your suggestion that host page cache performance 
benefits shouldn't be discarded just to make naive benchmark data collection 
easier.  Anyone suggesting that QEMU emulated disk I/O could somehow outperform 
the host I/O system should know that something is wrong with their benchmark 
setup.  Unfortunately this discussion continues to reappear in the Xen 
community.  I am sure that as QEMU/KVM/virtio matures, a similar thread will 
continue to resurface.

Steve

> 
> Regards,
> 
> Anthony LIguori
> 
>>  QEMU code is only used for the IDE emulation case which isn't
>> interesting from a performance POV.
>>
>> Daniel
>>   
> 
> -- 
> Libvir-list mailing list
> Libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

--
Libvir-list mailing list
Libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to