Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Kevin Wolf
Am 14.09.2010 17:11, schrieb Anthony Liguori:
 On 09/14/2010 09:47 AM, Chris Wright wrote:
 0.13
 - if all goes well...tomorrow

 
 To tag, it may be thursday for announcement.  I need to run a regression 
 run tonight.
 
 qed/qcow2
 - increase concurrency, performance

 
 To achieve performance, a block driver must: 1) support concurrent 
 request handling 2) not hold the qemu_mutex for prolonged periods of time.
 
 QED never does (2) and supports (1) in all circumstances except cluster 
 allocation today.
 
 qcow2 can do (1) for the data read/write portions of an I/O request.  
 All metadata read/write is serialized.  It also does (2) for all 
 metadata operations and for CoW operations.
 
 These are implementation details though.  The real claim of QED is that 
 by having fewer IO ops required to satisfy a request, it achieves better 
 performance especially since it achieves zero syncs in the cluster 
 allocation path.  qcow2 has two syncs in the cluster allocation path 
 today.  One sync is due to the refcount table.  Another sync is due to 
 the fact that it doesn't require fsck support.

The refcount table sync is the sync that allows not doing an fsck. For a
simple cluster allocation (no L2 allocation, no COW), we only have one
sync (which is still one sync too much in this path, so we must move it).

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Kevin Wolf
Am 15.09.2010 14:26, schrieb Anthony Liguori:
 On 09/15/2010 03:30 AM, Kevin Wolf wrote:
 Am 14.09.2010 17:11, schrieb Anthony Liguori:

 On 09/14/2010 09:47 AM, Chris Wright wrote:
  
 0.13
 - if all goes well...tomorrow


 To tag, it may be thursday for announcement.  I need to run a regression
 run tonight.

  
 qed/qcow2
 - increase concurrency, performance


 To achieve performance, a block driver must: 1) support concurrent
 request handling 2) not hold the qemu_mutex for prolonged periods of time.

 QED never does (2) and supports (1) in all circumstances except cluster
 allocation today.

 qcow2 can do (1) for the data read/write portions of an I/O request.
 All metadata read/write is serialized.  It also does (2) for all
 metadata operations and for CoW operations.

 These are implementation details though.  The real claim of QED is that
 by having fewer IO ops required to satisfy a request, it achieves better
 performance especially since it achieves zero syncs in the cluster
 allocation path.  qcow2 has two syncs in the cluster allocation path
 today.  One sync is due to the refcount table.  Another sync is due to
 the fact that it doesn't require fsck support.
  
 The refcount table sync is the sync that allows not doing an fsck. For a
 simple cluster allocation (no L2 allocation, no COW), we only have one
 sync (which is still one sync too much in this path, so we must move it).

 
 Don't you have to write both a reference count entry and update the L2 
 entry?  Both calls would be bdrv_pwrite_sync, no?

No, we don't really care if the L2 entry is on disk. If the guest want
to have its data safe it needs to issue an explicit flush anyway. The
only thing we want to achieve with bdrv_write_sync is to maintain the
right order between metadata updates to survive a crash without corruption.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Anthony Liguori

On 09/15/2010 07:38 AM, Kevin Wolf wrote:

No, we don't really care if the L2 entry is on disk. If the guest want
to have its data safe it needs to issue an explicit flush anyway. The
only thing we want to achieve with bdrv_write_sync is to maintain the
right order between metadata updates to survive a crash without corruption.
   


Ah, yes, this is brand new :-)

I was looking at my QED branch which is a few weeks old.

Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Kevin Wolf
Am 15.09.2010 15:21, schrieb Anthony Liguori:
 On 09/15/2010 07:38 AM, Kevin Wolf wrote:
 No, we don't really care if the L2 entry is on disk. If the guest want
 to have its data safe it needs to issue an explicit flush anyway. The
 only thing we want to achieve with bdrv_write_sync is to maintain the
 right order between metadata updates to survive a crash without corruption.

 
 Ah, yes, this is brand new :-)
 
 I was looking at my QED branch which is a few weeks old.

Well, the whole bdrv_pwrite_sync thing is new - with your benchmarking
you probably caught qcow2 at its worst performance in years. Initially I
just blindly converted everything to be on the safe side, and now we
need to optimize to get the performance back. There are probably some
more syncs that can be removed in less common paths.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Anthony Liguori

On 09/15/2010 08:30 AM, Kevin Wolf wrote:

Am 15.09.2010 15:21, schrieb Anthony Liguori:
   

On 09/15/2010 07:38 AM, Kevin Wolf wrote:
 

No, we don't really care if the L2 entry is on disk. If the guest want
to have its data safe it needs to issue an explicit flush anyway. The
only thing we want to achieve with bdrv_write_sync is to maintain the
right order between metadata updates to survive a crash without corruption.

   

Ah, yes, this is brand new :-)

I was looking at my QED branch which is a few weeks old.
 

Well, the whole bdrv_pwrite_sync thing is new - with your benchmarking
you probably caught qcow2 at its worst performance in years.


FWIW, we queued a run reverting the sync() stuff entirely as we were 
aware of that.  Should have results this morning.



  Initially I
just blindly converted everything to be on the safe side, and now we
need to optimize to get the performance back. There are probably some
more syncs that can be removed in less common paths.
   


Most likely.

Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call minutes for Sept 14

2010-09-15 Thread Kevin Wolf
Am 15.09.2010 15:52, schrieb Anthony Liguori:
 On 09/15/2010 08:30 AM, Kevin Wolf wrote:
 Am 15.09.2010 15:21, schrieb Anthony Liguori:

 On 09/15/2010 07:38 AM, Kevin Wolf wrote:
  
 No, we don't really care if the L2 entry is on disk. If the guest want
 to have its data safe it needs to issue an explicit flush anyway. The
 only thing we want to achieve with bdrv_write_sync is to maintain the
 right order between metadata updates to survive a crash without corruption.


 Ah, yes, this is brand new :-)

 I was looking at my QED branch which is a few weeks old.
  
 Well, the whole bdrv_pwrite_sync thing is new - with your benchmarking
 you probably caught qcow2 at its worst performance in years.
 
 FWIW, we queued a run reverting the sync() stuff entirely as we were 
 aware of that.  Should have results this morning.

Okay. I think that will be helpful, even outside the context of QED. I'd
be interested how much of a difference it really makes in your tests.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html