Re: [Qemu-devel] [RFC PATCH v2 5/6] qcow2: implement bdrv_preallocate

Hu Tao Mon, 16 Dec 2013 00:28:58 -0800

On Wed, Dec 11, 2013 at 03:33:40PM +0800, Hu Tao wrote:
> On Thu, Nov 28, 2013 at 11:03:04AM +0100, Peter Lieven wrote:
> > On 28.11.2013 09:48, Hu Tao wrote:
> > >On Wed, Nov 27, 2013 at 11:13:40AM +0100, Peter Lieven wrote:
> > >>Am 27.11.2013 11:07, schrieb Fam Zheng:
> > >>>On 2013年11月27日 18:03, Peter Lieven wrote:
> > >>>>Am 27.11.2013 07:40, schrieb Fam Zheng:
> > >>>>>On 2013年11月27日 14:01, Hu Tao wrote:
> > >>>>>>On Wed, Nov 27, 2013 at 11:01:23AM +0800, Fam Zheng wrote:
> > >>>>>>>On 2013年11月27日 10:15, Hu Tao wrote:
> > >>>>>>>>Signed-off-by: Hu Tao <hu...@cn.fujitsu.com>
> > >>>>>>>>---
> > >>>>>>>>    block/qcow2.c | 7 +++++++
> > >>>>>>>>    1 file changed, 7 insertions(+)
> > >>>>>>>>
> > >>>>>>>>diff --git a/block/qcow2.c b/block/qcow2.c
> > >>>>>>>>index b054a01..a23fade 100644
> > >>>>>>>>--- a/block/qcow2.c
> > >>>>>>>>+++ b/block/qcow2.c
> > >>>>>>>>@@ -2180,6 +2180,12 @@ static int 
> > >>>>>>>>qcow2_amend_options(BlockDriverState *bs,
> > >>>>>>>>        return 0;
> > >>>>>>>>    }
> > >>>>>>>>
> > >>>>>>>>+static int qcow2_preallocate(BlockDriverState *bs, int64_t offset,
> > >>>>>>>>+                             int64_t length)
> > >>>>>>>>+{
> > >>>>>>>>+    return bdrv_preallocate(bs->file, offset, length);
> > >>>>>>>>+}
> > >>>>>>>>+
> > >>>>>>>What's the semantics of .bdrv_preallocate? I think you should map
> > >>>>>>>[offset, offset + length) to clusters in image file, and then
> > >>>>>>>forward to bs->file, rather than this direct wrapper.
> > >>>>>>>
> > >>>>>>>E.g. bdrv_preallocate(qcow2_bs, 0, cluster_size) should call
> > >>>>>>>bdrv_preallocate(qcow2_bs->file, offset_off_first_cluster,
> > >>>>>>>cluster_size).
> > >>>>>>You mean data clusters here, right? Is there a single function to get
> > >>>>>>the offset of the first data cluster?
> > >>>>>>
> > >>>>>There is a function, qcow2_get_cluster_offset.
> > >>>>This should return no valid offset as long as the cluster is not 
> > >>>>allocated.
> > >>>>
> > >>>>I think you actually have to "write" all clusters of a qcow2 one by one.
> > >>>>Eventually this write could be an fallocate call instead of a zero 
> > >>>>write.
> > >>>>
> > >>>Yes, I was wrong about qcow2_get_cluster_offset. The logic here is more 
> > >>>like cluster allocation in qcow2_alloc_cluster_offset. Maybe we can 
> > >>>reuse that.
> > >>What I don't like about the preallocation is that we would loose the 
> > >>information that a cluster contains no valid data and would read it e.g. 
> > >>during
> > >>conversion.
> > >So the information is stored in table and you mean we shouldn't clear
> > >table when do preallocation? I'm not sure how the information could be
> > >useful on a newly-created image, but it seems ideal to keep informations
> > >in table.
> > When you want to e.g. convert this qcow2 later the performance is lower 
> > than needed because
> > you read all those preallocated sectors altough you could now they are 
> > empty.
> > >
> > >>I think what we want is a preallocated image with all clusters 
> > >>sequentally mapped into the qcow2 file. Preallocate all the cluster 
> > >>extends, but still
> > >>have the information in the table that the cluster in fact has no valid 
> > >>data. So we would need a valid cluster offset while still haveing the
> > >>flag that the cluster is unallocated. I think this would require 
> > >>thoughtfully checking all the cluster functions if they can easily cope 
> > >>with this.
> > >>
> > >>The quetion is Hu, what do you want to achieve? Do you want that the 
> > >>space on the filesystem is preallocated so you can't overcommit or
> > >>do you also want a sequential mapping of all the clusters into the file?
> > >The goal is to avoid sparse file as it can cause performance problem. So
> > >the first one. I'm not sure about the second but IIUC, one fallocate()
> > >is enough for all clusters if they are sequentially mapped.
> > If you do not premap them they are allocated in the order they are written.
> > So if you are going to preallocate the whole file anyway, you should 
> > sequentally map all clusters into the file
> > AND still keep the information that they are in fact not yet written.
> 
> Can this be achieved by first fallocate() the disk file, then allocate
> metadata? This way all metadata clusters are allocated before any data
> clusters, leaving all data clusters at the end of file.


Any comments?

Re: [Qemu-devel] [RFC PATCH v2 5/6] qcow2: implement bdrv_preallocate

Reply via email to