On Wed, Dec 11, 2013 at 03:33:40PM +0800, Hu Tao wrote: > On Thu, Nov 28, 2013 at 11:03:04AM +0100, Peter Lieven wrote: > > On 28.11.2013 09:48, Hu Tao wrote: > > >On Wed, Nov 27, 2013 at 11:13:40AM +0100, Peter Lieven wrote: > > >>Am 27.11.2013 11:07, schrieb Fam Zheng: > > >>>On 2013年11月27日 18:03, Peter Lieven wrote: > > >>>>Am 27.11.2013 07:40, schrieb Fam Zheng: > > >>>>>On 2013年11月27日 14:01, Hu Tao wrote: > > >>>>>>On Wed, Nov 27, 2013 at 11:01:23AM +0800, Fam Zheng wrote: > > >>>>>>>On 2013年11月27日 10:15, Hu Tao wrote: > > >>>>>>>>Signed-off-by: Hu Tao <hu...@cn.fujitsu.com> > > >>>>>>>>--- > > >>>>>>>> block/qcow2.c | 7 +++++++ > > >>>>>>>> 1 file changed, 7 insertions(+) > > >>>>>>>> > > >>>>>>>>diff --git a/block/qcow2.c b/block/qcow2.c > > >>>>>>>>index b054a01..a23fade 100644 > > >>>>>>>>--- a/block/qcow2.c > > >>>>>>>>+++ b/block/qcow2.c > > >>>>>>>>@@ -2180,6 +2180,12 @@ static int > > >>>>>>>>qcow2_amend_options(BlockDriverState *bs, > > >>>>>>>> return 0; > > >>>>>>>> } > > >>>>>>>> > > >>>>>>>>+static int qcow2_preallocate(BlockDriverState *bs, int64_t offset, > > >>>>>>>>+ int64_t length) > > >>>>>>>>+{ > > >>>>>>>>+ return bdrv_preallocate(bs->file, offset, length); > > >>>>>>>>+} > > >>>>>>>>+ > > >>>>>>>What's the semantics of .bdrv_preallocate? I think you should map > > >>>>>>>[offset, offset + length) to clusters in image file, and then > > >>>>>>>forward to bs->file, rather than this direct wrapper. > > >>>>>>> > > >>>>>>>E.g. bdrv_preallocate(qcow2_bs, 0, cluster_size) should call > > >>>>>>>bdrv_preallocate(qcow2_bs->file, offset_off_first_cluster, > > >>>>>>>cluster_size). > > >>>>>>You mean data clusters here, right? Is there a single function to get > > >>>>>>the offset of the first data cluster? > > >>>>>> > > >>>>>There is a function, qcow2_get_cluster_offset. > > >>>>This should return no valid offset as long as the cluster is not > > >>>>allocated. > > >>>> > > >>>>I think you actually have to "write" all clusters of a qcow2 one by one. > > >>>>Eventually this write could be an fallocate call instead of a zero > > >>>>write. > > >>>> > > >>>Yes, I was wrong about qcow2_get_cluster_offset. The logic here is more > > >>>like cluster allocation in qcow2_alloc_cluster_offset. Maybe we can > > >>>reuse that. > > >>What I don't like about the preallocation is that we would loose the > > >>information that a cluster contains no valid data and would read it e.g. > > >>during > > >>conversion. > > >So the information is stored in table and you mean we shouldn't clear > > >table when do preallocation? I'm not sure how the information could be > > >useful on a newly-created image, but it seems ideal to keep informations > > >in table. > > When you want to e.g. convert this qcow2 later the performance is lower > > than needed because > > you read all those preallocated sectors altough you could now they are > > empty. > > > > > >>I think what we want is a preallocated image with all clusters > > >>sequentally mapped into the qcow2 file. Preallocate all the cluster > > >>extends, but still > > >>have the information in the table that the cluster in fact has no valid > > >>data. So we would need a valid cluster offset while still haveing the > > >>flag that the cluster is unallocated. I think this would require > > >>thoughtfully checking all the cluster functions if they can easily cope > > >>with this. > > >> > > >>The quetion is Hu, what do you want to achieve? Do you want that the > > >>space on the filesystem is preallocated so you can't overcommit or > > >>do you also want a sequential mapping of all the clusters into the file? > > >The goal is to avoid sparse file as it can cause performance problem. So > > >the first one. I'm not sure about the second but IIUC, one fallocate() > > >is enough for all clusters if they are sequentially mapped. > > If you do not premap them they are allocated in the order they are written. > > So if you are going to preallocate the whole file anyway, you should > > sequentally map all clusters into the file > > AND still keep the information that they are in fact not yet written. > > Can this be achieved by first fallocate() the disk file, then allocate > metadata? This way all metadata clusters are allocated before any data > clusters, leaving all data clusters at the end of file.
Any comments?