On 8/13/13 8:58 AM, Artem Bityutskiy wrote:
> # Make the image to be sparse
> $ cp --sparse=always Fedora-x86_64-19-20130627-sda.raw 
> Fedora-x86_64-19-20130627-sda.raw.sparse
> 
> # Generate the bmap file
> $ bmaptool create Fedora-x86_64-19-20130627-sda.raw.sparse -o 
> Fedora-x86_64-19-20130627-sda.raw.sparse.bmap

So this is the part that interests me . . . 

There seem to be two issues here; how do we efficiently (compress and) 
transport sparse files while retaining sparseness, and how do we efficiently 
operate on files which are already sparse.

For the latter, you're using your bmap tool to map what is hopefully a static 
file (via fibmap or fiemap, I guess?).

I haven't looked at how you've done it, but you do need to be very careful that 
the file is stable & quiesced on disk.  Mapping it this way can be fraught with 
errors if the file is changing, or has delalloc blocks, etc.  And of course 
getting the mapping wrong means data corruption.  If the file is known to be 
sparse, then going forward, using SEEK_HOLE / SEEK_DATA is probably the best 
approach.

But then there's the issue of transporting these sparse files around.  We have 
had the same problem in the past with large e2image metadata image files, which 
may be terabytes in length, with only gigabytes or megabytes of real data.  
e2image _itself_ creates a sparse file, but bzipping it or rsyncing it still 
processes terabytes of zeros, and loses all notion of sparseness.

xfs_metadump worked around this by creating its own compact format describing a 
sparse file's data & sparseness, which is "unpacked" into a normal sparse file 
by xfs_mdrestore.

More recently e2image gained something slightly similar, but used the existing 
qcow format to encode the sparseness.  qemu-image convert to "raw" type turns 
it back into a "normal" sparse file readable by e2fsprogs tools.

So I guess your solution requires 2 pieces of information; the existing file, 
and the mapping file.  Are there mechanisms to ensure that they are in sync?

Another approach which might (?) be more robust, is to somehow encode that 
sparseness in a single file format that can be transported/compressed/copied 
w/o losing the sparseness information, and another tool to operate efficiently 
on that format at the destination, either by unpacking it to a normal sparse 
file or piping it to some other process.

Just some thoughts...

-Eric
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Reply via email to