Am 08.01.2019 um 20:45 hat Vladimir Sementsov-Ogievskiy geschrieben: > bdrv_co_block_status digs bs->file for additional, more accurate search > for hole inside region, reported as DATA by bs since 5daa74a6ebc. > > This accuracy is not free: assume we have qcow2 disk. Actually, qcow2 > knows, where are holes and where is data. But every block_status > request calls lseek additionally. Assume a big disk, full of > data, in any iterative copying block job (or img convert) we'll call > lseek(HOLE) on every iteration, and each of these lseeks will have to > iterate through all metadata up to the end of file. It's obviously > ineffective behavior. And for many scenarios we don't need this lseek > at all. > > So, add an option to omit calling lseek. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
I don't think adding an option is a good idea, people will not know this option or forget to add it. Also, it's an option that you want to specify almost always, so if we do add an option, maybe it should be the default and an option should be added to enable recursive mode? I haven't checked the full list of bdrv_co_block_status() callers yet. So far we're talking about 'qemu-img convert' or the mirror job. Maybe we need to check other callers, too, but for now I'll stick to these. The interesting case seems to be preallocated qcow2 images, where the clusters are allocated on the qcow2 level, but not in the image file. If we don't do the lseek there, we'll have to read in the data and will only then detect that it is zero (qemu-img only, mirror doesn't do this yet - should it have an option to enable a zero-detecting mode? Can detect-zeroes on the target node achieve the same thing?). So at least in qemu-img, we don't get a different result, it's just a bit slower. Also note that this is only metadata preallocation; full preallocation will still return allocated for the protocol layer and so it will always be slow. Now, if this isn't about correctness, but about performance, maybe we can do some heuristics instead of adding a user option? A very simple approach could be, look at the lower layers only if we got a large allocated area. If the image was preallocated, everything is allocated, so if bdrv_co_block_status() returns a short range, it's a hint that qcow2 knows _something_ at least. (That something could just be that someone discarded something from a preallocated image, so we can't jump to the conclusion that it's not preallocated - but it's certainly more likely). And for small buffer, just reading it in and checking whether it is zero can't be that much slower than finding a hole first. So how about we look at the protocol layer only if we got an allocated range of at least a megabyte or something? Or maybe make it depend on the cluster size because with 2 MB clusters that will be always the case, even if the image isn't preallocated. Kevin