07.01.2019 15:52, Max Reitz wrote: > On 24.12.18 16:50, Vladimir Sementsov-Ogievskiy wrote: >> Hi all! >> >> bdrv_co_block_status digs bs->file for additional, more accurate search for >> hole >> inside region, reported as DATA by bs since long-ago >> >> commit 5daa74a6ebce7543aaad178c4061dc087bb4c705 >> Author: Paolo Bonzini <pbonz...@redhat.com> >> Date: Wed Sep 4 19:00:38 2013 +0200 >> >> block: look for zero blocks in bs->file >> >> This accuracy is not free: assume we have qcow2 disk. Actually, qcow2 knows, >> where are holes and where is data. But every block_status request calls lseek >> additionally. Recently (and not the first time) we have faced the situation, >> when lseek works slow. Of course, it is more about kernel implementation of >> it, >> but why to involve lseek at all, if we have the information on qcow2 level? >> Assume a big disk, full of data, in any iterative copying block job (or img >> convert) we'll call lseek(HOLE) on every iteration, and each of these lseeks >> will have to iterate through all metadata up to the end of file. It's >> obviously >> ineffective behavior. >> >> I'm thinking about, how to properly workaround this thing, and I have some >> questions. >> >> What are the cases, when we really need digging for zeros in bs->file? >> >> Why in mirror we care about discard vs write_zeros, keeping in mind, that >> block_status will return ZERO instead for unallocated regions in some cases, >> not >> related to guest view of disk (bdrv_unallocated_blocks_are_zero, backing >> file is >> smaller than request.start)? >> >> And finally, what to do with the problem? >> >> Obviously, the simplest thing is just revert 5daa74a. What will it break? >> >> Otherwise, I need to "revert" it in some cases, and, may be, it should be a >> user-defined parameter.. Should it? And what are the cases? We need to >> "free" >> of 5daa74a at least qcow2.. But again, should it be format-specific option, >> or >> something more generic? >> >> Then, if it would be separate parameter, how should it interfere with >> want_zeros, or even, should it be want_zeros, reworked a bit to be >> want_lseek? > > As far as I have understood, the @want_zero parameter basically is > exactly this. If you want the call to be quick, you set it to false. > If you want the result to be accurate, you set it to true. > > Well, for raw images it'll still go down to bs->file even with > want_zero=false, but I don't think there's much else we can do, really. > > Max >
want_zero also affects, that unallocated areas with bdrv_unallocated_blocks_are_zero() is true are still reported unallocated (without ZERO) if want_zero is not set. (and similar for unallocated areas after end of backing file).. Also, want_zero is currently internal parameter, just set to true in _block_status and false in _is_allocated. Ok, finally, is it OK to add no-lseek option for 'file' block driver, which will behave exactly as want_zero = false? -- Best regards, Vladimir