Am 11.06.2018 um 16:04 schrieb Max Reitz: > On 2018-06-11 15:59, Peter Lieven wrote: >> Am 11.06.2018 um 15:30 schrieb Max Reitz: >>> On 2018-06-07 14:46, Peter Lieven wrote: >>>> We currently don't enforce that the sparse segments we detect during >>>> convert are >>>> aligned. This leads to unnecessary and costly read-modify-write >>>> cycles either >>>> internally in Qemu or in the background on the storage device as >>>> nearly all >>>> modern filesystems or hardware has a 4k alignment internally. >>>> >>>> As we per default set the min_sparse size to 4k it makes perfectly >>>> sense to ensure >>>> that these sparse holes in the file are placed at 4k boundaries. >>>> >>>> The number of RMW cycles when converting an example image [1] to a >>>> raw device that >>>> has 4k sector size is about 4600 4k read requests to perform a total >>>> of about 15000 >>>> write requests. With this path the 4600 additional read requests are >>>> eliminated. >>>> >>>> [1] >>>> https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.vmdk >>>> >>>> >>>> Signed-off-by: Peter Lieven <p...@kamp.de> >>>> --- >>>> qemu-img.c | 21 +++++++++++++++------ >>>> 1 file changed, 15 insertions(+), 6 deletions(-) >>> I like the idea, but it doesn't seem guaranteed that >>> is_allocated_sectors() is called on aligned offsets, so this alignment >>> work may still leave things unaligned. >> I can't image why this should happen. As long as the alignment devides >> the buffer size we either >> write or skip aligned bytes. Maybe get_block_status returns an unaligned >> number of sectors? > Yes, because the source medium does not need to be the same as the > destination (so the source may have e.g. 512-byte clusters). > >>> Furthermore, we should probably not blindly assume 4k but instead use >>> some block limit of the target, like pwrite_zeroes_alignment, or >>> pdiscard_alignment, depending on the case. (Or probably still >>> min_sparse, if that's less.) >>> >>> Since is_allocated_sectors_min() (the only caller of >>> is_allocated_sectors()) is called from just a single place, taking those >>> factors into account should be possible. >> I also thought of this, but for instance for raw-posix I always get a >> request_alignment of 1. > Yes, because request_alignment is a hard requirement. With caching, you > can send requests with any alignment, so it's 1. > > pwrite_zeroes_alignment and pdiscard_alignment are described as "Optimal > alignment", so those should contain the values we/you want. If they are > 0, then you should probably fall back to opt_transfer instead of > request_alignment.
I am still trying to figure out what is the best solution. If I take the optima into account I might ending up transfering more data than necessary just to create an optimal request. I just want to avoid unnecessary RMW cycles. And even if modern byte interfaces advertise a request_alignment of 1 someone has to do the RMW cycle. Either the OS or the harddrive itself. I am thinking about sth like alignment = MAX(request_alignment, opt_transfer, min_sparse) as a starting point? I found that opt_transfer seems to be 0 for everything I found to test. So maybe even reduce the alignment to MAX(request_alignment, min_sparse). Peter