On 11/17/2016 04:01 PM, Max Reitz wrote: > On 17.11.2016 21:13, Eric Blake wrote: >> Right now, the block layer rounds discard requests, so that >> individual drivers are able to assert that discard requests >> will never be unaligned. But there are some ISCSI devices >> that track and coalesce multiple unaligned requests, turning it >> into an actual discard if the requests eventually cover an >> entire page, which implies that it is better to always pass >> discard requests as low down the stack as possible. >> >> In isolation, this patch has no semantic effect, since the >> block layer currently never passes an unaligned request through. >> But the block layer already has code that silently ignores >> drivers that return -ENOTSUP for a discard request that cannot >> be honored (as well as drivers that return 0 even when nothing >> was done). But the next patch will update the block layer to >> fragment discard requests, so that clients are guaranteed that >> they are either dealing with an unaligned head or tail, or an >> aligned core, making it similar to the block layer semantics of >> write zero fragmentation. >>
>> +++ b/block/iscsi.c >> @@ -1083,7 +1083,9 @@ coroutine_fn iscsi_co_pdiscard(BlockDriverState *bs, >> int64_t offset, int count) >> struct IscsiTask iTask; >> struct unmap_list list; >> >> - assert(is_byte_request_lun_aligned(offset, count, iscsilun)); >> + if (!is_byte_request_lun_aligned(offset, count, iscsilun)) { >> + return -ENOTSUP; >> + } >> >> if (!iscsilun->lbp.lbpu) { >> /* UNMAP is not supported by the target */ > > Next line is: > >> return 0; > > Hmm... -ENOTSUP would be the obvious return value here, too. That might > interfere with your next patch, though. Shouldn't interfere. I guess no one value is better than the other; I can respin to pick a consistent value (I'd lean towards -ENOTSUP) if you think it is worth it; but I'd rather get this into 2.8 without worrying about it. > >> diff --git a/block/qcow2.c b/block/qcow2.c >> index e22f6dc..7cfcd84 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2491,6 +2491,11 @@ static coroutine_fn int >> qcow2_co_pdiscard(BlockDriverState *bs, >> int ret; >> BDRVQcow2State *s = bs->opaque; >> >> + if (!QEMU_IS_ALIGNED(offset | count, s->cluster_size)) { > > Ha! I like "offset | count". It only works because we know that qcow2 guarantees that s->cluster_size is a power of 2 (it does not work at the block layer, where the bs->bl.pdiscard_align need not be a power of 2). > >> + assert(count < s->cluster_size); > > Maybe add a comment for this assertion? E.g. "The block layer will only > generate unaligned discard requests that are smaller than the alignment". Sure, if the maintainer wants a respin. > >> + return -ENOTSUP; >> + } >> + >> qemu_co_mutex_lock(&s->lock); >> ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS, >> QCOW2_DISCARD_REQUEST, false); >> diff --git a/block/sheepdog.c b/block/sheepdog.c >> index 1fb9173..4c9af89 100644 >> --- a/block/sheepdog.c >> +++ b/block/sheepdog.c >> @@ -2829,8 +2829,9 @@ static coroutine_fn int >> sd_co_pdiscard(BlockDriverState *bs, int64_t offset, >> iov.iov_len = sizeof(zero); >> discard_iov.iov = &iov; >> discard_iov.niov = 1; >> - assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0); >> - assert((count & (BDRV_SECTOR_SIZE - 1)) == 0); >> + if (!QEMU_IS_ALIGNED(offset | count, BDRV_SECTOR_SIZE)) { Again, works because this is a power of 2. >> + return -ENOTSUP; >> + } > > Out of interest: Where does sheepdog tell the block layer that requests > have to be aligned to this value? It's Magic ! Don't tell anyone that I told you :) Actually, it's because sheepdog still uses .bdrv_co_readv (sector-based), and not .bdrv_co_preadv (byte-based), so the block layer automatically sets bs->bl.request_alignment to BDRV_SECTOR_SIZE for sheepdog and all other old-school drivers. The block layer then treats bs->bl.request_alignment as the very minimum that can be changed in the image at a time, so it only makes sense that sheepdog can't react to a discard request aligned below those limits. This code is weakening from an assertion to an early error return, and then 5/9 is what starts calling the code even with something aligned smaller than a sector. Someday the sheepdog driver may be relaxed to implement byte-based callbacks, and then we may want to delete the early error return of -ENOTSUP for requests smaller than 512; but that depends on whether sheepdog uses bytes or sectors over the wire. > > With this patch, it doesn't matter though, it only did before, so: > > Reviewed-by: Max Reitz <mre...@redhat.com> > >> acb = sd_aio_setup(bs, &discard_iov, offset >> BDRV_SECTOR_BITS, >> count >> BDRV_SECTOR_BITS); >> acb->aiocb_type = AIOCB_DISCARD_OBJ; >> > > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature