29.03.2019 7:27, Eric Blake wrote:
> We have a latent bug in our NBD client code, tickled by the brand new
> nbdkit 1.11.10 block status support:
> 
> $ nbdkit --filter=log --filter=truncate -U - \
>             data data="1" size=511 truncate=64K logfile=/dev/stdout \
>             --run 'qemu-img convert $nbd /var/tmp/out'
> ...
> qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && 
> QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.
> 
> The culprit? Our implementation of .bdrv_co_block_status can return
> unaligned block status for any server that operates with a lower
> actual alignment than what we tell the block layer in
> request_alignment, in violation of the block layer's constraints. To
> date, we've been unable to trip the bug, because qemu as NBD server
> always advertises block sizing (at which point it is a server bug if
> the server sends unaligned status - although qemu 3.1 is such a server
> and I've sent separate patches for 4.0 both to get the server to obey
> the spec, and to let the client to tolerate server oddities at EOF).
> 
> But nbdkit does not (yet) advertise block sizing, and therefore is not
> in violation of the spec for returning block status at whatever
> boundaries it wants, and those unaligned results can occur anywhere
> rather than just at EOF. While we are still wise to avoid sending
> sub-sector read/write requests to a server of unknown origin, we MUST
> consider that a server telling us block status without an advertised
> block size is correct.  So, we either have to munge unaligned answers
> from the server into aligned ones that we hand back to the block
> layer, or we have to tell the block layer about a smaller alignment.
> 
> Similarly, if the server advertises an image size that is not
> sector-aligned, we might as well assume that the server intends to let
> us access those tail bytes, and therefore supports a minimum block
> size of 1, regardless of whether the server supports block status
> (although we still need more patches to fix the problem that with an
> unaligned image, we can send read or block status requests that exceed
> EOF to the server). Again, qemu as server cannot trip this problem
> (because it rounds images to sector alignment), but nbdkit advertised
> unaligned size even before it gained block status support.
> 
> Solve both alignment problems at once by using better heuristics on
> what alignment to report to the block layer when the server did not
> give us something to work with. Note that very few NBD servers
> implement block status (to date, only qemu and nbdkit are known to do
> so); and as the NBD spec mentioned block sizing constraints prior to
> documenting block status, it can be assumed that any future
> implementations of block status are aware that they must advertise
> block size if they want a minimum size other than 1.
> 
> We've had a long history of struggles with picking the right alignment
> to use in the block layer, as evidenced by the commit message of
> fd8d372d (v2.12) that introduced the current choice of forced 512-byte
> alignment.
> 
> There is no iotest coverage for this fix, because qemu can't provoke
> it, and I didn't want to make test 241 dependent on nbdkit.
> 
> Fixes: fd8d372d
> Reported-by: Richard W.M. Jones <rjo...@redhat.com>
> Signed-off-by: Eric Blake <ebl...@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>

> ---
>   block/nbd.c | 19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 2e72df528ac..208be596027 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -437,7 +437,24 @@ static void nbd_refresh_limits(BlockDriverState *bs, 
> Error **errp)
>       uint32_t min = s->info.min_block;
>       uint32_t max = MIN_NON_ZERO(NBD_MAX_BUFFER_SIZE, s->info.max_block);
> 
> -    bs->bl.request_alignment = min ? min : BDRV_SECTOR_SIZE;
> +    /*
> +     * If the server did not advertise an alignment:
> +     * - a size that is not sector-aligned implies that an alignment
> +     *   of 1 can be used to access those tail bytes
> +     * - advertisement of block status requires an alignment of 1, so
> +     *   that we don't violate block layer constraints that block
> +     *   status is always aligned (as we can't control whether the
> +     *   server will report sub-sector extents, such as a hole at EOF
> +     *   on an unaligned POSIX file)
> +     * - otherwise, assume the server is so old that we are safer avoiding
> +     *   sub-sector requests
> +     */
> +    if (!min) {
> +        min = (!QEMU_IS_ALIGNED(s->info.size, BDRV_SECTOR_SIZE) ||
> +               s->info.base_allocation) ? 1 : BDRV_SECTOR_SIZE;
> +    }
> +
> +    bs->bl.request_alignment = min;
>       bs->bl.max_pdiscard = max;
>       bs->bl.max_pwrite_zeroes = max;
>       bs->bl.max_transfer = max;
> 


-- 
Best regards,
Vladimir

Reply via email to