On 04/25/2016 06:19 AM, Alex Bligh wrote: > > On 23 Apr 2016, at 00:40, Eric Blake <ebl...@redhat.com> wrote: > >> The upstream NBD Protocol has defined a new extension to allow >> the server to advertise block sizes to the client, as well as >> a way for the client to inform the server that it intends to >> obey block sizes. >> >> Pass any received sizes on to the block layer. >> >> Use the minimum block size as the sector size we pass to the >> kernel - which also has the nice effect of cooperating with >> (non-qemu) servers that don't do read-modify-write when exposing >> a block device with 4k sectors; it can also allow us to visit a >> file larger than 2T on a 32-bit kernel. >>
>> + be32_to_cpus(&info->max_block); >> + TRACE("Block sizes are 0x%" PRIx32 ", 0x%" PRIx32 ", 0x%" >> PRIx32, >> + info->min_block, info->opt_block, info->max_block); >> + break; >> + > > You should probably check min_block <= opt_block <= max_block here opt_block > max_block is possible if max_block is clamped to export size (in the degenerate case where you have a small export that is too small for the granularity of a hole or efficient I/O). But yes, some sanity checks that the server isn't horribly broken might be worthwhile. > > Also should there be a check that BDRV_SECTOR_SIZE >= min_block? No, because we take the server's min_block and feed it into bs->request_align (which forces the block layer to comply with a minimum alignment larger than 512, using code already tested on physical block drives with 4k sectors), see the hunk in nbd-client.c. >> int nbd_init(int fd, QIOChannelSocket *sioc, NbdExportInfo *info) >> { >> - unsigned long sectors = info->size / BDRV_SECTOR_SIZE; >> - if (info->size / BDRV_SECTOR_SIZE != sectors) { >> + unsigned long sector_size = MAX(BDRV_SECTOR_SIZE, info->min_block); >> + unsigned long sectors = info->size / sector_size; >> + if (info->size / sector_size != sectors) { >> LOG("Export size %" PRId64 " too large for 32-bit kernel", >> info->size); >> return -E2BIG; >> } >> @@ -724,18 +784,18 @@ int nbd_init(int fd, QIOChannelSocket *sioc, >> NbdExportInfo *info) >> return -serrno; >> } >> >> - TRACE("Setting block size to %lu", (unsigned long)BDRV_SECTOR_SIZE); >> + TRACE("Setting block size to %lu", sector_size); >> >> - if (ioctl(fd, NBD_SET_BLKSIZE, (unsigned long)BDRV_SECTOR_SIZE) < 0) { >> + if (ioctl(fd, NBD_SET_BLKSIZE, sector_size) < 0) { We also feed the maximum of 512 or the advertised minimum block size into the kernel when using ioctl() for the kernel to take over transmission phase; although I'm not certain whether the kernel obeys NBD_SET_BLKSIZE as a hint rather than a hard rule - but if that needs patching, it needs patching in the kernel implementation, not in qemu. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature