On 01.04.2016 19:49, Olaf Hering wrote: > On Fri, Apr 01, Max Reitz wrote: > >> In any case, do you have a test case where a guest was able to submit a >> request that led to the overflow error you described in the commit message? > > mkfs -t ext4 /dev/sdb1 in a xen guest with qcow2 as backing device. > When I added discard support to libxl I worked with raw images, so I did > not notice this. Not sure why it happens to work in kvm guests. I assume > the frontend driver just works around the qemu bug by limiting its > request size.
Sorry for not having replied in so long. I know next to nothing about Xen, but I'm very much inclined to think the Xen block driver (hw/block/xen_disk.c) is at fault here. The blkif_request_discard structure it uses for accepting discard requests apparently has a uint64_t nr_sectors field. So I think using the Xen block device, it is possible for a guest to submit discard requests with more then INT_MAX sectors involved, and it's the Xen's block device emulation code job to split those requests into smaller ones. That said, I'm not sure this is ideal. It doesn't really look like other block drivers care so much about splitting requests either. So we're a bit in a pickle there. Anyway, for your issue at hand I think there is a simpler solution. bdrv_co_discard() can and already does split requests. So if we remove the calls to blk_check_request() in blk_discard() and bdrv_check_request() in bdrv_co_discard(), everything should already work just fine. Therefore, I think what could work for now is for blk_discard() and bdrv_co_discard() to modify the checks they are doing on the incoming request. In theory, all we need to do is skip the length check for both, i.e. accept requests even if they are longer than INT_MAX / BDRV_SECTOR_SIZE sectors. I'm not sure how we can do that nicely in practice, though. Max
signature.asc
Description: OpenPGP digital signature