Eric,

> Hmm. The current wording of the experimental block size additions does
> NOT allow the client to send a NBD_CMD_TRIM with a size larger than the
> maximum NBD_CMD_WRITE:
> https://github.com/yoe/nbd/blob/extension-info/doc/proto.md#block-size-constraints

Correct

> Maybe we should revisit that in the spec, and/or advertise yet another
> block size (since the maximum size for a trim and/or write_zeroes
> request may indeed be different than the maximum size for a read/write).

I think it's up to the server to either handle large requests, or
for the client to break these up.

The core problem here is that the kernel (and, ahem, most servers) are
ignorant of the block size extension, and need to guess how to break
things up. In my view the client (kernel in this case) should
be breaking the trim requests up into whatever size it uses as the
maximum size write requests. But then it would have to know about block
sizes which are in (another) experimental extension.

I've nothing against you fixing it qemu's server (after all I don't
think there is anything to /prohibit/ a server working on something
larger than the maximum block size), and ...

> But since the kernel is the one sending the large length request, and
> since you are right that this is not a denial-of-service in the amount
> of data being sent in a single NBD message,

... currently there isn't actually a maximum block size advertised (that
being in another experimental addition), so this is just DoS prevention,
which qemu is free to do how it wishes.

> I definitely agree that qemu
> would be wise as a quality-of-implementation to allow the larger size,
> for maximum interoperability, even if it exceeds advertised limits (that
> is, when no limits are advertised, we should handle everything possible
> if it is not so large as to be construed a denial-of-service, and
> NBD_CMD_TRIM is not large; and when limits ARE advertised, a client that
> violates limits is out of spec but we can still be liberal and respond
> successfully to such a client rather than having to outright reject it).
> So I think this patch is headed in the right direction.

I'd agree with that.

What surprises me is that a release kernel is using experimental
NBD extensions; there's no guarantee these won't change. Or does
fstrim work some other way?

Alex

> 
>> 
>> and looking at the kernel side implementation of the nbd device
>> (drivers/block/nbd.c) where it only sends the request header with no data
>> for a NBD_CMD_TRIM.
>> 
>> With this fix in, I am now able to run fstrim on my qcow2 images and keep
>> them small (or at least keep their size proportional to the amount of data
>> present on them).
>> 
>> Signed-off-by: Quentin Casasnovas <quentin.casasno...@oracle.com>
>> CC: Paolo Bonzini <pbonz...@redhat.com>
>> CC: <qemu-devel@nongnu.org>
>> CC: <qemu-sta...@nongnu.org>
>> CC: <qemu-triv...@nongnu.org>
> 
> This is NOT trivial material and should not go in through that tree.
> However, I concur that it qualifies for a backport on a stable branch.
> 
>> ---
>> nbd.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/nbd.c b/nbd.c
>> index b3d9654..e733669 100644
>> --- a/nbd.c
>> +++ b/nbd.c
>> @@ -1209,6 +1209,11 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, 
>> struct nbd_reply *reply,
>>     return rc;
>> }
>> 
>> +static bool nbd_should_check_request_size(const struct nbd_request *request)
>> +{
>> +        return (request->type & NBD_CMD_MASK_COMMAND) != NBD_CMD_TRIM;
>> +}
>> +
>> static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request 
>> *request)
>> {
>>     NBDClient *client = req->client;
>> @@ -1227,7 +1232,8 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, 
>> struct nbd_request *reque
>>         goto out;
>>     }
>> 
>> -    if (request->len > NBD_MAX_BUFFER_SIZE) {
>> +    if (nbd_should_check_request_size(request) &&
>> +        request->len > NBD_MAX_BUFFER_SIZE) {
> 
> I'd rather sort out the implications of this on the NBD protocol before
> taking anything into qemu.  We've got time on our hand, so let's use it
> to get this right.  (That, and I have several pending patches that
> conflict with this as part of adding WRITE_ZEROES and INFO_BLOCK_SIZE
> support, where it may be easier to resubmit this fix on top of my
> pending patches).
> 
>>         LOG("len (%u) is larger than max len (%u)",
>>             request->len, NBD_MAX_BUFFER_SIZE);
>>         rc = -EINVAL;
>> 
> 
> --
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j_______________________________________________
> Nbd-general mailing list
> nbd-gene...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nbd-general

--
Alex Bligh




Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to