On 11/19/18 4:23 AM, Daniel P. Berrangé wrote:
Simply setting the 'quit' flag after sending NBD_CMD_DISC is enough to
get the coroutine to stop waiting for a reply and thus supress the error
message.
Actually, it's not quite enough - once you actually start performing I/O,
enough coroutines are kicked off that the error still happens:
Hmm, does the client not send requests synchronously ? I expected that
any other I/O operations would have already received their reply by the
time we sent the DISC command.
During handshake phase (NBD_OPT_* messages), client sends requests
synchronously (at most one pending read, in response to the most recent
write). During transmission phase (NBD_CMD_* messages), the client is
allowed to send messages asynchronously. However, the NBD protocol
recommends (but not requires) that the client never send NBD_CMD_DISC to
disconnect until AFTER the client has collected all responses to any
earlier in-flight messages. And if I'm reading the block/nbd-client.c
code correctly, in general we don't try to shutdown a client while there
are any pending commands with unread replies, except when we are
shutting down due to detecting a protocol error (at which point we are
already out of sync with the server and waiting for replies is not going
to be helpful).
But I want to do more testing to make sure I'm not missing out on reporting
an actual error if I add that.
Yes, I'd be a little concerned about missing reporting of an error from
a command other than NBD_CMD_DISC if we did this.
But it looks like your approach of making the qio code behave more
nicely is a smarter cleanup than my idea, so I'm willing to ditch my
code in favor of yours. Testing your approach now :)
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org