On 12 Apr 2016, at 11:34, Daniel P. Berrange <[email protected]> wrote:
> On Tue, Apr 12, 2016 at 10:48:20AM +0100, Daniel P. Berrange wrote:
>> On Sun, Apr 10, 2016 at 10:49:00AM +0100, Alex Bligh wrote:
>>> (Daniel: if you want to replicate the issue, just run qemu-img info
>>> against my gonbdserver with TLS. Every fifth NBD_CMD_DISC doesn't
>>> get through before the TCP session closes).
>>
>> Hmm, I'll have a look at this - I'm not sure its caused by
>> the lack of gnutls_bye, as opposed to incorrect handling
>> of EAGAIN when the block layer sends CMD_DISC
>
> I have tried to reproduce with your server yet, but I did look at the
> code and identified one potential flaw that might cause the behaviour
> you mention. Can you try testing with the following change applied
> to QEMU:
>
> diff --git a/nbd/common.c b/nbd/common.c
> index 8ddb2dd..47757b6 100644
> --- a/nbd/common.c
> +++ b/nbd/common.c
> @@ -50,7 +50,7 @@ ssize_t nbd_wr_syncv(QIOChannel *ioc,
> * qio_channel_yield() that works with AIO contexts
> * and consider using that in this branch */
> qemu_coroutine_yield();
> - } else if (done) {
> + } else if (done || !do_read) {
> /* XXX this is needed by nbd_reply_ready. */
> qio_channel_wait(ioc,
> do_read ? G_IO_IN : G_IO_OUT);
>
>
> The current code will cause it to silently not send CMD_DISC if the socket
> returns EAGAIN on send(). This change will cause it to do a poll to wait
> and retry the send(). That said I'd be pretty surprised if we do actually
> get EAGAIN in this scenario, as I'd expect the outgoing TCP buffers to be
> empty at the point in which we close the client connection, but this fix
> is worth a try.
Well, what with pulling to the latest qemu master I now can't reproduce
the original problem against my server :-/ However, I have verified that
against my server it works with your patch (as well as without). I have
verified each time that I am receiving NBD_CMD_DISC.
It's also now working (with and without your patch) against nbd-server.c
(reference implementation) with my GnuTLS patches. This should be exactly
the same as before.
I suspect the issue may be timing related. Unless there's something else
fixed in master, the main change I've made is rebooted my MBP, which
was running very slowly.
I think you should probably still do a gnutls_bye() though.
--
Alex Bligh
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Nbd-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nbd-general