Re: [Qemu-devel] [PATCH RFC 1/6] io: only allow return path for socket typed

Daniel P. Berrange Fri, 19 May 2017 05:57:05 -0700

On Fri, May 19, 2017 at 01:51:43PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berra...@redhat.com) wrote:
> > On Fri, May 19, 2017 at 02:43:27PM +0800, Peter Xu wrote:
> > > We don't really have a return path for the other types yet. Let's check
> > > this when .get_return_path() is called.
> > > 
> > > For this, we introduce a new feature bit, and set it up only for socket
> > > typed IO channels.
> > > 
> > > This will help detect earlier failure for postcopy, e.g., logically
> > > speaking postcopy cannot work with "exec:". Before this patch, when we
> > > try to migrate with "migrate -d exec:cat>out", we'll hang the system.
> > > With this patch, we'll get:
> > > 
> > > (qemu) migrate -d exec:cat>out
> > > Unable to open return-path for postcopy
> > 
> > This is wrong - post-copy migration *can* work with exec: - it just entirely
> > depends on what command you are running. Your example ran a command which is
> > unidirectional, but if you ran 'exec:socat ...' you would have a fully
> > bidirectional channel. Actually the channel is always bi-directional, but
> > 'cat' simply won't ever send data back to QEMU.
> 
> The thing is it didn't used to be able to; prior to your conversion to
> channel, postcopy would reject being started with exec: because it
> couldn't open a return path, so it was safe.


It safe but functionally broken because it is valid to want to use
exec migration with post-copy.

> > If QEMU hangs when the other end doesn't send data back, that actually seems
> > like a potentially serious bug in migration code. Even if using the normal
> > 'tcp' migration protocol, if the target QEMU server hangs and fails to
> > send data to QEMU on the return path, the source QEMU must never hang.
> 
> Hmm, we shouldn't get a 'hang' with a postcopy on a link without a
> return path; but it does depend on how the exec: behaves on the
> destination.
> If the destination discards data written to it, then I think the
> behaviour would be:
>    a) Page requests would just get dropped, they'd eventually get
> fulfilled by the background page transmissions, but that could mean that
> a page request would wait for minutes for the page.
>    b) The qemu main thread on the destination can be blocked by that, so
> the monitor might not respond until the page request is fulfilled.
>    c) I'm not quite sure what would happen to the source return-path
> thread
> 
> The behaviour seems to have changed between 2.9.0 (f26 package) and my
> reasonably recent head build.

That's due to the bug we just fixed where we mistakenly didn't
allow bi-directional I/O for exec

  commit 062d81f0e968fe1597474735f3ea038065027372
  Author: Daniel P. Berrange <berra...@redhat.com>
  Date:   Fri Apr 21 12:12:20 2017 +0100

    migration: setup bi-directional I/O channel for exec: protocol
    
    Historically the migration data channel has only needed to be
    unidirectional. Thus the 'exec:' protocol was requesting an
    I/O channel with O_RDONLY on incoming side, and O_WRONLY on
    the outgoing side.
    
    This is fine for classic migration, but if you then try to run
    TLS over it, this fails because the TLS handshake requires a
    bi-directional channel.
    
    Signed-off-by: Daniel P. Berrange <berra...@redhat.com>
    Reviewed-by: Juan Quintela <quint...@redhat.com>
    Signed-off-by: Juan Quintela <quint...@redhat.com>


> A migration_cancel also doesn't work for 'exec' because it doesn't
> support shutdown() - it just sticks in 'cancelling'.
> On a socket that was broken like this the cancel would work because
> it issues a shutdown() which causes the socket to cleanup.

I'm curious why migration_cancel requires shutdown() to work ? Why
isn't it sufficient from the source POV to just close the socket
entirely straight away.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH RFC 1/6] io: only allow return path for socket typed

Reply via email to