> -----Original Message----- > From: Peter Xu [mailto:pet...@redhat.com] > Sent: Wednesday, April 25, 2018 11:32 AM > To: Daniel P. Berrangé <berra...@redhat.com> > Cc: Dr. David Alan Gilbert <dgilb...@redhat.com>; wangxin (U) > <wangxinxin.w...@huawei.com>; qemu-devel@nongnu.org; > quint...@redhat.com; Gonglei (Arei) <arei.gong...@huawei.com> > Subject: Re: [PATCH] migration/fd: abort migration if receive POLLHUP event > > On Wed, Apr 25, 2018 at 11:14:23AM +0800, Peter Xu wrote: > > On Tue, Apr 24, 2018 at 07:24:05PM +0100, Daniel P. Berrangé wrote: > > > On Tue, Apr 24, 2018 at 06:16:31PM +0100, Dr. David Alan Gilbert wrote: > > > > * Wang Xin (wangxinxin.w...@huawei.com) wrote: > > > > > If the fd socket peer closed shortly, ppoll may receive a POLLHUP > > > > > event before the expected POLLIN event, and qemu will do nothing > > > > > but goes into an infinite loop of the POLLHUP event. > > > > > > > > > > So, abort the migration if we receive a POLLHUP event. > > > > > > > > Hi Wang Xin, > > > > Can you explain how you manage to trigger this case; I've not hit it.
Hi David, I haven't find a way of stable reproduction. It happened when I try to use libvirt command to restore a VM from an remote image, and the net broken unexpectedly. virsh restore -------> libvirtd run/ \ pass fd Remote image <----> libvirt_iohelper <--pipe--> qemu -migration Here are some key log: 1) virsh resotre /opt/galax/mnt/hibernating/db898219-1cde-45bf-b46e-4e439aa2d573/db898219-1cde-45bf-b46e-4e439aa2d573.img --bypass-cache 2018-04-17T09:45:04.128979+08:00|info|libvirtd[4834]|[5098]|virDomainRestoreFlags[1140]|: enter virDomainRestoreFlags from=/opt/galax/mnt/hibernating/db898219-1cde-45bf-b46e-4e439aa2d573/db898219-1cde-45bf-b46e-4e439aa2d573.img, flags=3 2)some qemu log we catched: 2018-04-17T09:45:06.111682+08:00|info|qemu[1628]|[1628]|do_qmp_dispatch[109]|: qmp_cmd_name: migrate-incoming, arguments: {"uri": "fd:37"} 2018-04-17T09:45:06.111735+08:00|info|qemu[1628]|[1628]|monitor_qapi_event_emit[423]|: {"timestamp": {"seconds": 1523929506, "microseconds": 111706}, "event": "MIGRATION", "data": {"status": "setup"}} 2018-04-17T09:45:06.112353+08:00|info|qemu[1628]|[1628]|monitor_qapi_event_emit[423]|: {"timestamp": {"seconds": 1523929506, "microseconds": 112317}, "event": "MIGRATION", "data": {"status": "active"}} ... then nothing print 3) Remote storage broken, process libvirt_iohelper do error exit. The system key messages as below: 2018-04-17T09:45:04.298085+08:00|info|systemd-machined[-]|New machine qemu-1-i-0000023F. 2018-04-17T09:45:04.298547+08:00|info|systemd[-]|Started Virtual Machine qemu-1-i-0000023F. 2018-04-17T09:45:04.298918+08:00|info|systemd[-]|Starting Virtual Machine qemu-1-i-0000023F. 2018-04-17T09:45:06.188259+08:00|info|kernel[-]|[55369.414640] connection6:0: detected conn error (1020) //Remote storage broken, libvirt_iohelper do error exit 2018-04-17T09:45:06.198229+08:00|info|kernel[-]|[55369.423977] connection7:0: detected conn error (1020) 2018-04-17T09:45:06.776992+08:00|warning|iscsid[-]|Kernel reported iSCSI connection 6:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) 2018-04-17T09:45:06.777200+08:00|warning|iscsid[-]|Kernel reported iSCSI connection 7:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3) 4) Qemu ppoll() exit frequently because of POLLHUP event (gdb) p ((GPollFD *)gpollfds->date)[8] $117 = {fd = 37, events = 1, revents = 16} > > > > > > > > > Signed-off-by: Wang Xin <wangxinxin.w...@huawei.com> > > > > > > > > > > diff --git a/migration/fd.c b/migration/fd.c > > > > > index cd06182..5932c87 100644 > > > > > --- a/migration/fd.c > > > > > +++ b/migration/fd.c > > > > > @@ -15,6 +15,7 @@ > > > > > */ > > > > > > > > > > #include "qemu/osdep.h" > > > > > +#include "qemu/error-report.h" > > > > > #include "channel.h" > > > > > #include "fd.h" > > > > > #include "monitor/monitor.h" > > > > > @@ -46,6 +47,11 @@ static gboolean > fd_accept_incoming_migration(QIOChannel *ioc, > > > > > GIOCondition > condition, > > > > > gpointer > opaque) > > > > > { > > > > > + if (condition & G_IO_HUP) { > > > > > + error_report("The migration peer closed, job abort"); > > > > > + exit(EXIT_FAILURE); > > > > > + } > > > > > + > > > > > > > > OK, I wish we had a nicer way for failing; especially for the > > > > multifd/postcopy recovery worlds where one failed connection might not > > > > be fatal; but I don't see how to do that here. Any suggestion? Closing the fd is the normal way to handle G_IO_HUP event. > > > > > > This doesn't feel right to me. > > > > > > We have passed in a pre-opened FD to QEMU, and we registered a watch > > > on it to detect when there is data from the src QEMU that is available > > > to read. Normally the src will have sent something so we'll get G_IO_IN, > > > but you're suggesting the client has quit immediately, so we're getting > > > G_IO_HUP due to end of file. > > > > > > The migration_channel_process_incoming() method that we pass the ioc > > > object to will be calling qio_channel_read(ioc) somewhere to try to > > > read that data. > > > > > > For QEMU to spin in infinite loop there must be code in the > > > migration_channel_process_incoming() that is ignoring the return > > > value of qio_channel_read() in some manner causing it to retry > > > the read again & again I presume. > > > > > > Putting this check for G_IO_HUP fixes your immediate problem scenario, > > > but whatever code was spinning in infinite loop is still broken and > > > I'd guess it was possible to still trigger the loop. eg by writing > > > 1 single byte and then closing the socket. > > > > > > So, IMHO this fix is wrong - we need to find the root cause and fix > > > that, not try to avoid calling the buggy code. > > > > I agree. AFAIU the first read should be in qemu_loadvm_state(): > > > > v = qemu_get_be32(f); > > if (v != QEMU_VM_FILE_MAGIC) { > > error_report("Not a migration stream"); > > return -EINVAL; > > } > > > > So I would be curious more about how that infinite loop happened. > > Ah, wait. I just noticed that Xin mentioned about the loop already - > it's an infinite loop of SIGHUP. I suppose it means that we'll just > never go into fd_accept_incoming_migration() at all? Yeah, that's what I want to fix. > > If so, I'm not sure whether we should just always watch on G_IO_HUP > (and possibly G_IO_ERR too) in qio_channel_create_watch(): > > GSource *ret = klass->io_create_watch(ioc, condition | G_IO_HUP | > G_IO_ERR); > > Otherwise I'm not sure the same loop will happen for other users of > qio_channel_add_watch(). > Hi Daniel and Peter, In my scenario, it's clear the client quit immediately and we never got a POLLIN event, otherwise, the watch should be unregistered when POLLIN event coming. As Daniel said, normally G_IO_IN will be the first event, we need to find why POLLIN event never happened, I'll try it. > Regards, > -- > Peter Xu