One update for the testing scenario: No need to kill OVS. The issue reproducible with simple 'del-port' and 'add-port'. virtio driver in guest could crash on both operations. Most times it crashes in my case on 'add-port' after deletion.
Hi Maxime, I already saw below patches and original linux kernel virtio issue. Just had no enough time to test them. Now I tested below patches and they fixes virtio driver crash. Thanks for suggestion. Michael, I tested "[PATCH] virtio_error: don't invoke status callbacks " and it fixes the QEMU crash in case of broken guest index. Thanks. Best regards, Ilya Maximets. P.S. Previously I mentioned that I can not reproduce virtio driver crash with "[PATCH] virtio_error: don't invoke status callbacks" applied. I was wrong. I can reproduce now. System was misconfigured. Sorry. On 14.12.2017 12:01, Maxime Coquelin wrote: > Hi Ilya, > > On 12/14/2017 08:06 AM, Ilya Maximets wrote: >> On 13.12.2017 22:48, Michael S. Tsirkin wrote: >>> On Wed, Dec 13, 2017 at 04:45:20PM +0300, Ilya Maximets wrote: >>>>>> That >>>>>> looks very strange. Some of the functions gets 'old_status', others >>>>>> the 'new_status'. I'm a bit confused. >>>>> >>>>> OK, fair enough. Fixed - let's pass old status everywhere, >>>>> users that need the new one can get it from the vdev. >>>>> >>>>>> And it's not functional in current state: >>>>>> >>>>>> hw/net/virtio-net.c:264:28: error: ‘status’ undeclared >>>>> >>>>> Fixed too. new version below. >>>> >>>> This doesn't fix the segmentation fault. >>> >>> Hmm you are right. Looking into it. >>> >>>> I have exactly same crash stacktrace: >>>> >>>> #0 vhost_memory_unmap hw/virtio/vhost.c:446 >>>> #1 vhost_virtqueue_stop hw/virtio/vhost.c:1155 >>>> #2 vhost_dev_stop hw/virtio/vhost.c:1594 >>>> #3 vhost_net_stop_one hw/net/vhost_net.c:289 >>>> #4 vhost_net_stop hw/net/vhost_net.c:368 >>>> #5 virtio_net_vhost_status (old_status=15 '\017', n=0x5625f3901100) at >>>> hw/net/virtio-net.c:180 >>>> #6 virtio_net_set_status (vdev=0x5625f3901100, old_status=<optimized >>>> out>) at hw/net/virtio-net.c:254 >>>> #7 virtio_set_status (vdev=vdev@entry=0x5625f3901100, val=<optimized >>>> out>) at hw/virtio/virtio.c:1152 >>>> #8 virtio_error (vdev=0x5625f3901100, fmt=fmt@entry=0x5625f014f688 "Guest >>>> says index %u is available") at hw/virtio/virtio.c:2460 >>> >>> BTW what is causing this? Why is guest avail index corrupted? >> >> My testing environment for the issue: >> >> * QEMU 2.10.1 > > Could you try to backport below patch and try again killing OVS? > > commit 2ae39a113af311cb56a0c35b7f212dafcef15303 > Author: Maxime Coquelin <maxime.coque...@redhat.com> > Date: Thu Nov 16 19:48:35 2017 +0100 > > vhost: restore avail index from vring used index on disconnection > > vhost_virtqueue_stop() gets avail index value from the backend, > except if the backend is not responding. > > It happens when the backend crashes, and in this case, internal > state of the virtio queue is inconsistent, making packets > to corrupt the vring state. > > With a Linux guest, it results in following error message on > backend reconnection: > > [ 22.444905] virtio_net virtio0: output.0:id 0 is not a head! > [ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5 > [ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5 > > Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down") > Cc: qemu-sta...@nongnu.org > Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> > Reviewed-by: Michael S. Tsirkin <m...@redhat.com> > Signed-off-by: Michael S. Tsirkin <m...@redhat.com> > > commit 2d4ba6cc741df15df6fbb4feaa706a02e103083a > Author: Maxime Coquelin <maxime.coque...@redhat.com> > Date: Thu Nov 16 19:48:34 2017 +0100 > > virtio: Add queue interface to restore avail index from vring used index > > In case of backend crash, it is not possible to restore internal > avail index from the backend value as vhost_get_vring_base > callback fails. > > This patch provides a new interface to restore internal avail index > from the vring used index, as done by some vhost-user backend on > reconnection. > > Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> > Reviewed-by: Michael S. Tsirkin <m...@redhat.com> > Signed-off-by: Michael S. Tsirkin <m...@redhat.com> > > > Cheers, > Maxime > > >