One update for the testing scenario:

No need to kill OVS. The issue reproducible with simple 'del-port'
and 'add-port'. virtio driver in guest could crash on both operations.
Most times it crashes in my case on 'add-port' after deletion.

Hi Maxime,
I already saw below patches and original linux kernel virtio issue.
Just had no enough time to test them.
Now I tested below patches and they fixes virtio driver crash.
Thanks for suggestion.

Michael,
I tested "[PATCH] virtio_error: don't invoke status callbacks "
and it fixes the QEMU crash in case of broken guest index.
Thanks.

Best regards, Ilya Maximets.

P.S. Previously I mentioned that I can not reproduce virtio driver
     crash with "[PATCH] virtio_error: don't invoke status callbacks"
     applied. I was wrong. I can reproduce now. System was misconfigured.
     Sorry.


On 14.12.2017 12:01, Maxime Coquelin wrote:
> Hi Ilya,
> 
> On 12/14/2017 08:06 AM, Ilya Maximets wrote:
>> On 13.12.2017 22:48, Michael S. Tsirkin wrote:
>>> On Wed, Dec 13, 2017 at 04:45:20PM +0300, Ilya Maximets wrote:
>>>>>> That
>>>>>> looks very strange. Some of the functions gets 'old_status', others
>>>>>> the 'new_status'. I'm a bit confused.
>>>>>
>>>>> OK, fair enough. Fixed - let's pass old status everywhere,
>>>>> users that need the new one can get it from the vdev.
>>>>>
>>>>>> And it's not functional in current state:
>>>>>>
>>>>>> hw/net/virtio-net.c:264:28: error: ‘status’ undeclared
>>>>>
>>>>> Fixed too. new version below.
>>>>
>>>> This doesn't fix the segmentation fault.
>>>
>>> Hmm you are right. Looking into it.
>>>
>>>> I have exactly same crash stacktrace:
>>>>
>>>> #0  vhost_memory_unmap hw/virtio/vhost.c:446
>>>> #1  vhost_virtqueue_stop hw/virtio/vhost.c:1155
>>>> #2  vhost_dev_stop hw/virtio/vhost.c:1594
>>>> #3  vhost_net_stop_one hw/net/vhost_net.c:289
>>>> #4  vhost_net_stop hw/net/vhost_net.c:368
>>>> #5  virtio_net_vhost_status (old_status=15 '\017', n=0x5625f3901100) at 
>>>> hw/net/virtio-net.c:180
>>>> #6  virtio_net_set_status (vdev=0x5625f3901100, old_status=<optimized 
>>>> out>) at hw/net/virtio-net.c:254
>>>> #7  virtio_set_status (vdev=vdev@entry=0x5625f3901100, val=<optimized 
>>>> out>) at hw/virtio/virtio.c:1152
>>>> #8  virtio_error (vdev=0x5625f3901100, fmt=fmt@entry=0x5625f014f688 "Guest 
>>>> says index %u is available") at hw/virtio/virtio.c:2460
>>>
>>> BTW what is causing this? Why is guest avail index corrupted?
>>
>> My testing environment for the issue:
>>
>> * QEMU 2.10.1
> 
> Could you try to backport below patch and try again killing OVS?
> 
> commit 2ae39a113af311cb56a0c35b7f212dafcef15303
> Author: Maxime Coquelin <maxime.coque...@redhat.com>
> Date:   Thu Nov 16 19:48:35 2017 +0100
> 
>     vhost: restore avail index from vring used index on disconnection
> 
>     vhost_virtqueue_stop() gets avail index value from the backend,
>     except if the backend is not responding.
> 
>     It happens when the backend crashes, and in this case, internal
>     state of the virtio queue is inconsistent, making packets
>     to corrupt the vring state.
> 
>     With a Linux guest, it results in following error message on
>     backend reconnection:
> 
>     [   22.444905] virtio_net virtio0: output.0:id 0 is not a head!
>     [   22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
>     [   22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5
> 
>     Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down")
>     Cc: qemu-sta...@nongnu.org
>     Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
>     Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>     Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> 
> commit 2d4ba6cc741df15df6fbb4feaa706a02e103083a
> Author: Maxime Coquelin <maxime.coque...@redhat.com>
> Date:   Thu Nov 16 19:48:34 2017 +0100
> 
>     virtio: Add queue interface to restore avail index from vring used index
> 
>     In case of backend crash, it is not possible to restore internal
>     avail index from the backend value as vhost_get_vring_base
>     callback fails.
> 
>     This patch provides a new interface to restore internal avail index
>     from the vring used index, as done by some vhost-user backend on
>     reconnection.
> 
>     Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
>     Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>     Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> 
> 
> Cheers,
> Maxime
> 
> 
> 

Reply via email to