On 2017年06月23日 02:53, Michael S. Tsirkin wrote:
On Thu, Jun 22, 2017 at 08:15:58AM +0200, jean-philippe menil wrote:
2017-06-06 1:52 GMT+02:00 Michael S. Tsirkin <m...@redhat.com>:

     On Mon, Jun 05, 2017 at 05:08:25AM +0300, Michael S. Tsirkin wrote:
     > On Mon, Jun 05, 2017 at 12:48:53AM +0200, Jean-Philippe Menil wrote:
     > > Hi,
     > >
     > > while playing with xdp and ebpf, i'm hitting the following:
     > >
     > > [  309.993136]
     > > ==================================================================
     > > [  309.994735] BUG: KASAN: use-after-free in
     > > free_old_xmit_skbs.isra.29+0x2b7/0x2e0 [virtio_net]
     > > [  309.998396] Read of size 8 at addr ffff88006aa64220 by task sshd/323
     > > [  310.000650]
     > > [  310.002305] CPU: 1 PID: 323 Comm: sshd Not tainted 4.12.0-rc3+ #2
     > > [  310.004018] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
     BIOS
     > > 1.10.2-20170228_101828-anatol 04/01/2014
...

     >
     > Since commit 680557cf79f82623e2c4fd42733077d60a843513
     >     virtio_net: rework mergeable buffer handling
     >
     > we no longer must do the resets, we now have enough space
     > to store a bit saying whether a buffer is xdp one or not.
     >
     > And that's probably a cleaner way to fix these issues than
     > try to find and fix the race condition.
     >
     > John?
     >
     > --
     > MST


     I think I see the source of the race. virtio net calls
     netif_device_detach and assumes no packets will be sent after
     this point. However, all it does is stop all queues so
     no new packets will be transmitted.

     Try locking with HARD_TX_LOCK?
     --
     MST


Hi Michael,

from what i see, the race appear when we hit virtnet_reset in virtnet_xdp_set.
virtnet_reset
   _remove_vq_common
     virtnet_del_vqs
       virtnet_free_queues
         kfree(vi->sq)
when the xdp program (with two instances of the program to trigger it faster)
is added or removed.

It's easily repeatable, with 2 cpus and 4 queues on the qemu command line,
running the xdp_ttl tool from Jesper.

For now, i'm able to continue my qualification, testing if xdp_qp is not null,
but do not seem to be a sustainable trick.
if (xdp_qp && vi->xdp_queues_pairs != xdp_qp)

Maybe it will be more clear to you with theses informations.

Best regards.

Jean-Philippe

I'm pretty clear about the issue here, I was trying to figure out a fix.
Jason, any thoughts?



Hi Jean:

Does the following fix this issue? (I can't reproduce it locally through xdp_ttl)

Thanks

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1f8c15c..3e65c3f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1801,7 +1801,9 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
        /* Make sure no work handler is accessing the device */
        flush_work(&vi->config_work);

+       netif_tx_lock_bh(vi->dev);
        netif_device_detach(vi->dev);
+       netif_tx_unlock_bh(vi->dev);
        cancel_delayed_work_sync(&vi->refill);


Reply via email to