On Fri, Jun 23, 2017 at 1:43 AM, Jason Wang <jasow...@redhat.com> wrote: > > > On 2017年06月23日 02:53, Michael S. Tsirkin wrote: >> >> On Thu, Jun 22, 2017 at 08:15:58AM +0200, jean-philippe menil wrote: >>> >>> Hi Michael, >>> >>> from what i see, the race appear when we hit virtnet_reset in >>> virtnet_xdp_set. >>> virtnet_reset >>> _remove_vq_common >>> virtnet_del_vqs >>> virtnet_free_queues >>> kfree(vi->sq) >>> when the xdp program (with two instances of the program to trigger it >>> faster) >>> is added or removed. >>> >>> It's easily repeatable, with 2 cpus and 4 queues on the qemu command >>> line, >>> running the xdp_ttl tool from Jesper. >>> >>> For now, i'm able to continue my qualification, testing if xdp_qp is not >>> null, >>> but do not seem to be a sustainable trick. >>> if (xdp_qp && vi->xdp_queues_pairs != xdp_qp) >>> >>> Maybe it will be more clear to you with theses informations. >>> >>> Best regards. >>> >>> Jean-Philippe >> >> >> I'm pretty clear about the issue here, I was trying to figure out a fix. >> Jason, any thoughts? >> >> > > Hi Jean: > > Does the following fix this issue? (I can't reproduce it locally through > xdp_ttl)
It is tricky here. >From my understanding of the code base, the tx_lock is not sufficient here, because in virtnet_del_vqs() all vqs are deleted and one vp maps to one txq. I am afraid you have to add a spinlock somewhere to serialized free_old_xmit_skbs() vs. vring_del_virtqueue(). As you can see they are in different layers, so it is hard to figure out where to add it... Also, make sure we don't sleep inside the spinlock, I see a synchronize_net().