Hi Maxime, > -----Original Message----- > From: Maxime Coquelin <maxime.coque...@redhat.com> > Sent: Friday, June 18, 2021 4:48 PM > To: Xia, Chenbo <chenbo....@intel.com>; dev@dpdk.org; > david.march...@redhat.com > Cc: sta...@dpdk.org > Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue > > > > On 6/18/21 10:21 AM, Xia, Chenbo wrote: > > Hi Maxime, > > > >> -----Original Message----- > >> From: Maxime Coquelin <maxime.coque...@redhat.com> > >> Sent: Friday, June 18, 2021 4:01 PM > >> To: Xia, Chenbo <chenbo....@intel.com>; dev@dpdk.org; > >> david.march...@redhat.com > >> Cc: sta...@dpdk.org > >> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue > >> > >> > >> > >> On 6/18/21 6:34 AM, Xia, Chenbo wrote: > >>> Hi Maxime, > >>> > >>>> -----Original Message----- > >>>> From: Maxime Coquelin <maxime.coque...@redhat.com> > >>>> Sent: Thursday, June 17, 2021 11:38 PM > >>>> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > >> <chenbo....@intel.com> > >>>> Cc: Maxime Coquelin <maxime.coque...@redhat.com>; sta...@dpdk.org > >>>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue > >>>> > >>>> Since the Vhost-user device initialization has been reworked, > >>>> enabling the application to start using the device as soon as > >>>> the first queue pair is ready, NUMA reallocation no more > >>>> happened on queue pairs other than the first one since > >>>> numa_realloc() was returning early if the device was running. > >>>> > >>>> This patch fixes this issue by only preventing the device > >>>> metadata to be allocated if the device is running. For the > >>>> virtqueues, a vring state change notification is sent to > >>>> notify the application of its disablement. Since the callback > >>>> is supposed to be blocking, it is safe to reallocate it > >>>> afterwards. > >>> > >>> Is there a corner case? Numa_realloc may happen during vhost-user msg > >>> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will > >>> not take vq access lock. It could happen when numa_realloc happens on > >>> iotlb msg and app accesses vq in the meantime? > >> > >> I think we are safe wrt to numa_realloc(), because the app's > >> .vring_state_changed() callback is only returning when it is no more > >> processing the rings. > > > > Yes, I think it should be. But in this iotlb msg case (take vhost pmd for > example), > > can't vhost pmd still access vq since vq access lock is not took? Do I miss > something? > > Vhost PMD sends RTE_ETH_EVENT_QUEUE_STATE, and my assumption was that > the application would stop processing the rings when handling this > event and only return from the callback when it's one, but this seems > that's not done at least in testpmd. So we may not rely on that after > all :/. > > We cannot rely on the VQ's access lock since the goal of numa_realloc is > to reallocate the vhost_virtqueue itself which contains the acces_lock. > Relying on it would cause a use after free. > > Maybe the safest thing to do is to just skip the reallocation if > vq->ready == true. > > Having vq->ready == true means we already received all the vrings info > from QEMU, which means the driver has already initialized the device. > > It should not change runtime behavior compared to this patch since it > would not reallocate anyway. > > What do you think?
That sounds good to me 😊 Thanks, Chenbo > > > Thanks, > > Chenbo > > > >> > >> > >>> Thanks, > >>> Chenbo > >>> > >>>> > >>>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications") > >>>> Cc: sta...@dpdk.org > >>>> > >>>> Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com> > >>>> --- > >>>> lib/vhost/vhost_user.c | 11 ++++++++--- > >>>> 1 file changed, 8 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > >>>> index 0e9e26ebe0..6e7b327ef8 100644 > >>>> --- a/lib/vhost/vhost_user.c > >>>> +++ b/lib/vhost/vhost_user.c > >>>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index) > >>>> struct batch_copy_elem *new_batch_copy_elems; > >>>> int ret; > >>>> > >>>> - if (dev->flags & VIRTIO_DEV_RUNNING) > >>>> - return dev; > >>>> - > >>>> old_dev = dev; > >>>> vq = old_vq = dev->virtqueue[index]; > >>>> > >>>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index) > >>>> return dev; > >>>> } > >>>> if (oldnode != newnode) { > >>>> + if (vq->ready) { > >>>> + vq->ready = false; > >>>> + vhost_user_notify_queue_state(dev, index, 0); > >>>> + } > >>>> + > >>>> VHOST_LOG_CONFIG(INFO, > >>>> "reallocate vq from %d to %d node\n", oldnode, > newnode); > >>>> vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode); > >>>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index) > >>>> rte_free(old_vq); > >>>> } > >>>> > >>>> + if (dev->flags & VIRTIO_DEV_RUNNING) > >>>> + goto out; > >>>> + > >>>> /* check if we need to reallocate dev */ > >>>> ret = get_mempolicy(&oldnode, NULL, 0, old_dev, > >>>> MPOL_F_NODE | MPOL_F_ADDR); > >>>> -- > >>>> 2.31.1 > >>> > >