> Subject: [EXTERNAL] Re: [PATCH v2 3/8] net/netvsc: add multi-process VF device > removal support > > On Fri, 20 Feb 2026 18:45:22 -0800 > [email protected] wrote: > > > From: Long Li <[email protected]> > > > > When a VF device is hot-removed by the primary process, secondary > > processes must be notified to release their references to the VF port. > > Without this, secondary processes retain stale port references leading > > to crashes or undefined behavior when accessing the removed device. > > > > This patch adds multi-process communication infrastructure to > > coordinate VF removal across all processes: > > > > - Shared memory (netvsc_shared_data) to track secondary process count > > - Multi-process message handlers (NETVSC_MP_REQ_VF_REMOVE) to notify > > secondaries when primary removes a VF device > > - Secondary handler calls rte_eth_dev_release_port() to cleanly release > > the VF port in its own process space > > - Primary waits for all secondaries to acknowledge removal before > > proceeding > > > > The implementation uses rte_mp_request_sync() to ensure all secondary > > processes respond within NETVSC_MP_REQ_TIMEOUT_SEC (5 seconds) before > > the primary completes the VF removal sequence. > > > > Fixes: 7fc4c0997b04 ("net/netvsc: fix hot adding multiple VF PCI > > devices") > > Cc: [email protected] > > > > Signed-off-by: Long Li <[email protected]> > > AI review feedback: > > **Patch 3 (net/netvsc: add multi-process VF device removal support)** — adds > MP infrastructure to coordinate VF removal across processes. > > **Three concerns:** > > 1. **Race window on `secondary_cnt` during probe (~50% confidence).** The > secondary increments `secondary_cnt` *after* `rte_eth_dev_probing_finish()`, > but `netvsc_init_once()` and device setup happen before that. A primary > removing a VF during this window sees `secondary_cnt == 0`, skips > `rte_mp_request_sync()`, and the secondary never gets notified — leaving it > with > a stale VF port reference. > > 2. **Misleading "VF is already locked by primary" comment.** In > `netvsc_secondary_handle_device_remove()`, the code reads `hv->vf_ctx.vf_port` > from shared memory with a comment saying the primary's lock protects it. But > `rte_rwlock_t` is process-local — it doesn't work cross-process. The actual > synchronization comes from the MP message exchange itself (the primary sends > the message after setting state, the secondary handles it after receiving). > The > comment should reflect that. > > 3. **`netvsc_init_once()` not protected by the spinlock.** It's called from > `eth_hn_probe()` without `netvsc_shared_data_lock`, while > `netvsc_uninit_once()` is called *inside* the lock. If two netvsc devices > probe > concurrently in the same process, the `init_done` flag check could race. Low > risk > since DPDK probe is typically single-threaded, but inconsistent with the > uninit > path. > > **Minor style notes:** `MZ_NETVSC_SHARED_DATA` uses macro-style naming > but is a `const char *` variable — could be a `#define` for consistency with > `NETVSC_MP_NAME`. The stub `netvsc_mp_primary_handle()` that always > returns 0 is benign but could mask future protocol errors. > > **Overall:** Sound infrastructure, suitable for merging with the comment fix > and > awareness of the probe-time race window.
I'm sending v3 addressing those comments. Thanks, Long

