The vhost_user_read_cb() and rte_vhost_driver_unregister() can be called at the same time by 2 threads. Eg thread1 calls vhost_user_read_cb() and removes the vsocket from conn_list, then thread2 calls rte_vhost_driver_unregister() and frees the vsocket since it is NOT in the conn_list. So thread1 will access invalid memory when trying to reconnect.
The fix is to move the "removing of vsocket from conn_list" to end of the vhost_user_read_cb(), then avoid the race condition. The core trace is Program terminated with signal 11, Segmentation fault. Signed-off-by: Zhike Wang <wangzh...@jd.com> --- lib/librte_vhost/socket.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index ebb2ff6..01a9dec 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -318,16 +318,16 @@ struct vhost_user { vhost_destroy_device(conn->vid); + if (vsocket->reconnect) { + create_unix_socket(vsocket); + vhost_user_start_client(vsocket); + } + pthread_mutex_lock(&vsocket->conn_mutex); TAILQ_REMOVE(&vsocket->conn_list, conn, next); pthread_mutex_unlock(&vsocket->conn_mutex); free(conn); - - if (vsocket->reconnect) { - create_unix_socket(vsocket); - vhost_user_start_client(vsocket); - } } } -- 1.8.3.1