On Tue, Jun 10, 2025 at 10:08:49AM -0500, Jaehoon Kim wrote:
> When the source VM attempts to connect to the destination VM's Unix
> domain socket (cpr.sock) during a cpr-transfer test, race conditions can
> occur if the socket file isn't ready. This can lead to connection
> failures when running tests.
> 
> This patch creates and listens on the socket in advance, and passes the
> pre-listened FD directly. This avoids timing issues and improves the
> reliability of CPR tests.
> 
> Reviewed-by: Jason J. Herne <jjhe...@linux.ibm.com>
> Signed-off-by: Jaehoon Kim <jh...@linux.ibm.com>
> 
> ---
> Changes since v1:
> - In v1, the patch added a wait loop to poll the existence of the socket
>   file (cpr_validate_socket_path()).
> 
> - This version instead creates the socket beforehand and passes its FD
>   to the destination QEMU, eliminating the race condition entirely.
> 
> - Commit title and message changed accordingly.
> ---
>  migration/cpr-transfer.c          |  3 +-
>  tests/qtest/migration/cpr-tests.c | 72 ++++++++++++++++++++++++++++++-
>  2 files changed, 72 insertions(+), 3 deletions(-)

> diff --git a/tests/qtest/migration/cpr-tests.c 
> b/tests/qtest/migration/cpr-tests.c
> index 5536e14610..6f90160e21 100644
> --- a/tests/qtest/migration/cpr-tests.c
> +++ b/tests/qtest/migration/cpr-tests.c
> @@ -50,6 +50,51 @@ static void *test_mode_transfer_start(QTestState *from, 
> QTestState *to)
>      return NULL;
>  }
>  
> +/*
> + * Create a pre-listened UNIX domain socket at the specified path.
> + *
> + * This is used to eliminate a race condition that can occur
> + * intermittently in qtest during CPR tests. By pre-creating and
> + * listening on the socket, we avoid timing-related issues.
> + */
> +static int setup_socket_listener(const char *path)
> +{
> +    struct sockaddr_un un;
> +    size_t pathlen;
> +    int sock_fd;
> +
> +    sock_fd = socket(PF_UNIX, SOCK_STREAM, 0);
> +    if (sock_fd < 0) {
> +        g_test_message("Failed to create Unix socket");
> +        return -1;
> +    }
> +
> +    pathlen = strlen(path);
> +    if (pathlen >= sizeof(un.sun_path)) {
> +        g_test_message("UNIX socket path '%s' is too long", path);
> +        close(sock_fd);
> +        return -1;
> +    }
> +
> +    memset(&un, 0, sizeof(un));
> +    un.sun_family = AF_UNIX;
> +    strncpy(un.sun_path, path, sizeof(un.sun_path) - 1);
> +
> +    if (bind(sock_fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> +        g_test_message("Failed to bind socket to %s", path);
> +        close(sock_fd);
> +        return -1;
> +    }
> +
> +    if (listen(sock_fd, 1) < 0) {
> +        g_test_message("Failed to listen on socket %s", path);
> +        close(sock_fd);
> +        return -1;
> +    }
> +
> +    return sock_fd;
> +}

This is effectively re-implementing 'unix_listen', so just use
that function.

> @@ -75,6 +120,29 @@ static void test_mode_transfer_common(bool incoming_defer)
>          "              'path': '%s' } } ]",
>          mig_path);
>  
> +    /*
> +     * Determine socket address type and value.
> +     * If socket creation fails, provide the socket path to the target,
> +     * so it can create the Unix domain socket itself.
> +     * Otherwise, use the pre-listened socket file descriptor directly.
> +     */
> +    int cpr_sockfd = setup_socket_listener(cpr_path);
> +
> +    if (cpr_sockfd < 0) {

A failure of this function (or in future 'unix_listen') shouldn't
trigger any fallback logic - we should report it and fail thue
test.

> +        addr_type = g_strdup("unix");
> +        addr_key = g_strdup("path");
> +        addr_value = g_strdup(cpr_path);
> +    } else {
> +        addr_type = g_strdup("fd");
> +        addr_key = g_strdup("str");
> +        addr_value = g_strdup_printf("%d", cpr_sockfd);
> +    }
> +
> +    opts_target = g_strdup_printf("-incoming cpr,addr.transport=socket,"
> +                                  "addr.type=%s,addr.%s=%s %s",
> +                                  addr_type, addr_key, addr_value, opts);
> +
> +
>      MigrateCommon args = {
>          .start.opts_source = opts,
>          .start.opts_target = opts_target,
> -- 
> 2.49.0
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to