Re: [PATCH 2/2] nbd/server: Allow users to adjust handshake limit in QMP
On Wed, Oct 02, 2024 at 05:14:42PM GMT, Markus Armbruster wrote: > >> Are we confident we'll never need less than a full second? > > > > Hmm, recent "[PATCH v2] chardev: introduce 'reconnect-ms' and deprecate > > 'reconnect'" shows that at least sometimes second is not enough precision. > > > > Maybe, using milliseconds consistently for all relatively short time > > intervals in QAPI would be a good rule? > > Ideally, we'd use a single unit for time: nanoseconds. But we missed > that chance long ago, and now are stuck with a mix of seconds, > milliseconds, microseconds, and nanoseconds. > > I think a good rule is to pick the first from this list that will surely > provide all the precision we'll ever need. > > In this case, milliseconds should do. I'll use milliseconds in the next revision. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH for-9.1 6/9] block/nbd: Use URI parsing code from glib
On Sun, Sep 22, 2024 at 08:51:22PM GMT, Richard W.M. Jones wrote: > On Thu, Mar 28, 2024 at 02:13:42PM +, Richard W.M. Jones wrote: > > On Thu, Mar 28, 2024 at 03:06:03PM +0100, Thomas Huth wrote: > > > Since version 2.66, glib has useful URI parsing functions, too. > > > Use those instead of the QEMU-internal ones to be finally able > > > to get rid of the latter. The g_uri_get_host() also takes care > > > of removing the square brackets from IPv6 addresses, so we can > > > drop that part of the QEMU code now, too. > > > > > > > > > -p = uri->path ? uri->path : ""; > > > +p = g_uri_get_path(uri) ?: ""; > > > if (p[0] == '/') { > > > p++; > > > } > > Looks ok, > > > > Reviewed-by: Richard W.M. Jones > > Or maybe not. This caused a regression in the nbdkit test suite (when > we use qemu-img from 9.1). It seems the exportname part of the NBD > URI gets munged: > > https://gitlab.com/qemu-project/qemu/-/issues/2584 To be more specific, it looks like g_uri_get_path("./name//with//..//slashes") is getting munged to "name/slashes". That is, glib is blindly assuming that ./ and XXX/../ can be dropped, and // can be simplified to /, which may be true for arbitrary file names but not true for abitrary URIs (since URIs have application-specific semantics, which may not match path name traversal semantics). Looks like we need to report a bug to glib, and/or see if glib's URI functions have a flag for turning off this unwanted munging. Or we may just want to document this corner case change as intentional. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 2/2] qapi/block-core: Drop drive-backup's "Any other error" documentation
On Wed, Sep 11, 2024 at 03:24:59PM GMT, Markus Armbruster wrote: > We've always been rather lax about documenting errors. Many "Errors" > sections are obviously not exhaustive. Only drive-backup is explicit > about this: "Any other error returns a GenericError". > > Not useful. Drop. > > Signed-off-by: Markus Armbruster > --- > qapi/block-core.json | 1 - > 1 file changed, 1 deletion(-) Reviewed-by: Eric Blake > > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 82f59a7758..485388be32 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -1853,7 +1853,6 @@ > # > # Errors: > # - If @device does not exist, DeviceNotFound > -# - Any other error returns a GenericError. > # > # Since: 1.3 > # > -- > 2.46.0 > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 1/2] qapi: Drop "with an explanation" from error descriptions
On Wed, Sep 11, 2024 at 03:24:58PM GMT, Markus Armbruster wrote: > All errors come with an explanation, namely the human-readable error > message in the error response's @desc member. Drop the redundant > "with an explanation" phrase. > > Signed-off-by: Markus Armbruster > --- > qapi/block-core.json | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 01/39] docs/spin: replace assert(0) with g_assert_not_reached()
On Wed, Sep 11, 2024 at 07:33:59AM GMT, Eric Blake wrote: > On Tue, Sep 10, 2024 at 03:15:28PM GMT, Pierrick Bouvier wrote: > > Signed-off-by: Pierrick Bouvier > > --- > > A general suggestion for the entire series: please use a commit > message that explains why this is a good idea. Even something as > boiler-plate as "refer to commit XXX for rationale" that can be > copy-pasted into all the other commits is better than nothing, > although a self-contained message is best. Maybe: > > This patch is part of a series that moves towards a consistent use of > g_assert_not_reached() rather than an ad hoc mix of different > assertion mechanisms. Or summarize your cover letter: Use of assert(false) can trip spurious control flow warnings from some versions of gcc: https://lore.kernel.org/qemu-devel/54bb02a6-1b12-460a-97f6-3f478ef76...@linaro.org/ Solve that by unifying the code base on g_assert_not_reached() instead. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 01/39] docs/spin: replace assert(0) with g_assert_not_reached()
On Tue, Sep 10, 2024 at 03:15:28PM GMT, Pierrick Bouvier wrote: > Signed-off-by: Pierrick Bouvier > --- A general suggestion for the entire series: please use a commit message that explains why this is a good idea. Even something as boiler-plate as "refer to commit XXX for rationale" that can be copy-pasted into all the other commits is better than nothing, although a self-contained message is best. Maybe: This patch is part of a series that moves towards a consistent use of g_assert_not_reached() rather than an ad hoc mix of different assertion mechanisms. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PULL 1/1] nbd/server: CVE-2024-7409: Avoid use-after-free when closing server
Commit 3e7ef738 plugged the use-after-free of the global nbd_server object, but overlooked a use-after-free of nbd_server->listener. Although this race is harder to hit, notice that our shutdown path first drops the reference count of nbd_server->listener, then triggers actions that can result in a pending client reaching the nbd_blockdev_client_closed() callback, which in turn calls qio_net_listener_set_client_func on a potentially stale object. If we know we don't want any more clients to connect, and have already told the listener socket to shut down, then we should not be trying to update the listener socket's associated function. Reproducer: > #!/usr/bin/python3 > > import os > from threading import Thread > > def start_stop(): > while 1: > os.system('virsh qemu-monitor-command VM \'{"execute": > "nbd-server-start", +"arguments":{"addr":{"type":"unix","data":{"path":"/tmp/nbd-sock"\'') > os.system('virsh qemu-monitor-command VM \'{"execute": > "nbd-server-stop"}\'') > > def nbd_list(): > while 1: > os.system('/path/to/build/qemu-nbd -L -k /tmp/nbd-sock') > > def test(): > sst = Thread(target=start_stop) > sst.start() > nlt = Thread(target=nbd_list) > nlt.start() > > sst.join() > nlt.join() > > test() Fixes: CVE-2024-7409 Fixes: 3e7ef738c8 ("nbd/server: CVE-2024-7409: Close stray clients at server-stop") CC: qemu-sta...@nongnu.org Reported-by: Andrey Drobyshev Signed-off-by: Eric Blake Message-ID: <20240822143617.800419-2-ebl...@redhat.com> Reviewed-by: Stefan Hajnoczi --- blockdev-nbd.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index f73409ae494..b36f41b7c5a 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -92,10 +92,13 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, static void nbd_update_server_watch(NBDServerData *s) { -if (!s->max_connections || s->connections < s->max_connections) { -qio_net_listener_set_client_func(s->listener, nbd_accept, NULL, NULL); -} else { -qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL); +if (s->listener) { +if (!s->max_connections || s->connections < s->max_connections) { +qio_net_listener_set_client_func(s->listener, nbd_accept, NULL, + NULL); +} else { +qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL); +} } } @@ -113,6 +116,7 @@ static void nbd_server_free(NBDServerData *server) */ qio_net_listener_disconnect(server->listener); object_unref(OBJECT(server->listener)); +server->listener = NULL; QLIST_FOREACH_SAFE(conn, &server->conns, next, tmp) { qio_channel_shutdown(QIO_CHANNEL(conn->cioc), QIO_CHANNEL_SHUTDOWN_BOTH, NULL); -- 2.46.0
[PATCH for-9.1] nbd/server: CVE-2024-7409: Avoid use-after-free when closing server
Commit 3e7ef738 plugged the use-after-free of the global nbd_server object, but overlooked a use-after-free of nbd_server->listener. Although this race is harder to hit, notice that our shutdown path first drops the reference count of nbd_server->listener, then triggers actions that can result in a pending client reaching the nbd_blockdev_client_closed() callback, which in turn calls qio_net_listener_set_client_func on a potentially stale object. If we know we don't want any more clients to connect, and have already told the listener socket to shut down, then we should not be trying to update the listener socket's associated function. Reproducer: > #!/usr/bin/python3 > > import os > from threading import Thread > > def start_stop(): > while 1: > os.system('virsh qemu-monitor-command VM \'{"execute": > "nbd-server-start", +"arguments":{"addr":{"type":"unix","data":{"path":"/tmp/nbd-sock"\'') > os.system('virsh qemu-monitor-command VM \'{"execute": > "nbd-server-stop"}\'') > > def nbd_list(): > while 1: > os.system('/path/to/build/qemu-nbd -L -k /tmp/nbd-sock') > > def test(): > sst = Thread(target=start_stop) > sst.start() > nlt = Thread(target=nbd_list) > nlt.start() > > sst.join() > nlt.join() > > test() Fixes: CVE-2024-7409 Fixes: 3e7ef738c8 ("nbd/server: CVE-2024-7409: Close stray clients at server-stop") CC: qemu-sta...@nongnu.org Reported-by: Andrey Drobyshev Signed-off-by: Eric Blake --- blockdev-nbd.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index f73409ae494..b36f41b7c5a 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -92,10 +92,13 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, static void nbd_update_server_watch(NBDServerData *s) { -if (!s->max_connections || s->connections < s->max_connections) { -qio_net_listener_set_client_func(s->listener, nbd_accept, NULL, NULL); -} else { -qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL); +if (s->listener) { +if (!s->max_connections || s->connections < s->max_connections) { +qio_net_listener_set_client_func(s->listener, nbd_accept, NULL, + NULL); +} else { +qio_net_listener_set_client_func(s->listener, NULL, NULL, NULL); +} } } @@ -113,6 +116,7 @@ static void nbd_server_free(NBDServerData *server) */ qio_net_listener_disconnect(server->listener); object_unref(OBJECT(server->listener)); +server->listener = NULL; QLIST_FOREACH_SAFE(conn, &server->conns, next, tmp) { qio_channel_shutdown(QIO_CHANNEL(conn->cioc), QIO_CHANNEL_SHUTDOWN_BOTH, NULL); -- 2.46.0
Re: [PATCH v1 1/1] block/file-posix: Avoid maybe-uninitialized warning
On Mon, Aug 12, 2024 at 04:43:23PM GMT, Edgar E. Iglesias wrote: > From: "Edgar E. Iglesias" > > Avoid a maybe-uninitialized warning in raw_refresh_zoned_limits() > by initializing zoned. > > With GCC 14.1.0: > In function ‘raw_refresh_zoned_limits’, > inlined from ‘raw_refresh_limits’ at ../qemu/block/file-posix.c:1522:5: > ../qemu/block/file-posix.c:1405:17: error: ‘zoned’ may be used uninitialized > [-Werror=maybe-uninitialized] > 1405 | if (ret < 0 || zoned == BLK_Z_NONE) { > | ^~ > ../qemu/block/file-posix.c:1401:20: note: ‘zoned’ was declared here > 1401 | BlockZoneModel zoned; > |^ > cc1: all warnings being treated as errors > > Signed-off-by: Edgar E. Iglesias > --- > block/file-posix.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Eric Blake > > diff --git a/block/file-posix.c b/block/file-posix.c > index ff928b5e85..90fa54352c 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -1398,7 +1398,7 @@ static void raw_refresh_zoned_limits(BlockDriverState > *bs, struct stat *st, > Error **errp) > { > BDRVRawState *s = bs->opaque; > -BlockZoneModel zoned; > +BlockZoneModel zoned = BLK_Z_NONE; > int ret; > > ret = get_sysfs_zoned_model(st, &zoned); > -- > 2.43.0 > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PULL 5/5] nbd/server: CVE-2024-7409: Close stray clients at server-stop
On Sun, Aug 11, 2024 at 11:02:52AM GMT, Michael Tokarev wrote: > 09.08.2024 00:53, Eric Blake wrote: > > A malicious client can attempt to connect to an NBD server, and then > > intentionally delay progress in the handshake, including if it does > > not know the TLS secrets. Although the previous two patches reduce > > Eric, from the 5-patch series, only this last patch is Cc'd for stable, > but it obviously does not work without all 4 previous patches. Do you > mean whole series should be applied to -stable? > > I picked up patches 2-5 for 7.2 and 9.0. You are correct that patch 5 in isolation won't work due to missing pre-reqs, but also that 1 is fluff that doesn't need backporting; my apologies for not more judiciously adding the cc to all 4 patches worth the backport effort. I'm in the middle of efforts to backport only 2-5 to various RHEL releases, so your choice to do the same for 7.2 and 9.0 matches what I'm doing downstream. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH 2/2] nbd/server: Allow users to adjust handshake limit in QMP
Although defaulting the handshake limit to 10 seconds was a nice QoI change to weed out intentionally slow clients, it can interfere with integration testing done with manual NBD_OPT commands over 'nbdsh --opt-mode'. Expose a QMP knob 'handshake-max-secs' to allow the user to alter the timeout away from the default. The parameter name here intentionally matches the spelling of the constant added in commit fb1c2aaa98, and not the command-line spelling added in the previous patch for qemu-nbd; that's because in QMP, longer names serve as good self-documentation, and unlike the command line, machines don't have problems generating longer spellings. Signed-off-by: Eric Blake --- qapi/block-export.json | 10 ++ include/block/nbd.h| 6 +++--- block/monitor/block-hmp-cmds.c | 4 ++-- blockdev-nbd.c | 26 ++ 4 files changed, 33 insertions(+), 13 deletions(-) diff --git a/qapi/block-export.json b/qapi/block-export.json index ce33fe378df..c110dd375ad 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -17,6 +17,10 @@ # # @addr: Address on which to listen. # +# @handshake-max-secs: Time limit, in seconds, at which a client that +# has not completed the negotiation handshake will be disconnected, +# or 0 for no limit (since 9.2; default: 10). +# # @tls-creds: ID of the TLS credentials object (since 2.6). # # @tls-authz: ID of the QAuthZ authorization object used to validate @@ -34,6 +38,7 @@ ## { 'struct': 'NbdServerOptions', 'data': { 'addr': 'SocketAddress', +'*handshake-max-secs': 'uint32', '*tls-creds': 'str', '*tls-authz': 'str', '*max-connections': 'uint32' } } @@ -52,6 +57,10 @@ # # @addr: Address on which to listen. # +# @handshake-max-secs: Time limit, in seconds, at which a client that +# has not completed the negotiation handshake will be disconnected, +# or 0 for no limit (since 9.2; default: 10). +# # @tls-creds: ID of the TLS credentials object (since 2.6). # # @tls-authz: ID of the QAuthZ authorization object used to validate @@ -72,6 +81,7 @@ ## { 'command': 'nbd-server-start', 'data': { 'addr': 'SocketAddressLegacy', +'*handshake-max-secs': 'uint32', '*tls-creds': 'str', '*tls-authz': 'str', '*max-connections': 'uint32' }, diff --git a/include/block/nbd.h b/include/block/nbd.h index d4f8b21aecc..92987c76fd6 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -428,9 +428,9 @@ void nbd_client_put(NBDClient *client); void nbd_server_is_qemu_nbd(int max_connections); bool nbd_server_is_running(void); int nbd_server_max_connections(void); -void nbd_server_start(SocketAddress *addr, const char *tls_creds, - const char *tls_authz, uint32_t max_connections, - Error **errp); +void nbd_server_start(SocketAddress *addr, uint32_t handshake_max_secs, + const char *tls_creds, const char *tls_authz, + uint32_t max_connections, Error **errp); void nbd_server_start_options(NbdServerOptions *arg, Error **errp); /* nbd_read diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index bdf2eb50b68..cae6e3064a0 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -402,8 +402,8 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) goto exit; } -nbd_server_start(addr, NULL, NULL, NBD_DEFAULT_MAX_CONNECTIONS, - &local_err); +nbd_server_start(addr, NBD_DEFAULT_HANDSHAKE_MAX_SECS, NULL, NULL, + NBD_DEFAULT_MAX_CONNECTIONS, &local_err); qapi_free_SocketAddress(addr); if (local_err != NULL) { goto exit; diff --git a/blockdev-nbd.c b/blockdev-nbd.c index f73409ae494..0cdabfbd82c 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -28,6 +28,7 @@ typedef struct NBDConn { typedef struct NBDServerData { QIONetListener *listener; +uint32_t handshake_max_secs; QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t max_connections; @@ -84,8 +85,7 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); -/* TODO - expose handshake timeout as QMP option */ -nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_MAX_SECS, +nbd_client_new(cioc, nbd_server->handshake_max_secs, nbd_server->tlscreds, nbd_server->tlsauthz, nbd_blockdev_client_closed, conn); } @@ -158,9
[PATCH 1/2] qemu-nbd: Allow users to adjust handshake limit
Although defaulting the handshake limit to 10 seconds was a nice QoI change to weed out intentionally slow clients, it can interfere with integration testing done with manual NBD_OPT commands over 'nbdsh --opt-mode'. Expose a command line option to allow the user to alter the timeout away from the default. This option is unlikely to be used in enough scenarios to warrant a short option letter. The option --handshake-limit intentionally differs from the name of the constant added in commit fb1c2aaa98 (limit instead of max_secs) and the QMP name to be added in the next commit; this is because typing a longer command-line name is undesirable and there is sufficient --help text to document the units. Signed-off-by: Eric Blake --- docs/tools/qemu-nbd.rst | 5 + qemu-nbd.c | 41 ++--- 2 files changed, 31 insertions(+), 15 deletions(-) diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst index 329f44d9895..f55c10eb39c 100644 --- a/docs/tools/qemu-nbd.rst +++ b/docs/tools/qemu-nbd.rst @@ -154,6 +154,11 @@ driver options if :option:`--image-opts` is specified. Set the NBD volume export description, as a human-readable string. +.. option:: --handshake-limit=N + + Set the timeout for a client to successfully complete its handshake + to N seconds (default 10), or 0 for no limit. + .. option:: -L, --list Connect as a client and list all details about the exports exposed by diff --git a/qemu-nbd.c b/qemu-nbd.c index a186d2e1190..4a611b4a4ed 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -57,19 +57,20 @@ #define HAVE_NBD_DEVICE 0 #endif -#define SOCKET_PATH"/var/lock/qemu-nbd-%s" -#define QEMU_NBD_OPT_CACHE 256 -#define QEMU_NBD_OPT_AIO 257 -#define QEMU_NBD_OPT_DISCARD 258 -#define QEMU_NBD_OPT_DETECT_ZEROES 259 -#define QEMU_NBD_OPT_OBJECT260 -#define QEMU_NBD_OPT_TLSCREDS 261 -#define QEMU_NBD_OPT_IMAGE_OPTS262 -#define QEMU_NBD_OPT_FORK 263 -#define QEMU_NBD_OPT_TLSAUTHZ 264 -#define QEMU_NBD_OPT_PID_FILE 265 -#define QEMU_NBD_OPT_SELINUX_LABEL 266 -#define QEMU_NBD_OPT_TLSHOSTNAME 267 +#define SOCKET_PATH "/var/lock/qemu-nbd-%s" +#define QEMU_NBD_OPT_CACHE 256 +#define QEMU_NBD_OPT_AIO 257 +#define QEMU_NBD_OPT_DISCARD 258 +#define QEMU_NBD_OPT_DETECT_ZEROES 259 +#define QEMU_NBD_OPT_OBJECT 260 +#define QEMU_NBD_OPT_TLSCREDS261 +#define QEMU_NBD_OPT_IMAGE_OPTS 262 +#define QEMU_NBD_OPT_FORK263 +#define QEMU_NBD_OPT_TLSAUTHZ264 +#define QEMU_NBD_OPT_PID_FILE265 +#define QEMU_NBD_OPT_SELINUX_LABEL 266 +#define QEMU_NBD_OPT_TLSHOSTNAME 267 +#define QEMU_NBD_OPT_HANDSHAKE_LIMIT 268 #define MBR_SIZE 512 @@ -80,6 +81,7 @@ static int nb_fds; static QIONetListener *server; static QCryptoTLSCreds *tlscreds; static const char *tlsauthz; +static int handshake_limit = NBD_DEFAULT_HANDSHAKE_MAX_SECS; static void usage(const char *name) { @@ -101,6 +103,7 @@ static void usage(const char *name) " -v, --verbose display extra debugging information\n" " -x, --export-name=NAMEexpose export by name (default is empty string)\n" " -D, --description=TEXTexport a human-readable description\n" +" --handshake-limit=N limit client's handshake to N seconds (default 10)\n" "\n" "Exposing part of the image:\n" " -o, --offset=OFFSET offset into the image\n" @@ -390,8 +393,7 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nb_fds++; nbd_update_server_watch(); -/* TODO - expose handshake timeout as command line option */ -nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_MAX_SECS, +nbd_client_new(cioc, handshake_limit, tlscreds, tlsauthz, nbd_client_closed, NULL); } @@ -569,6 +571,8 @@ int main(int argc, char **argv) { "object", required_argument, NULL, QEMU_NBD_OPT_OBJECT }, { "export-name", required_argument, NULL, 'x' }, { "description", required_argument, NULL, 'D' }, +{ "handshake-limit", required_argument, NULL, + QEMU_NBD_OPT_HANDSHAKE_LIMIT }, { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, { "tls-hostname", required_argument, NULL, QEMU_NBD_OPT_TLSHOSTNAME }, { "tls-authz", required_argument, NULL, QEMU_NBD_OPT_TLSAUTHZ }, @@ -815,6 +819,13 @@ int main(int argc, char **argv) case QEMU_NBD_OPT_SELINUX_LABEL: selinux_label = optarg; break; +case QEMU_NBD_OPT_HANDSHAKE_LIMIT: +if (qemu_strtoi(optarg, NULL, 0, &handshake_limit) < 0 || +handshake_limit < 0) { +error_report("Invalid handshake limit '%s'", optarg); +exit(EXIT_FAILURE); +} +break; } } -- 2.46.0
[PATCH for-9.2 0/2] NBD: tune handshake timeout
Originally posted as the tail part of this v4 series: https://lists.gnu.org/archive/html/qemu-devel/2024-08/msg01154.html but not deemed necessary for 9.1 as it is a feature addition that was not proposed before soft freeze (even if the point of the feature is to enable interop testing to revert back to 9.0 behavior allowing for unlimited time when debuging an NBD_OPT interchange under gdb). As part of closing out the CVE, where the v4 patches used handshake_limit, the actual pull request used handshake_max_secs (a bit more self-documenting); I rebased that rename into the QMP code, but prefer to keep the qemu-nbd command-line spelling shorter. But I'm open to any arguments on why the names should be the same, or on any other better spellings to expose to the user. Eric Blake (2): qemu-nbd: Allow users to adjust handshake limit nbd/server: Allow users to adjust handshake limit in QMP docs/tools/qemu-nbd.rst| 5 + qapi/block-export.json | 10 + include/block/nbd.h| 6 ++--- block/monitor/block-hmp-cmds.c | 4 ++-- blockdev-nbd.c | 26 ++--- qemu-nbd.c | 41 +- 6 files changed, 64 insertions(+), 28 deletions(-) -- 2.46.0
[PULL 1/5] nbd: Minor style and typo fixes
Touch up a comment with the wrong type name, and an over-long line, both noticed while working on upcoming patches. Signed-off-by: Eric Blake Message-ID: <20240807174943.771624-10-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé --- nbd/server.c | 2 +- qemu-nbd.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 892797bb111..ecd9366ba64 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1972,7 +1972,7 @@ static void nbd_export_request_shutdown(BlockExport *blk_exp) blk_exp_ref(&exp->common); /* - * TODO: Should we expand QMP NbdServerRemoveNode enum to allow a + * TODO: Should we expand QMP BlockExportRemoveMode enum to allow a * close mode that stops advertising the export to new clients but * still permits existing clients to run to completion? Because of * that possibility, nbd_export_close() can be called more than diff --git a/qemu-nbd.c b/qemu-nbd.c index d7b3ccab21c..8e104ef22c3 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -588,7 +588,8 @@ int main(int argc, char **argv) pthread_t client_thread; const char *fmt = NULL; Error *local_err = NULL; -BlockdevDetectZeroesOptions detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; +BlockdevDetectZeroesOptions detect_zeroes = +BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; QDict *options = NULL; const char *export_name = NULL; /* defaults to "" later for server mode */ const char *export_description = NULL; -- 2.46.0
[PULL 5/5] nbd/server: CVE-2024-7409: Close stray clients at server-stop
A malicious client can attempt to connect to an NBD server, and then intentionally delay progress in the handshake, including if it does not know the TLS secrets. Although the previous two patches reduce this behavior by capping the default max-connections parameter and killing slow clients, they did not eliminate the possibility of a client waiting to close the socket until after the QMP nbd-server-stop command is executed, at which point qemu would SEGV when trying to dereference the NULL nbd_server global which is no longer present. This amounts to a denial of service attack. Worse, if another NBD server is started before the malicious client disconnects, I cannot rule out additional adverse effects when the old client interferes with the connection count of the new server (although the most likely is a crash due to an assertion failure when checking nbd_server->connections > 0). For environments without this patch, the CVE can be mitigated by ensuring (such as via a firewall) that only trusted clients can connect to an NBD server. Note that using frameworks like libvirt that ensure that TLS is used and that nbd-server-stop is not executed while any trusted clients are still connected will only help if there is also no possibility for an untrusted client to open a connection but then stall on the NBD handshake. Given the previous patches, it would be possible to guarantee that no clients remain connected by having nbd-server-stop sleep for longer than the default handshake deadline before finally freeing the global nbd_server object, but that could make QMP non-responsive for a long time. So intead, this patch fixes the problem by tracking all client sockets opened while the server is running, and forcefully closing any such sockets remaining without a completed handshake at the time of nbd-server-stop, then waiting until the coroutines servicing those sockets notice the state change. nbd-server-stop now has a second AIO_WAIT_WHILE_UNLOCKED (the first is indirectly through the blk_exp_close_all_type() that disconnects all clients that completed handshakes), but forced socket shutdown is enough to progress the coroutines and quickly tear down all clients before the server is freed, thus finally fixing the CVE. This patch relies heavily on the fact that nbd/server.c guarantees that it only calls nbd_blockdev_client_closed() from the main loop (see the assertion in nbd_client_put() and the hoops used in nbd_client_put_nonzero() to achieve that); if we did not have that guarantee, we would also need a mutex protecting our accesses of the list of connections to survive re-entrancy from independent iothreads. Although I did not actually try to test old builds, it looks like this problem has existed since at least commit 862172f45c (v2.12.0, 2017) - even back when that patch started using a QIONetListener to handle listening on multiple sockets, nbd_server_free() was already unaware that the nbd_blockdev_client_closed callback can be reached later by a client thread that has not completed handshakes (and therefore the client's socket never got added to the list closed in nbd_export_close_all), despite that patch intentionally tearing down the QIONetListener to prevent new clients. Reported-by: Alexander Ivanov Fixes: CVE-2024-7409 CC: qemu-sta...@nongnu.org Signed-off-by: Eric Blake Message-ID: <20240807174943.771624-14-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé --- blockdev-nbd.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 24ba5382db0..f73409ae494 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -21,12 +21,18 @@ #include "io/channel-socket.h" #include "io/net-listener.h" +typedef struct NBDConn { +QIOChannelSocket *cioc; +QLIST_ENTRY(NBDConn) next; +} NBDConn; + typedef struct NBDServerData { QIONetListener *listener; QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t max_connections; uint32_t connections; +QLIST_HEAD(, NBDConn) conns; } NBDServerData; static NBDServerData *nbd_server; @@ -51,6 +57,14 @@ int nbd_server_max_connections(void) static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) { +NBDConn *conn = nbd_client_owner(client); + +assert(qemu_in_main_thread() && nbd_server); + +object_unref(OBJECT(conn->cioc)); +QLIST_REMOVE(conn, next); +g_free(conn); + nbd_client_put(client); assert(nbd_server->connections > 0); nbd_server->connections--; @@ -60,14 +74,20 @@ static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, gpointer opaque) { +NBDConn *conn = g_new0(NBDConn, 1); + +assert(qemu_in_main_thread() && nbd_server); nbd_server->connections++; +object_ref(OBJECT(cioc)); +conn
[PULL 3/5] nbd/server: CVE-2024-7409: Cap default max-connections to 100
Allowing an unlimited number of clients to any web service is a recipe for a rudimentary denial of service attack: the client merely needs to open lots of sockets without closing them, until qemu no longer has any more fds available to allocate. For qemu-nbd, we default to allowing only 1 connection unless more are explicitly asked for (-e or --shared); this was historically picked as a nice default (without an explicit -t, a non-persistent qemu-nbd goes away after a client disconnects, without needing any additional follow-up commands), and we are not going to change that interface now (besides, someday we want to point people towards qemu-storage-daemon instead of qemu-nbd). But for qemu proper, and the newer qemu-storage-daemon, the QMP nbd-server-start command has historically had a default of unlimited number of connections, in part because unlike qemu-nbd it is inherently persistent until nbd-server-stop. Allowing multiple client sockets is particularly useful for clients that can take advantage of MULTI_CONN (creating parallel sockets to increase throughput), although known clients that do so (such as libnbd's nbdcopy) typically use only 8 or 16 connections (the benefits of scaling diminish once more sockets are competing for kernel attention). Picking a number large enough for typical use cases, but not unlimited, makes it slightly harder for a malicious client to perform a denial of service merely by opening lots of connections withot progressing through the handshake. This change does not eliminate CVE-2024-7409 on its own, but reduces the chance for fd exhaustion or unlimited memory usage as an attack surface. On the other hand, by itself, it makes it more obvious that with a finite limit, we have the problem of an unauthenticated client holding 100 fds opened as a way to block out a legitimate client from being able to connect; thus, later patches will further add timeouts to reject clients that are not making progress. This is an INTENTIONAL change in behavior, and will break any client of nbd-server-start that was not passing an explicit max-connections parameter, yet expects more than 100 simultaneous connections. We are not aware of any such client (as stated above, most clients aware of MULTI_CONN get by just fine on 8 or 16 connections, and probably cope with later connections failing by relying on the earlier connections; libvirt has not yet been passing max-connections, but generally creates NBD servers with the intent for a single client for the sake of live storage migration; meanwhile, the KubeSAN project anticipates a large cluster sharing multiple clients [up to 8 per node, and up to 100 nodes in a cluster], but it currently uses qemu-nbd with an explicit --shared=0 rather than qemu-storage-daemon with nbd-server-start). We considered using a deprecation period (declare that omitting max-parameters is deprecated, and make it mandatory in 3 releases - then we don't need to pick an arbitrary default); that has zero risk of breaking any apps that accidentally depended on more than 100 connections, and where such breakage might not be noticed under unit testing but only under the larger loads of production usage. But it does not close the denial-of-service hole until far into the future, and requires all apps to change to add the parameter even if 100 was good enough. It also has a drawback that any app (like libvirt) that is accidentally relying on an unlimited default should seriously consider their own CVE now, at which point they are going to change to pass explicit max-connections sooner than waiting for 3 qemu releases. Finally, if our changed default breaks an app, that app can always pass in an explicit max-parameters with a larger value. It is also intentional that the HMP interface to nbd-server-start is not changed to expose max-connections (any client needing to fine-tune things should be using QMP). Suggested-by: Daniel P. Berrangé Signed-off-by: Eric Blake Message-ID: <20240807174943.771624-12-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé [ericb: Expand commit message to summarize Dan's argument for why we break corner-case back-compat behavior without a deprecation period] Signed-off-by: Eric Blake --- qapi/block-export.json | 4 ++-- include/block/nbd.h| 7 +++ block/monitor/block-hmp-cmds.c | 3 ++- blockdev-nbd.c | 8 4 files changed, 19 insertions(+), 3 deletions(-) diff --git a/qapi/block-export.json b/qapi/block-export.json index 665d5fd0262..ce33fe378df 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -28,7 +28,7 @@ # @max-connections: The maximum number of connections to allow at the # same time, 0 for unlimited. Setting this to 1 also stops the # server from advertising multiple client support (since 5.2; -# default: 0) +# default: 100) # # Since: 4.2 ## @@ -63,7 +63,7 @@ # @max-connections: The maximum number of connec
[PULL 2/5] nbd/server: Plumb in new args to nbd_client_add()
Upcoming patches to fix a CVE need to track an opaque pointer passed in by the owner of a client object, as well as request for a time limit on how fast negotiation must complete. Prepare for that by changing the signature of nbd_client_new() and adding an accessor to get at the opaque pointer, although for now the two servers (qemu-nbd.c and blockdev-nbd.c) do not change behavior even though they pass in a new default timeout value. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Eric Blake Message-ID: <20240807174943.771624-11-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé [eblake: s/LIMIT/MAX_SECS/ as suggested by Dan] Signed-off-by: Eric Blake --- include/block/nbd.h | 11 ++- blockdev-nbd.c | 6 -- nbd/server.c| 20 +--- qemu-nbd.c | 4 +++- 4 files changed, 34 insertions(+), 7 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 4e7bd6342f9..1d4d65922d1 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -33,6 +33,12 @@ typedef struct NBDMetaContexts NBDMetaContexts; extern const BlockExportDriver blk_exp_nbd; +/* + * NBD_DEFAULT_HANDSHAKE_MAX_SECS: Number of seconds in which client must + * succeed at NBD_OPT_GO before being forcefully dropped as too slow. + */ +#define NBD_DEFAULT_HANDSHAKE_MAX_SECS 10 + /* Handshake phase structs - this struct is passed on the wire */ typedef struct NBDOption { @@ -403,9 +409,12 @@ AioContext *nbd_export_aio_context(NBDExport *exp); NBDExport *nbd_export_find(const char *name); void nbd_client_new(QIOChannelSocket *sioc, +uint32_t handshake_max_secs, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)); +void (*close_fn)(NBDClient *, bool), +void *owner); +void *nbd_client_owner(NBDClient *client); void nbd_client_get(NBDClient *client); void nbd_client_put(NBDClient *client); diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 213012435f4..267a1de903f 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -64,8 +64,10 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); -nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz, - nbd_blockdev_client_closed); +/* TODO - expose handshake timeout as QMP option */ +nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_MAX_SECS, + nbd_server->tlscreds, nbd_server->tlsauthz, + nbd_blockdev_client_closed, NULL); } static void nbd_update_server_watch(NBDServerData *s) diff --git a/nbd/server.c b/nbd/server.c index ecd9366ba64..fee04415d83 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -124,12 +124,14 @@ struct NBDMetaContexts { struct NBDClient { int refcount; /* atomic */ void (*close_fn)(NBDClient *client, bool negotiated); +void *owner; QemuMutex lock; NBDExport *exp; QCryptoTLSCreds *tlscreds; char *tlsauthz; +uint32_t handshake_max_secs; QIOChannelSocket *sioc; /* The underlying data channel */ QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */ @@ -3191,6 +3193,7 @@ static coroutine_fn void nbd_co_client_start(void *opaque) qemu_co_mutex_init(&client->send_lock); +/* TODO - utilize client->handshake_max_secs */ if (nbd_negotiate(client, &local_err)) { if (local_err) { error_report_err(local_err); @@ -3205,14 +3208,17 @@ static coroutine_fn void nbd_co_client_start(void *opaque) } /* - * Create a new client listener using the given channel @sioc. + * Create a new client listener using the given channel @sioc and @owner. * Begin servicing it in a coroutine. When the connection closes, call - * @close_fn with an indication of whether the client completed negotiation. + * @close_fn with an indication of whether the client completed negotiation + * within @handshake_max_secs seconds (0 for unbounded). */ void nbd_client_new(QIOChannelSocket *sioc, +uint32_t handshake_max_secs, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)) +void (*close_fn)(NBDClient *, bool), +void *owner) { NBDClient *client; Coroutine *co; @@ -3225,13 +3231,21 @@ void nbd_client_new(QIOChannelSocket *sioc, object_ref(OBJECT(client->tlscreds)); } client->tlsauthz = g_strdup(tlsauthz); +client->handshake_max_secs = handshake_max_secs; client->sioc = sioc; qio_channel_set_delay(QIO_CHANNEL(sioc), false); object_ref(OBJECT(client->sioc)); client->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(client->ioc)
[PULL 4/5] nbd/server: CVE-2024-7409: Drop non-negotiating clients
A client that opens a socket but does not negotiate is merely hogging qemu's resources (an open fd and a small amount of memory); and a malicious client that can access the port where NBD is listening can attempt a denial of service attack by intentionally opening and abandoning lots of unfinished connections. The previous patch put a default bound on the number of such ongoing connections, but once that limit is hit, no more clients can connect (including legitimate ones). The solution is to insist that clients complete handshake within a reasonable time limit, defaulting to 10 seconds. A client that has not successfully completed NBD_OPT_GO by then (including the case of where the client didn't know TLS credentials to even reach the point of NBD_OPT_GO) is wasting our time and does not deserve to stay connected. Later patches will allow fine-tuning the limit away from the default value (including disabling it for doing integration testing of the handshake process itself). Note that this patch in isolation actually makes it more likely to see qemu SEGV after nbd-server-stop, as any client socket still connected when the server shuts down will now be closed after 10 seconds rather than at the client's whims. That will be addressed in the next patch. For a demo of this patch in action: $ qemu-nbd -f raw -r -t -e 10 file & $ nbdsh --opt-mode -c ' H = list() for i in range(20): print(i) H.insert(i, nbd.NBD()) H[i].set_opt_mode(True) H[i].connect_uri("nbd://localhost") ' $ kill $! where later connections get to start progressing once earlier ones are forcefully dropped for taking too long, rather than hanging. Suggested-by: Daniel P. Berrangé Signed-off-by: Eric Blake Message-ID: <20240807174943.771624-13-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé [eblake: rebase to changes earlier in series, reduce scope of timer] Signed-off-by: Eric Blake --- nbd/server.c | 28 +++- nbd/trace-events | 1 + 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/nbd/server.c b/nbd/server.c index fee04415d83..c30e687fc8b 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -3186,22 +3186,48 @@ static void nbd_client_receive_next_request(NBDClient *client) } } +static void nbd_handshake_timer_cb(void *opaque) +{ +QIOChannel *ioc = opaque; + +trace_nbd_handshake_timer_cb(); +qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); +} + static coroutine_fn void nbd_co_client_start(void *opaque) { NBDClient *client = opaque; Error *local_err = NULL; +QEMUTimer *handshake_timer = NULL; qemu_co_mutex_init(&client->send_lock); -/* TODO - utilize client->handshake_max_secs */ +/* + * Create a timer to bound the time spent in negotiation. If the + * timer expires, it is likely nbd_negotiate will fail because the + * socket was shutdown. + */ +if (client->handshake_max_secs > 0) { +handshake_timer = aio_timer_new(qemu_get_aio_context(), +QEMU_CLOCK_REALTIME, +SCALE_NS, +nbd_handshake_timer_cb, +client->sioc); +timer_mod(handshake_timer, + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + client->handshake_max_secs * NANOSECONDS_PER_SECOND); +} + if (nbd_negotiate(client, &local_err)) { if (local_err) { error_report_err(local_err); } +timer_free(handshake_timer); client_close(client, false); return; } +timer_free(handshake_timer); WITH_QEMU_LOCK_GUARD(&client->lock) { nbd_client_receive_next_request(client); } diff --git a/nbd/trace-events b/nbd/trace-events index 00ae3216a11..cbd0a4ab7e4 100644 --- a/nbd/trace-events +++ b/nbd/trace-events @@ -76,6 +76,7 @@ nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64 nbd_co_receive_align_compliance(const char *op, uint64_t from, uint64_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx64 ", align=0x%" PRIx32 nbd_trip(void) "Reading request" +nbd_handshake_timer_cb(void) "client took too long to negotiate" # client-connection.c nbd_connect_thread_sleep(uint64_t timeout) "timeout %" PRIu64 -- 2.46.0
Re: [PATCH] block/blkio: use FUA flag on write zeroes only if supported
On Thu, Aug 08, 2024 at 10:05:45AM GMT, Stefano Garzarella wrote: > libblkio supports BLKIO_REQ_FUA with write zeros requests only since > version 1.4.0, so let's inform the block layer that the blkio driver > supports it only in this case. Otherwise we can have runtime errors > as reported in https://issues.redhat.com/browse/RHEL-32878 > > Fixes: fd66dbd424 ("blkio: add libblkio block driver") > Cc: qemu-sta...@nongnu.org > Buglink: https://issues.redhat.com/browse/RHEL-32878 > Signed-off-by: Stefano Garzarella > --- > meson.build | 2 ++ > block/blkio.c | 6 -- > 2 files changed, 6 insertions(+), 2 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
On Thu, Aug 08, 2024 at 09:54:26AM GMT, Markus Armbruster wrote: > Eric Blake writes: > > > My next patch needs to convert text from an untrusted input into an > > output representation that is suitable for display on a terminal is > > useful to more than just the json-writer; the text should normally be > > UTF-8, but blindly allowing all Unicode code points (including ASCII > > ESC) through to a terminal risks remote-code-execution attacks on some > > terminals. Extract the existing body of json-writer's quoted_strinto > > a new helper routine mod_utf8_sanitize, and generalize it to also work > > on data that is length-limited rather than NUL-terminated. > > This is two changes in one patch: factor out, and generalize. > > You don't actually use the generalized variant. Please leave that for > later, and keep this patch simple. Makes sense. Will simplify in v2. > > > [I was > > actually surprised that glib does not have such a sanitizer already - > > Google turns up lots of examples of rolling your own string > > sanitizer.] See https://gitlab.gnome.org/GNOME/glib/-/issues/3426 where I asked if glib should provide a more generic sanitizer. In turn, I found glib does have: char* g_uri_escape_string ( const char* unescaped, const char* reserved_chars_allowed, gboolean allow_utf8 ) which is a bit more powerful (you have some fine-tuning on what gets escaped), but a different escaping mechanism (%XX instead of \u escapes, where % rather than \ is special). I'm not sure whether it is nicer to have a malloc'd result or to append into a larger g_string as the base operation (and where you could write an easy wrapper to provide the other operation). > > +/** > > + * mod_utf8_sanitize: > > + * @buf: Destination buffer > > + * @str: Modified UTF-8 string to sanitize > > + * > > + * Append the contents of the NUL-terminated Modified UTF-8 string > > + * @str into @buf, with escape sequences used for non-printable ASCII > > "Append into" or "append to"? "append to" works, will simplify. > > > + * and for quote characters. The result is therefore safe for output > > + * to a terminal. > > Actually, we escape double quote, backslash, ASCII control characters, > and non-ASCII characters, i.e. we leave just printable ASCII characters > other than '"' and '\\' unescaped. See the code quoted below. > > Escaping '\\' is of course necessary. > > Escaping '"' is necessary only for callers that want to enclose the > result in double quotes without ambiguity. Which is what the existing > caller wants. Future callers may prefer not to escape '"'. But we can > worry about this when we run into such callers. If we encounter more users, I could see this morphing into a backend that takes a flag argument on knobs like whether to escape " or ', whether to preserve printing Unicode, ...; coupled with frontends with fewer arguments that pass the right flags intot the backend. > > > + * > > + * Modified UTF-8 is exactly like UTF-8, except U+ is encoded as > > + * "\xC0\x80". > > + */ > > Missing: behavior for invalid sequences. Each such sequence is replaced > by a replacement character U+FFFD. For the definition of "invalid > sequence", see mod_utf8_codepoint(). Indeed, more documentation here wouldn't hurt (both on \ and " being escaped, and on U+FFFD codepoints being placed into the output stream). > > On the function name. "Sanitize" could be many things. This function > actually does two things: (1) replace invalid sequences, and (2) escape > to printable ASCII. What about append_mod_utf8_as_printable_ascii()? > Admittedly a mouthful. Naming is always tough. Your suggestion is longer, but indeed accurate. Maybe a shorter compromise of append_escaped_mod_utf8()? > > > +void mod_utf8_sanitize(GString *buf, const char *str) > > +{ > > +mod_utf8_sanitize_len(buf, str, -1); > > +} -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v4 5/7] nbd/server: CVE-2024-7409: Close stray client sockets at shutdown
On Wed, Aug 07, 2024 at 07:29:25PM GMT, Daniel P. Berrangé wrote: > On Wed, Aug 07, 2024 at 12:43:31PM -0500, Eric Blake wrote: > > A malicious client can attempt to connect to an NBD server, and then > > intentionally delay progress in the handshake, including if it does > > not know the TLS secrets. Although this behavior can be bounded by > > the max-connections parameter, the QMP nbd-server-start currently > > defaults to unlimited incoming client connections. Worse, if the I need to touch that sentence up to match the earlier patches. (The curse of rebasing late at night...) > > client waits to close the socket until after the QMP nbd-server-stop > > command is executed, qemu will then SEGV when trying to dereference > > the NULL nbd_server global which is no longer present, which amounts > > to a denial of service attack. If another NBD server is started > > before the malicious client disconnects, I cannot rule out additional > > adverse effects when the old client interferes with the connection > > count of the new server. I _think_ the effect is most likely to be an assertion failure (nbd_server->connections > 0), since we recommend compiling qemu with assertions enabled. But "most likely" is not the same as "certainty". > > > > For environments without this patch, the CVE can be mitigated by > > ensuring (such as via a firewall) that only trusted clients can > > connect to an NBD server. Note that using frameworks like libvirt > > that ensure that TLS is used and that nbd-server-stop is not executed > > while any trusted clients are still connected will only help if there > > is also no possibility for an untrusted client to open a connection > > but then stall on the NBD handshake. > > > > Given the previous patches, it would be possible to guarantee that no > > clients remain connected by having nbd-server-stop sleep for longer > > than the default handshake deadline before finally freeing the global > > nbd_server object, but that could make QMP non-responsive for a long > > time. So intead, this patch fixes the problem by tracking all client > > sockets opened while the server is running, and forcefully closing any > > such sockets remaining without a completed handshake at the time of > > nbd-server-stop, then waiting until the coroutines servicing those > > sockets notice the state change. nbd-server-stop now has a second > > AIO_WAIT_WHILE_UNLOCKED (the first is indirectly through the > > blk_exp_close_all_type() that disconnects all clients that completed > > handshakes), but forced socket shutdown is enough to progress the > > coroutines and quickly tear down all clients before the server is > > freed, thus finally fixing the CVE. > > > > This patch relies heavily on the fact that nbd/server.c guarantees > > that it only calls nbd_blockdev_client_closed() from the main loop > > (see the assertion in nbd_client_put() and the hoops used in > > nbd_client_put_nonzero() to achieve that); if we did not have that > > guarantee, we would also need a mutex protecting our accesses of the > > list of connections to survive re-entrancy from independent iothreads. > > > > Although I did not actually try to test old builds, it looks like this > > problem has existed since at least commit 862172f45c (v2.12.0, 2017) - > > even back when that patch started using a QIONetListener to handle > > listening on multiple sockets, nbd_server_free() was already unaware > > that the nbd_blockdev_client_closed callback can be reached later by a > > client thread that has not completed handshakes (and therefore the > > client's socket never got added to the list closed in > > nbd_export_close_all), despite that patch intentionally tearing down > > the QIONetListener to prevent new clients. > > > > Reported-by: Alexander Ivanov > > Fixes: CVE-2024-7409 > > Signed-off-by: Eric Blake > > --- > > blockdev-nbd.c | 35 ++- > > 1 file changed, 34 insertions(+), 1 deletion(-) > > Reviewed-by: Daniel P. Berrangé Thanks for a speedy review. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v4 3/7] nbd/server: CVE-2024-7409: Change default max-connections to 100
On Wed, Aug 07, 2024 at 07:24:56PM GMT, Daniel P. Berrangé wrote: > On Wed, Aug 07, 2024 at 12:43:29PM -0500, Eric Blake wrote: > > Allowing an unlimited number of clients to any web service is a recipe > > for a rudimentary denial of service attack: the client merely needs to > > open lots of sockets without closing them, until qemu no longer has > > any more fds available to allocate. > > > > For qemu-nbd, we default to allowing only 1 connection unless more are > > explicitly asked for (-e or --shared); this was historically picked as > > a nice default (without an explicit -t, a non-persistent qemu-nbd goes > > away after a client disconnects, without needing any additional > > follow-up commands), and we are not going to change that interface now > > (besides, someday we want to point people towards qemu-storage-daemon > > instead of qemu-nbd). > > > > But for qemu proper, the QMP nbd-server-start command has historically > > had a default of unlimited number of connections, in part because > > unlike qemu-nbd it is inherently persistent. Allowing multiple client > > sockets is particularly useful for clients that can take advantage of > > MULTI_CONN (creating parallel sockets to increase throughput), > > although known clients that do so (such as libnbd's nbdcopy) typically > > use only 8 or 16 connections (the benefits of scaling diminish once > > more sockets are competing for kernel attention). Picking a number > > large enough for typical use cases, but not unlimited, makes it > > slightly harder for a malicious client to perform a denial of service > > merely by opening lots of connections withot progressing through the > > handshake. > > > > This change does not eliminate CVE-2024-7409 on its own, but reduces > > the chance for fd exhaustion or unlimited memory usage as an attack > > surface. On the other hand, by itself, it makes it more obvious that > > with a finite limit, we have the problem of an unauthenticated client > > holding 100 fds opened as a way to block out a legitimate client from > > being able to connect; thus, later patches will further add timeouts > > to reject clients that are not making progress. > > > > Suggested-by: Daniel P. Berrangé > > Signed-off-by: Eric Blake > > --- > > qapi/block-export.json | 4 ++-- > > include/block/nbd.h| 7 +++ > > block/monitor/block-hmp-cmds.c | 3 ++- > > blockdev-nbd.c | 8 > > 4 files changed, 19 insertions(+), 3 deletions(-) > > > > diff --git a/qapi/block-export.json b/qapi/block-export.json > > index 665d5fd0262..ce33fe378df 100644 > > --- a/qapi/block-export.json > > +++ b/qapi/block-export.json > > @@ -28,7 +28,7 @@ > > # @max-connections: The maximum number of connections to allow at the > > # same time, 0 for unlimited. Setting this to 1 also stops the > > # server from advertising multiple client support (since 5.2; > > -# default: 0) > > +# default: 100) > > # > > # Since: 4.2 > > ## > > @@ -63,7 +63,7 @@ > > # @max-connections: The maximum number of connections to allow at the > > # same time, 0 for unlimited. Setting this to 1 also stops the > > # server from advertising multiple client support (since 5.2; > > -# default: 0). > > +# default: 100). > > # > > # Errors: > > # - if the server is already running > > This is considered a backwards compatibility break. > An intentional one. > > Our justification is that we believe it is the least worst option > to mitigate the DoS vector. Lets explore the reasoning for that > belief... I'll probably crib quite a bit of this analysis into the commit message. > > An alternative would be to deprecate the absence of 'max-connections', > and after the deprecation period, make it in into a mandatory > parameter, forcing apps to make a decision. This would be strictly > compliant with our QAPI change policy. > > How does that differ in impact from changing the defaults... > > * Changed default > - Small risk of breaking app if needing > 100 concurrent conns > - Immediately closes the DoS hole for all apps using QEMU Reduces, rather than closes. QEMU is no longer vulnerable to running out of fds (likely) or memory (less likely, since fds are more limited); but once there is a cap, the malicious client can still tie up 100 fds and block any other connections from proceeding. Presumably, though, in the bigger picture, if you have difficulties connecting your legitimate client (because a client has ti
Re: [PATCH v4 2/7] nbd/server: Plumb in new args to nbd_client_add()
On Wed, Aug 07, 2024 at 06:58:36PM GMT, Daniel P. Berrangé wrote: > On Wed, Aug 07, 2024 at 12:43:28PM -0500, Eric Blake wrote: > > Upcoming patches to fix a CVE need to track an opaque pointer passed > > in by the owner of a client object, as well as reequest for a time s/reequest/request/ > > limit on how fast negotiation must complete. Prepare for that by > > changing the signature of nbd_client_new() and adding an accessor to > > get at the opaque pointer, although for now the two servers > > (qemu-nbd.c and blockdev-nbd.c) do not change behavior. > > > > Suggested-by: Vladimir Sementsov-Ogievskiy > > Signed-off-by: Eric Blake > > --- > > include/block/nbd.h | 11 ++- > > blockdev-nbd.c | 6 -- > > nbd/server.c| 20 +--- > > qemu-nbd.c | 4 +++- > > 4 files changed, 34 insertions(+), 7 deletions(-) > > > > diff --git a/include/block/nbd.h b/include/block/nbd.h > > index 4e7bd6342f9..5fe14786414 100644 > > --- a/include/block/nbd.h > > +++ b/include/block/nbd.h > > @@ -33,6 +33,12 @@ typedef struct NBDMetaContexts NBDMetaContexts; > > > > extern const BlockExportDriver blk_exp_nbd; > > > > +/* > > + * NBD_DEFAULT_HANDSHAKE_LIMIT: Number of seconds in which client must > > + * succeed at NBD_OPT_GO before being forcefully dropped as too slow. > > + */ > > +#define NBD_DEFAULT_HANDSHAKE_LIMIT 10 > > Suggest > > s/NBD_DEFAULT_HANDSHAKE_LIMIT/NBD_DEFAULT_HANDSHAKE_MAX_SECS/ I like it. > > > > + > > /* Handshake phase structs - this struct is passed on the wire */ > > > > typedef struct NBDOption { > > @@ -403,9 +409,12 @@ AioContext *nbd_export_aio_context(NBDExport *exp); > > NBDExport *nbd_export_find(const char *name); > > > > void nbd_client_new(QIOChannelSocket *sioc, > > +uint32_t handshake_limit, > > s/handshake_limit/handshake_max_secs/ > > to make the units of the parameter self-documenting. > > Since this is a non-functional suggestion > > Reviewed-by: Daniel P. Berrangé Will adjust the series with the fallout. At this point, I'm leaning towards queuing 1-5 in an upcoming pull request, but leaving 6-7 for the 9.2 development cycle. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
On Mon, Aug 05, 2024 at 08:11:31PM GMT, Richard W.M. Jones wrote: > On Mon, Aug 05, 2024 at 01:48:12PM -0500, Eric Blake wrote: > > On Fri, Aug 02, 2024 at 02:26:04PM GMT, Eric Blake wrote: > > > I've requested a CVE from Red Hat, and hope to have an assigned number > > > soon. Meanwhile, we can get review started, to make sure this is > > > ready to include in 9.1. 'qemu-img info' should never print untrusted > > > data in a way that might take over a user's terminal. > > > > > > There are probably other spots where qemu-img info is printing > > > untrusted data (such as filenames), where we probably ought to use the > > > same sanitization tactics as shown here. Identifying those spots > > > would be a useful part of this review, and may mean a v2 that is even > > > more extensive in the number of patches. > > > > In fact, should we insist that 'qemu-img info XXX' refuse to accept > > any control characters on any command-line filename, and that it > > reject any backing file name with control characters as a malformed > > qcow2 file? For reference, we JUST fixed qemu-img info to change > > qcow2 files with a claimed backing file of json:... to favor the local > > file ./json:... over the potentially dangerous user-controlled > > format/protocol description, so we are _already_ changing how strict > > qemu-img is for 9.1, and adding one more restriction to avoid > > escape-sequence madness makes sense. > > > > Note that with: > > > > touch $'\e[m' && qemu-img info --output=json $'\e[m' > > > > we already escape our output, but without --output=json, we are > > vulnerable to user-controlled ESC leaking through to stdout for more > > than just the NBD server errors that I addressed in v1 of this patch > > series. Hence my question on whether v2 of the series should touch > > more places in the code, and whether doing something like flat-out > > refusing users stupid enough to embed control characters in their > > filenames is a safe change this close to release. > > You could say if someone gives you a "malicious" text file which you > cat to stdout, it could change your terminal settings. I don't think > therefore any of this is very serious. We should probably fix any > obvious things, but it doesn't need to happen right before 9.1 is > released, we can take our time. After consulting with Red Hat security, they agree: their decision at this time is that any CVE related to escape sequences taking over a terminal would best be filed against the terminal and/or shell that allowed the escape, rather than against every single app that can produce such pass-through output. So at this time they were unwilling to issue a CVE against qemu for this particular issue, and I will clean up the subject line for v2. So I agree that cleaning up low-hanging fruit is still worth it, but no longer a reason to rush this series into 9.1. If it still gets a timely positive review, I can include it alongside the other NBD patches (we are fixing a CVE with qemu hitting SEGV on nbd-server-stop). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v4 1/7] nbd: Minor style fixes
Touch up a comment with the wrong type name, and an over-long line, both noticed while working on upcoming patches. Signed-off-by: Eric Blake --- nbd/server.c | 2 +- qemu-nbd.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 892797bb111..ecd9366ba64 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1972,7 +1972,7 @@ static void nbd_export_request_shutdown(BlockExport *blk_exp) blk_exp_ref(&exp->common); /* - * TODO: Should we expand QMP NbdServerRemoveNode enum to allow a + * TODO: Should we expand QMP BlockExportRemoveMode enum to allow a * close mode that stops advertising the export to new clients but * still permits existing clients to run to completion? Because of * that possibility, nbd_export_close() can be called more than diff --git a/qemu-nbd.c b/qemu-nbd.c index d7b3ccab21c..8e104ef22c3 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -588,7 +588,8 @@ int main(int argc, char **argv) pthread_t client_thread; const char *fmt = NULL; Error *local_err = NULL; -BlockdevDetectZeroesOptions detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; +BlockdevDetectZeroesOptions detect_zeroes = +BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; QDict *options = NULL; const char *export_name = NULL; /* defaults to "" later for server mode */ const char *export_description = NULL; -- 2.45.2
[PATCH v4 5/7] nbd/server: CVE-2024-7409: Close stray client sockets at shutdown
A malicious client can attempt to connect to an NBD server, and then intentionally delay progress in the handshake, including if it does not know the TLS secrets. Although this behavior can be bounded by the max-connections parameter, the QMP nbd-server-start currently defaults to unlimited incoming client connections. Worse, if the client waits to close the socket until after the QMP nbd-server-stop command is executed, qemu will then SEGV when trying to dereference the NULL nbd_server global which is no longer present, which amounts to a denial of service attack. If another NBD server is started before the malicious client disconnects, I cannot rule out additional adverse effects when the old client interferes with the connection count of the new server. For environments without this patch, the CVE can be mitigated by ensuring (such as via a firewall) that only trusted clients can connect to an NBD server. Note that using frameworks like libvirt that ensure that TLS is used and that nbd-server-stop is not executed while any trusted clients are still connected will only help if there is also no possibility for an untrusted client to open a connection but then stall on the NBD handshake. Given the previous patches, it would be possible to guarantee that no clients remain connected by having nbd-server-stop sleep for longer than the default handshake deadline before finally freeing the global nbd_server object, but that could make QMP non-responsive for a long time. So intead, this patch fixes the problem by tracking all client sockets opened while the server is running, and forcefully closing any such sockets remaining without a completed handshake at the time of nbd-server-stop, then waiting until the coroutines servicing those sockets notice the state change. nbd-server-stop now has a second AIO_WAIT_WHILE_UNLOCKED (the first is indirectly through the blk_exp_close_all_type() that disconnects all clients that completed handshakes), but forced socket shutdown is enough to progress the coroutines and quickly tear down all clients before the server is freed, thus finally fixing the CVE. This patch relies heavily on the fact that nbd/server.c guarantees that it only calls nbd_blockdev_client_closed() from the main loop (see the assertion in nbd_client_put() and the hoops used in nbd_client_put_nonzero() to achieve that); if we did not have that guarantee, we would also need a mutex protecting our accesses of the list of connections to survive re-entrancy from independent iothreads. Although I did not actually try to test old builds, it looks like this problem has existed since at least commit 862172f45c (v2.12.0, 2017) - even back when that patch started using a QIONetListener to handle listening on multiple sockets, nbd_server_free() was already unaware that the nbd_blockdev_client_closed callback can be reached later by a client thread that has not completed handshakes (and therefore the client's socket never got added to the list closed in nbd_export_close_all), despite that patch intentionally tearing down the QIONetListener to prevent new clients. Reported-by: Alexander Ivanov Fixes: CVE-2024-7409 Signed-off-by: Eric Blake --- blockdev-nbd.c | 35 ++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 19c57897819..4e38ff46747 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -21,12 +21,18 @@ #include "io/channel-socket.h" #include "io/net-listener.h" +typedef struct NBDConn { +QIOChannelSocket *cioc; +QLIST_ENTRY(NBDConn) next; +} NBDConn; + typedef struct NBDServerData { QIONetListener *listener; QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t max_connections; uint32_t connections; +QLIST_HEAD(, NBDConn) conns; } NBDServerData; static NBDServerData *nbd_server; @@ -51,6 +57,14 @@ int nbd_server_max_connections(void) static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) { +NBDConn *conn = nbd_client_owner(client); + +assert(qemu_in_main_thread() && nbd_server); + +object_unref(OBJECT(conn->cioc)); +QLIST_REMOVE(conn, next); +g_free(conn); + nbd_client_put(client); assert(nbd_server->connections > 0); nbd_server->connections--; @@ -60,14 +74,20 @@ static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, gpointer opaque) { +NBDConn *conn = g_new0(NBDConn, 1); + +assert(qemu_in_main_thread() && nbd_server); nbd_server->connections++; +object_ref(OBJECT(cioc)); +conn->cioc = cioc; +QLIST_INSERT_HEAD(&nbd_server->conns, conn, next); nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); /* TODO - expose handshake limit as QMP option */ nbd_
[PATCH for-9.1 v4 0/7] CVE-2024-7409
v3 was here: https://lists.gnu.org/archive/html/qemu-devel/2024-08/msg00818.html since then: - re-add a minor patch from v2 (now patch 1) - refactor how the client opaque pointer is handled (patch 2) - add two new patches to prevent malicious clients from consuming inordinate resources: change the default max-connections from unlimited to capped at 100 (patch 3), and add code to kill any client that takes longer than 10 seconds after connect to reach NBD_OPT_GO (patch 4) [Dan] - squash the connection list handling into a single patch (5) [Dan] - two new additional patches for reverting back to 9.0 behavior for integration testing purposes; I'm okay if these last two miss 9.1 Eric Blake (7): nbd: Minor style fixes nbd/server: Plumb in new args to nbd_client_add() nbd/server: CVE-2024-7409: Change default max-connections to 100 nbd/server: CVE-2024-7409: Drop non-negotiating clients nbd/server: CVE-2024-7409: Close stray client sockets at shutdown qemu-nbd: Allow users to adjust handshake limit nbd/server: Allow users to adjust handshake limit in QMP docs/tools/qemu-nbd.rst| 5 +++ qapi/block-export.json | 18 +++--- include/block/nbd.h| 20 +-- block/monitor/block-hmp-cmds.c | 3 +- blockdev-nbd.c | 62 +++--- nbd/server.c | 51 +--- qemu-nbd.c | 44 nbd/trace-events | 1 + 8 files changed, 173 insertions(+), 31 deletions(-) -- 2.45.2
[PATCH v4 6/7] qemu-nbd: Allow users to adjust handshake limit
Although defaulting the handshake limit to 10 seconds was a nice QoI change to weed out intentionally slow clients, it can interfere with integration testing done with manual NBD_OPT commands over 'nbdsh --opt-mode'. Expose a command line option to allow the user to alter the timeout away from the default. This option is unlikely to be used in enough scenarios to warrant a short option letter. Signed-off-by: Eric Blake --- I'm not sure if this is 9.1 material. It is a new feature (user-visible command line option) implemented after soft freeze; on the other hand, it allows one to recover the behavior that existed prior to plugging the CVE which may be useful in integration testing. --- docs/tools/qemu-nbd.rst | 5 + qemu-nbd.c | 41 ++--- 2 files changed, 31 insertions(+), 15 deletions(-) diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst index 329f44d9895..f55c10eb39c 100644 --- a/docs/tools/qemu-nbd.rst +++ b/docs/tools/qemu-nbd.rst @@ -154,6 +154,11 @@ driver options if :option:`--image-opts` is specified. Set the NBD volume export description, as a human-readable string. +.. option:: --handshake-limit=N + + Set the timeout for a client to successfully complete its handshake + to N seconds (default 10), or 0 for no limit. + .. option:: -L, --list Connect as a client and list all details about the exports exposed by diff --git a/qemu-nbd.c b/qemu-nbd.c index 7bf86a6566b..4287a595d92 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -57,19 +57,20 @@ #define HAVE_NBD_DEVICE 0 #endif -#define SOCKET_PATH"/var/lock/qemu-nbd-%s" -#define QEMU_NBD_OPT_CACHE 256 -#define QEMU_NBD_OPT_AIO 257 -#define QEMU_NBD_OPT_DISCARD 258 -#define QEMU_NBD_OPT_DETECT_ZEROES 259 -#define QEMU_NBD_OPT_OBJECT260 -#define QEMU_NBD_OPT_TLSCREDS 261 -#define QEMU_NBD_OPT_IMAGE_OPTS262 -#define QEMU_NBD_OPT_FORK 263 -#define QEMU_NBD_OPT_TLSAUTHZ 264 -#define QEMU_NBD_OPT_PID_FILE 265 -#define QEMU_NBD_OPT_SELINUX_LABEL 266 -#define QEMU_NBD_OPT_TLSHOSTNAME 267 +#define SOCKET_PATH "/var/lock/qemu-nbd-%s" +#define QEMU_NBD_OPT_CACHE 256 +#define QEMU_NBD_OPT_AIO 257 +#define QEMU_NBD_OPT_DISCARD 258 +#define QEMU_NBD_OPT_DETECT_ZEROES 259 +#define QEMU_NBD_OPT_OBJECT 260 +#define QEMU_NBD_OPT_TLSCREDS261 +#define QEMU_NBD_OPT_IMAGE_OPTS 262 +#define QEMU_NBD_OPT_FORK263 +#define QEMU_NBD_OPT_TLSAUTHZ264 +#define QEMU_NBD_OPT_PID_FILE265 +#define QEMU_NBD_OPT_SELINUX_LABEL 266 +#define QEMU_NBD_OPT_TLSHOSTNAME 267 +#define QEMU_NBD_OPT_HANDSHAKE_LIMIT 268 #define MBR_SIZE 512 @@ -80,6 +81,7 @@ static int nb_fds; static QIONetListener *server; static QCryptoTLSCreds *tlscreds; static const char *tlsauthz; +static int handshake_limit = NBD_DEFAULT_HANDSHAKE_LIMIT; static void usage(const char *name) { @@ -101,6 +103,7 @@ static void usage(const char *name) " -v, --verbose display extra debugging information\n" " -x, --export-name=NAMEexpose export by name (default is empty string)\n" " -D, --description=TEXTexport a human-readable description\n" +" --handshake-limit=N limit client's handshake to N seconds (default 10)\n" "\n" "Exposing part of the image:\n" " -o, --offset=OFFSET offset into the image\n" @@ -390,8 +393,7 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nb_fds++; nbd_update_server_watch(); -/* TODO - expose handshake limit as command line option */ -nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_LIMIT, +nbd_client_new(cioc, handshake_limit, tlscreds, tlsauthz, nbd_client_closed, NULL); } @@ -569,6 +571,8 @@ int main(int argc, char **argv) { "object", required_argument, NULL, QEMU_NBD_OPT_OBJECT }, { "export-name", required_argument, NULL, 'x' }, { "description", required_argument, NULL, 'D' }, +{ "handshake-limit", required_argument, NULL, + QEMU_NBD_OPT_HANDSHAKE_LIMIT }, { "tls-creds", required_argument, NULL, QEMU_NBD_OPT_TLSCREDS }, { "tls-hostname", required_argument, NULL, QEMU_NBD_OPT_TLSHOSTNAME }, { "tls-authz", required_argument, NULL, QEMU_NBD_OPT_TLSAUTHZ }, @@ -815,6 +819,13 @@ int main(int argc, char **argv) case QEMU_NBD_OPT_SELINUX_LABEL: selinux_label = optarg; break; +case QEMU_NBD_OPT_HANDSHAKE_LIMIT: +if (qemu_strtoi(optarg, NULL, 0, &handshake_limit) < 0 || +handshake_limit < 0) { +error_report("Invalid handshake limit '%s'", optarg); +exit(EXIT_FAILURE); +} +break; } } -- 2.45.2
[PATCH v4 3/7] nbd/server: CVE-2024-7409: Change default max-connections to 100
Allowing an unlimited number of clients to any web service is a recipe for a rudimentary denial of service attack: the client merely needs to open lots of sockets without closing them, until qemu no longer has any more fds available to allocate. For qemu-nbd, we default to allowing only 1 connection unless more are explicitly asked for (-e or --shared); this was historically picked as a nice default (without an explicit -t, a non-persistent qemu-nbd goes away after a client disconnects, without needing any additional follow-up commands), and we are not going to change that interface now (besides, someday we want to point people towards qemu-storage-daemon instead of qemu-nbd). But for qemu proper, the QMP nbd-server-start command has historically had a default of unlimited number of connections, in part because unlike qemu-nbd it is inherently persistent. Allowing multiple client sockets is particularly useful for clients that can take advantage of MULTI_CONN (creating parallel sockets to increase throughput), although known clients that do so (such as libnbd's nbdcopy) typically use only 8 or 16 connections (the benefits of scaling diminish once more sockets are competing for kernel attention). Picking a number large enough for typical use cases, but not unlimited, makes it slightly harder for a malicious client to perform a denial of service merely by opening lots of connections withot progressing through the handshake. This change does not eliminate CVE-2024-7409 on its own, but reduces the chance for fd exhaustion or unlimited memory usage as an attack surface. On the other hand, by itself, it makes it more obvious that with a finite limit, we have the problem of an unauthenticated client holding 100 fds opened as a way to block out a legitimate client from being able to connect; thus, later patches will further add timeouts to reject clients that are not making progress. Suggested-by: Daniel P. Berrangé Signed-off-by: Eric Blake --- qapi/block-export.json | 4 ++-- include/block/nbd.h| 7 +++ block/monitor/block-hmp-cmds.c | 3 ++- blockdev-nbd.c | 8 4 files changed, 19 insertions(+), 3 deletions(-) diff --git a/qapi/block-export.json b/qapi/block-export.json index 665d5fd0262..ce33fe378df 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -28,7 +28,7 @@ # @max-connections: The maximum number of connections to allow at the # same time, 0 for unlimited. Setting this to 1 also stops the # server from advertising multiple client support (since 5.2; -# default: 0) +# default: 100) # # Since: 4.2 ## @@ -63,7 +63,7 @@ # @max-connections: The maximum number of connections to allow at the # same time, 0 for unlimited. Setting this to 1 also stops the # server from advertising multiple client support (since 5.2; -# default: 0). +# default: 100). # # Errors: # - if the server is already running diff --git a/include/block/nbd.h b/include/block/nbd.h index 5fe14786414..fd5044359dc 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -39,6 +39,13 @@ extern const BlockExportDriver blk_exp_nbd; */ #define NBD_DEFAULT_HANDSHAKE_LIMIT 10 +/* + * NBD_DEFAULT_MAX_CONNECTIONS: Number of client sockets to allow at + * once; must be large enough to allow a MULTI_CONN-aware client like + * nbdcopy to create its typical number of 8-16 sockets. + */ +#define NBD_DEFAULT_MAX_CONNECTIONS 100 + /* Handshake phase structs - this struct is passed on the wire */ typedef struct NBDOption { diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index d954bec6f1e..bdf2eb50b68 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -402,7 +402,8 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) goto exit; } -nbd_server_start(addr, NULL, NULL, 0, &local_err); +nbd_server_start(addr, NULL, NULL, NBD_DEFAULT_MAX_CONNECTIONS, + &local_err); qapi_free_SocketAddress(addr); if (local_err != NULL) { goto exit; diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 11f878b6db3..19c57897819 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -170,6 +170,10 @@ void nbd_server_start(SocketAddress *addr, const char *tls_creds, void nbd_server_start_options(NbdServerOptions *arg, Error **errp) { +if (!arg->has_max_connections) { +arg->max_connections = NBD_DEFAULT_MAX_CONNECTIONS; +} + nbd_server_start(arg->addr, arg->tls_creds, arg->tls_authz, arg->max_connections, errp); } @@ -182,6 +186,10 @@ void qmp_nbd_server_start(SocketAddressLegacy *addr, { SocketAddress *addr_flat = socket_address_flatten(addr); +if (!has_max_connections) { +max_connections = NBD_DEFAULT_MAX_CONNECTIONS; +} + nbd_server_start(addr_flat, tls_creds, tls_authz, max_connections, errp); qapi_free_SocketAddress(addr_flat); } -- 2.45.2
[PATCH v4 4/7] nbd/server: CVE-2024-7409: Drop non-negotiating clients
A client that opens a socket but does not negotiate is merely hogging qemu's resources (an open fd and a small amount of memory); and a malicious client that can access the port where NBD is listening can attempt a denial of service attack by intentionally opening and abandoning lots of unfinished connections. The previous patch put a default bound on the number of such ongoing connections, but once that limit is hit, no more clients can connect (including legitimate ones). The solution is to insist that clients complete handshake within a reasonable time limit, defaulting to 10 seconds. A client that has not successfully completed NBD_OPT_GO by then (including the case of where the client didn't know TLS credentials to even reach the point of NBD_OPT_GO) is wasting our time and does not deserve to stay connected. Later patches will allow fine-tuning the limit away from the default value (including disabling it for doing integration testing of the handshake process itself). Note that this patch in isolation actually makes it more likely to see qemu SEGV after nbd-server-stop, as any client socket still connected when the server shuts down will now be closed after 10 seconds rather than at the client's whims. That will be addressed in the next patch. For a demo of this patch in action: $ qemu-nbd -f raw -r -t -e 10 file & $ nbdsh --opt-mode -c ' H = list() for i in range(20): print(i) H.insert(i, nbd.NBD()) H[i].set_opt_mode(True) H[i].connect_uri("nbd://localhost") ' where later connections get to start progressing once earlier ones are forcefully dropped for taking too long, rather than hanging. Suggested-by: Daniel P. Berrangé Signed-off-by: Eric Blake --- nbd/server.c | 31 ++- nbd/trace-events | 1 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/nbd/server.c b/nbd/server.c index 31b77bf0d4f..a470052d957 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -132,6 +132,7 @@ struct NBDClient { QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t handshake_limit; +QEMUTimer *handshake_timer; QIOChannelSocket *sioc; /* The underlying data channel */ QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */ @@ -3186,6 +3187,14 @@ static void nbd_client_receive_next_request(NBDClient *client) } } +static void nbd_handshake_timer_cb(void *opaque) +{ +QIOChannel *ioc = opaque; + +trace_nbd_handshake_timer_cb(); +qio_channel_shutdown(ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); +} + static coroutine_fn void nbd_co_client_start(void *opaque) { NBDClient *client = opaque; @@ -3193,15 +3202,35 @@ static coroutine_fn void nbd_co_client_start(void *opaque) qemu_co_mutex_init(&client->send_lock); -/* TODO - utilize client->handshake_limit */ +/* + * Create a timer to bound the time spent in negotiation. If the + * timer expires, it is likely nbd_negotiate will fail because the + * socket was shutdown. + */ +client->handshake_timer = aio_timer_new(qemu_get_aio_context(), +QEMU_CLOCK_REALTIME, +SCALE_NS, +nbd_handshake_timer_cb, +client->sioc); +if (client->handshake_limit > 0) { +timer_mod(client->handshake_timer, + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + client->handshake_limit * NANOSECONDS_PER_SECOND); +} + if (nbd_negotiate(client, &local_err)) { if (local_err) { error_report_err(local_err); } +timer_free(client->handshake_timer); +client->handshake_timer = NULL; client_close(client, false); return; } +timer_free(client->handshake_timer); +client->handshake_timer = NULL; + WITH_QEMU_LOCK_GUARD(&client->lock) { nbd_client_receive_next_request(client); } diff --git a/nbd/trace-events b/nbd/trace-events index 00ae3216a11..cbd0a4ab7e4 100644 --- a/nbd/trace-events +++ b/nbd/trace-events @@ -76,6 +76,7 @@ nbd_co_receive_request_payload_received(uint64_t cookie, uint64_t len) "Payload nbd_co_receive_ext_payload_compliance(uint64_t from, uint64_t len) "client sent non-compliant write without payload flag: from=0x%" PRIx64 ", len=0x%" PRIx64 nbd_co_receive_align_compliance(const char *op, uint64_t from, uint64_t len, uint32_t align) "client sent non-compliant unaligned %s request: from=0x%" PRIx64 ", len=0x%" PRIx64 ", align=0x%" PRIx32 nbd_trip(void) "Reading request" +nbd_handshake_timer_cb(void) "client took too long to negotiate" # client-connection.c nbd_connect_thread_sleep(uint64_t timeout) "timeout %" PRIu64 -- 2.45.2
[PATCH v4 7/7] nbd/server: Allow users to adjust handshake limit in QMP
Although defaulting the handshake limit to 10 seconds was a nice QoI change to weed out intentionally slow clients, it can interfere with integration testing done with manual NBD_OPT commands over 'nbdsh --opt-mode'. Expose a QMP knob to allow the user to alter the timeout away from the default. Signed-off-by: Eric Blake --- I'm not sure if this is 9.1 material. It is a new feature (user-visible QMP addition) implemented after soft freeze; on the other hand, it allows one to recover the behavior that existed prior to plugging the CVE which may be useful in integration testing. --- qapi/block-export.json | 14 -- include/block/nbd.h| 2 +- block/monitor/block-hmp-cmds.c | 2 +- blockdev-nbd.c | 19 ++- 4 files changed, 28 insertions(+), 9 deletions(-) diff --git a/qapi/block-export.json b/qapi/block-export.json index ce33fe378df..b485b380f1a 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -30,13 +30,18 @@ # server from advertising multiple client support (since 5.2; # default: 100) # +# @handshake-limit: Time limit, in seconds, at which a client that +# has not completed the negotiation handshake will be disconnected, +# or 0 for no limit (since 9.1; default: 10). +# # Since: 4.2 ## { 'struct': 'NbdServerOptions', 'data': { 'addr': 'SocketAddress', '*tls-creds': 'str', '*tls-authz': 'str', -'*max-connections': 'uint32' } } +'*max-connections': 'uint32', +'*handshake-limit': 'uint32' } } ## # @nbd-server-start: @@ -65,6 +70,10 @@ # server from advertising multiple client support (since 5.2; # default: 100). # +# @handshake-limit: Time limit, in seconds, at which a client that +# has not completed the negotiation handshake will be disconnected, +# or 0 for no limit (since 9.1; default: 10). +# # Errors: # - if the server is already running # @@ -74,7 +83,8 @@ 'data': { 'addr': 'SocketAddressLegacy', '*tls-creds': 'str', '*tls-authz': 'str', -'*max-connections': 'uint32' }, +'*max-connections': 'uint32', +'*handshake-limit': 'uint32' }, 'allow-preconfig': true } ## diff --git a/include/block/nbd.h b/include/block/nbd.h index fd5044359dc..22071aea797 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -430,7 +430,7 @@ bool nbd_server_is_running(void); int nbd_server_max_connections(void); void nbd_server_start(SocketAddress *addr, const char *tls_creds, const char *tls_authz, uint32_t max_connections, - Error **errp); + uint32_t handshake_limit, Error **errp); void nbd_server_start_options(NbdServerOptions *arg, Error **errp); /* nbd_read diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c index bdf2eb50b68..a258e029f78 100644 --- a/block/monitor/block-hmp-cmds.c +++ b/block/monitor/block-hmp-cmds.c @@ -403,7 +403,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict *qdict) } nbd_server_start(addr, NULL, NULL, NBD_DEFAULT_MAX_CONNECTIONS, - &local_err); + NBD_DEFAULT_HANDSHAKE_LIMIT, &local_err); qapi_free_SocketAddress(addr); if (local_err != NULL) { goto exit; diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 4e38ff46747..ad21bd1412d 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -30,6 +30,7 @@ typedef struct NBDServerData { QIONetListener *listener; QCryptoTLSCreds *tlscreds; char *tlsauthz; +uint32_t handshake_limit; uint32_t max_connections; uint32_t connections; QLIST_HEAD(, NBDConn) conns; @@ -84,8 +85,7 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); -/* TODO - expose handshake limit as QMP option */ -nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_LIMIT, +nbd_client_new(cioc, nbd_server->handshake_limit, nbd_server->tlscreds, nbd_server->tlsauthz, nbd_blockdev_client_closed, conn); } @@ -160,7 +160,7 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp) void nbd_server_start(SocketAddress *addr, const char *tls_creds, const char *tls_authz, uint32_t max_connections, - Error **errp) + uint32_t handshake_limit, Error **errp) { if (nbd_server) { error_setg(errp, "NBD server already running")
[PATCH v4 2/7] nbd/server: Plumb in new args to nbd_client_add()
Upcoming patches to fix a CVE need to track an opaque pointer passed in by the owner of a client object, as well as reequest for a time limit on how fast negotiation must complete. Prepare for that by changing the signature of nbd_client_new() and adding an accessor to get at the opaque pointer, although for now the two servers (qemu-nbd.c and blockdev-nbd.c) do not change behavior. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Eric Blake --- include/block/nbd.h | 11 ++- blockdev-nbd.c | 6 -- nbd/server.c| 20 +--- qemu-nbd.c | 4 +++- 4 files changed, 34 insertions(+), 7 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 4e7bd6342f9..5fe14786414 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -33,6 +33,12 @@ typedef struct NBDMetaContexts NBDMetaContexts; extern const BlockExportDriver blk_exp_nbd; +/* + * NBD_DEFAULT_HANDSHAKE_LIMIT: Number of seconds in which client must + * succeed at NBD_OPT_GO before being forcefully dropped as too slow. + */ +#define NBD_DEFAULT_HANDSHAKE_LIMIT 10 + /* Handshake phase structs - this struct is passed on the wire */ typedef struct NBDOption { @@ -403,9 +409,12 @@ AioContext *nbd_export_aio_context(NBDExport *exp); NBDExport *nbd_export_find(const char *name); void nbd_client_new(QIOChannelSocket *sioc, +uint32_t handshake_limit, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)); +void (*close_fn)(NBDClient *, bool), +void *owner); +void *nbd_client_owner(NBDClient *client); void nbd_client_get(NBDClient *client); void nbd_client_put(NBDClient *client); diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 213012435f4..11f878b6db3 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -64,8 +64,10 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); -nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz, - nbd_blockdev_client_closed); +/* TODO - expose handshake limit as QMP option */ +nbd_client_new(cioc, NBD_DEFAULT_HANDSHAKE_LIMIT, + nbd_server->tlscreds, nbd_server->tlsauthz, + nbd_blockdev_client_closed, NULL); } static void nbd_update_server_watch(NBDServerData *s) diff --git a/nbd/server.c b/nbd/server.c index ecd9366ba64..31b77bf0d4f 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -124,12 +124,14 @@ struct NBDMetaContexts { struct NBDClient { int refcount; /* atomic */ void (*close_fn)(NBDClient *client, bool negotiated); +void *owner; QemuMutex lock; NBDExport *exp; QCryptoTLSCreds *tlscreds; char *tlsauthz; +uint32_t handshake_limit; QIOChannelSocket *sioc; /* The underlying data channel */ QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */ @@ -3191,6 +3193,7 @@ static coroutine_fn void nbd_co_client_start(void *opaque) qemu_co_mutex_init(&client->send_lock); +/* TODO - utilize client->handshake_limit */ if (nbd_negotiate(client, &local_err)) { if (local_err) { error_report_err(local_err); @@ -3205,14 +3208,17 @@ static coroutine_fn void nbd_co_client_start(void *opaque) } /* - * Create a new client listener using the given channel @sioc. + * Create a new client listener using the given channel @sioc and @owner. * Begin servicing it in a coroutine. When the connection closes, call - * @close_fn with an indication of whether the client completed negotiation. + * @close_fn with an indication of whether the client completed negotiation + * within @handshake_limit seconds (0 for unbounded). */ void nbd_client_new(QIOChannelSocket *sioc, +uint32_t handshake_limit, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)) +void (*close_fn)(NBDClient *, bool), +void *owner) { NBDClient *client; Coroutine *co; @@ -3225,13 +3231,21 @@ void nbd_client_new(QIOChannelSocket *sioc, object_ref(OBJECT(client->tlscreds)); } client->tlsauthz = g_strdup(tlsauthz); +client->handshake_limit = handshake_limit; client->sioc = sioc; qio_channel_set_delay(QIO_CHANNEL(sioc), false); object_ref(OBJECT(client->sioc)); client->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(client->ioc)); client->close_fn = close_fn; +client->owner = owner; co = qemu_coroutine_create(nbd_co_client_start, client); qemu_coroutine_enter(co); } + +void * +nbd_client_owner(NBDClient *client) +{ +return client->owner; +} d
Re: [PATCH v3 2/2] nbd: Clean up clients more efficiently
On Tue, Aug 06, 2024 at 10:32:54AM GMT, Daniel P. Berrangé wrote: > On Mon, Aug 05, 2024 at 09:21:36PM -0500, Eric Blake wrote: > > Since an NBD server may be long-living, serving clients that > > repeatedly connect and disconnect, it can be more efficient to clean > > up after each client disconnects, rather than storing a list of > > resources to clean up when the server exits. Rewrite the list of > > known clients to be double-linked so that we can get O(1) deletion to > > keep the list pruned to size as clients exit. This in turn requires > > each client to track an opaque pointer of owner information (although > > qemu-nbd doesn't need to refer to it). > > I tend to feel that this needs to be squashed into the previous > patch. The previous patch effectively creates unbounded memory > usage in the NBD server. ie consider a client that connects and > immediately disconnects, not sending any data, in a tight loop. > It will "leak" NBDConn & QIOChanelSocket pointers for each > iteration of the loop, only to be cleaned up when the NBD Server > is shutdown. Hmm. Offline, we observed that qemu's default of unlimited clients may not be the smartest thing - if max-connections is not set, we may want to default it to something like 100 (even when MULTI_CONN is exploited, most clients that take advantage of it will probably only use 8 or 16 parallel sockets; more than that tends to run into other limits rather than providing continual improvements). I can add such a change in default in v4. In contrast, qemu-nbd defaults to max-connections 1 for historical reasons (among others, if you have a non-persistent server, it is really nice that qemu-nbd automatically exits after the first client disconnects, rather than needing to clean up the server yourself), but that's too small for MULTI_CONN to be of any use. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v3 2/2] nbd: Clean up clients more efficiently
On Tue, Aug 06, 2024 at 10:32:54AM GMT, Daniel P. Berrangé wrote: > On Mon, Aug 05, 2024 at 09:21:36PM -0500, Eric Blake wrote: > > Since an NBD server may be long-living, serving clients that > > repeatedly connect and disconnect, it can be more efficient to clean > > up after each client disconnects, rather than storing a list of > > resources to clean up when the server exits. Rewrite the list of > > known clients to be double-linked so that we can get O(1) deletion to > > keep the list pruned to size as clients exit. This in turn requires > > each client to track an opaque pointer of owner information (although > > qemu-nbd doesn't need to refer to it). > > I tend to feel that this needs to be squashed into the previous > patch. The previous patch effectively creates unbounded memory > usage in the NBD server. ie consider a client that connects and > immediately disconnects, not sending any data, in a tight loop. > It will "leak" NBDConn & QIOChanelSocket pointers for each > iteration of the loop, only to be cleaned up when the NBD Server > is shutdown. > > Especially given that we tagged the previous commit with a CVE > and not this commit, people could be misled into backporting > only the former commit leaving them open to the leak. Makes sense. Will respin v4 along those lines; although I plan to refactor slightly: have patch 1 pick up the part of this patch that allows server.c to track a client's owner, then patch 2 be the CVE fix creating the doubly-linked list of QIOSocketChannel wrappers as owners. Also, I realized that nbd_server->connections == 0 is now effectively redundant with QLIST_EMPTY(&nbd_server->conns), so that's another cleanup to squash in. > > @@ -103,14 +112,9 @@ static void nbd_server_free(NBDServerData *server) > > */ > > qio_net_listener_disconnect(server->listener); > > object_unref(OBJECT(server->listener)); > > -while (!QSLIST_EMPTY(&server->conns)) { > > -NBDConn *conn = QSLIST_FIRST(&server->conns); > > - > > +QLIST_FOREACH_SAFE(conn, &server->conns, next, tmp) { > > qio_channel_shutdown(QIO_CHANNEL(conn->cioc), > > QIO_CHANNEL_SHUTDOWN_BOTH, > > NULL); > > -object_unref(OBJECT(conn->cioc)); > > -QSLIST_REMOVE_HEAD(&server->conns, next); > > -g_free(conn); > > } > > > > AIO_WAIT_WHILE_UNLOCKED(NULL, server->connections > 0); -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v3 2/2] nbd: Clean up clients more efficiently
On Mon, Aug 05, 2024 at 09:21:36PM GMT, Eric Blake wrote: > Since an NBD server may be long-living, serving clients that > repeatedly connect and disconnect, it can be more efficient to clean > up after each client disconnects, rather than storing a list of > resources to clean up when the server exits. Rewrite the list of > known clients to be double-linked so that we can get O(1) deletion to > keep the list pruned to size as clients exit. This in turn requires > each client to track an opaque pointer of owner information (although > qemu-nbd doesn't need to refer to it). > > Signed-off-by: Eric Blake > --- > include/block/nbd.h | 4 +++- > blockdev-nbd.c | 27 --- > nbd/server.c| 15 --- > qemu-nbd.c | 2 +- > 4 files changed, 32 insertions(+), 16 deletions(-) > > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 4e7bd6342f9..7dce9b9c35b 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -405,7 +405,9 @@ NBDExport *nbd_export_find(const char *name); > void nbd_client_new(QIOChannelSocket *sioc, > QCryptoTLSCreds *tlscreds, > const char *tlsauthz, > -void (*close_fn)(NBDClient *, bool)); > +void (*close_fn)(NBDClient *, bool), > +void *owner); > +void *nbd_client_owner(NBDClient *client); > void nbd_client_get(NBDClient *client); > void nbd_client_put(NBDClient *client); > > diff --git a/blockdev-nbd.c b/blockdev-nbd.c > index b8f00f402c6..660f89d881e 100644 > --- a/blockdev-nbd.c > +++ b/blockdev-nbd.c > @@ -23,7 +23,7 @@ > > typedef struct NBDConn { > QIOChannelSocket *cioc; > -QSLIST_ENTRY(NBDConn) next; > +QLIST_ENTRY(NBDConn) next; > } NBDConn; > > typedef struct NBDServerData { > @@ -32,10 +32,11 @@ typedef struct NBDServerData { > char *tlsauthz; > uint32_t max_connections; > uint32_t connections; > -QSLIST_HEAD(, NBDConn) conns; > +QLIST_HEAD(, NBDConn) conns; > } NBDServerData; > > static NBDServerData *nbd_server; > +static uint32_t nbd_cookie; /* Generation count of nbd_server */ Dead line leftover from v2; gone in my tree now. > static int qemu_nbd_connections = -1; /* Non-negative if this is qemu-nbd */ > > static void nbd_update_server_watch(NBDServerData *s); > @@ -57,10 +58,16 @@ int nbd_server_max_connections(void) > > static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) > { > +NBDConn *conn = nbd_client_owner(client); > + > assert(qemu_in_main_thread() && nbd_server); > > +object_unref(OBJECT(conn->cioc)); > +QLIST_REMOVE(conn, next); > +g_free(conn); > + > nbd_client_put(client); > -assert(nbd_server->connections > 0); > +assert(nbd_server && nbd_server->connections > 0); The 'nbd_server && ' in this hunk is redundant with a couple lines above. > nbd_server->connections--; > nbd_update_server_watch(nbd_server); > } > @@ -74,12 +81,12 @@ static void nbd_accept(QIONetListener *listener, > QIOChannelSocket *cioc, > nbd_server->connections++; > object_ref(OBJECT(cioc)); > conn->cioc = cioc; > -QSLIST_INSERT_HEAD(&nbd_server->conns, conn, next); > +QLIST_INSERT_HEAD(&nbd_server->conns, conn, next); > nbd_update_server_watch(nbd_server); > > qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); > nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz, > - nbd_blockdev_client_closed); > + nbd_blockdev_client_closed, conn); > } > > static void nbd_update_server_watch(NBDServerData *s) > @@ -93,6 +100,8 @@ static void nbd_update_server_watch(NBDServerData *s) > > static void nbd_server_free(NBDServerData *server) > { > +NBDConn *conn, *tmp; > + > if (!server) { > return; > } > @@ -103,14 +112,9 @@ static void nbd_server_free(NBDServerData *server) > */ > qio_net_listener_disconnect(server->listener); > object_unref(OBJECT(server->listener)); > -while (!QSLIST_EMPTY(&server->conns)) { > -NBDConn *conn = QSLIST_FIRST(&server->conns); > - > +QLIST_FOREACH_SAFE(conn, &server->conns, next, tmp) { > qio_channel_shutdown(QIO_CHANNEL(conn->cioc), > QIO_CHANNEL_SHUTDOWN_BOTH, > NULL); > -object_unref(OBJECT(conn->cioc)); > -QSLIST_REMOVE_HEAD(&server->conns, next); > -g_free(conn); > } > >
Re: [PATCH v2 1/3] nbd: CVE-XXX: Use cookie to track generation of nbd-server
On Fri, Aug 02, 2024 at 06:00:32PM GMT, Vladimir Sementsov-Ogievskiy wrote: > > -static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) > > +static void nbd_blockdev_client_closed(NBDClient *client, uint32_t cookie, > > + bool ignored) > > { > > nbd_client_put(client); > > -assert(nbd_server->connections > 0); > > -nbd_server->connections--; > > -nbd_update_server_watch(nbd_server); > > +/* Ignore any (late) connection made under a previous server */ > > +if (cookie == nbd_cookie) { > > creating a getter nbd_client_get_cookie(client), and use it instead of > passing together with client, will simplify the patch a lot. [*] > > Hmm.. don't we need some atomic accessors for nbd_cookie? and for > nbs_server->connections.. The function is called from client, which live in > coroutine and maybe in another thread? At least client code do atomic > accesses of client->refcount.. > > > +assert(nbd_server->connections > 0); > > +nbd_server->connections--; > > +nbd_update_server_watch(nbd_server); The client code already jumps through hoops to make sure nbd_blockdev_client_closed() is the last thing called, and that nbd_client_put() is only reached from the main thread (any coroutines fired off a one-shot bh); but I added asserts in v3 to make it clear I'm relying on the synchronous nature of coroutines yielding only at known points and the code executing only in the main thread as the reason why we don't need explicit locking here. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v3 1/2] nbd: CVE-2024-7409: Close stray client sockets at server shutdown
A malicious client can attempt to connect to an NBD server, and then intentionally delay progress in the handshake, including if it does not know the TLS secrets. Although this behavior can be bounded by the max-connections parameter, the QMP nbd-server-start currently defaults to unlimited incoming client connections. Worse, if the client waits to close the socket until after the QMP nbd-server-stop command is executed, qemu will then SEGV when trying to dereference the NULL nbd_server global whish is no longer present, which amounts to a denial of service attack. If another NBD server is started before the malicious client disconnects, I cannot rule out additional adverse effects when the old client interferes with the connection count of the new server. For environments without this patch, the CVE can be mitigated by ensuring (such as via a firewall) that only trusted clients can connect to an NBD server. Note that using frameworks like libvirt that ensure that TLS is used and that nbd-server-stop is not executed while any trusted clients are still connected will only help if there is also no possibility for an untrusted client to open a connection but then stall on the NBD handshake. This patch fixes the problem by tracking all client sockets opened while the server is running, and forcefully closing any such sockets remaining without a completed handshake at the time of nbd-server-stop, then waiting until the coroutines servicing those sockets notice the state change. nbd-server-stop may now take slightly longer to execute, but the extra time is independent of client response behaviors, and is generally no worse than the time already taken by the blk_exp_close_all_type() that disconnects all clients that completed handshakes (since that code also has an AIO_WAIT_WHILE_UNLOCKED). For a long-running server with lots of clients rapidly connecting and disconnecting, the memory used to track all client sockets can result in some memory overhead, but it is not a leak; the next patch will further optimize that by cleaning up memory as clients go away. At any rate, this patch in isolation is sufficient to fix the CVE. This patch relies heavily on the fact that nbd/server.c guarantees that it only calls nbd_blockdev_client_closed() from the main loop (see the assertion in nbd_client_put() and the hoops used in nbd_client_put_nonzero() to achieve that); if we did not have that guarantee, we would also need a mutex protecting our accesses of the list of connections to survive re-entrancy from independent iothreads. Although I did not actually try to test old builds, it looks like this problem has existed since at least commit 862172f45c (v2.12.0, 2017) - even back in that patch to start using a QIONetListener to handle listening on multiple sockets, nbd_server_free() was already unaware that the nbd_blockdev_client_closed callback can be reached later by a client thread that has not completed handshakes (and therefore the client's socket never got added to the list closed in nbd_export_close_all), despite that patch intentionally tearing down the QIONetListener to prevent new clients. Reported-by: Alexander Ivanov Fixes: CVE-2024-7409 Signed-off-by: Eric Blake --- blockdev-nbd.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 213012435f4..b8f00f402c6 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -21,12 +21,18 @@ #include "io/channel-socket.h" #include "io/net-listener.h" +typedef struct NBDConn { +QIOChannelSocket *cioc; +QSLIST_ENTRY(NBDConn) next; +} NBDConn; + typedef struct NBDServerData { QIONetListener *listener; QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t max_connections; uint32_t connections; +QSLIST_HEAD(, NBDConn) conns; } NBDServerData; static NBDServerData *nbd_server; @@ -51,6 +57,8 @@ int nbd_server_max_connections(void) static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) { +assert(qemu_in_main_thread() && nbd_server); + nbd_client_put(client); assert(nbd_server->connections > 0); nbd_server->connections--; @@ -60,7 +68,13 @@ static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, gpointer opaque) { +NBDConn *conn = g_new0(NBDConn, 1); + +assert(qemu_in_main_thread() && nbd_server); nbd_server->connections++; +object_ref(OBJECT(cioc)); +conn->cioc = cioc; +QSLIST_INSERT_HEAD(&nbd_server->conns, conn, next); nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); @@ -83,8 +97,24 @@ static void nbd_server_free(NBDServerData *server) return; } +/* + * Forcefully close the listener socket, and any clients that have + * not yet di
[PATCH v3 2/2] nbd: Clean up clients more efficiently
Since an NBD server may be long-living, serving clients that repeatedly connect and disconnect, it can be more efficient to clean up after each client disconnects, rather than storing a list of resources to clean up when the server exits. Rewrite the list of known clients to be double-linked so that we can get O(1) deletion to keep the list pruned to size as clients exit. This in turn requires each client to track an opaque pointer of owner information (although qemu-nbd doesn't need to refer to it). Signed-off-by: Eric Blake --- include/block/nbd.h | 4 +++- blockdev-nbd.c | 27 --- nbd/server.c| 15 --- qemu-nbd.c | 2 +- 4 files changed, 32 insertions(+), 16 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 4e7bd6342f9..7dce9b9c35b 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -405,7 +405,9 @@ NBDExport *nbd_export_find(const char *name); void nbd_client_new(QIOChannelSocket *sioc, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)); +void (*close_fn)(NBDClient *, bool), +void *owner); +void *nbd_client_owner(NBDClient *client); void nbd_client_get(NBDClient *client); void nbd_client_put(NBDClient *client); diff --git a/blockdev-nbd.c b/blockdev-nbd.c index b8f00f402c6..660f89d881e 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -23,7 +23,7 @@ typedef struct NBDConn { QIOChannelSocket *cioc; -QSLIST_ENTRY(NBDConn) next; +QLIST_ENTRY(NBDConn) next; } NBDConn; typedef struct NBDServerData { @@ -32,10 +32,11 @@ typedef struct NBDServerData { char *tlsauthz; uint32_t max_connections; uint32_t connections; -QSLIST_HEAD(, NBDConn) conns; +QLIST_HEAD(, NBDConn) conns; } NBDServerData; static NBDServerData *nbd_server; +static uint32_t nbd_cookie; /* Generation count of nbd_server */ static int qemu_nbd_connections = -1; /* Non-negative if this is qemu-nbd */ static void nbd_update_server_watch(NBDServerData *s); @@ -57,10 +58,16 @@ int nbd_server_max_connections(void) static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) { +NBDConn *conn = nbd_client_owner(client); + assert(qemu_in_main_thread() && nbd_server); +object_unref(OBJECT(conn->cioc)); +QLIST_REMOVE(conn, next); +g_free(conn); + nbd_client_put(client); -assert(nbd_server->connections > 0); +assert(nbd_server && nbd_server->connections > 0); nbd_server->connections--; nbd_update_server_watch(nbd_server); } @@ -74,12 +81,12 @@ static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, nbd_server->connections++; object_ref(OBJECT(cioc)); conn->cioc = cioc; -QSLIST_INSERT_HEAD(&nbd_server->conns, conn, next); +QLIST_INSERT_HEAD(&nbd_server->conns, conn, next); nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz, - nbd_blockdev_client_closed); + nbd_blockdev_client_closed, conn); } static void nbd_update_server_watch(NBDServerData *s) @@ -93,6 +100,8 @@ static void nbd_update_server_watch(NBDServerData *s) static void nbd_server_free(NBDServerData *server) { +NBDConn *conn, *tmp; + if (!server) { return; } @@ -103,14 +112,9 @@ static void nbd_server_free(NBDServerData *server) */ qio_net_listener_disconnect(server->listener); object_unref(OBJECT(server->listener)); -while (!QSLIST_EMPTY(&server->conns)) { -NBDConn *conn = QSLIST_FIRST(&server->conns); - +QLIST_FOREACH_SAFE(conn, &server->conns, next, tmp) { qio_channel_shutdown(QIO_CHANNEL(conn->cioc), QIO_CHANNEL_SHUTDOWN_BOTH, NULL); -object_unref(OBJECT(conn->cioc)); -QSLIST_REMOVE_HEAD(&server->conns, next); -g_free(conn); } AIO_WAIT_WHILE_UNLOCKED(NULL, server->connections > 0); @@ -119,6 +123,7 @@ static void nbd_server_free(NBDServerData *server) object_unref(OBJECT(server->tlscreds)); } g_free(server->tlsauthz); +nbd_cookie++; g_free(server); } diff --git a/nbd/server.c b/nbd/server.c index 892797bb111..90f48b42a47 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -124,6 +124,7 @@ struct NBDMetaContexts { struct NBDClient { int refcount; /* atomic */ void (*close_fn)(NBDClient *client, bool negotiated); +void *owner; QemuMutex lock; @@ -3205,14 +3206,15 @@ static coroutine_fn void nbd_co_client_start(void *opaque) } /* - * Create a new client listener using the given channel @sioc. + * Create a new client listener using the given channel
[PATCH for-9.1 v3 0/2] NBD CVE-2024-7409
v2 was here: https://lists.gnu.org/archive/html/qemu-devel/2024-08/msg00253.html Since then: - CVE number assigned - drop old patch 1. Instead of tracking nbd_server generation, the code now ensures that nbd_server can't be set to NULL until all clients have disconnected - rewrite to force qio shutdown coupled with AIO_WAIT to ensure all clients actually disconnect quickly (from the server's perspective. A client may still hold its socket open longer, but will eventually see EPIPE or EOF when finally using it) - patch 2 is optional, although I like the notion of a doubly-linked list (where the client has to remember an opaque pointer) over a singly-linked one (where the client is unchanged, but a lot of repeated client connect/disconnect over a long-lived server can chew up memory and slow down the eventual nbd-server-stop) Eric Blake (2): nbd: CVE-2024-7409: Close stray client sockets at server shutdown nbd: Clean up clients more efficiently include/block/nbd.h | 4 +++- blockdev-nbd.c | 39 +-- nbd/server.c| 15 --- qemu-nbd.c | 2 +- 4 files changed, 53 insertions(+), 7 deletions(-) -- 2.45.2
Re: [PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
On Fri, Aug 02, 2024 at 02:26:04PM GMT, Eric Blake wrote: > I've requested a CVE from Red Hat, and hope to have an assigned number > soon. Meanwhile, we can get review started, to make sure this is > ready to include in 9.1. 'qemu-img info' should never print untrusted > data in a way that might take over a user's terminal. > > There are probably other spots where qemu-img info is printing > untrusted data (such as filenames), where we probably ought to use the > same sanitization tactics as shown here. Identifying those spots > would be a useful part of this review, and may mean a v2 that is even > more extensive in the number of patches. In fact, should we insist that 'qemu-img info XXX' refuse to accept any control characters on any command-line filename, and that it reject any backing file name with control characters as a malformed qcow2 file? For reference, we JUST fixed qemu-img info to change qcow2 files with a claimed backing file of json:... to favor the local file ./json:... over the potentially dangerous user-controlled format/protocol description, so we are _already_ changing how strict qemu-img is for 9.1, and adding one more restriction to avoid escape-sequence madness makes sense. Note that with: touch $'\e[m' && qemu-img info --output=json $'\e[m' we already escape our output, but without --output=json, we are vulnerable to user-controlled ESC leaking through to stdout for more than just the NBD server errors that I addressed in v1 of this patch series. Hence my question on whether v2 of the series should touch more places in the code, and whether doing something like flat-out refusing users stupid enough to embed control characters in their filenames is a safe change this close to release. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH 2/2] qemu-img: CVE-XXX Sanitize untrusted output from NBD server
Error messages from an NBD server must be treated as untrusted; a malicious server can inject escape sequences to try and trigger RCE flaws via escape sequences to whatever terminal happens to be running qemu-img. The easiest solution is to sanitize the output with the same code we use to produce sanitized (pseudo-)JSON over QMP. Rich Jones originally pointed this flaw out at: https://lists.libguestfs.org/archives/list/gues...@lists.libguestfs.org/thread/2NXA23G2V3HPWJYAO726PLNBEAAEUJAU/ With this patch, and a malicious server run with nbdkit 1.40 as: $ nbdkit --log=null eval open=' printf \ "EPERM x\\r mess up the output \e[31mmess up the output\e[m mess up" >&2; \ exit 1 ' get_size=' echo 0 ' --run 'qemu-img info "$uri"' we now get: qemu-img: Could not open 'nbd://localhost': Requested export not available server reported: /tmp/nbdkitOZHOKB/open: x\r mess up the output \u001B[31mmess up the output\u001B[m mess up instead of an attempt to hide the name of the Unix socket and forcing the terminal to render part of the text red. Note that I did _not_ sanitize the string being sent through trace-events in trace_nbd_server_error_msg; this is because I assume that our trace engines already treat all string strings as untrusted input and apply their own escaping as needed. Reported-by: "Richard W.M. Jones" Signed-off-by: Eric Blake --- If my assumption about allowing raw escape bytes through to trace_ calls is wrong (such as when tracing to stderr), let me know. That's a much bigger audit to determine which trace points, if any, should sanitize data before tracing, and/or change the trace engines to sanitize all strings (with possible knock-on effects if trace output changes unexpectedly for a tool expecting something unsanitized). --- nbd/client.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/nbd/client.c b/nbd/client.c index c89c7504673..baa20d10d69 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -23,6 +23,7 @@ #include "trace.h" #include "nbd-internal.h" #include "qemu/cutils.h" +#include "qemu/unicode.h" /* Definitions for opaque data types */ @@ -230,7 +231,9 @@ static int nbd_handle_reply_err(QIOChannel *ioc, NBDOptionReply *reply, } if (msg) { -error_append_hint(errp, "server reported: %s\n", msg); +g_autoptr(GString) buf = g_string_sized_new(reply->length); +mod_utf8_sanitize(buf, msg); +error_append_hint(errp, "server reported: %s\n", buf->str); } err: -- 2.45.2
[PATCH 1/2] util: Refactor json-writer's string sanitizer to be public
My next patch needs to convert text from an untrusted input into an output representation that is suitable for display on a terminal is useful to more than just the json-writer; the text should normally be UTF-8, but blindly allowing all Unicode code points (including ASCII ESC) through to a terminal risks remote-code-execution attacks on some terminals. Extract the existing body of json-writer's quoted_strinto a new helper routine mod_utf8_sanitize, and generalize it to also work on data that is length-limited rather than NUL-terminated. [I was actually surprised that glib does not have such a sanitizer already - Google turns up lots of examples of rolling your own string sanitizer.] If desired in the future, we may want to tweak whether the output is guaranteed to be ASCII (using lots of \u escape sequences, including surrogate pairs for code points outside the BMP) or if we are okay passing printable Unicode through (we still need to escape control characters). But for now, I went for minimal code churn, including the fact that the resulting function allows a non-UTF-8 2-byte synonym for U+. Signed-off-by: Eric Blake --- include/qemu/unicode.h | 3 ++ qobject/json-writer.c | 47 +-- util/unicode.c | 84 ++ 3 files changed, 88 insertions(+), 46 deletions(-) diff --git a/include/qemu/unicode.h b/include/qemu/unicode.h index 7fa10b8e604..e1013b29f72 100644 --- a/include/qemu/unicode.h +++ b/include/qemu/unicode.h @@ -4,4 +4,7 @@ int mod_utf8_codepoint(const char *s, size_t n, char **end); ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint); +void mod_utf8_sanitize(GString *buf, const char *str); +void mod_utf8_sanitize_len(GString *buf, const char *str, ssize_t len); + #endif diff --git a/qobject/json-writer.c b/qobject/json-writer.c index 309a31d57a9..5c14574efee 100644 --- a/qobject/json-writer.c +++ b/qobject/json-writer.c @@ -104,53 +104,8 @@ static void pretty_newline_or_space(JSONWriter *writer) static void quoted_str(JSONWriter *writer, const char *str) { -const char *ptr; -char *end; -int cp; - g_string_append_c(writer->contents, '"'); - -for (ptr = str; *ptr; ptr = end) { -cp = mod_utf8_codepoint(ptr, 6, &end); -switch (cp) { -case '\"': -g_string_append(writer->contents, "\\\""); -break; -case '\\': -g_string_append(writer->contents, ""); -break; -case '\b': -g_string_append(writer->contents, "\\b"); -break; -case '\f': -g_string_append(writer->contents, "\\f"); -break; -case '\n': -g_string_append(writer->contents, "\\n"); -break; -case '\r': -g_string_append(writer->contents, "\\r"); -break; -case '\t': -g_string_append(writer->contents, "\\t"); -break; -default: -if (cp < 0) { -cp = 0xFFFD; /* replacement character */ -} -if (cp > 0x) { -/* beyond BMP; need a surrogate pair */ -g_string_append_printf(writer->contents, "\\u%04X\\u%04X", - 0xD800 + ((cp - 0x1) >> 10), - 0xDC00 + ((cp - 0x1) & 0x3FF)); -} else if (cp < 0x20 || cp >= 0x7F) { -g_string_append_printf(writer->contents, "\\u%04X", cp); -} else { -g_string_append_c(writer->contents, cp); -} -} -}; - +mod_utf8_sanitize(writer->contents, str); g_string_append_c(writer->contents, '"'); } diff --git a/util/unicode.c b/util/unicode.c index 8580bc598b3..a419ed4de35 100644 --- a/util/unicode.c +++ b/util/unicode.c @@ -154,3 +154,87 @@ ssize_t mod_utf8_encode(char buf[], size_t bufsz, int codepoint) buf[4] = 0; return 4; } + +/** + * mod_utf8_sanitize: + * @buf: Destination buffer + * @str: Modified UTF-8 string to sanitize + * + * Append the contents of the NUL-terminated Modified UTF-8 string + * @str into @buf, with escape sequences used for non-printable ASCII + * and for quote characters. The result is therefore safe for output + * to a terminal. + * + * Modified UTF-8 is exactly like UTF-8, except U+ is encoded as + * "\xC0\x80". + */ +void mod_utf8_sanitize(GString *buf, const char *str) +{ +mod_utf8_sanitize_len(buf, str, -1); +} + +/** + * mod_utf8_sanitize: + * @buf: Destination buffer + * @str: Modified UTF-8 string to sanitize + * @len: Length of @str, or negative to stop at NUL terminator + * + * Append
[PATCH for-9.1 0/2] NBD: don't print raw server error text to terminal
I've requested a CVE from Red Hat, and hope to have an assigned number soon. Meanwhile, we can get review started, to make sure this is ready to include in 9.1. 'qemu-img info' should never print untrusted data in a way that might take over a user's terminal. There are probably other spots where qemu-img info is printing untrusted data (such as filenames), where we probably ought to use the same sanitization tactics as shown here. Identifying those spots would be a useful part of this review, and may mean a v2 that is even more extensive in the number of patches. In patch 1, I created mod_utf8_sanitize_len(), with the intent that I could sanitize just a prefix of a string without having to copy it into a NUL-terminated buffer. I didn't end up needing it in my current version of patch 2 (since the code was already copying to a NUL-terminated buffer for trace purposes), but we may find uses for it; in fact, it raises the question of whether any of our trace_ calls need to sanitize untrusted data (or whether we can rely on ALL trace engines to be doing that on our behalf, already). Eric Blake (2): util: Refactor json-writer's string sanitizer to be public qemu-img: CVE-XXX Sanitize untrusted output from NBD server include/qemu/unicode.h | 3 ++ nbd/client.c | 5 ++- qobject/json-writer.c | 47 +-- util/unicode.c | 84 ++ 4 files changed, 92 insertions(+), 47 deletions(-) -- 2.45.2
Re: [PATCH v2 1/3] nbd: CVE-XXX: Use cookie to track generation of nbd-server
On Fri, Aug 02, 2024 at 06:00:32PM GMT, Vladimir Sementsov-Ogievskiy wrote: > On 02.08.24 04:32, Eric Blake wrote: > [..] > > > -static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) > > +static void nbd_blockdev_client_closed(NBDClient *client, uint32_t cookie, > > + bool ignored) > > { > > nbd_client_put(client); > > -assert(nbd_server->connections > 0); > > -nbd_server->connections--; > > -nbd_update_server_watch(nbd_server); > > +/* Ignore any (late) connection made under a previous server */ > > +if (cookie == nbd_cookie) { > > creating a getter nbd_client_get_cookie(client), and use it instead of > passing together with client, will simplify the patch a lot. [*] I may be able to avoid the cookie altogether if I can add an AIO_WAIT_WHILE(, nbd_server->connections > 0) after forcefully closing all of the client sockets (nbd_client_new _should_ progress pretty rapidly towards eventually calling nbd_blockdev_client_closed once the socket is closed) - but that still requires patch 2 to keep a list of open clients. > > Hmm.. don't we need some atomic accessors for nbd_cookie? and for > nbs_server->connections.. The function is called from client, which live in > coroutine and maybe in another thread? At least client code do atomic > accesses of client->refcount.. > > > +assert(nbd_server->connections > 0); > > +nbd_server->connections--; > > +nbd_update_server_watch(nbd_server); > > +} > > } > > > > [..] > > > @@ -1621,7 +1622,7 @@ static void client_close(NBDClient *client, bool > > negotiated) > > > > /* Also tell the client, so that they release their reference. */ > > if (client->close_fn) { > > -client->close_fn(client, negotiated); > > +client->close_fn(client, client->close_cookie, negotiated); > > [*] passing client->close_cokkie together with client itself looks like we > lack a getter for .close_cookie Whether the cookie be a uint32_t or the void* server object itself, it is opaque to the client, but the server needs to track something. > > > } > > } > > > > [..] > > > Hmm, instead of cookies and additional NBDConn objects in the next patch, > could we simply have a list of connected NBDClient objects in NBDServer and > link to NBDServer in NBDClient? (Ok we actually don't have NBDServer, but > NBDServerData in qemu, and several global variables in qemu-nbd, so some > refactoring is needed, to put common state to NBDServer, and add clients list > to it) > > This way, in nbd_server_free we'll just call client_close() in a loop. And in > client_close we'll have nbd_server_client_detach(client->server, client), > instead of client->close_fn(...). And server is freed only after all clients > are closed. And client never try to detach from another server. > > This way, we also may implement several NBD servers working simultaneously if > we want. Yes, we do eventually want to get to the point of being able to open parallel NBD servers on different ports simultaneously, at which point having a client remember which server it is associated makes sense (so at a bare minimum, pass in a void* instead of a uint32_t to nbd_client_new). And given that we can have an NBD server with more than one export, and with exports running in different threads (if multithread io is enabled), I probably also need to add missing locking to protect nbd_server (whether or not it stays global or we eventually reach the point of having parallel servers on separate ports). Looks like I have work cut out for me before posting a v3. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 1/3] nbd: CVE-XXX: Use cookie to track generation of nbd-server
On Thu, Aug 01, 2024 at 08:32:06PM GMT, Eric Blake wrote: > As part of the QMP command nbd-server-start, the blockdev code was > creating a single global nbd_server object, and telling the qio code > to accept one or more client connections to the exposed listener > socket. But even though we tear down the listener socket during a > subsequent QMP nbd-server-stop, the qio code has handed off to a > coroutine that may be executing asynchronously with the server > shutdown, such that a client that connects to the socket before > nbd-server-stop but delays disconnect or completion of the NBD > handshake until after the followup QMP command can result in the > nbd_blockdev_client_closed() callback running after nbd_server has > already been torn down, causing a SEGV. Worse, if a second nbd_server > object is created (possibly on a different socket address), a late > client resolution from the first server can subtly interfere with the > second server. This can be abused by a malicious client that does not > know TLS secrets to cause qemu to SEGV after a legitimate client has > concluded storage migration to a destination qemu, which can be turned > into a denial of service attack even when qemu is set up to require > only TLS clients. Now assigned CVE-2024-7409; I'll make sure that is mentioned in the commit message, whether a v2 is needed or whether this gets a positive review as-is modulo the reference. I have not (yet?) determined how long the bug has existed, but note that the code in question appears unchanged since at least qemu 6.0 (from 2021), and probably predates that, so it is long-standing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v2 for-9.1 0/3] Avoid NBD crash on nbd-server-stop
v1 was here: https://lists.gnu.org/archive/html/qemu-devel/2024-06/msg01609.html Since then, I've applied with Red Hat to get a CVE assigned (a client not using TLS should never be able to cause the destination qemu to crash during live storage migration), and will update to include that number once it is assigned. But since the issue has already been on list, I see no need to further embargo this patch attempt. I've also concluded from my audit that Alexander's initial hunch was right: there's nothing simple we can do to prevent qio coroutines from reaching our callback into nbd_blockdev_client_closed() after a QMP nbd-server-stop (I could not find a graceful way to close the qio channel and then forcefully wait for the coroutine to finish), so that function has to be prepared to handle late clients, but the fix in patch 1 is a bit more involved than Alexander's attempt since we must also not corrupt a subsequent nbd-server-start. Patch 2 is insufficient on its own to prevent the problems, but reduces the amount of time that a client can hang on to server resources after nbd-server-stop (the client will see EPIPE rather than being able to carry on a prolonged NBD_OPT_* conversation). [I'm also aware of some Coverity analysis pointing to potential race conditions in block/nbd.c; if those need fixes, I hope to also post patches for those in time for inclusion in the same pull request that picks up this series] Eric Blake (3): nbd: CVE-XXX: Use cookie to track generation of nbd-server nbd: CVE-XXX: Close stray client sockets at server shutdown nbd: Minor style fixes include/block/nbd.h | 3 ++- blockdev-nbd.c | 39 ++- nbd/server.c| 14 +- qemu-nbd.c | 8 +--- 4 files changed, 50 insertions(+), 14 deletions(-) -- 2.45.2
[PATCH v2 2/3] nbd: CVE-XXX: Close stray client sockets at server shutdown
A malicious client can attempt to connect to an NBD server, and then intentionally not progress in the handshake. This can form a resource leak for as long as the client wants, when the client does not catch our attention by failing to provide TLS credentials; although it can be bounded by max-connections per NBD server that does not use the default of unlimited. Better is to forcefully disconnect any remaining client sockets when the NBD server is shut down. While less severe than the issue fixed in the previous patch, this can still be considered defense against a potential denial of service attack, so it is still categorized under the same CVE. Signed-off-by: Eric Blake --- I do not know if I need to worry about multi-threaded access (is it possible that more than one client trying to connect simultaneously means that I need to access nbd_server->conns atomically)? blockdev-nbd.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 1ddcbd7b247..9bbd86ebc31 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -21,12 +21,18 @@ #include "io/channel-socket.h" #include "io/net-listener.h" +typedef struct NBDConn { +QIOChannelSocket *cioc; +QSLIST_ENTRY(NBDConn) next; +} NBDConn; + typedef struct NBDServerData { QIONetListener *listener; QCryptoTLSCreds *tlscreds; char *tlsauthz; uint32_t max_connections; uint32_t connections; +QSLIST_HEAD(, NBDConn) conns; } NBDServerData; static NBDServerData *nbd_server; @@ -65,8 +71,13 @@ static void nbd_blockdev_client_closed(NBDClient *client, uint32_t cookie, static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, gpointer opaque) { +NBDConn *conn = g_new0(NBDConn, 1); + assert(nbd_server); nbd_server->connections++; +object_ref(OBJECT(cioc)); +conn->cioc = cioc; +QSLIST_INSERT_HEAD(&nbd_server->conns, conn, next); nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); @@ -89,8 +100,21 @@ static void nbd_server_free(NBDServerData *server) return; } +/* + * Forcefully close the listener socket, and any clients that have + * not yet disconnected on their own. + */ qio_net_listener_disconnect(server->listener); object_unref(OBJECT(server->listener)); +while (!QSLIST_EMPTY(&server->conns)) { +NBDConn *conn = QSLIST_FIRST(&server->conns); + +qio_channel_close(QIO_CHANNEL(conn->cioc), NULL); +object_unref(OBJECT(conn->cioc)); +QSLIST_REMOVE_HEAD(&server->conns, next); +g_free(conn); +} if (server->tlscreds) { object_unref(OBJECT(server->tlscreds)); } -- 2.45.2
[PATCH v2 3/3] nbd: Minor style fixes
Touch up a comment with the wrong type name, and an over-long line, both noticed while working on the previous patches. Signed-off-by: Eric Blake --- nbd/server.c | 2 +- qemu-nbd.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 7c37d9753f0..3debc416dd0 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1973,7 +1973,7 @@ static void nbd_export_request_shutdown(BlockExport *blk_exp) blk_exp_ref(&exp->common); /* - * TODO: Should we expand QMP NbdServerRemoveNode enum to allow a + * TODO: Should we expand QMP BlockExportRemoveMode enum to allow a * close mode that stops advertising the export to new clients but * still permits existing clients to run to completion? Because of * that possibility, nbd_export_close() can be called more than diff --git a/qemu-nbd.c b/qemu-nbd.c index 3ad50eec18e..c0bd16217cd 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -589,7 +589,8 @@ int main(int argc, char **argv) pthread_t client_thread; const char *fmt = NULL; Error *local_err = NULL; -BlockdevDetectZeroesOptions detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; +BlockdevDetectZeroesOptions detect_zeroes = +BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF; QDict *options = NULL; const char *export_name = NULL; /* defaults to "" later for server mode */ const char *export_description = NULL; -- 2.45.2
[PATCH v2 1/3] nbd: CVE-XXX: Use cookie to track generation of nbd-server
As part of the QMP command nbd-server-start, the blockdev code was creating a single global nbd_server object, and telling the qio code to accept one or more client connections to the exposed listener socket. But even though we tear down the listener socket during a subsequent QMP nbd-server-stop, the qio code has handed off to a coroutine that may be executing asynchronously with the server shutdown, such that a client that connects to the socket before nbd-server-stop but delays disconnect or completion of the NBD handshake until after the followup QMP command can result in the nbd_blockdev_client_closed() callback running after nbd_server has already been torn down, causing a SEGV. Worse, if a second nbd_server object is created (possibly on a different socket address), a late client resolution from the first server can subtly interfere with the second server. This can be abused by a malicious client that does not know TLS secrets to cause qemu to SEGV after a legitimate client has concluded storage migration to a destination qemu, which can be turned into a denial of service attack even when qemu is set up to require only TLS clients. For environments without this patch, the CVE can be mitigated by ensuring (such as via a firewall) that only trusted clients can connect to an NBD server; using frameworks like libvirt that ensure that nbd-server-stop is not executed while any trusted clients are still connected will only help if there is also no possibility for an untrusted client to open a connection but then stall on the NBD handshake. Fix this by passing a cookie through to each client connection (whether or not that client actually knows the TLS details to successfully negotiate); then increment the cookie every time a server is torn down so that we can recognize any late client actions lingering from an old server. This patch does not address the fact that client sockets can be left open indefinitely after the server is torn down (possibly creating a resource leak, if a malicious client intentionally leaves lots of open sockets paused partway through NBD negotiation); the next patch will add some code to forcefully close any lingering clients as soon as possible when the server is torn down. Reported-by: Alexander Ivanov Signed-off-by: Eric Blake --- include/block/nbd.h | 3 ++- blockdev-nbd.c | 17 - nbd/server.c| 12 qemu-nbd.c | 5 +++-- 4 files changed, 25 insertions(+), 12 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 4e7bd6342f9..9c43bcf8a26 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -405,7 +405,8 @@ NBDExport *nbd_export_find(const char *name); void nbd_client_new(QIOChannelSocket *sioc, QCryptoTLSCreds *tlscreds, const char *tlsauthz, -void (*close_fn)(NBDClient *, bool)); +void (*close_fn)(NBDClient *, uint32_t, bool), +uint32_t cookie); void nbd_client_get(NBDClient *client); void nbd_client_put(NBDClient *client); diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 213012435f4..1ddcbd7b247 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -30,6 +30,7 @@ typedef struct NBDServerData { } NBDServerData; static NBDServerData *nbd_server; +static uint32_t nbd_cookie; /* Generation count of nbd_server */ static int qemu_nbd_connections = -1; /* Non-negative if this is qemu-nbd */ static void nbd_update_server_watch(NBDServerData *s); @@ -49,23 +50,28 @@ int nbd_server_max_connections(void) return nbd_server ? nbd_server->max_connections : qemu_nbd_connections; } -static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) +static void nbd_blockdev_client_closed(NBDClient *client, uint32_t cookie, + bool ignored) { nbd_client_put(client); -assert(nbd_server->connections > 0); -nbd_server->connections--; -nbd_update_server_watch(nbd_server); +/* Ignore any (late) connection made under a previous server */ +if (cookie == nbd_cookie) { +assert(nbd_server->connections > 0); +nbd_server->connections--; +nbd_update_server_watch(nbd_server); +} } static void nbd_accept(QIONetListener *listener, QIOChannelSocket *cioc, gpointer opaque) { +assert(nbd_server); nbd_server->connections++; nbd_update_server_watch(nbd_server); qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server"); nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz, - nbd_blockdev_client_closed); + nbd_blockdev_client_closed, nbd_cookie); } static void nbd_update_server_watch(NBDServerData *s) @@ -89,6 +95,7 @@ static void nbd_server_free(NBDServerData *server) object_unref(OBJECT(server->tlscreds)); } g_free(server->tlsauthz); +nbd_cooki
Re: [PATCH] iotests/024: exclude 'backing file format' field from the output
On Tue, Jul 30, 2024 at 12:47:01PM GMT, Andrey Drobyshev wrote: > Apparently 'qemu-img info' doesn't report the backing file format field > for qed (as it does for qcow2): > > $ qemu-img create -f qed base.qed 1M && qemu-img create -f qed -b base.qed -F > qed top.qed 1M > $ qemu-img create -f qcow2 base.qcow2 1M && qemu-img create -f qcow2 -b > base.qcow2 -F qcow2 top.qcow2 1M > $ qemu-img info top.qed | grep 'backing file format' > $ qemu-img info top.qcow2 | grep 'backing file format' > backing file format: qcow2 > > This leads to the 024 test failure with -qed. Let's just filter the > field out and exclude it from the output. > > This is a fixup for the commit f93e65ee51 ("iotests/{024, 271}: add > testcases for qemu-img rebase"). > > Found-by: Thomas Huth > Signed-off-by: Andrey Drobyshev > --- > tests/qemu-iotests/024 | 2 +- > tests/qemu-iotests/024.out | 1 - > 2 files changed, 1 insertion(+), 2 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v3] scripts/qcow2-to-stdout.py: Add script to write qcow2 images to stdout
t; +break > +raise err > + > + > +# write_qcow2_content() expects a raw input file. If we have a different > +# format we can use qemu-storage-daemon to make it appear as raw. > +@contextmanager > +def get_input_as_raw_file(input_file, input_format): > +if input_format == "raw": > +yield input_file > +return > +try: > +temp_dir = tempfile.mkdtemp() > +pid_file = os.path.join(temp_dir, "pid") > +raw_file = os.path.join(temp_dir, "raw") > +open(raw_file, "wb").close() > +ret = subprocess.run( > +[ > +QEMU_STORAGE_DAEMON, > +"--daemonize", > +"--pidfile", pid_file, > +"--blockdev", > f"driver=file,node-name=file0,driver=file,filename={input_file},read-only=on", > +"--blockdev", > f"driver={input_format},node-name=disk0,file=file0,read-only=on", > +"--export", > f"type=fuse,id=export0,node-name=disk0,mountpoint={raw_file},writable=off", > +], > +capture_output=True, > +) Does q-s-d exposing an image as raw still support lseek(SEEK_HOLE) efficiently? > +parser.add_argument( > +"-v", > +dest="qcow2_version", > +metavar="qcow2_version", > +help=f"qcow2 version (default: {QCOW2_DEFAULT_VERSION})", > +default=QCOW2_DEFAULT_VERSION, > +type=int, > +choices=[2, 3], Is it really worth trying to create v2 images? These days, v3 images are hands down better, and we should be encouraging people to upgrade their tools to v3 all around, rather than making it easy to still consume v2 images. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
On Fri, Jun 28, 2024 at 11:58:37AM GMT, Alexander Ivanov wrote: > Ping? > > On 6/7/24 17:00, Alexander Ivanov wrote: > > static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) > > { > > nbd_client_put(client); > > +if (nbd_server == NULL) { > > +return; > > +} > > assert(nbd_server->connections > 0); > > nbd_server->connections--; > > nbd_update_server_watch(nbd_server); > I've spent more time looking at this, and I'm stumped. Maybe Dan can help out. If I understand correctly, here's the race situation we have: qemu main loopcoroutineclient QMP nbd-server-start - calls nbd_server_start - assigns nbd_server = g_new0() - calls qio_net_listener_open_sync - calls nbd_server_update_watch - calls qio_net_listener_set_client_func(, nbd_accept, ) - waiting for client is now in coroutine - returns control to QMP main loop opens TCP socket - qio notices incoming connection - calls nbd_accept callback - starts NBD handshake nbd_client_new(, nbd_blockdev_client_closed) QMP nbd-server-stop - calls nbd_server_stop - calls nbd_server_free - calls qio_net_listener_disconnect() - marks qio channel as useless - sets nbd_server = NULL - qio sees channel is useless, disconnects client deals gracefully with dead connection - calls nbd_blockdev_client_closed callback - segfault since nbd_server->connections derefs NULL Thread 1 "qemu-kvm" received signal SIGSEGV, Segmentation fault. 0x5600af59 in nbd_blockdev_client_closed (client=0x5810dfc0, ignored=false) at ../blockdev-nbd.c:56 56 assert(nbd_server->connections > 0); (gdb) p nbd_server $1 = (NBDServerData *) 0x0 (gdb) bt #0 0x5600af59 in nbd_blockdev_client_closed (client=0x5810dfc0, ignored=false) at ../blockdev-nbd.c:56 #1 0x55ff72f9 in client_close (client=0x5810dfc0, negotiated=false) at ../nbd/server.c:1624 #2 0x55ffbc49 in nbd_co_client_start (opaque=0x5810dfc0) at ../nbd/server.c:3198 #3 0x56220629 in coroutine_trampoline (i0=1488434896, i1=21845) Worse, the race could go another way: if another QMP nbd-server-start is called fast enough before the coroutine finishes the nbd_accept code from the first connection, it could attempt to modify the second nbd_server object, which may be associated with a completely different socket. As far as I can tell, the problem stems from the fact that the attempt to establish the connection with the client is continuing in a background coroutine despite our efforts to honor the QMP nbd-server-stop command. But I'm not sure on the proper way to inform qio to abandon all incoming traffic to a given net listener. Or maybe I should just change the semantics of QMP nbd-server-stop to fail if there are known connections from and nbd_accept() - but I still don't want that to wait indefinitely, so I still want some way to tell the qio channel that the server is going away and to tear down sockets as soon as possible. As a stopgap, something like this might prevent the SEGV: diff --git i/blockdev-nbd.c w/blockdev-nbd.c index 213012435f4..2f5ea094407 100644 --- i/blockdev-nbd.c +++ w/blockdev-nbd.c @@ -277,6 +277,12 @@ void qmp_nbd_server_stop(Error **errp) blk_exp_close_all_type(BLOCK_EXPORT_TYPE_NBD); +if (qio_net_listener_is_connected(nbd_server->listener) || +nbd_server->connections > 0) { +error_setg(errp, "NBD server still has connected clients"); +return; +} + nbd_server_free(nbd_server); nbd_server = NULL; } but it's not as graceful as I'd like (it would be nicer to have the nbd-server-stop command wait until it knows all connections are cleaned, instead of failing up front). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: query dirty areas according to bitmap via QMP or qemu-nbd
On Mon, Jul 29, 2024 at 03:51:17PM GMT, Fiona Ebner wrote: > > In particular, tests/qemu-img-bitmaps gives the magic decoder ring: > > > > | # x-dirty-bitmap is a hack for reading bitmaps; it abuses block status to > > | # report "data":false for portions of the bitmap which are set > > | IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket" > > | nbd_server_start_unix_socket -r -f qcow2 \ > > | -B b0 -B b1 -B b2 -B b3 "$TEST_IMG" > > | $QEMU_IMG map --output=json --image-opts \ > > | "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b0" | _filter_qemu_img_map > > > > meaning the map output includes "data":false for the dirty portions > > and "data":true for the unchanged portions recorded in bitmap b0 as > > read from the JSON map output. > > > > Oh, I didn't think about checking the NBD block driver for such an > option :) And thank you for all the explanations! Glad it helped. The 'x-' in the name is a giveaway that the command is not intended for public use, but for development, it makes iotests easier to write, so I see no reason to drop it. > > >> > >> If it is not currently possible, would upstream be interested too in the > >> feature, either for QMP or qemu-nbd? > > > > Improving qemu-img to get at the information without quite the hacky > > post-processing deciphering would indeed be a useful patch, but it has > > never risen to the level of enough of an itch for me to write it > > myself (especially since 'nbdinfo --map's output works just as well). > > > > I might just go with the above for now, but who knows if I'll get around > to this some day. Three approaches that come to mind are: > > 1. qemu-img bitmap --dump > > Other bitmap actions won't be supported in combination with NBD. This seems like an independently useful option (for any image with embedded bitmaps, get at the contents of that bitmap). I'm not sure how much code you could reuse from the 'qemu-img map', or if it is better to just write the iteration loops from scratch. More interesting is the observation that since NBD metacontexts are NOT visible as a local bitmap (but rather as block status information), it may still not do what you want for NBD connections. > > 2. qemu-img map --bitmap NAME > > Should it use a dedicated output format compared to the usual "map" > output (both human and json) with just "start/offset + length + dirty > bit" triples? Again, this would work well with local qcow2 images, but I'm not sure if it would translate nicely into NBD client images without more code in block/nbd.c to expose a metacontext as a bitmap rather than its current use of feeding block status. > > 3. qemu-nbd --map CONTEXT > > With only supporting one context at a time? Would be limited to NBD of > course which the other two won't be. The other two would be hard to implement for NBD, while this one is hard to implement for qcow2. It might be less hacky than x-dirty-bitmap, but also starts duplicating efforts with what nbdinfo can already do. > > > All would require connecting to the NBD export with the correct meta > context, which currently means using x_dirty_bitmap internally. So would > that even be okay as part of a non-experimental command, or would it > require to teach the NBD client code to deal with multiple meta contexts > first? If you want qemu-img to treat both qcow2 bitmaps and NBD metacontexts transparently from the same command line, we'll definitely need some work on how to make block/nbd.c expose an NBD metacontext as a bitmap (separately from how the existing x-dirty-bitmap just repurposes what is plugged into the block_status callbacks). But handling only one metacontext at a time, rather than having to do multiple in parallel, is probably okay. When you start reading more than one context at once, you have to worry about things like the two contexts returning different lengths of information ("base:allocation" might tell you about a 64k hole while qemu:dirty-bitmap:XXX tells you about a 1M dirty region - but consolidating that into contiguous extents of combined output information is tricky, especially if you don't want to re-query status you already know about one context but not the other). > > Best Regards, > Fiona > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: query dirty areas according to bitmap via QMP or qemu-nbd
On Fri, Jul 26, 2024 at 04:16:41PM GMT, Fiona Ebner wrote: > Hi, > > sorry if I'm missing the obvious, but is there a way to get the dirty > areas according to a dirty bitmap via QMP? I mean as something like > offset + size + dirty-flag triples. In my case, the bitmap is also > exported via NBD, so same question for qemu-nbd being the client. Over QMP, no - that can produce a potentially large response and possible long time in computing the data, so we have never felt the need to introduce a new QMP command for that purpose. So over NBD is the preferred solution. > > I can get the info with "nbdinfo --map", but would like to avoid > requiring a tool outside QEMU. By default, QEMU as an NBD client only reads the "base:allocation" NBD metacontext, and is not wired to read more than one NBD metacontext at once (weaker than nbdinfo's capabilities). But I have intentionally left in a hack (accessible through QMP as well as from the command line) for connecting a qemu NBD client to an alternative NBD metacontext that feeds the block status, at which point 2 bits of information from the alternative context are observable through the result of block status calls. Note that using such an NBD connection for anything OTHER than block status calls is inadvisable (qemu might incorrectly optimize reads based on its misinterpretation of those block status bits); but as long as you limit the client to block status calls, it's a great way to read out a "qemu:dirty-bitmap:..." metacontext using only a qemu NBD client connection. git grep -l x-dirty-bitmap tests/qemu-iotests shows several of the iotests using the backdoor in just that manner. In particular, tests/qemu-img-bitmaps gives the magic decoder ring: | # x-dirty-bitmap is a hack for reading bitmaps; it abuses block status to | # report "data":false for portions of the bitmap which are set | IMG="driver=nbd,server.type=unix,server.path=$nbd_unix_socket" | nbd_server_start_unix_socket -r -f qcow2 \ | -B b0 -B b1 -B b2 -B b3 "$TEST_IMG" | $QEMU_IMG map --output=json --image-opts \ | "$IMG,x-dirty-bitmap=qemu:dirty-bitmap:b0" | _filter_qemu_img_map meaning the map output includes "data":false for the dirty portions and "data":true for the unchanged portions recorded in bitmap b0 as read from the JSON map output. > > If it is not currently possible, would upstream be interested too in the > feature, either for QMP or qemu-nbd? Improving qemu-img to get at the information without quite the hacky post-processing deciphering would indeed be a useful patch, but it has never risen to the level of enough of an itch for me to write it myself (especially since 'nbdinfo --map's output works just as well). -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [Libguestfs] Passing block size preferences from NBD -> qemu -> Linux
[adding qemu-block in cc] On Sat, Jul 13, 2024 at 03:40:36PM GMT, Richard W.M. Jones wrote: > This is expanding on the commit message I wrote here: > > https://gitlab.com/nbdkit/nbdkit/-/commit/780599d2e77c7cc4c1a7e99d0a933289761a9b27 > > A simple "one-liner" to test if NBD block size preferences are passed > correctly through qemu and into a Linux guest is this: > > $ nbdkit memory 1G --filter=blocksize-policy \ >blocksize-minimum=4096 \ > blocksize-preferred=65536 \ > blocksize-maximum=8M \ > --run ' > LIBGUESTFS_HV=/path/to/qemu-system-x86_64 \ > LIBGUESTFS_BACKEND=direct \ > guestfish --format=raw -a "$uri" \ > run : \ > debug sh "head -1 /sys/block/*/queue/*_io_size" : \ > debug sh "for d in /dev/sd? ; do sg_inq -p 0xb0 \$d ; done" \ > ' > > Current qemu (9.0.0) does not pass the block size preferences > correctly. It's a problem in qemu, not in Linux. > > qemu's NBD client requests the block size preferences from nbdkit and > reads them correctly. I verified this by adding some print statements > into nbd/client.c. The sizes are stored in BDRVNBDState 'info' field. > > qemu's virtio-scsi driver *can* present a block limits VPD page (0xb0) > containing these limits (see hw/scsi/scsi-disk.c), and Linux is able > to see the contents of this page using tools like 'sg_inq'. Linux > appears to translate the information faithfully into > /sys/block/sdX/queue/{minimum,optimal}_io_size files. > > However the virtio-scsi driver in qemu populates this information from > the qemu command line (-device [...]min_io_size=512,opt_io_size=4096). > It doesn't pass the information through from the NBD source backing > the drive. Is guestfish the one synthesizing the '-device min_io_size=512' used by qemu? I don't see it in the nbdkit command line posted above. Or is guestfish leaving it up to qemu to advertise its defaults, and this is merely a case of qemu favoring its defaults over what the device advertised? > > Fixing this seems like a non-trivial amount of work. Indeed, if guestfish is passing command-line defaults for qemu to use, we have to determine when to prioritize hardware advertisements over command-line defaults, while still maintaining flexibility to intentionally pick different sizes than what hardware advertised for the purposes of performance testing. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] nbd: Prevent NULL pointer dereference in nbd_blockdev_client_closed()
On Sat, Jun 08, 2024 at 11:36:59AM GMT, Alexander Ivanov wrote: > There is a bug reproducer in the attachment. Summarizing the reproducer, you are repeatedly calling QMP nbd-server-start/nbd-server-stop on qemu as NBD server in one thread, and repeatedly calling 'qemu-nbd -L' in another, to try and provoke the race where the server is stopped while qemu-nbd -L is still trying to grab information. > > > On 6/7/24 17:00, Alexander Ivanov wrote: > > In some cases, the NBD server can be stopped before > > nbd_blockdev_client_closed() is called, causing the nbd_server variable > > to be nullified. This leads to a NULL pointer dereference when accessing > > nbd_server. Am I correct that the NULL pointer deref was in qemu-nbd in your reproducer, and not in qemu-kvm? > > > > Add a NULL check for nbd_server to the nbd_blockdev_client_closed() > > function to prevent NULL pointer dereference. > > > > Signed-off-by: Alexander Ivanov > > --- > > blockdev-nbd.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/blockdev-nbd.c b/blockdev-nbd.c > > index 213012435f..fb1f30ae0d 100644 > > --- a/blockdev-nbd.c > > +++ b/blockdev-nbd.c > > @@ -52,6 +52,9 @@ int nbd_server_max_connections(void) > > static void nbd_blockdev_client_closed(NBDClient *client, bool ignored) > > { > > nbd_client_put(client); > > +if (nbd_server == NULL) { > > +return; > > +} > > assert(nbd_server->connections > 0); While this does indeed avoid the NULL dereference right here, I still want to understand why nbd_server is getting set to NULL while there is still an active client, and make sure we don't have any other NULL derefs. I'll respond again once I've studied the code a bit more. > > nbd_server->connections--; > > nbd_update_server_watch(nbd_server); > > -- > Best regards, > Alexander Ivanov -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PULL v2 2/2] iotests: test NBD+TLS+iothread
Prevent regressions when using NBD with TLS in the presence of iothreads, adding coverage the fix to qio channels made in the previous patch. The shell function pick_unused_port() was copied from nbdkit.git/tests/functions.sh.in, where it had all authors from Red Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2. CC: qemu-sta...@nongnu.org CC: "Richard W.M. Jones" Signed-off-by: Eric Blake Message-ID: <20240531180639.1392905-6-ebl...@redhat.com> Reviewed-by: Daniel P. Berrangé --- tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 2 files changed, 222 insertions(+) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread b/tests/qemu-iotests/tests/nbd-tls-iothread new file mode 100755 index 000..a2fb07206e5 --- /dev/null +++ b/tests/qemu-iotests/tests/nbd-tls-iothread @@ -0,0 +1,168 @@ +#!/usr/bin/env bash +# group: rw quick +# +# Test of NBD+TLS+iothread +# +# Copyright (C) 2024 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# creator +owner=ebl...@redhat.com + +seq=`basename $0` +echo "QA output created by $seq" + +status=1# failure is the default! + +_cleanup() +{ +_cleanup_qemu +_cleanup_test_img +rm -f "$dst_image" +tls_x509_cleanup +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +cd .. +. ./common.rc +. ./common.filter +. ./common.qemu +. ./common.tls +. ./common.nbd + +_supported_fmt qcow2 # Hardcoded to qcow2 command line and QMP below +_supported_proto file + +# pick_unused_port +# +# Picks and returns an "unused" port, setting the global variable +# $port. +# +# This is inherently racy, but we need it because qemu does not currently +# permit NBD+TLS over a Unix domain socket +pick_unused_port () +{ +if ! (ss --version) >/dev/null 2>&1; then +_notrun "ss utility required, skipped this test" +fi + +# Start at a random port to make it less likely that two parallel +# tests will conflict. +port=$(( 5 + (RANDOM%15000) )) +while ss -ltn | grep -sqE ":$port\b"; do +((port++)) +if [ $port -eq 65000 ]; then port=5; fi +done +echo picked unused port +} + +tls_x509_init + +size=1G +DST_IMG="$TEST_DIR/dst.qcow2" + +echo +echo "== preparing TLS creds and spare port ==" + +pick_unused_port +tls_x509_create_root_ca "ca1" +tls_x509_create_server "ca1" "server1" +tls_x509_create_client "ca1" "client1" +tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}" + +echo +echo "== preparing image ==" + +_make_test_img $size +$QEMU_IMG create -f qcow2 "$DST_IMG" $size | _filter_img_create + +echo +echo === Starting Src QEMU === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread0 \ +-object "${tls_obj_base}"/client1,endpoint=client \ +-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true, + "bus":"pcie.0"}' \ +-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0", + "bus":"root0", "iothread":"iothread0"}' \ +-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1", + "bus":"virtio_scsi_pci0.0"}' \ +-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false}, +"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \ +-blockdev '{"driver":"qcow2", "node-name":"drive_image1", +"file":"drive_sys1"}' +h1=$QEMU_HANDLE +_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return' + +echo +
[PULL v2 1/2] qio: Inherit follow_coroutine_ctx across TLS
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an assertion failure: qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed. It turns out that when we removed AioContext locking, we did so by having NBD tell its qio channels that it wanted to opt in to qio_channel_set_follow_coroutine_ctx(); but while we opted in on the main channel, we did not opt in on the TLS wrapper channel. qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently no coverage of NBD+TLS+iothread, or we would have noticed this regression sooner. (I'll add that in the next patch) But while we could manually opt in to the TLS channel in nbd/server.c (a one-line change), it is more generic if all qio channels that wrap other channels inherit the follow status, in the same way that they inherit feature bits. CC: Stefan Hajnoczi CC: Daniel P. Berrangé CC: qemu-sta...@nongnu.org Fixes: https://issues.redhat.com/browse/RHEL-34786 Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0) Signed-off-by: Eric Blake Reviewed-by: Stefan Hajnoczi Reviewed-by: Daniel P. Berrangé Message-ID: <20240518025246.791593-5-ebl...@redhat.com> --- io/channel-tls.c | 26 +++--- io/channel-websock.c | 1 + 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/io/channel-tls.c b/io/channel-tls.c index 1d9c9c72bfb..67b9760 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master, const char *aclname, Error **errp) { -QIOChannelTLS *ioc; +QIOChannelTLS *tioc; +QIOChannel *ioc; -ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +ioc = QIO_CHANNEL(tioc); -ioc->master = master; +tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { -qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN); +qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } object_ref(OBJECT(master)); -ioc->session = qcrypto_tls_session_new( +tioc->session = qcrypto_tls_session_new( creds, NULL, aclname, QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, errp); -if (!ioc->session) { +if (!tioc->session) { goto error; } qcrypto_tls_session_set_callbacks( -ioc->session, +tioc->session, qio_channel_tls_write_handler, qio_channel_tls_read_handler, -ioc); +tioc); -trace_qio_channel_tls_new_server(ioc, master, creds, aclname); -return ioc; +trace_qio_channel_tls_new_server(tioc, master, creds, aclname); +return tioc; error: -object_unref(OBJECT(ioc)); +object_unref(OBJECT(tioc)); return NULL; } @@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master, ioc = QIO_CHANNEL(tioc); tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } diff --git a/io/channel-websock.c b/io/channel-websock.c index a12acc27cf2..de39f0d182d 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master) ioc = QIO_CHANNEL(wioc); wioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } -- 2.45.1
[PULL v2 0/2] NBD patches for 2024-05-30
The following changes since commit 3b2fe44bb7f605f179e5e7feb2c13c2eb3abbb80: Merge tag 'pull-request-2024-05-29' of https://gitlab.com/thuth/qemu into staging (2024-05-29 08:38:20 -0700) are available in the Git repository at: https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2024-05-30-v2 for you to fetch changes up to a73c99378022ebb785481e84cfe1e81097546268: iotests: test NBD+TLS+iothread (2024-06-03 09:17:11 -0500) NBD patches for 2024-05-30 - Fix AioContext assertion with NBD+TLS ---- Eric Blake (2): qio: Inherit follow_coroutine_ctx across TLS iotests: test NBD+TLS+iothread io/channel-tls.c | 26 ++-- io/channel-websock.c | 1 + tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 + 4 files changed, 238 insertions(+), 11 deletions(-) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out -- 2.45.1
[PATCH v3 2/2] iotests: test NBD+TLS+iothread
Prevent regressions when using NBD with TLS in the presence of iothreads, adding coverage the fix to qio channels made in the previous patch. The shell function pick_unused_port() was copied from nbdkit.git/tests/functions.sh.in, where it had all authors from Red Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2. CC: qemu-sta...@nongnu.org CC: "Richard W.M. Jones" Signed-off-by: Eric Blake --- tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 2 files changed, 222 insertions(+) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread b/tests/qemu-iotests/tests/nbd-tls-iothread new file mode 100755 index 000..a2fb07206e5 --- /dev/null +++ b/tests/qemu-iotests/tests/nbd-tls-iothread @@ -0,0 +1,168 @@ +#!/usr/bin/env bash +# group: rw quick +# +# Test of NBD+TLS+iothread +# +# Copyright (C) 2024 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# creator +owner=ebl...@redhat.com + +seq=`basename $0` +echo "QA output created by $seq" + +status=1# failure is the default! + +_cleanup() +{ +_cleanup_qemu +_cleanup_test_img +rm -f "$dst_image" +tls_x509_cleanup +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +cd .. +. ./common.rc +. ./common.filter +. ./common.qemu +. ./common.tls +. ./common.nbd + +_supported_fmt qcow2 # Hardcoded to qcow2 command line and QMP below +_supported_proto file + +# pick_unused_port +# +# Picks and returns an "unused" port, setting the global variable +# $port. +# +# This is inherently racy, but we need it because qemu does not currently +# permit NBD+TLS over a Unix domain socket +pick_unused_port () +{ +if ! (ss --version) >/dev/null 2>&1; then +_notrun "ss utility required, skipped this test" +fi + +# Start at a random port to make it less likely that two parallel +# tests will conflict. +port=$(( 5 + (RANDOM%15000) )) +while ss -ltn | grep -sqE ":$port\b"; do +((port++)) +if [ $port -eq 65000 ]; then port=5; fi +done +echo picked unused port +} + +tls_x509_init + +size=1G +DST_IMG="$TEST_DIR/dst.qcow2" + +echo +echo "== preparing TLS creds and spare port ==" + +pick_unused_port +tls_x509_create_root_ca "ca1" +tls_x509_create_server "ca1" "server1" +tls_x509_create_client "ca1" "client1" +tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}" + +echo +echo "== preparing image ==" + +_make_test_img $size +$QEMU_IMG create -f qcow2 "$DST_IMG" $size | _filter_img_create + +echo +echo === Starting Src QEMU === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread0 \ +-object "${tls_obj_base}"/client1,endpoint=client \ +-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true, + "bus":"pcie.0"}' \ +-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0", + "bus":"root0", "iothread":"iothread0"}' \ +-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1", + "bus":"virtio_scsi_pci0.0"}' \ +-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false}, +"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \ +-blockdev '{"driver":"qcow2", "node-name":"drive_image1", +"file":"drive_sys1"}' +h1=$QEMU_HANDLE +_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return' + +echo +echo === Starting Dst VM2 === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread
[PATCH v3 1/2] qio: Inherit follow_coroutine_ctx across TLS
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an assertion failure: qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed. It turns out that when we removed AioContext locking, we did so by having NBD tell its qio channels that it wanted to opt in to qio_channel_set_follow_coroutine_ctx(); but while we opted in on the main channel, we did not opt in on the TLS wrapper channel. qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently no coverage of NBD+TLS+iothread, or we would have noticed this regression sooner. (I'll add that in the next patch) But while we could manually opt in to the TLS channel in nbd/server.c (a one-line change), it is more generic if all qio channels that wrap other channels inherit the follow status, in the same way that they inherit feature bits. CC: Stefan Hajnoczi CC: Daniel P. Berrangé CC: qemu-sta...@nongnu.org Fixes: https://issues.redhat.com/browse/RHEL-34786 Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0) Signed-off-by: Eric Blake Reviewed-by: Stefan Hajnoczi Reviewed-by: Daniel P. Berrangé Message-ID: <20240518025246.791593-5-ebl...@redhat.com> --- io/channel-tls.c | 26 +++--- io/channel-websock.c | 1 + 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/io/channel-tls.c b/io/channel-tls.c index 1d9c9c72bfb..67b9760 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master, const char *aclname, Error **errp) { -QIOChannelTLS *ioc; +QIOChannelTLS *tioc; +QIOChannel *ioc; -ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +ioc = QIO_CHANNEL(tioc); -ioc->master = master; +tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { -qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN); +qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } object_ref(OBJECT(master)); -ioc->session = qcrypto_tls_session_new( +tioc->session = qcrypto_tls_session_new( creds, NULL, aclname, QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, errp); -if (!ioc->session) { +if (!tioc->session) { goto error; } qcrypto_tls_session_set_callbacks( -ioc->session, +tioc->session, qio_channel_tls_write_handler, qio_channel_tls_read_handler, -ioc); +tioc); -trace_qio_channel_tls_new_server(ioc, master, creds, aclname); -return ioc; +trace_qio_channel_tls_new_server(tioc, master, creds, aclname); +return tioc; error: -object_unref(OBJECT(ioc)); +object_unref(OBJECT(tioc)); return NULL; } @@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master, ioc = QIO_CHANNEL(tioc); tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } diff --git a/io/channel-websock.c b/io/channel-websock.c index a12acc27cf2..de39f0d182d 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master) ioc = QIO_CHANNEL(wioc); wioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } -- 2.45.1
[PATCH v3 0/2] Fix NBD+TLS regression in presence of iothread
In v3: - 2/2: fix iotest filtering [kwolf] v2 was here: https://lists.gnu.org/archive/html/qemu-devel/2024-05/msg03517.html and this time, I'll wait for R-b before sending my v2 pull request: https://lists.gnu.org/archive/html/qemu-devel/2024-05/msg06206.html Eric Blake (2): qio: Inherit follow_coroutine_ctx across TLS iotests: test NBD+TLS+iothread io/channel-tls.c | 26 +-- io/channel-websock.c | 1 + tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 4 files changed, 238 insertions(+), 11 deletions(-) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out -- 2.45.1
Re: [PATCH v2 2/2] iotests: test NBD+TLS+iothread
On Fri, May 31, 2024 at 04:36:20PM GMT, Kevin Wolf wrote: > Am 18.05.2024 um 04:50 hat Eric Blake geschrieben: > > Prevent regressions when using NBD with TLS in the presence of > > iothreads, adding coverage the fix to qio channels made in the > > previous patch. > > > > CC: qemu-sta...@nongnu.org > > Signed-off-by: Eric Blake > > > diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread.out > > b/tests/qemu-iotests/tests/nbd-tls-iothread.out > > new file mode 100644 > > index 000..b5e12dcbe7a > > --- /dev/null > > +++ b/tests/qemu-iotests/tests/nbd-tls-iothread.out > > +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 > > +Formatting > > '/home/eblake/qemu/build/tests/qemu-iotests/scratch/qcow2-file-nbd-tls-iothread/dst.qcow2', > > fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib > > size=1073741824 lazy_refcounts=off refcount_bits=16 > > /home/eblake is suboptimal in reference output. :-) Indeed. Will fix, which means I need a v2 pull request. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PULL 2/2] iotests: test NBD+TLS+iothread
Prevent regressions when using NBD with TLS in the presence of iothreads, adding coverage the fix to qio channels made in the previous patch. The shell function pick_unused_port() was copied from nbdkit.git/tests/functions.sh.in, where it had all authors from Red Hat, agreeing to the resulting relicensing from 2-clause BSD to GPLv2. CC: qemu-sta...@nongnu.org CC: "Richard W.M. Jones" Signed-off-by: Eric Blake Message-ID: <20240518025246.791593-6-ebl...@redhat.com> --- tests/qemu-iotests/tests/nbd-tls-iothread | 168 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 2 files changed, 222 insertions(+) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread b/tests/qemu-iotests/tests/nbd-tls-iothread new file mode 100755 index 000..26ccff8ddb7 --- /dev/null +++ b/tests/qemu-iotests/tests/nbd-tls-iothread @@ -0,0 +1,168 @@ +#!/usr/bin/env bash +# group: rw quick +# +# Test of NBD+TLS+iothread +# +# Copyright (C) 2024 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# creator +owner=ebl...@redhat.com + +seq=`basename $0` +echo "QA output created by $seq" + +status=1# failure is the default! + +_cleanup() +{ +_cleanup_qemu +_cleanup_test_img +rm -f "$dst_image" +tls_x509_cleanup +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +cd .. +. ./common.rc +. ./common.filter +. ./common.qemu +. ./common.tls +. ./common.nbd + +_supported_fmt qcow2 # Hardcoded to qcow2 command line and QMP below +_supported_proto file + +# pick_unused_port +# +# Picks and returns an "unused" port, setting the global variable +# $port. +# +# This is inherently racy, but we need it because qemu does not currently +# permit NBD+TLS over a Unix domain socket +pick_unused_port () +{ +if ! (ss --version) >/dev/null 2>&1; then +_notrun "ss utility required, skipped this test" +fi + +# Start at a random port to make it less likely that two parallel +# tests will conflict. +port=$(( 5 + (RANDOM%15000) )) +while ss -ltn | grep -sqE ":$port\b"; do +((port++)) +if [ $port -eq 65000 ]; then port=5; fi +done +echo picked unused port +} + +tls_x509_init + +size=1G +DST_IMG="$TEST_DIR/dst.qcow2" + +echo +echo "== preparing TLS creds and spare port ==" + +pick_unused_port +tls_x509_create_root_ca "ca1" +tls_x509_create_server "ca1" "server1" +tls_x509_create_client "ca1" "client1" +tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}" + +echo +echo "== preparing image ==" + +_make_test_img $size +$QEMU_IMG create -f qcow2 "$DST_IMG" $size + +echo +echo === Starting Src QEMU === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread0 \ +-object "${tls_obj_base}"/client1,endpoint=client \ +-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true, + "bus":"pcie.0"}' \ +-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0", + "bus":"root0", "iothread":"iothread0"}' \ +-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1", + "bus":"virtio_scsi_pci0.0"}' \ +-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false}, +"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \ +-blockdev '{"driver":"qcow2", "node-name":"drive_image1", +"file":"drive_sys1"}' +h1=$QEMU_HANDLE +_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return' + +echo +echo === Starting Dst VM2 === +echo
Re: [PATCH v2 2/2] iotests: test NBD+TLS+iothread
Adding a bit of self-review (in case you want to amend this before pushing, instead of waiting for me to get back online), On Fri, May 17, 2024 at 09:50:15PM GMT, Eric Blake wrote: > Prevent regressions when using NBD with TLS in the presence of > iothreads, adding coverage the fix to qio channels made in the > previous patch. > > CC: qemu-sta...@nongnu.org > Signed-off-by: Eric Blake > --- > tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++ > tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ > 2 files changed, 224 insertions(+) > create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread > create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out > > diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread > b/tests/qemu-iotests/tests/nbd-tls-iothread > new file mode 100755 > index 000..a737224a90e > --- /dev/null > +++ b/tests/qemu-iotests/tests/nbd-tls-iothread > @@ -0,0 +1,170 @@ > +#!/usr/bin/env bash > +# group: rw quick > +# > +# Test of NBD+TLS+iothread > +# > +# Copyright (C) 2024 Red Hat, Inc. > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 2 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see <http://www.gnu.org/licenses/>. > +# > + > +# creator > +owner=ebl...@redhat.com > + > +seq=`basename $0` > +echo "QA output created by $seq" > + > +status=1# failure is the default! > + > +_cleanup() > +{ > +_cleanup_qemu > +_cleanup_test_img > +rm -f "$dst_image" > +tls_x509_cleanup > +} > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +# get standard environment, filters and checks > +cd .. > +. ./common.rc > +. ./common.filter > +. ./common.qemu > +. ./common.tls > +. ./common.nbd > + > +_supported_fmt qcow2 # Hardcoded to qcow2 command line and QMP below > +_supported_proto file > +_require_command QEMU_NBD This line can probably be dropped. I originally included it thinking I might reuse common.nbd's nbd_server_start_tcp_socket to pick an unused port via a throwaway qemu-nbd, then kill the qemu-nbd process before starting up the two qemu processes. But in the end, using ss to probe a port's use seems a bit more elegant than a throwaway qemu-nbd process, although it may make CI testing harder by dragging in another dependency that is less universal. > + > +# pick_unused_port > +# Copied from nbdkit/tests/functions.sh.in with compatible 2-clause BSD > license I'm not sure if I have to include the license text verbatim in this file, and/or have this function moved to a helper utility file. The original source file that I borrowed pick_unused_port from has: # Copyright Red Hat # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are # met: # # * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # * Neither the name of Red Hat nor the names of its contributors may be # used to endorse or promote products derived from this software without # specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, # THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A # PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF # USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND # ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT # OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. > +# > +# Picks and returns an "unused" port, setting the global variable > +# $port. > +# >
Re: [PATCH v2 0/2] Fix NBD+TLS regression in presence of iothread
On Fri, May 17, 2024 at 09:50:13PM GMT, Eric Blake wrote: > In v2: > - correct list email address > - add iotest > - add R-b > > I'm offline next week, and have been communicating with Stefan who may > want to push this through his block tree instead of waiting for me to > get back. I also meant to add that I did test that the iotest 2/2 fails unless 1/2 is applied. > > Eric Blake (2): > qio: Inherit follow_coroutine_ctx across TLS > iotests: test NBD+TLS+iothread > > io/channel-tls.c | 26 +-- > io/channel-websock.c | 1 + > tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++ > tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ > 4 files changed, 240 insertions(+), 11 deletions(-) > create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread > create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out > > -- > 2.45.0 > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v2 1/2] qio: Inherit follow_coroutine_ctx across TLS
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an assertion failure: qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed. It turns out that when we removed AioContext locking, we did so by having NBD tell its qio channels that it wanted to opt in to qio_channel_set_follow_coroutine_ctx(); but while we opted in on the main channel, we did not opt in on the TLS wrapper channel. qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently no coverage of NBD+TLS+iothread, or we would have noticed this regression sooner. (I'll add that in the next patch) But while we could manually opt in to the TLS channel in nbd/server.c (a one-line change), it is more generic if all qio channels that wrap other channels inherit the follow status, in the same way that they inherit feature bits. CC: Stefan Hajnoczi CC: Daniel P. Berrangé CC: qemu-sta...@nongnu.org Fixes: https://issues.redhat.com/browse/RHEL-34786 Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0) Signed-off-by: Eric Blake Reviewed-by: Stefan Hajnoczi Reviewed-by: Daniel P. Berrangé --- io/channel-tls.c | 26 +++--- io/channel-websock.c | 1 + 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/io/channel-tls.c b/io/channel-tls.c index 1d9c9c72bfb..67b9760 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master, const char *aclname, Error **errp) { -QIOChannelTLS *ioc; +QIOChannelTLS *tioc; +QIOChannel *ioc; -ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +ioc = QIO_CHANNEL(tioc); -ioc->master = master; +tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { -qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN); +qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } object_ref(OBJECT(master)); -ioc->session = qcrypto_tls_session_new( +tioc->session = qcrypto_tls_session_new( creds, NULL, aclname, QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, errp); -if (!ioc->session) { +if (!tioc->session) { goto error; } qcrypto_tls_session_set_callbacks( -ioc->session, +tioc->session, qio_channel_tls_write_handler, qio_channel_tls_read_handler, -ioc); +tioc); -trace_qio_channel_tls_new_server(ioc, master, creds, aclname); -return ioc; +trace_qio_channel_tls_new_server(tioc, master, creds, aclname); +return tioc; error: -object_unref(OBJECT(ioc)); +object_unref(OBJECT(tioc)); return NULL; } @@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master, ioc = QIO_CHANNEL(tioc); tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } diff --git a/io/channel-websock.c b/io/channel-websock.c index a12acc27cf2..de39f0d182d 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master) ioc = QIO_CHANNEL(wioc); wioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } -- 2.45.0
[PATCH v2 2/2] iotests: test NBD+TLS+iothread
Prevent regressions when using NBD with TLS in the presence of iothreads, adding coverage the fix to qio channels made in the previous patch. CC: qemu-sta...@nongnu.org Signed-off-by: Eric Blake --- tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 2 files changed, 224 insertions(+) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread b/tests/qemu-iotests/tests/nbd-tls-iothread new file mode 100755 index 000..a737224a90e --- /dev/null +++ b/tests/qemu-iotests/tests/nbd-tls-iothread @@ -0,0 +1,170 @@ +#!/usr/bin/env bash +# group: rw quick +# +# Test of NBD+TLS+iothread +# +# Copyright (C) 2024 Red Hat, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# + +# creator +owner=ebl...@redhat.com + +seq=`basename $0` +echo "QA output created by $seq" + +status=1# failure is the default! + +_cleanup() +{ +_cleanup_qemu +_cleanup_test_img +rm -f "$dst_image" +tls_x509_cleanup +} +trap "_cleanup; exit \$status" 0 1 2 3 15 + +# get standard environment, filters and checks +cd .. +. ./common.rc +. ./common.filter +. ./common.qemu +. ./common.tls +. ./common.nbd + +_supported_fmt qcow2 # Hardcoded to qcow2 command line and QMP below +_supported_proto file +_require_command QEMU_NBD + +# pick_unused_port +# Copied from nbdkit/tests/functions.sh.in with compatible 2-clause BSD license +# +# Picks and returns an "unused" port, setting the global variable +# $port. +# +# This is inherently racy, but we need it because qemu does not currently +# permit NBD+TLS over a Unix domain socket +pick_unused_port () +{ +if ! (ss --version) >/dev/null 2>&1; then +_notrun "ss utility required, skipped this test" +fi + +# Start at a random port to make it less likely that two parallel +# tests will conflict. +port=$(( 5 + (RANDOM%15000) )) +while ss -ltn | grep -sqE ":$port\b"; do +((port++)) +if [ $port -eq 65000 ]; then port=5; fi +done +echo picked unused port +} + +tls_x509_init + +size=1G +DST_IMG="$TEST_DIR/dst.qcow2" + +echo +echo "== preparing TLS creds and spare port ==" + +pick_unused_port +tls_x509_create_root_ca "ca1" +tls_x509_create_server "ca1" "server1" +tls_x509_create_client "ca1" "client1" +tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}" + +echo +echo "== preparing image ==" + +_make_test_img $size +$QEMU_IMG create -f qcow2 "$DST_IMG" $size + +echo +echo === Starting Src QEMU === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread0 \ +-object "${tls_obj_base}"/client1,endpoint=client \ +-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true, + "bus":"pcie.0"}' \ +-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0", + "bus":"root0", "iothread":"iothread0"}' \ +-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1", + "bus":"virtio_scsi_pci0.0"}' \ +-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false}, +"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \ +-blockdev '{"driver":"qcow2", "node-name":"drive_image1", +"file":"drive_sys1"}' +h1=$QEMU_HANDLE +_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return' + +echo +echo === Starting Dst VM2 === +echo + +_launch_qemu -machine q35 \ +-object iothread,id=iothread0 \ +-object "${tls_obj_base}"/server1,endpoint=server \ +-device '{"driver":"pcie-root-port", "
[PATCH v2 0/2] Fix NBD+TLS regression in presence of iothread
In v2: - correct list email address - add iotest - add R-b I'm offline next week, and have been communicating with Stefan who may want to push this through his block tree instead of waiting for me to get back. Eric Blake (2): qio: Inherit follow_coroutine_ctx across TLS iotests: test NBD+TLS+iothread io/channel-tls.c | 26 +-- io/channel-websock.c | 1 + tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++ tests/qemu-iotests/tests/nbd-tls-iothread.out | 54 ++ 4 files changed, 240 insertions(+), 11 deletions(-) create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out -- 2.45.0
[PATCH] qio: Inherit follow_coroutine_ctx across TLS
Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an assertion failure: qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion `qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed. It turns out that when we removed AioContext locking, we did so by having NBD tell its qio channels that it wanted to opt in to qio_channel_set_follow_coroutine_ctx(); but while we opted in on the main channel, we did not opt in on the TLS wrapper channel. qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently no coverage of NBD+TLS+iothread, or we would have noticed this regression sooner. (I'll add that in the next patch) But while we could manually opt in to the TLS thread in nbd/server.c, it is more generic if all qio channels that wrap other channels inherit the follow status, in the same way that they inherit feature bits. CC: Stefan Hajnoczi CC: Daniel P. Berrangé CC: qemu-sta...@nongnu.org Fixes: https://issues.redhat.com/browse/RHEL-34786 Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", v8.2.0) Signed-off-by: Eric Blake --- Maybe we should turn ioc->follow_coroutine_ctx into a QIO_CHANNEL_FEATURE_* bit? And I have not yet written the promised qemu-iotests patch, but I wanted to get this on the list before I'm offline for a week. --- io/channel-tls.c | 26 +++--- io/channel-websock.c | 1 + 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/io/channel-tls.c b/io/channel-tls.c index 1d9c9c72bfb..67b9760 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master, const char *aclname, Error **errp) { -QIOChannelTLS *ioc; +QIOChannelTLS *tioc; +QIOChannel *ioc; -ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS)); +ioc = QIO_CHANNEL(tioc); -ioc->master = master; +tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { -qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SHUTDOWN); +qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } object_ref(OBJECT(master)); -ioc->session = qcrypto_tls_session_new( +tioc->session = qcrypto_tls_session_new( creds, NULL, aclname, QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, errp); -if (!ioc->session) { +if (!tioc->session) { goto error; } qcrypto_tls_session_set_callbacks( -ioc->session, +tioc->session, qio_channel_tls_write_handler, qio_channel_tls_read_handler, -ioc); +tioc); -trace_qio_channel_tls_new_server(ioc, master, creds, aclname); -return ioc; +trace_qio_channel_tls_new_server(tioc, master, creds, aclname); +return tioc; error: -object_unref(OBJECT(ioc)); +object_unref(OBJECT(tioc)); return NULL; } @@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master, ioc = QIO_CHANNEL(tioc); tioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } diff --git a/io/channel-websock.c b/io/channel-websock.c index a12acc27cf2..de39f0d182d 100644 --- a/io/channel-websock.c +++ b/io/channel-websock.c @@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master) ioc = QIO_CHANNEL(wioc); wioc->master = master; +ioc->follow_coroutine_ctx = master->follow_coroutine_ctx; if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) { qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN); } base-commit: 922582ace2df59572a671f5c0c5c6c5c706995e5 -- 2.45.0
[PULL 2/2] nbd/server: Mark negotiation functions as coroutine_fn
nbd_negotiate() is already marked coroutine_fn. And given the fix in the previous patch to have nbd_negotiate_handle_starttls not create and wait on a g_main_loop (as that would violate coroutine constraints), it is worth marking the rest of the related static functions reachable only during option negotiation as also being coroutine_fn. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Eric Blake Message-ID: <20240408160214.1200629-6-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy [eblake: drop one spurious coroutine_fn marking] Signed-off-by: Eric Blake --- nbd/server.c | 102 +-- 1 file changed, 59 insertions(+), 43 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 98ae0e16326..892797bb111 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -195,8 +195,9 @@ static inline void set_be_option_rep(NBDOptionReply *rep, uint32_t option, /* Send a reply header, including length, but no payload. * Return -errno on error, 0 on success. */ -static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, - uint32_t len, Error **errp) +static coroutine_fn int +nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, + uint32_t len, Error **errp) { NBDOptionReply rep; @@ -211,15 +212,15 @@ static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, /* Send a reply header with default 0 length. * Return -errno on error, 0 on success. */ -static int nbd_negotiate_send_rep(NBDClient *client, uint32_t type, - Error **errp) +static coroutine_fn int +nbd_negotiate_send_rep(NBDClient *client, uint32_t type, Error **errp) { return nbd_negotiate_send_rep_len(client, type, 0, errp); } /* Send an error reply. * Return -errno on error, 0 on success. */ -static int G_GNUC_PRINTF(4, 0) +static coroutine_fn int G_GNUC_PRINTF(4, 0) nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type, Error **errp, const char *fmt, va_list va) { @@ -259,7 +260,7 @@ nbd_sanitize_name(const char *name) /* Send an error reply. * Return -errno on error, 0 on success. */ -static int G_GNUC_PRINTF(4, 5) +static coroutine_fn int G_GNUC_PRINTF(4, 5) nbd_negotiate_send_rep_err(NBDClient *client, uint32_t type, Error **errp, const char *fmt, ...) { @@ -275,7 +276,7 @@ nbd_negotiate_send_rep_err(NBDClient *client, uint32_t type, /* Drop remainder of the current option, and send a reply with the * given error type and message. Return -errno on read or write * failure; or 0 if connection is still live. */ -static int G_GNUC_PRINTF(4, 0) +static coroutine_fn int G_GNUC_PRINTF(4, 0) nbd_opt_vdrop(NBDClient *client, uint32_t type, Error **errp, const char *fmt, va_list va) { @@ -288,7 +289,7 @@ nbd_opt_vdrop(NBDClient *client, uint32_t type, Error **errp, return ret; } -static int G_GNUC_PRINTF(4, 5) +static coroutine_fn int G_GNUC_PRINTF(4, 5) nbd_opt_drop(NBDClient *client, uint32_t type, Error **errp, const char *fmt, ...) { @@ -302,7 +303,7 @@ nbd_opt_drop(NBDClient *client, uint32_t type, Error **errp, return ret; } -static int G_GNUC_PRINTF(3, 4) +static coroutine_fn int G_GNUC_PRINTF(3, 4) nbd_opt_invalid(NBDClient *client, Error **errp, const char *fmt, ...) { int ret; @@ -319,8 +320,9 @@ nbd_opt_invalid(NBDClient *client, Error **errp, const char *fmt, ...) * If @check_nul, require that no NUL bytes appear in buffer. * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_read(NBDClient *client, void *buffer, size_t size, -bool check_nul, Error **errp) +static coroutine_fn int +nbd_opt_read(NBDClient *client, void *buffer, size_t size, + bool check_nul, Error **errp) { if (size > client->optlen) { return nbd_opt_invalid(client, errp, @@ -343,7 +345,8 @@ static int nbd_opt_read(NBDClient *client, void *buffer, size_t size, /* Drop size bytes from the unparsed payload of the current option. * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp) +static coroutine_fn int +nbd_opt_skip(NBDClient *client, size_t size, Error **errp) { if (size > client->optlen) { return nbd_opt_invalid(client, errp, @@ -366,8 +369,9 @@ static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp) * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length, - Error **errp) +stati
[PULL 0/2] NBD patches for 2024-04-25
The following changes since commit 5da72194df36535d773c8bdc951529ecd5e31707: Merge tag 'pull-tcg-20240424' of https://gitlab.com/rth7680/qemu into staging (2024-04-24 15:51:49 -0700) are available in the Git repository at: https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2024-04-25 for you to fetch changes up to 4fa333e08dd96395a99ea8dd9e4c73a29dd23344: nbd/server: Mark negotiation functions as coroutine_fn (2024-04-25 12:59:19 -0500) NBD patches for 2024-04-25 - Avoid calling poll() within coroutine ---- Eric Blake (1): nbd/server: Mark negotiation functions as coroutine_fn Zhu Yangyang (1): nbd/server: do not poll within a coroutine context nbd/nbd-internal.h | 10 - nbd/client.c | 28 ++-- nbd/common.c | 11 - nbd/server.c | 128 + 4 files changed, 105 insertions(+), 72 deletions(-) -- 2.44.0
[PULL 1/2] nbd/server: do not poll within a coroutine context
From: Zhu Yangyang Coroutines are not supposed to block. Instead, they should yield. The client performs TLS upgrade outside of an AIOContext, during synchronous handshake; this still requires g_main_loop. But the server responds to TLS upgrade inside a coroutine, so a nested g_main_loop is wrong. Since the two callbacks no longer share more than the setting of data.complete and data.error, it's just as easy to use static helpers instead of trying to share a common code path. It is also possible to add assertions that no other code is interfering with the eventual path to qio reaching the callback, whether or not it required a yield or main loop. Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") Signed-off-by: Zhu Yangyang [eblake: move callbacks to their use point, add assertions] Signed-off-by: Eric Blake Message-ID: <20240408160214.1200629-5-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/nbd-internal.h | 10 -- nbd/client.c | 28 nbd/common.c | 11 --- nbd/server.c | 28 +++- 4 files changed, 47 insertions(+), 30 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index dfa02f77ee4..91895106a95 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -72,16 +72,6 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size, return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0; } -struct NBDTLSHandshakeData { -GMainLoop *loop; -bool complete; -Error *error; -}; - - -void nbd_tls_handshake(QIOTask *task, - void *opaque); - int nbd_drop(QIOChannel *ioc, size_t size, Error **errp); #endif diff --git a/nbd/client.c b/nbd/client.c index 29ffc609a4b..c89c7504673 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -596,13 +596,31 @@ static int nbd_request_simple_option(QIOChannel *ioc, int opt, bool strict, return 1; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSClientHandshakeData { +bool complete; +Error *error; +GMainLoop *loop; +}; + +static void nbd_client_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSClientHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (data->loop) { +g_main_loop_quit(data->loop); +} +} + static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, const char *hostname, Error **errp) { int ret; QIOChannelTLS *tioc; -struct NBDTLSHandshakeData data = { 0 }; +struct NBDTLSClientHandshakeData data = { 0 }; ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, true, errp); if (ret <= 0) { @@ -619,18 +637,20 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, return NULL; } qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); -data.loop = g_main_loop_new(g_main_context_default(), FALSE); trace_nbd_receive_starttls_tls_handshake(); qio_channel_tls_handshake(tioc, - nbd_tls_handshake, + nbd_client_tls_handshake, &data, NULL, NULL); if (!data.complete) { +data.loop = g_main_loop_new(g_main_context_default(), FALSE); g_main_loop_run(data.loop); +assert(data.complete); +g_main_loop_unref(data.loop); } -g_main_loop_unref(data.loop); + if (data.error) { error_propagate(errp, data.error); object_unref(OBJECT(tioc)); diff --git a/nbd/common.c b/nbd/common.c index 3247c1d618a..589a748cfe6 100644 --- a/nbd/common.c +++ b/nbd/common.c @@ -47,17 +47,6 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp) } -void nbd_tls_handshake(QIOTask *task, - void *opaque) -{ -struct NBDTLSHandshakeData *data = opaque; - -qio_task_propagate_error(task, &data->error); -data->complete = true; -g_main_loop_quit(data->loop); -} - - const char *nbd_opt_lookup(uint32_t opt) { switch (opt) { diff --git a/nbd/server.c b/nbd/server.c index c3484cc1ebc..98ae0e16326 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -748,6 +748,23 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) return rc; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSServerHandshakeData { +bool complete; +Error *error; +Coroutine *co; +}; + +static void nbd_server_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSServerHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (!qemu_coroutine_entered(dat
Re: [PATCH v3 09/13] block/gluster: Use URI parsing code from glib
On Thu, Apr 18, 2024 at 12:10:52PM +0200, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. > > Since g_uri_get_path() returns a const pointer, we also need to > tweak the parameter of parse_volume_options() (where we use the > result of g_uri_get_path() as input). > > Reviewed-by: Eric Blake > Reviewed-by: Daniel P. Berrangé > Signed-off-by: Thomas Huth > --- > block/gluster.c | 71 - > 1 file changed, 35 insertions(+), 36 deletions(-) > > @@ -364,57 +363,57 @@ static int > qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf, > QAPI_LIST_PREPEND(gconf->server, gsconf); > > /* transport */ > -if (!uri->scheme || !strcmp(uri->scheme, "gluster")) { > +uri_scheme = g_uri_get_scheme(uri); > +if (!uri_scheme || !strcmp(uri_scheme, "gluster")) { > gsconf->type = SOCKET_ADDRESS_TYPE_INET; It may be worth a mention in the commit message that we are aware that this provides a positive user-visible change as a side-effect: namely, by virtue of using glib's parser (which normalizes the scheme to lowercase) instead of our own (which did not), we now accept GLUSTER:// URIs in addition to the usual gluster:// spelling. Similar comments to all the other affected patches in the series. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v3 04/13] tests: Update our CI to use CentOS Stream 9 instead of 8
On Thu, Apr 18, 2024 at 12:10:47PM +0200, Thomas Huth wrote: > RHEL 9 (and thus also the derivatives) are available since two years > now, so according to QEMU's support policy, we can drop the active Grammar suggestion: RHEL 9 (and thus also the derivatives) have been available for two years now, > support for the previous major version 8 now. > > Another reason for doing this is that Centos Stream 8 will go EOL soon: > > https://blog.centos.org/2023/04/end-dates-are-coming-for-centos-stream-8-and-centos-linux-7/ > > "After May 31, 2024, CentOS Stream 8 will be archived >and no further updates will be provided." > > Thus upgrade our CentOS Stream container to major version 9 now. > > Reviewed-by: Daniel P. Berrangé > Signed-off-by: Thomas Huth -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH] tests/unit: Remove debug statements in test-nested-aio-poll.c
On Fri, Apr 19, 2024 at 10:58:19AM +0200, Philippe Mathieu-Daudé wrote: > We are running this test since almost a year; it is Grammar suggestion: We have been running this test for almost a year; > safe to remove its debug statements, which clutter > CI jobs output: > > ▶ 88/100 /nested-aio-poll OK > io_read 0x16bb26158 > io_poll_true 0x16bb26158 > > io_poll_ready > io_read 0x16bb26164 > < io_poll_ready > io_poll_true 0x16bb26158 > io_poll_false 0x16bb26164 > > io_poll_ready > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_poll_false 0x16bb26164 > io_read 0x16bb26164 > < io_poll_ready > 88/100 qemu:unit / test-nested-aio-pollOK > > Signed-off-by: Philippe Mathieu-Daudé > --- > tests/unit/test-nested-aio-poll.c | 7 --- > 1 file changed, 7 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v3 01/13] tests: Remove Ubuntu 20.04 container
On Thu, Apr 18, 2024 at 12:10:44PM +0200, Thomas Huth wrote: > Since Ubuntu 22.04 is now available since two years, we can stop Grammar suggestion: Since Ubuntu 22.04 has now been available for more than two years, > actively supporting the previous LTS version of Ubuntu now. > > Reviewed-by: Philippe Mathieu-Daudé > Signed-off-by: Thomas Huth > --- -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
MAINTAINERS tweak [was: [PATCH for-9.1 0/9] Switch to glib URI parsing code]
[Trying Peter Lieven's alternate address...] On Thu, Mar 28, 2024 at 03:05:57PM +0100, Thomas Huth wrote: > In the QEMU 9.1 development cycle, we can drop the support for > Ubuntu 20.04 and CentOS 8 since the following major versions of > these distributions are available since 2 years already. Every time I've replied to any message in this thread, I've gotten a response: | +Your message to p...@kamp.de couldn't be delivered. | kamp.de couldn't confirm that your message was sent from a trusted location. | eblake Office 365 pl | Action Required Recipient | SPF validation error | | How to Fix It | Your organization's email admin will have to diagnose and fix your domain's email settings. Please forward this message to your | +email admin. | | | | More Info for Email Admins | Status code: 550 5.7.23 | | This error occurs when Sender Policy Framework (SPF) validation for the sender's domain fails. If you're the sender's email | +admin, make sure the SPF records for your domain at your domain registrar are set up correctly. Office 365 supports only one | +SPF record (a TXT record that defines SPF) for your domain. Include the following domain name: spf.protection.outlook.com. If | +you have a hybrid configuration (some mailboxes in the cloud, and some mailboxes on premises) or if you're an Exchange Online | +Protection standalone customer, add the outbound IP address of your on-premises servers to the TXT record. Red Hat IT says that it is unlikely to be Red Hat's SPF settings, and suspects that it is instead something caused by whatever Peter is using to bounce mail from his alias Peter Lieven to his Office 365 account. As I appear to be unable to contact Peter (even my use of direct email, bypassing the list, and using a personal account instead of my Red Hat email) about this issue, I'm wondering if Peter is still an active contributor to the project. But while typing this email, to see if RBD, iSCSI, and NFS need a new entry in MAINTAINERS, I did a search through the list archives, where the last email I found from Peter was https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg00574.html, which asked to update MAINTAINERS to his new address, and that has not made it in so far... -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 13/13] util/uri: Remove the old URI parsing code
On Fri, Apr 12, 2024 at 03:24:15PM +0200, Thomas Huth wrote: > Now that we switched all consumers of the URI code to use the URI > parsing functions from glib instead, we can remove our internal > URI parsing code since it is not used anymore. > > Signed-off-by: Thomas Huth > --- > include/qemu/uri.h | 99 --- > util/uri.c | 1466 > util/meson.build |2 +- > 3 files changed, 1 insertion(+), 1566 deletions(-) > delete mode 100644 include/qemu/uri.h > delete mode 100644 util/uri.c Nice. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 12/13] block/ssh: Use URI parsing code from glib
On Fri, Apr 12, 2024 at 03:24:14PM +0200, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. > > Reviewed-by: Richard W.M. Jones > Signed-off-by: Thomas Huth > --- > block/ssh.c | 75 - > 1 file changed, 40 insertions(+), 35 deletions(-) > > > -if (g_strcmp0(uri->scheme, "ssh") != 0) { > +if (g_strcmp0(g_uri_get_scheme(uri), "ssh") != 0) { Yet another case-sensitive spot to consider. > > -qdict_put_str(options, "path", uri->path); > - > -/* Pick out any query parameters that we understand, and ignore > - * the rest. > - */ > -for (i = 0; i < qp->n; ++i) { > -if (strcmp(qp->p[i].name, "host_key_check") == 0) { > -qdict_put_str(options, "host_key_check", qp->p[i].value); > +qdict_put_str(options, "path", uri_path); > + > +uri_query = g_uri_get_query(uri); > +if (uri_query) { > +g_uri_params_iter_init(&qp, uri_query, -1, "&", G_URI_PARAMS_NONE); > +while (g_uri_params_iter_next(&qp, &qp_name, &qp_value, &gerror)) { > +if (!qp_name || !qp_value || gerror) { > +warn_report("Failed to parse SSH URI parameters '%s'.", > +uri_query); > +break; > +} > +/* > + * Pick out the query parameters that we understand, and ignore > + * (or rather warn about) the rest. > + */ > +if (g_str_equal(qp_name, "host_key_check")) { > +qdict_put_str(options, "host_key_check", qp_value); > +} else { > +warn_report("Unsupported parameter '%s' in URI.", qp_name); Do we want the trailing '.' in warn_report? The warning is new; it was not in the old code, nor mentioned in the commit message. It seems like a good idea, but we should be more intentional if we intend to make that change. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 11/13] block/nfs: Use URI parsing code from glib
On Fri, Apr 12, 2024 at 03:24:13PM +0200, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. > > Signed-off-by: Thomas Huth > --- > block/nfs.c | 110 ++-- > 1 file changed, 54 insertions(+), 56 deletions(-) > > } > -if (g_strcmp0(uri->scheme, "nfs") != 0) { > +if (!g_str_equal(g_uri_get_scheme(uri), "nfs")) { Another case where we should be considering whether g_ascii_strcasecmp is better, as a separate patch. > -for (i = 0; i < qp->n; i++) { > -uint64_t val; > -if (!qp->p[i].value) { > -error_setg(errp, "Value for NFS parameter expected: %s", > - qp->p[i].name); > -goto out; > -} > -if (parse_uint_full(qp->p[i].value, 0, &val)) { > -error_setg(errp, "Illegal value for NFS parameter: %s", Pre-existing,... > + > +uri_query = g_uri_get_query(uri); > +if (uri_query) { > +g_uri_params_iter_init(&qp, uri_query, -1, "&", G_URI_PARAMS_NONE); > +while (g_uri_params_iter_next(&qp, &qp_name, &qp_value, &gerror)) { > +uint64_t val; > +if (!qp_name || gerror) { > +error_setg(errp, "Failed to parse NFS parameter"); > +return -EINVAL; > +} > +if (!qp_value) { > +error_setg(errp, "Value for NFS parameter expected: %s", > + qp_name); > +return -EINVAL; > +} > +if (parse_uint_full(qp_value, 0, &val)) { > +error_setg(errp, "Illegal value for NFS parameter: %s", ...but since we're touching it, I prefer 'Invalid' over 'Illegal' (any error message implying that you broke a law when you passed in bad data is a bit aggressive). Not a show-stopper to leave it alone. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 10/13] block/nbd: Use URI parsing code from glib
On Fri, Apr 12, 2024 at 03:24:12PM +0200, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. The g_uri_get_host() also takes care > of removing the square brackets from IPv6 addresses, so we can > drop that part of the QEMU code now, too. > > Reviewed-by: Richard W.M. Jones > Signed-off-by: Thomas Huth > --- > block/nbd.c | 76 ++--- > 1 file changed, 37 insertions(+), 39 deletions(-) > > > /* transport */ > -if (!g_strcmp0(uri->scheme, "nbd")) { > +uri_scheme = g_uri_get_scheme(uri); > +if (!g_strcmp0(uri_scheme, "nbd")) { > is_unix = false; As with gluster, we should probably be using g_ascii_strcasecmp() here to match the RFC; again, worth a separate patch. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 09/13] block/gluster: Use URI parsing code from glib
On Fri, Apr 12, 2024 at 09:40:18AM -0500, Eric Blake wrote: > > @@ -364,57 +363,57 @@ static int > > qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf, > > QAPI_LIST_PREPEND(gconf->server, gsconf); > > > > /* transport */ > > -if (!uri->scheme || !strcmp(uri->scheme, "gluster")) { > > +uri_scheme = g_uri_get_scheme(uri); > > +if (!uri_scheme || !strcmp(uri_scheme, "gluster")) { > > Pre-existing, but per RFC 3986, we should probably be using strcasecmp > for scheme comparisons (I'm not sure if g_uri_parse guarantees a > lower-case return, even when the user passed in upper case). That can > be a separate patch. Even beter, g_ascii_strcasecmp() (since strcasecmp can be locale-specific which is generally not what we need here) -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 09/13] block/gluster: Use URI parsing code from glib
On Fri, Apr 12, 2024 at 03:24:11PM +0200, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. > > Signed-off-by: Thomas Huth > --- > block/gluster.c | 71 - > 1 file changed, 35 insertions(+), 36 deletions(-) > > diff --git a/block/gluster.c b/block/gluster.c > index cc74af06dc..1c9505f8bb 100644 > --- a/block/gluster.c > +++ b/block/gluster.c > @@ -17,7 +17,6 @@ > #include "qapi/error.h" > #include "qapi/qmp/qdict.h" > #include "qapi/qmp/qerror.h" > -#include "qemu/uri.h" > #include "qemu/error-report.h" > #include "qemu/module.h" > #include "qemu/option.h" > @@ -289,9 +288,9 @@ static void glfs_clear_preopened(glfs_t *fs) > } > } > > -static int parse_volume_options(BlockdevOptionsGluster *gconf, char *path) > +static int parse_volume_options(BlockdevOptionsGluster *gconf, const char > *path) Is it worth mentioning in the commit message that this includes a const-correctness tweak? > @@ -364,57 +363,57 @@ static int > qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf, > QAPI_LIST_PREPEND(gconf->server, gsconf); > > /* transport */ > -if (!uri->scheme || !strcmp(uri->scheme, "gluster")) { > +uri_scheme = g_uri_get_scheme(uri); > +if (!uri_scheme || !strcmp(uri_scheme, "gluster")) { Pre-existing, but per RFC 3986, we should probably be using strcasecmp for scheme comparisons (I'm not sure if g_uri_parse guarantees a lower-case return, even when the user passed in upper case). That can be a separate patch. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v5 2/2] nbd/server: Mark negotiation functions as coroutine_fn
On Wed, Apr 10, 2024 at 10:05:28AM +0300, Vladimir Sementsov-Ogievskiy wrote: > > > > @@ -755,7 +764,8 @@ struct NBDTLSServerHandshakeData { > > > >Coroutine *co; > > > >}; > > > > > > > > -static void nbd_server_tls_handshake(QIOTask *task, void *opaque) > > > > +static coroutine_fn void > > > > > > This is not, that's a callback for tls handshake, which is not coroutine > > > context as I understand. > > > > The call chain is: > > > > nbd_negotiate() - coroutine_fn before this patch > >nbd_negotiate_options() - marked coroutine_fn here > > nbd_negotiate_handle_starttls() - marked coroutine_fn here > >qio_channel_tls_handshake() - in io/channel-tls.c; not marked > > coroutine_fn, but > > works both in or out of coroutine > > context > > ... > > nbd_server_tls_handshake() - renamed in 1/2 of this series > > > > > without this hunk: > > > > I can drop that particular marking. There are cases where the > > callback is reached without having done the qemu_coroutine_yield() in > > nbd_negotiate_handle_starttls; but there are also cases where the IO > > took enough time that the coroutine yielded and it is that callback > > that reawakens it. > > The key thing for me is that in this case (when coroutine yielded), > nbd_server_tls_handshake() would finally be called from glib IO handler, set > in > >qio_channel_tls_handshake() > qio_channel_tls_handshake_task() > qio_channel_add_watch_full() > g_source_set_callback(source, (GSourceFunc)func, user_data, > notify); <<< > > , which would definitely not be a coroutine context. > > > Do I understand correctly, that "coroutine_fn" means "only call from > coroutine context"[1], or does it mean "could be called from coroutine > context"[2]? I'm always fuzzy on that distinction myself. But reading the docs helps... > > The comment in osdep.h says: > > * Mark a function that executes in coroutine context > |} > * > | > * Functions that execute in coroutine context cannot be called directly from > | > * normal functions. ... > > So I assume, we mean [1]. ...[1] sounds more appropriate. Adding the marker is more of a promise that "I've audited that this function does not block and is therefore safe for a function in coroutine context to use", but sometimes additionally implies "this function assumes a coroutine is present; if you are not in a coroutine, things might break". But with a bit of thought, it is obvious that a coroutine can call functions that are not marked with coroutine_fn: any non-blocking syscall fits in this category (since we don't have control over system headers to add coroutine_fn annotations to those). So on that grounds, it is okay for a function marked coroutine_fn to call another function not marked coroutine_fn - it just makes the auditing harder to ensure that you aren't violating your promise of no non-blocking calls. It's too close to 9.0-rc3 for my comfort to include this patch series. Even though this bug can cause wedged migrations, I'd feel more comfortable with preparing the pull request for this series (including your fix for dropping this one annotation) for 9.1 and for qemu-stable once the release is complete. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v5 2/2] nbd/server: Mark negotiation functions as coroutine_fn
On Tue, Apr 09, 2024 at 09:30:39AM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 08.04.24 19:00, Eric Blake wrote: > > nbd_negotiate() is already marked coroutine_fn. And given the fix in > > the previous patch to have nbd_negotiate_handle_starttls not create > > and wait on a g_main_loop (as that would violate coroutine > > constraints), it is worth marking the rest of the related static > > functions reachable only during option negotiation as also being > > coroutine_fn. > > > > Suggested-by: Vladimir Sementsov-Ogievskiy > > Signed-off-by: Eric Blake > > --- > > nbd/server.c | 102 +-- > > 1 file changed, 59 insertions(+), 43 deletions(-) > > > > diff --git a/nbd/server.c b/nbd/server.c > > index 98ae0e16326..1857fba51c1 100644 > > [..] > > > { > > int rc; > > g_autofree char *name = NULL; > > @@ -755,7 +764,8 @@ struct NBDTLSServerHandshakeData { > > Coroutine *co; > > }; > > > > -static void nbd_server_tls_handshake(QIOTask *task, void *opaque) > > +static coroutine_fn void > > This is not, that's a callback for tls handshake, which is not coroutine > context as I understand. The call chain is: nbd_negotiate() - coroutine_fn before this patch nbd_negotiate_options() - marked coroutine_fn here nbd_negotiate_handle_starttls() - marked coroutine_fn here qio_channel_tls_handshake() - in io/channel-tls.c; not marked coroutine_fn, but works both in or out of coroutine context ... nbd_server_tls_handshake() - renamed in 1/2 of this series > without this hunk: I can drop that particular marking. There are cases where the callback is reached without having done the qemu_coroutine_yield() in nbd_negotiate_handle_starttls; but there are also cases where the IO took enough time that the coroutine yielded and it is that callback that reawakens it. > > > Reviewed-by: Vladimir Sementsov-Ogievskiy Thanks. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v5 1/2] nbd/server: do not poll within a coroutine context
From: Zhu Yangyang Coroutines are not supposed to block. Instead, they should yield. The client performs TLS upgrade outside of an AIOContext, during synchronous handshake; this still requires g_main_loop. But the server responds to TLS upgrade inside a coroutine, so a nested g_main_loop is wrong. Since the two callbacks no longer share more than the setting of data.complete and data.error, it's just as easy to use static helpers instead of trying to share a common code path. It is also possible to add assertions that no other code is interfering with the eventual path to qio reaching the callback, whether or not it required a yield or main loop. Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") Signed-off-by: Zhu Yangyang [eblake: move callbacks to their use point, add assertions] Signed-off-by: Eric Blake --- nbd/nbd-internal.h | 10 -- nbd/client.c | 28 nbd/common.c | 11 --- nbd/server.c | 28 +++- 4 files changed, 47 insertions(+), 30 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index dfa02f77ee4..91895106a95 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -72,16 +72,6 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size, return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0; } -struct NBDTLSHandshakeData { -GMainLoop *loop; -bool complete; -Error *error; -}; - - -void nbd_tls_handshake(QIOTask *task, - void *opaque); - int nbd_drop(QIOChannel *ioc, size_t size, Error **errp); #endif diff --git a/nbd/client.c b/nbd/client.c index 29ffc609a4b..c89c7504673 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -596,13 +596,31 @@ static int nbd_request_simple_option(QIOChannel *ioc, int opt, bool strict, return 1; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSClientHandshakeData { +bool complete; +Error *error; +GMainLoop *loop; +}; + +static void nbd_client_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSClientHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (data->loop) { +g_main_loop_quit(data->loop); +} +} + static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, const char *hostname, Error **errp) { int ret; QIOChannelTLS *tioc; -struct NBDTLSHandshakeData data = { 0 }; +struct NBDTLSClientHandshakeData data = { 0 }; ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, true, errp); if (ret <= 0) { @@ -619,18 +637,20 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, return NULL; } qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); -data.loop = g_main_loop_new(g_main_context_default(), FALSE); trace_nbd_receive_starttls_tls_handshake(); qio_channel_tls_handshake(tioc, - nbd_tls_handshake, + nbd_client_tls_handshake, &data, NULL, NULL); if (!data.complete) { +data.loop = g_main_loop_new(g_main_context_default(), FALSE); g_main_loop_run(data.loop); +assert(data.complete); +g_main_loop_unref(data.loop); } -g_main_loop_unref(data.loop); + if (data.error) { error_propagate(errp, data.error); object_unref(OBJECT(tioc)); diff --git a/nbd/common.c b/nbd/common.c index 3247c1d618a..589a748cfe6 100644 --- a/nbd/common.c +++ b/nbd/common.c @@ -47,17 +47,6 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp) } -void nbd_tls_handshake(QIOTask *task, - void *opaque) -{ -struct NBDTLSHandshakeData *data = opaque; - -qio_task_propagate_error(task, &data->error); -data->complete = true; -g_main_loop_quit(data->loop); -} - - const char *nbd_opt_lookup(uint32_t opt) { switch (opt) { diff --git a/nbd/server.c b/nbd/server.c index c3484cc1ebc..98ae0e16326 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -748,6 +748,23 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) return rc; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSServerHandshakeData { +bool complete; +Error *error; +Coroutine *co; +}; + +static void nbd_server_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSServerHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (!qemu_coroutine_entered(data->co)) { +aio_co_wake(data->co); +} +} /* Handle NBD_OPT_STARTTLS. Return NUL
[PATCH v5 2/2] nbd/server: Mark negotiation functions as coroutine_fn
nbd_negotiate() is already marked coroutine_fn. And given the fix in the previous patch to have nbd_negotiate_handle_starttls not create and wait on a g_main_loop (as that would violate coroutine constraints), it is worth marking the rest of the related static functions reachable only during option negotiation as also being coroutine_fn. Suggested-by: Vladimir Sementsov-Ogievskiy Signed-off-by: Eric Blake --- nbd/server.c | 102 +-- 1 file changed, 59 insertions(+), 43 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index 98ae0e16326..1857fba51c1 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -195,8 +195,9 @@ static inline void set_be_option_rep(NBDOptionReply *rep, uint32_t option, /* Send a reply header, including length, but no payload. * Return -errno on error, 0 on success. */ -static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, - uint32_t len, Error **errp) +static coroutine_fn int +nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, + uint32_t len, Error **errp) { NBDOptionReply rep; @@ -211,15 +212,15 @@ static int nbd_negotiate_send_rep_len(NBDClient *client, uint32_t type, /* Send a reply header with default 0 length. * Return -errno on error, 0 on success. */ -static int nbd_negotiate_send_rep(NBDClient *client, uint32_t type, - Error **errp) +static coroutine_fn int +nbd_negotiate_send_rep(NBDClient *client, uint32_t type, Error **errp) { return nbd_negotiate_send_rep_len(client, type, 0, errp); } /* Send an error reply. * Return -errno on error, 0 on success. */ -static int G_GNUC_PRINTF(4, 0) +static coroutine_fn int G_GNUC_PRINTF(4, 0) nbd_negotiate_send_rep_verr(NBDClient *client, uint32_t type, Error **errp, const char *fmt, va_list va) { @@ -259,7 +260,7 @@ nbd_sanitize_name(const char *name) /* Send an error reply. * Return -errno on error, 0 on success. */ -static int G_GNUC_PRINTF(4, 5) +static coroutine_fn int G_GNUC_PRINTF(4, 5) nbd_negotiate_send_rep_err(NBDClient *client, uint32_t type, Error **errp, const char *fmt, ...) { @@ -275,7 +276,7 @@ nbd_negotiate_send_rep_err(NBDClient *client, uint32_t type, /* Drop remainder of the current option, and send a reply with the * given error type and message. Return -errno on read or write * failure; or 0 if connection is still live. */ -static int G_GNUC_PRINTF(4, 0) +static coroutine_fn int G_GNUC_PRINTF(4, 0) nbd_opt_vdrop(NBDClient *client, uint32_t type, Error **errp, const char *fmt, va_list va) { @@ -288,7 +289,7 @@ nbd_opt_vdrop(NBDClient *client, uint32_t type, Error **errp, return ret; } -static int G_GNUC_PRINTF(4, 5) +static coroutine_fn int G_GNUC_PRINTF(4, 5) nbd_opt_drop(NBDClient *client, uint32_t type, Error **errp, const char *fmt, ...) { @@ -302,7 +303,7 @@ nbd_opt_drop(NBDClient *client, uint32_t type, Error **errp, return ret; } -static int G_GNUC_PRINTF(3, 4) +static coroutine_fn int G_GNUC_PRINTF(3, 4) nbd_opt_invalid(NBDClient *client, Error **errp, const char *fmt, ...) { int ret; @@ -319,8 +320,9 @@ nbd_opt_invalid(NBDClient *client, Error **errp, const char *fmt, ...) * If @check_nul, require that no NUL bytes appear in buffer. * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_read(NBDClient *client, void *buffer, size_t size, -bool check_nul, Error **errp) +static coroutine_fn int +nbd_opt_read(NBDClient *client, void *buffer, size_t size, + bool check_nul, Error **errp) { if (size > client->optlen) { return nbd_opt_invalid(client, errp, @@ -343,7 +345,8 @@ static int nbd_opt_read(NBDClient *client, void *buffer, size_t size, /* Drop size bytes from the unparsed payload of the current option. * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp) +static coroutine_fn int +nbd_opt_skip(NBDClient *client, size_t size, Error **errp) { if (size > client->optlen) { return nbd_opt_invalid(client, errp, @@ -366,8 +369,9 @@ static int nbd_opt_skip(NBDClient *client, size_t size, Error **errp) * Return -errno on I/O error, 0 if option was completely handled by * sending a reply about inconsistent lengths, or 1 on success. */ -static int nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length, - Error **errp) +static coroutine_fn int +nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length, + Error **errp) { int ret; uint32_t len; @@ -402,8 +406,8 @@ static int nbd_o
[PATCH v5 for-9.0? 0/2] Fix NBD TLS poll in coroutine
v4 was here: https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg00624.html Since then: add some asserts [Vladimir], add second patch with more coroutine_fn annotations [Vladimir] Eric Blake (1): nbd/server: Mark negotiation functions as coroutine_fn Zhu Yangyang (1): nbd/server: do not poll within a coroutine context nbd/nbd-internal.h | 10 nbd/client.c | 28 -- nbd/common.c | 11 nbd/server.c | 128 - 4 files changed, 105 insertions(+), 72 deletions(-) -- 2.44.0
Re: [PATCH v4] nbd/server: do not poll within a coroutine context
On Mon, Apr 08, 2024 at 11:46:39AM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 05.04.24 20:44, Eric Blake wrote: > > From: Zhu Yangyang > > > > Coroutines are not supposed to block. Instead, they should yield. > > > > The client performs TLS upgrade outside of an AIOContext, during > > synchronous handshake; this still requires g_main_loop. But the > > server responds to TLS upgrade inside a coroutine, so a nested > > g_main_loop is wrong. Since the two callbacks no longer share more > > than the setting of data.complete and data.error, it's just as easy to > > use static helpers instead of trying to share a common code path. > > > > Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") > > Signed-off-by: Zhu Yangyang > > [eblake: move callbacks to their use point] > > Signed-off-by: Eric Blake > > Reviewed-by: Vladimir Sementsov-Ogievskiy I'm debating whether it is worth trying to shove this into 9.0; -rc3 is very late, and the problem is pre-existing, so I'm leaning towards no. At which point, it's better to get this right. > > still, some notes below > > > --- > > > > v3: https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg00375.html > > > > in v4, factor even the struct to the .c files, avoiding a union [Vladimir] > > > > nbd/nbd-internal.h | 10 -- > > nbd/client.c | 27 +++ > > nbd/common.c | 11 --- > > nbd/server.c | 29 +++-- > > 4 files changed, 46 insertions(+), 31 deletions(-) > > > > +++ b/nbd/client.c > > @@ -596,13 +596,31 @@ static int nbd_request_simple_option(QIOChannel *ioc, > > int opt, bool strict, > > return 1; > > } > > > > +/* Callback to learn when QIO TLS upgrade is complete */ > > +struct NBDTLSClientHandshakeData { > > +bool complete; > > +Error *error; > > +GMainLoop *loop; > > +}; > > + > > +static void nbd_client_tls_handshake(QIOTask *task, void *opaque) > > +{ > > +struct NBDTLSClientHandshakeData *data = opaque; > > + > > +qio_task_propagate_error(task, &data->error); > > +data->complete = true; > > +if (data->loop) { > > +g_main_loop_quit(data->loop); > > +} > > +} > > + > > static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, > > QCryptoTLSCreds *tlscreds, > > const char *hostname, Error > > **errp) > > { > > int ret; > > QIOChannelTLS *tioc; > > -struct NBDTLSHandshakeData data = { 0 }; > > +struct NBDTLSClientHandshakeData data = { 0 }; > > > > ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, true, errp); > > if (ret <= 0) { > > @@ -619,18 +637,19 @@ static QIOChannel *nbd_receive_starttls(QIOChannel > > *ioc, > > return NULL; > > } > > qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); > > -data.loop = g_main_loop_new(g_main_context_default(), FALSE); > > trace_nbd_receive_starttls_tls_handshake(); > > qio_channel_tls_handshake(tioc, > > - nbd_tls_handshake, > > + nbd_client_tls_handshake, > > &data, > > NULL, > > NULL); > > > > if (!data.complete) { > > +data.loop = g_main_loop_new(g_main_context_default(), FALSE); > > g_main_loop_run(data.loop); > > +g_main_loop_unref(data.loop); > > probably good to assert(data.complete); Seems reasonable. > > +++ b/nbd/server.c > > @@ -748,6 +748,23 @@ static int nbd_negotiate_handle_info(NBDClient > > *client, Error **errp) > > return rc; > > } > > > > +/* Callback to learn when QIO TLS upgrade is complete */ > > +struct NBDTLSServerHandshakeData { > > +bool complete; > > +Error *error; > > +Coroutine *co; > > +}; > > + > > +static void nbd_server_tls_handshake(QIOTask *task, void *opaque) > > +{ > > +struct NBDTLSServerHandshakeData *data = opaque; > > + > > +qio_task_propagate_error(task, &data->error); > > +data->complete = true; > > +if (!qemu_coroutine_entered(data->co)) { > > +aio_co_wake(data->co); > > +} > > +}
[PATCH v4] nbd/server: do not poll within a coroutine context
From: Zhu Yangyang Coroutines are not supposed to block. Instead, they should yield. The client performs TLS upgrade outside of an AIOContext, during synchronous handshake; this still requires g_main_loop. But the server responds to TLS upgrade inside a coroutine, so a nested g_main_loop is wrong. Since the two callbacks no longer share more than the setting of data.complete and data.error, it's just as easy to use static helpers instead of trying to share a common code path. Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") Signed-off-by: Zhu Yangyang [eblake: move callbacks to their use point] Signed-off-by: Eric Blake --- v3: https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg00375.html in v4, factor even the struct to the .c files, avoiding a union [Vladimir] nbd/nbd-internal.h | 10 -- nbd/client.c | 27 +++ nbd/common.c | 11 --- nbd/server.c | 29 +++-- 4 files changed, 46 insertions(+), 31 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index dfa02f77ee4..91895106a95 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -72,16 +72,6 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size, return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0; } -struct NBDTLSHandshakeData { -GMainLoop *loop; -bool complete; -Error *error; -}; - - -void nbd_tls_handshake(QIOTask *task, - void *opaque); - int nbd_drop(QIOChannel *ioc, size_t size, Error **errp); #endif diff --git a/nbd/client.c b/nbd/client.c index 29ffc609a4b..c7141d7a098 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -596,13 +596,31 @@ static int nbd_request_simple_option(QIOChannel *ioc, int opt, bool strict, return 1; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSClientHandshakeData { +bool complete; +Error *error; +GMainLoop *loop; +}; + +static void nbd_client_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSClientHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (data->loop) { +g_main_loop_quit(data->loop); +} +} + static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, const char *hostname, Error **errp) { int ret; QIOChannelTLS *tioc; -struct NBDTLSHandshakeData data = { 0 }; +struct NBDTLSClientHandshakeData data = { 0 }; ret = nbd_request_simple_option(ioc, NBD_OPT_STARTTLS, true, errp); if (ret <= 0) { @@ -619,18 +637,19 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, return NULL; } qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); -data.loop = g_main_loop_new(g_main_context_default(), FALSE); trace_nbd_receive_starttls_tls_handshake(); qio_channel_tls_handshake(tioc, - nbd_tls_handshake, + nbd_client_tls_handshake, &data, NULL, NULL); if (!data.complete) { +data.loop = g_main_loop_new(g_main_context_default(), FALSE); g_main_loop_run(data.loop); +g_main_loop_unref(data.loop); } -g_main_loop_unref(data.loop); + if (data.error) { error_propagate(errp, data.error); object_unref(OBJECT(tioc)); diff --git a/nbd/common.c b/nbd/common.c index 3247c1d618a..589a748cfe6 100644 --- a/nbd/common.c +++ b/nbd/common.c @@ -47,17 +47,6 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp) } -void nbd_tls_handshake(QIOTask *task, - void *opaque) -{ -struct NBDTLSHandshakeData *data = opaque; - -qio_task_propagate_error(task, &data->error); -data->complete = true; -g_main_loop_quit(data->loop); -} - - const char *nbd_opt_lookup(uint32_t opt) { switch (opt) { diff --git a/nbd/server.c b/nbd/server.c index c3484cc1ebc..ea13cf0e766 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -748,6 +748,23 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) return rc; } +/* Callback to learn when QIO TLS upgrade is complete */ +struct NBDTLSServerHandshakeData { +bool complete; +Error *error; +Coroutine *co; +}; + +static void nbd_server_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSServerHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (!qemu_coroutine_entered(data->co)) { +aio_co_wake(data->co); +} +} /* Handle NBD_OPT_STARTTLS. Return NULL to drop connection, or else the * new channel for all
Re: [PATCH v3] nbd/server: do not poll within a coroutine context
On Fri, Apr 05, 2024 at 05:10:16PM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 04.04.24 04:42, Eric Blake wrote: > > From: Zhu Yangyang > > > > Coroutines are not supposed to block. Instead, they should yield. > > > > The client performs TLS upgrade outside of an AIOContext, during > > synchronous handshake; this still requires g_main_loop. But the > > server responds to TLS upgrade inside a coroutine, so a nested > > g_main_loop is wrong. Since the two callbacks no longer share more > > than the setting of data.complete and data.error, it's just as easy to > > use static helpers instead of trying to share a common code path. > > > > Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") > > Signed-off-by: Zhu Yangyang > > [eblake: move callbacks to their use point] > > Signed-off-by: Eric Blake > > --- > > > > After looking at this more, I'm less convinced that there is enough > > common code here to even be worth trying to share in common.c. This > > takes the essence of the v2 patch, but refactors it a bit. > > Maybe, do the complete split, and make separate structure definitions in > client.c and server.c, and don't make shared NBDTLSHandshakeData with union? > Finally, it's just a simple opaque-structure for static callback function, > seems good to keep it in .c as well. Sure, v4 coming up along those lines. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
[PATCH v3] nbd/server: do not poll within a coroutine context
From: Zhu Yangyang Coroutines are not supposed to block. Instead, they should yield. The client performs TLS upgrade outside of an AIOContext, during synchronous handshake; this still requires g_main_loop. But the server responds to TLS upgrade inside a coroutine, so a nested g_main_loop is wrong. Since the two callbacks no longer share more than the setting of data.complete and data.error, it's just as easy to use static helpers instead of trying to share a common code path. Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation") Signed-off-by: Zhu Yangyang [eblake: move callbacks to their use point] Signed-off-by: Eric Blake --- After looking at this more, I'm less convinced that there is enough common code here to even be worth trying to share in common.c. This takes the essence of the v2 patch, but refactors it a bit. v2: https://lists.gnu.org/archive/html/qemu-devel/2024-04/msg00019.html nbd/nbd-internal.h | 20 ++-- nbd/client.c | 21 + nbd/common.c | 11 --- nbd/server.c | 21 - 4 files changed, 43 insertions(+), 30 deletions(-) diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h index dfa02f77ee4..087c6bfc002 100644 --- a/nbd/nbd-internal.h +++ b/nbd/nbd-internal.h @@ -63,6 +63,16 @@ #define NBD_SET_TIMEOUT _IO(0xab, 9) #define NBD_SET_FLAGS _IO(0xab, 10) +/* Used in NBD_OPT_STARTTLS handling */ +struct NBDTLSHandshakeData { +bool complete; +Error *error; +union { +GMainLoop *loop; +Coroutine *co; +} u; +}; + /* nbd_write * Writes @size bytes to @ioc. Returns 0 on success. */ @@ -72,16 +82,6 @@ static inline int nbd_write(QIOChannel *ioc, const void *buffer, size_t size, return qio_channel_write_all(ioc, buffer, size, errp) < 0 ? -EIO : 0; } -struct NBDTLSHandshakeData { -GMainLoop *loop; -bool complete; -Error *error; -}; - - -void nbd_tls_handshake(QIOTask *task, - void *opaque); - int nbd_drop(QIOChannel *ioc, size_t size, Error **errp); #endif diff --git a/nbd/client.c b/nbd/client.c index 29ffc609a4b..c9dc5265404 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -596,6 +596,18 @@ static int nbd_request_simple_option(QIOChannel *ioc, int opt, bool strict, return 1; } +/* Callback to learn when QIO TLS upgrade is complete */ +static void nbd_client_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (data->u.loop) { +g_main_loop_quit(data->u.loop); +} +} + static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, QCryptoTLSCreds *tlscreds, const char *hostname, Error **errp) @@ -619,18 +631,19 @@ static QIOChannel *nbd_receive_starttls(QIOChannel *ioc, return NULL; } qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls"); -data.loop = g_main_loop_new(g_main_context_default(), FALSE); trace_nbd_receive_starttls_tls_handshake(); qio_channel_tls_handshake(tioc, - nbd_tls_handshake, + nbd_client_tls_handshake, &data, NULL, NULL); if (!data.complete) { -g_main_loop_run(data.loop); +data.u.loop = g_main_loop_new(g_main_context_default(), FALSE); +g_main_loop_run(data.u.loop); +g_main_loop_unref(data.u.loop); } -g_main_loop_unref(data.loop); + if (data.error) { error_propagate(errp, data.error); object_unref(OBJECT(tioc)); diff --git a/nbd/common.c b/nbd/common.c index 3247c1d618a..589a748cfe6 100644 --- a/nbd/common.c +++ b/nbd/common.c @@ -47,17 +47,6 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp) } -void nbd_tls_handshake(QIOTask *task, - void *opaque) -{ -struct NBDTLSHandshakeData *data = opaque; - -qio_task_propagate_error(task, &data->error); -data->complete = true; -g_main_loop_quit(data->loop); -} - - const char *nbd_opt_lookup(uint32_t opt) { switch (opt) { diff --git a/nbd/server.c b/nbd/server.c index c3484cc1ebc..d16726a6326 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -748,6 +748,17 @@ static int nbd_negotiate_handle_info(NBDClient *client, Error **errp) return rc; } +/* Callback to learn when QIO TLS upgrade is complete */ +static void nbd_server_tls_handshake(QIOTask *task, void *opaque) +{ +struct NBDTLSHandshakeData *data = opaque; + +qio_task_propagate_error(task, &data->error); +data->complete = true; +if (!qemu_coroutine_entered(data->u.co)) { +aio_co_wake(data->u.co); +} +} /* Handle NBD_OPT_STA
Re: [PATCH 06/19] block/stream: fix -Werror=maybe-uninitialized false-positives
On Wed, Apr 03, 2024 at 01:24:11PM +0400, Marc-André Lureau wrote: > > > Unfortunately, it doesn't work in all cases. It seems to have issues > > > with some guards: > > > ../block/stream.c: In function ‘stream_run’: > > > ../block/stream.c:216:12: error: ‘ret’ may be used uninitialized > > > [-Werror=maybe-uninitialized] > > >216 | if (ret < 0) { > > > That one looks like: int ret; WITH_GRAPH_RDLOCK_GUARD() { ret = ...; } if (copy) { ret = ...; } if (ret < 0) I suspect the compiler is seeing the uncertainty possible from the second conditional, and letting it take priority over the certainty that the tweaked macro provided for the first conditional. > > > > > > > So, updated macro helps in some cases, but doesn't help here? Intersting, > > why. > > > > > What should we do? change the macros + cherry-pick the missing > > > false-positives, or keep this series as is? An uglier macro, with sufficient comments as to why it is ugly (in order to let us have fewer false positives where we have to add initializers) may be less churn in the code base, but I'm not necessarily sold on the ugly macro. Let's see if anyone else expresses an opinion. > > > > > > > > > > I think marco + missing is better. No reason to add dead-initializations in > > cases where new macros helps. > > Ok > > > Still, would be good to understand, what's the difference, why it help on > > some cases and not help in another. > > I don't know, it's like if the analyzer was lazy for this particular > case, although there is nothing much different from other usages. > > If I replace: > for (... *var2 = (void *)true; var2; > with: > for (... *var2 = (void *)true; var2 || true; > > then it doesn't warn.. but it also doesn't work. We want the body to execute exactly once, not infloop. > > Interestingly as well, if I change: > for (... *var2 = (void *)true; var2; var2 = NULL) > for: > for (... *var2 = GML_OBJ_(); var2; var2 = NULL) > > GML_OBJ_() simply being &(GraphLockable) { }), an empty compound > literal, then it doesn't work, in all usages. So the compiler is not figuring out that the compound literal is sufficient for an unconditional one time through the for loop body. What's worse, different compiler versions will change behavior over time. Making the code ugly to pacify a particular compiler, when that compiler might improve in the future, is a form of chasing windmills. > > All in all, I am not sure the trick of using var2 is really reliable either. And that's my biggest argument for not making the macro not more complex than it already is. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 06/19] block/stream: fix -Werror=maybe-uninitialized false-positives
On Tue, Apr 02, 2024 at 12:58:43PM +0300, Vladimir Sementsov-Ogievskiy wrote: > > > Again, same false-positives, because of WITH_GRAPH_RDLOCK_GUARD().. > > > > > > Didn't you try to change WITH_ macros somehow, so that compiler believe > > > in our good intentions? > > > > > > > > > #define WITH_QEMU_LOCK_GUARD_(x, var) \ > > for (g_autoptr(QemuLockable) var = \ > > qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))); \ > > var; \ > > qemu_lockable_auto_unlock(var), var = NULL) > > > > I can't think of a clever way to rewrite this. The compiler probably > > thinks the loop may not run, due to the "var" condition. But how to > > convince it otherwise? it's hard to introduce another variable too.. > > > hmm. maybe like this? > > #define WITH_QEMU_LOCK_GUARD_(x, var) \ > for (g_autoptr(QemuLockable) var = \ > qemu_lockable_auto_lock(QEMU_MAKE_LOCKABLE_NONNULL((x))), \ > var2 = (void *)(true); \ > var2; \ > qemu_lockable_auto_unlock(var), var2 = NULL) > > > probably, it would be simpler for compiler to understand the logic this way. > Could you check? Wouldn't that attach __attribute__((cleanup(xxx))) to var2, at which point we could cause the compiler to call xxx((void*)(true)) if the user does an early return inside the lock guard, with disastrous consequences? Or is the __attribute__ applied only to the first out of two declarations in a list? -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH v2 1/1] nbd/server: do not poll within a coroutine context
nbd_negotiate_handle_starttls(NBDClient *client, > > qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-server-tls"); > trace_nbd_negotiate_handle_starttls_handshake(); > -data.loop = g_main_loop_new(g_main_context_default(), FALSE); > +data.co = qemu_coroutine_self(); > qio_channel_tls_handshake(tioc, > - nbd_tls_handshake, > + nbd_server_tls_handshake, >&data, >NULL, > NULL); > > -if (!data.complete) { > -g_main_loop_run(data.loop); > +while (!data.complete) { > +qemu_coroutine_yield(); > } > -g_main_loop_unref(data.loop); > + > if (data.error) { > object_unref(OBJECT(tioc)); > error_propagate(errp, data.error); > -- > 2.33.0 > Thanks for the updated patch - it looks like we are heading in a good direction. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH for-9.1 6/9] block/nbd: Use URI parsing code from glib
Adjusting cc list to add upstream NBD and drop developers unrelated to this part of the qemu series... On Thu, Mar 28, 2024 at 02:13:42PM +, Richard W.M. Jones wrote: > On Thu, Mar 28, 2024 at 03:06:03PM +0100, Thomas Huth wrote: > > Since version 2.66, glib has useful URI parsing functions, too. > > Use those instead of the QEMU-internal ones to be finally able > > to get rid of the latter. The g_uri_get_host() also takes care > > of removing the square brackets from IPv6 addresses, so we can > > drop that part of the QEMU code now, too. > > > > > > if (is_unix) { > > /* nbd+unix:///export?socket=path */ > > -if (uri->server || uri->port || strcmp(qp->p[0].name, "socket")) { > > +const char *uri_socket = g_hash_table_lookup(qp, "socket"); > > +if (uri_server || uri_port != -1 || !uri_socket) { > > ret = -EINVAL; > > goto out; > > } The spec for NBD URIs is at: https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md Should any of this spec mention case-insensitive concerns, such as whether 'NBD://' may be equivalent to 'nbd://', and whether 'nbd+unix:///?socket=x' is equivalent to 'nbd+unix:///?Socket=x'? Right now, I think that most implementations of NBD servers and clients happen to use case-sensitive parsing; but glib provides the option to do case-insensitive query parsing. If I read https://docs.gtk.org/glib/type_func.Uri.parse_params.html correctly, passing G_URI_PARAMS_CASE_INSENSITIVE (which you did not do) would mean that 'nbd+unix:///?socket=ignore&Socket=/for/real' would result in this g_hash_table_lookup finding only "Socket", not "socket". Maybe it is worth an explicit addition to the NBD URI spec to mention that we intend to be case-sensitive (in the parts where it can be; I'm not sure if the schema part must be handled case-insensitively without re-reading the RFCs), and therefore that 'Socket=' does NOT have the same meaning as 'socket='. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH for-9.1 6/9] block/nbd: Use URI parsing code from glib
On Thu, Mar 28, 2024 at 03:06:03PM +0100, Thomas Huth wrote: > Since version 2.66, glib has useful URI parsing functions, too. > Use those instead of the QEMU-internal ones to be finally able > to get rid of the latter. The g_uri_get_host() also takes care > of removing the square brackets from IPv6 addresses, so we can > drop that part of the QEMU code now, too. > > Signed-off-by: Thomas Huth > --- > block/nbd.c | 66 ++--- > 1 file changed, 38 insertions(+), 28 deletions(-) > > diff --git a/block/nbd.c b/block/nbd.c > index ef05f7cdfd..95b507f872 100644 > --- a/block/nbd.c > +++ b/block/nbd.c > @@ -31,7 +31,6 @@ > #include "qemu/osdep.h" > > #include "trace.h" > -#include "qemu/uri.h" > #include "qemu/option.h" > #include "qemu/cutils.h" > #include "qemu/main-loop.h" > @@ -1514,30 +1513,34 @@ static void nbd_client_close(BlockDriverState *bs) > > static int nbd_parse_uri(const char *filename, QDict *options) > { > -URI *uri; > +GUri *uri; Is it worth using 'g_autoptr(GUri) uri = NULL;' here, to simplify cleanup later? > const char *p; > -QueryParams *qp = NULL; > +GHashTable *qp = NULL; Presumably would be easier if qp is also auto-free. > +int qp_n; > int ret = 0; > bool is_unix; > +const char *uri_scheme, *uri_query, *uri_server; > +int uri_port; > > -uri = uri_parse(filename); > +uri = g_uri_parse(filename, G_URI_FLAGS_NONE, NULL); The glib API is fairly close to what we have in qemu, making this a nice switchover. > /* nbd[+tcp]://host[:port]/export */ > -if (!uri->server) { > +if (!uri_server) { > ret = -EINVAL; > goto out; > } > > -/* strip braces from literal IPv6 address */ > -if (uri->server[0] == '[') { > -host = qstring_from_substr(uri->server, 1, > - strlen(uri->server) - 1); > -} else { > -host = qstring_from_str(uri->server); > -} > - > qdict_put_str(options, "server.type", "inet"); > -qdict_put(options, "server.host", host); > +qdict_put_str(options, "server.host", uri_server); > > -port_str = g_strdup_printf("%d", uri->port ?: NBD_DEFAULT_PORT); > +port_str = g_strdup_printf("%d", uri_port != -1 ? uri_port > +: NBD_DEFAULT_PORT); > qdict_put_str(options, "server.port", port_str); If a user requests nbd://hostname:0/export, this now sets server.port to "0" instead of "10809". Is that an intentional change? No one actually passes an explicit ":0" port on purpose, but we do have to worry about malicious URIs. > g_free(port_str); > } > > out: > if (qp) { > -query_params_free(qp); > +g_hash_table_destroy(qp); > } > -uri_free(uri); > +g_uri_unref(uri); > return ret; It may be possible to eliminate the out label altogether, if GHashTable has the appropriate auto-free magic. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 00/19] -Werror=maybe-uninitialized fixes
On Thu, Mar 28, 2024 at 02:20:33PM +0400, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > Hi, > > Depending on -Doptimization=, GCC (13.2.1 here) produces different > maybe-uninitialized warnings: > - g: produces -Werror=maybe-uninitialized errors > - 0: clean build > - 1: produces -Werror=maybe-uninitialized errors > - 2: clean build > - 3: produces few -Werror=maybe-uninitialized errors > - s: produces -Werror=maybe-uninitialized errors > > Most are false-positive, because prior LOCK_GUARD should guarantee an > initialization path. Few of them are a bit trickier. Finally, I found > a potential related memory leak. > > thanks Couple of subject lines are inconsistent; I suggest: > > Marc-André Lureau (19): > util/coroutine: fix -Werror=maybe-uninitialized false-positive > util/timer: with -Werror=maybe-uninitialized false-positive s/with/fix/ > hw/qxl: fix -Werror=maybe-uninitialized false-positives > nbd: with -Werror=maybe-uninitialized false-positive s/with/fix/ > block/mirror: fix -Werror=maybe-uninitialized false-positive > block/stream: fix -Werror=maybe-uninitialized false-positives > hw/ahci: fix -Werror=maybe-uninitialized false-positive > hw/vhost-scsi: fix -Werror=maybe-uninitialized > hw/sdhci: fix -Werror=maybe-uninitialized false-positive > hw/rdma: fix -Werror=maybe-uninitialized false-positive > migration/block: fix -Werror=maybe-uninitialized false-positive > migration: fix -Werror=maybe-uninitialized false-positives > hw/virtio-blk: fix -Werror=maybe-uninitialized false-positive > plugins: fix -Werror=maybe-uninitialized false-positive > migration: fix -Werror=maybe-uninitialized false-positive > tests: fix -Werror=maybe-uninitialized > hw/nvme: fix -Werror=maybe-uninitialized > hw/virtio: fix -Werror=maybe-uninitialized > RFC: hw/virtio: a potential leak fix > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Re: [PATCH 04/19] nbd: with -Werror=maybe-uninitialized false-positive
On Thu, Mar 28, 2024 at 02:20:37PM +0400, marcandre.lur...@redhat.com wrote: > From: Marc-André Lureau > > ../nbd/client-connection.c:419:8: error: ‘wait_co’ may be used uninitialized > [-Werror=maybe-uninitialized] > > Signed-off-by: Marc-André Lureau > --- > nbd/client-connection.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Eric Blake > > diff --git a/nbd/client-connection.c b/nbd/client-connection.c > index f9da67c87e..b11e266807 100644 > --- a/nbd/client-connection.c > +++ b/nbd/client-connection.c > @@ -410,7 +410,7 @@ nbd_co_establish_connection(NBDClientConnection *conn, > NBDExportInfo *info, > */ > void nbd_co_establish_connection_cancel(NBDClientConnection *conn) > { > -Coroutine *wait_co; > +Coroutine *wait_co = NULL; > > WITH_QEMU_LOCK_GUARD(&conn->mutex) { > wait_co = g_steal_pointer(&conn->wait_co); > -- > 2.44.0 > -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org