Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Wed, Feb 16, 2022 at 07:14:58PM +0200, Nir Soffer wrote: > > > I'm not the best at writing python iotests; I welcome a language > > > translation of this aspect. > > > > > > > > Let me try:) > > Thanks! This is much nicer and will be easier to maintain. > > > > > > > #!/usr/bin/env python3 > > > > import os > > import iotests > > import nbd What happens here if libnbd is not installed? Is there a way to make the test gracefully skip instead of error out? > > def check_multi_conn(self, export_name, multi_conn): > > cl = nbd.NBD() > > cl.connect_uri(nbd_uri.format(export_name)) > > self.assertEqual(cl.can_multi_conn(), multi_conn) > > cl.shutdown() > > The test will be more clear and the code more reusable if the helper > was doing only connect/disconnect. > > @contextmanager > def open_nbd(nbd_uri, export_name): > h = nbd.NBD() > h.connect_uri(nbd_uri.format(export_name)) This can throw an exception if the connect fails; should it be inside the try? I'm comparing to libnbd/python/t/220-opt-list.py, which also sets up a context manager. > try: > yield h > finally: > h.shutdown() > > Any test that need nbd handle can do: > > with open_nbd(nbd_uri, export_name) as h: > use the handle... > > Good example when we can use this is the block status cache test, > using complicated qemu-img commands instead of opening > a client and calling block_status a few times. > > And this also cleans up at the end of the test so failure > will not affect the next test. > > The helper can live in the iotest module instead of inventing it for > every new test. Moving it into iotest makes the question about what to do if libnbd is not installed even more important; while we could perhaps catch and deal with a failed 'import' for this test, skipping the iotest module due to a failed import has knock-on effects to all other iotests even when they don't care if libnbd is installed. > > > > > def test_default_settings(self): > > self.server_start() > > self.export_add('r')) > > self.export_add('w', writable=True) > > self.check_multi_conn('r', True) > > self.check_multi_conn('w', False) > > With the helper suggested, this test will be: > > with open_nbd(nbd_uri, "r") as h: > self.assertTrue(h.can_multi_conn()) > > with open_nbd(nbd_uri, "w") as h: > self.assertFalse(h.can_multi_conn()) > > Now you don't need to check what check_multicon() is doing when > reading the tests, and it is very clear what open_nbd() does based > on the name and usage as context manager. Yes, I like that aspect of a context manager. > > > > > def test_multiconn_option(self): > > self.server_start() > > self.export_add('r', multi_conn='off') > > self.export_add('w', writable=True, multi_conn='on') > > It will be more natural to use: > > self.start_server() > self.add_export(...) > > In C the other way is more natural because you don't have namespaces > or classes. > > > self.check_multi_conn('r', False) > > self.check_multi_conn('w', True) > > And I think you agree since you did not use: > > self.multi_con_check(...) > Useful naming advice. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Wed, Feb 16, 2022 at 11:08:06AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 16.02.2022 02:24, Eric Blake wrote: > > > > +++ b/tests/qemu-iotests/tests/nbd-multiconn > > > > @@ -0,0 +1,188 @@ > > > > +#!/usr/bin/env bash > > > > +# group: rw auto quick > > > > +# > > > > +# Test that qemu-nbd MULTI_CONN works > > > > +# > > > > +echo > > > > +echo "=== Initial image setup ===" > > > > +echo > > > > + > > > > +_make_test_img 4M > > > > +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | > > > > _filter_qemu_io > > > > +_launch_qemu 2> >(_filter_nbd) > > > > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return" > > > > +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add", > > > > + "arguments":{"driver":"qcow2", "node-name":"n", > > > > +"file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return" > > > > I'm not the best at writing python iotests; I welcome a language > > translation of this aspect. > > > > Let me try:) > > > #!/usr/bin/env python3 > > import os > import iotests > import nbd > from iotests import qemu_img_create, qemu_io_silent > > > disk = os.path.join(iotests.test_dir, 'disk') > size = '4M' > nbd_sock = os.path.join(iotests.test_dir, 'nbd_sock') > nbd_uri = 'nbd+unix:///{}?socket=' + nbd_sock ... Thanks; I'm playing with this (and the improvements suggested in followup messages) in preparation for a v3 posting. > > > > +nbdsh -u "nbd+unix:///r?socket=$nbd_unix_socket" -c ' > > > > +assert h.can_multi_conn() > > > > +h.shutdown() > > > > +print("nbdsh passed")' > > > > +nbdsh -u "nbd+unix:///w?socket=$nbd_unix_socket" -c ' > > > > +assert not h.can_multi_conn() > > > > +h.shutdown() > > > > +print("nbdsh passed")' > > > > > > > > > > Mixing of shell and python is very confusing. Wouldn't it be much cleaner > > > to write the test in python? > > > > Here, nbdsh -c 'python snippet' is used as a shell command line > > parameter. Writing python code to call out to a system() command > > where one of the arguments to that command is a python script snippet > > is going to be just as annoying as writing it in bash. Then again, since libnbd already includes python bindings, we wouldn't have to detour through nbdsh, but just use the python bindings directly (and I see that your translation did that). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Wed, Feb 16, 2022 at 02:44:52PM +0100, Markus Armbruster wrote: > > > > +## > > +# @NbdExportMultiConn: > > +# > > +# Possible settings for advertising NBD multiple client support. > > +# > > +# @off: Do not advertise multiple clients. > > +# > > +# @on: Allow multiple clients (for writable clients, this is only safe > > +# if the underlying BDS is cache-consistent, such as when backed > > +# by the raw file driver); ignored if the NBD server was set up > > +# with max-connections of 1. > > +# > > +# @auto: Behaves like @off if the export is writable, and @on if the > > +#export is read-only. > > +# > > +# Since: 7.0 > > +## > > +{ 'enum': 'NbdExportMultiConn', > > + 'data': ['off', 'on', 'auto'] } > > Have you considered using OnOffAuto from common.json? Sounds good to me. > > Duplicating it as NbdExportMultiConn lets you document the values right > in the enum. If you reuse it, you have to document them where the enum > is used, i.e. ... > > > + > > ## > > # @BlockExportOptionsNbd: > > # > > @@ -95,11 +119,15 @@ > > #the metadata context name "qemu:allocation-depth" to > > #inspect allocation details. (since 5.2) > > # > > +# @multi-conn: Controls whether multiple client support is advertised, if > > the > > +# server's max-connections is not 1. (default auto, since 7.0) > > +# > > ... here. Yep, that's a good change to make for v3. I'll do it. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
16.02.2022 20:14, Nir Soffer wrote: On Wed, Feb 16, 2022 at 10:08 AM Vladimir Sementsov-Ogievskiy wrote: 16.02.2022 02:24, Eric Blake wrote: On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote: On Tue, Feb 15, 2022 at 7:22 PM Eric Blake wrote: According to the NBD spec, a server advertising NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We satisfy these conditions in qemu when our block layer is backed by the local filesystem (by virtue of the semantics of fdatasync(), and the fact that qemu itself is not buffering writes beyond flushes). It is harder to state whether we satisfy these conditions for network-based protocols, so the safest course of action is to allow users to opt-in to advertising multi-conn. We may later tweak defaults to advertise by default when the block layer can confirm that the underlying protocol driver is cache consistent between multiple writers, but for now, this at least allows savvy users (such as virt-v2v or nbdcopy) to explicitly start qemu-nbd or qemu-storage-daemon with multi-conn advertisement in a known-safe setup where the client end can then benefit from parallel clients. It makes sense, and will be used by oVirt. Actually we are already using multiple connections for writing about 2 years, based on your promise that if every client writes to district areas this is safe. I presume s/district/distinct/, but yes, I'm glad we're finally trying to make the code match existing practice ;) +++ b/docs/tools/qemu-nbd.rst @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. .. option:: -e, --shared=NUM Allow up to *NUM* clients to share the device (default - ``1``), 0 for unlimited. Safe for readers, but for now, - consistency is not guaranteed between multiple writers. + ``1``), 0 for unlimited. Removing the note means that now consistency is guaranteed between multiple writers, no? Or maybe we want to mention here that consistency depends on the protocol and users can opt in, or refer to the section where this is discussed? Yeah, a link to the QAPI docs where multi-conn is documented might be nice, except I'm not sure the best way to do that in our sphinx documentation setup. +## +# @NbdExportMultiConn: +# +# Possible settings for advertising NBD multiple client support. +# +# @off: Do not advertise multiple clients. +# +# @on: Allow multiple clients (for writable clients, this is only safe +# if the underlying BDS is cache-consistent, such as when backed +# by the raw file driver); ignored if the NBD server was set up +# with max-connections of 1. +# +# @auto: Behaves like @off if the export is writable, and @on if the +#export is read-only. +# +# Since: 7.0 +## +{ 'enum': 'NbdExportMultiConn', + 'data': ['off', 'on', 'auto'] } Are we going to have --multi-con=(on|off|auto)? Oh. The QMP command (which is immediately visible through nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) gains "multi-conn":"on", but you may be right that qemu-nbd would want a command line option (either that, or we accellerate our plans that qsd should replace qemu-nbd). +++ b/blockdev-nbd.c @@ -44,6 +44,11 @@ bool nbd_server_is_running(void) return nbd_server || is_qemu_nbd; } +int nbd_server_max_connections(void) +{ +return nbd_server ? nbd_server->max_connections : -1; +} -1 is a little bit strange for a limit, maybe 1 is a better default when we nbd_server == NULL? When can this happen? In qemu, if you haven't used the QMP command 'nbd-server-start' yet. In qemu-nbd, always (per the nbd_server_is_running function just above). My iotest only covered the qemu/qsd side, not the qemu-nbd side, so it looks like I need a v3... +++ b/nbd/server.c +/* + * Determine whether to advertise multi-conn. Default is auto, + * which resolves to on for read-only and off for writable. But + * if the server has max-connections 1, that forces the flag off. Looks good, this can be enabled automatically based on the driver if we want, so users using auto will can upgrade to multi-con automatically. Yes, that's part of why I made it a tri-state with a default of 'auto'. + */ +if (!arg->has_multi_conn) { +arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO; +} +if (nbd_server_max_connections() == 1) { +arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF; +} +if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) { +multi_conn = readonly; +} else { +multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON; +} This part is a little bit confusing - we do: - initialize args->multi_con if it has not value - set the temporary multi_con based now initialized args->multi_con I think it will be nicer to se
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Wed, Feb 16, 2022 at 10:08 AM Vladimir Sementsov-Ogievskiy wrote: > > 16.02.2022 02:24, Eric Blake wrote: > > On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote: > >> On Tue, Feb 15, 2022 at 7:22 PM Eric Blake wrote: > >> > >>> According to the NBD spec, a server advertising > >>> NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will > >>> not see any cache inconsistencies: when properly separated by a single > >>> flush, actions performed by one client will be visible to another > >>> client, regardless of which client did the flush. We satisfy these > >>> conditions in qemu when our block layer is backed by the local > >>> filesystem (by virtue of the semantics of fdatasync(), and the fact > >>> that qemu itself is not buffering writes beyond flushes). It is > >>> harder to state whether we satisfy these conditions for network-based > >>> protocols, so the safest course of action is to allow users to opt-in > >>> to advertising multi-conn. We may later tweak defaults to advertise > >>> by default when the block layer can confirm that the underlying > >>> protocol driver is cache consistent between multiple writers, but for > >>> now, this at least allows savvy users (such as virt-v2v or nbdcopy) to > >>> explicitly start qemu-nbd or qemu-storage-daemon with multi-conn > >>> advertisement in a known-safe setup where the client end can then > >>> benefit from parallel clients. > >>> > >> > >> It makes sense, and will be used by oVirt. Actually we are already using > >> multiple connections for writing about 2 years, based on your promise > >> that if every client writes to district areas this is safe. > > > > I presume s/district/distinct/, but yes, I'm glad we're finally trying > > to make the code match existing practice ;) > > > >>> +++ b/docs/tools/qemu-nbd.rst > >>> @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. > >>> .. option:: -e, --shared=NUM > >>> > >>> Allow up to *NUM* clients to share the device (default > >>> - ``1``), 0 for unlimited. Safe for readers, but for now, > >>> - consistency is not guaranteed between multiple writers. > >>> + ``1``), 0 for unlimited. > >>> > >> > >> Removing the note means that now consistency is guaranteed between > >> multiple writers, no? > >> > >> Or maybe we want to mention here that consistency depends on the protocol > >> and users can opt in, or refer to the section where this is discussed? > > > > Yeah, a link to the QAPI docs where multi-conn is documented might be > > nice, except I'm not sure the best way to do that in our sphinx > > documentation setup. > > > >>> +## > >>> +# @NbdExportMultiConn: > >>> +# > >>> +# Possible settings for advertising NBD multiple client support. > >>> +# > >>> +# @off: Do not advertise multiple clients. > >>> +# > >>> +# @on: Allow multiple clients (for writable clients, this is only safe > >>> +# if the underlying BDS is cache-consistent, such as when backed > >>> +# by the raw file driver); ignored if the NBD server was set up > >>> +# with max-connections of 1. > >>> +# > >>> +# @auto: Behaves like @off if the export is writable, and @on if the > >>> +#export is read-only. > >>> +# > >>> +# Since: 7.0 > >>> +## > >>> +{ 'enum': 'NbdExportMultiConn', > >>> + 'data': ['off', 'on', 'auto'] } > >>> > >> > >> Are we going to have --multi-con=(on|off|auto)? > > > > Oh. The QMP command (which is immediately visible through > > nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) > > gains "multi-conn":"on", but you may be right that qemu-nbd would want > > a command line option (either that, or we accellerate our plans that > > qsd should replace qemu-nbd). > > > >>> +++ b/blockdev-nbd.c > >>> @@ -44,6 +44,11 @@ bool nbd_server_is_running(void) > >>> return nbd_server || is_qemu_nbd; > >>> } > >>> > >>> +int nbd_server_max_connections(void) > >>> +{ > >>> +return nbd_server ? nbd_server->max_connections : -1; > >>> +} > >>> > >> > >> -1 is a little bit strange for a limit, maybe 1 is a better default when > >> we nbd_server == NULL? When can this happen? > > > > In qemu, if you haven't used the QMP command 'nbd-server-start' yet. > > In qemu-nbd, always (per the nbd_server_is_running function just > > above). My iotest only covered the qemu/qsd side, not the qemu-nbd > > side, so it looks like I need a v3... > > > >>> +++ b/nbd/server.c > > > >>> +/* > >>> + * Determine whether to advertise multi-conn. Default is auto, > >>> + * which resolves to on for read-only and off for writable. But > >>> + * if the server has max-connections 1, that forces the flag off. > >>> > >> > >> Looks good, this can be enabled automatically based on the driver > >> if we want, so users using auto will can upgrade to multi-con > >> automatically. > > > > Yes, that's part of why I made it a tri-state with a default of 'auto'. > > > >> > >> > >>> + */ > >>> +if (!arg->has_multi_conn) { > >>> +arg->mul
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
Eric Blake writes: > According to the NBD spec, a server advertising > NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will > not see any cache inconsistencies: when properly separated by a single > flush, actions performed by one client will be visible to another > client, regardless of which client did the flush. We satisfy these > conditions in qemu when our block layer is backed by the local > filesystem (by virtue of the semantics of fdatasync(), and the fact > that qemu itself is not buffering writes beyond flushes). It is > harder to state whether we satisfy these conditions for network-based > protocols, so the safest course of action is to allow users to opt-in > to advertising multi-conn. We may later tweak defaults to advertise > by default when the block layer can confirm that the underlying > protocol driver is cache consistent between multiple writers, but for > now, this at least allows savvy users (such as virt-v2v or nbdcopy) to > explicitly start qemu-nbd or qemu-storage-daemon with multi-conn > advertisement in a known-safe setup where the client end can then > benefit from parallel clients. > > Note, however, that we don't want to advertise MULTI_CONN when we know > that a second client cannot connect (for historical reasons, qemu-nbd > defaults to a single connection while nbd-server-add and QMP commands > default to unlimited connections; but we already have existing means > to let either style of NBD server creation alter those defaults). The > harder part of this patch is setting up an iotest to demonstrate > behavior of multiple NBD clients to a single server. It might be > possible with parallel qemu-io processes, but concisely managing that > in shell is painful. I found it easier to do by relying on the libnbd > project's nbdsh, which means this test will be skipped on platforms > where that is not available. > > Signed-off-by: Eric Blake > Fixes: https://bugzilla.redhat.com/1708300 > --- > > v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3]. > > [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html > [2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html > [3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html > > Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and > reworked the logic so that default behavior is unchanged for now > (advertising multi-conn on a writable export requires opt-in during > the command line or QMP, but remains default for a readonly export). > I've also expanded the amount of testing done in the new iotest. > > docs/interop/nbd.txt | 1 + > docs/tools/qemu-nbd.rst| 3 +- > qapi/block-export.json | 34 +++- > include/block/nbd.h| 3 +- > blockdev-nbd.c | 5 + > nbd/server.c | 27 ++- > MAINTAINERS| 1 + > tests/qemu-iotests/tests/nbd-multiconn | 188 + > tests/qemu-iotests/tests/nbd-multiconn.out | 112 > 9 files changed, 363 insertions(+), 11 deletions(-) > create mode 100755 tests/qemu-iotests/tests/nbd-multiconn > create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out > > diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt > index bdb0f2a41aca..6c99070b99c8 100644 > --- a/docs/interop/nbd.txt > +++ b/docs/interop/nbd.txt > @@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE > * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports, > NBD_CMD_FLAG_FAST_ZERO > * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth" > +* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports > diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst > index 6031f9689312..1de785524c36 100644 > --- a/docs/tools/qemu-nbd.rst > +++ b/docs/tools/qemu-nbd.rst > @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. > .. option:: -e, --shared=NUM > >Allow up to *NUM* clients to share the device (default > - ``1``), 0 for unlimited. Safe for readers, but for now, > - consistency is not guaranteed between multiple writers. > + ``1``), 0 for unlimited. > > .. option:: -t, --persistent > > diff --git a/qapi/block-export.json b/qapi/block-export.json > index f183522d0d2c..0a27e8ee84f9 100644 > --- a/qapi/block-export.json > +++ b/qapi/block-export.json > @@ -21,7 +21,9 @@ > # recreated on the fly while the NBD server is active. > # If missing, it will default to denying access (since 4.0). > # @max-connections: The maximum number of connections to allow at the same > -# time, 0 for unlimited. (since 5.2; default: 0) > +# time, 0 for unlimited. Setting this to 1 also stops > +# the server from advertising multiple client support > +# (since 5.2; default: 0) > # >
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Wed, Feb 16, 2022 at 12:13 PM Richard W.M. Jones wrote: > On Tue, Feb 15, 2022 at 05:24:14PM -0600, Eric Blake wrote: > > Oh. The QMP command (which is immediately visible through > > nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) > > gains "multi-conn":"on", but you may be right that qemu-nbd would want > > a command line option (either that, or we accellerate our plans that > > qsd should replace qemu-nbd). > > I really hope there will always be something called "qemu-nbd" > that acts like qemu-nbd. > I share this hope. Most projects I work on are based on qemu-nbd. However in oVirt use case, we want to provide an NBD socket for clients to allow direct access to disks. One of the issues we need to solve for this is having a way to tell if the qemu-nbd is active, so we can terminate idle transfers. The way we do this with the ovirt-imageio server is to query the status of the transfer, and use the idle time (time since last request) and active status (has inflight requests) to detect a stale transfer that should be terminated. An example use case is a process on a remote host that started an image transfer, and killed or crashed in the middle of the transfer without cleaning up properly. To be more specific, every request to the imageio server (read, write, flush, zero, options) updates a timestamp in the transfer state. When we get the status we report the time since that timestamp was updated. Additionally we keep and report the number of inflight requests, so we can tell the case when requests are blocked on inaccessible storage (e.g. non responsive NFS). We don't have a way to do this with qemu-nbd, but I guess that using qemu-storage-daemon when we have qmp access will make such monitoring possible. Nir
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Tue, Feb 15, 2022 at 05:24:14PM -0600, Eric Blake wrote: > Oh. The QMP command (which is immediately visible through > nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) > gains "multi-conn":"on", but you may be right that qemu-nbd would want > a command line option (either that, or we accellerate our plans that > qsd should replace qemu-nbd). I really hope there will always be something called "qemu-nbd" that acts like qemu-nbd. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
16.02.2022 02:24, Eric Blake wrote: On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote: On Tue, Feb 15, 2022 at 7:22 PM Eric Blake wrote: According to the NBD spec, a server advertising NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We satisfy these conditions in qemu when our block layer is backed by the local filesystem (by virtue of the semantics of fdatasync(), and the fact that qemu itself is not buffering writes beyond flushes). It is harder to state whether we satisfy these conditions for network-based protocols, so the safest course of action is to allow users to opt-in to advertising multi-conn. We may later tweak defaults to advertise by default when the block layer can confirm that the underlying protocol driver is cache consistent between multiple writers, but for now, this at least allows savvy users (such as virt-v2v or nbdcopy) to explicitly start qemu-nbd or qemu-storage-daemon with multi-conn advertisement in a known-safe setup where the client end can then benefit from parallel clients. It makes sense, and will be used by oVirt. Actually we are already using multiple connections for writing about 2 years, based on your promise that if every client writes to district areas this is safe. I presume s/district/distinct/, but yes, I'm glad we're finally trying to make the code match existing practice ;) +++ b/docs/tools/qemu-nbd.rst @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. .. option:: -e, --shared=NUM Allow up to *NUM* clients to share the device (default - ``1``), 0 for unlimited. Safe for readers, but for now, - consistency is not guaranteed between multiple writers. + ``1``), 0 for unlimited. Removing the note means that now consistency is guaranteed between multiple writers, no? Or maybe we want to mention here that consistency depends on the protocol and users can opt in, or refer to the section where this is discussed? Yeah, a link to the QAPI docs where multi-conn is documented might be nice, except I'm not sure the best way to do that in our sphinx documentation setup. +## +# @NbdExportMultiConn: +# +# Possible settings for advertising NBD multiple client support. +# +# @off: Do not advertise multiple clients. +# +# @on: Allow multiple clients (for writable clients, this is only safe +# if the underlying BDS is cache-consistent, such as when backed +# by the raw file driver); ignored if the NBD server was set up +# with max-connections of 1. +# +# @auto: Behaves like @off if the export is writable, and @on if the +#export is read-only. +# +# Since: 7.0 +## +{ 'enum': 'NbdExportMultiConn', + 'data': ['off', 'on', 'auto'] } Are we going to have --multi-con=(on|off|auto)? Oh. The QMP command (which is immediately visible through nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) gains "multi-conn":"on", but you may be right that qemu-nbd would want a command line option (either that, or we accellerate our plans that qsd should replace qemu-nbd). +++ b/blockdev-nbd.c @@ -44,6 +44,11 @@ bool nbd_server_is_running(void) return nbd_server || is_qemu_nbd; } +int nbd_server_max_connections(void) +{ +return nbd_server ? nbd_server->max_connections : -1; +} -1 is a little bit strange for a limit, maybe 1 is a better default when we nbd_server == NULL? When can this happen? In qemu, if you haven't used the QMP command 'nbd-server-start' yet. In qemu-nbd, always (per the nbd_server_is_running function just above). My iotest only covered the qemu/qsd side, not the qemu-nbd side, so it looks like I need a v3... +++ b/nbd/server.c +/* + * Determine whether to advertise multi-conn. Default is auto, + * which resolves to on for read-only and off for writable. But + * if the server has max-connections 1, that forces the flag off. Looks good, this can be enabled automatically based on the driver if we want, so users using auto will can upgrade to multi-con automatically. Yes, that's part of why I made it a tri-state with a default of 'auto'. + */ +if (!arg->has_multi_conn) { +arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO; +} +if (nbd_server_max_connections() == 1) { +arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF; +} +if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) { +multi_conn = readonly; +} else { +multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON; +} This part is a little bit confusing - we do: - initialize args->multi_con if it has not value - set the temporary multi_con based now initialized args->multi_con I think it will be nicer to separate arguments parsing, so there is no need to initialize it here or have arg->has_multi_conn, but I did not ch
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Tue, Feb 15, 2022 at 09:23:36PM +0200, Nir Soffer wrote: > On Tue, Feb 15, 2022 at 7:22 PM Eric Blake wrote: > > > According to the NBD spec, a server advertising > > NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will > > not see any cache inconsistencies: when properly separated by a single > > flush, actions performed by one client will be visible to another > > client, regardless of which client did the flush. We satisfy these > > conditions in qemu when our block layer is backed by the local > > filesystem (by virtue of the semantics of fdatasync(), and the fact > > that qemu itself is not buffering writes beyond flushes). It is > > harder to state whether we satisfy these conditions for network-based > > protocols, so the safest course of action is to allow users to opt-in > > to advertising multi-conn. We may later tweak defaults to advertise > > by default when the block layer can confirm that the underlying > > protocol driver is cache consistent between multiple writers, but for > > now, this at least allows savvy users (such as virt-v2v or nbdcopy) to > > explicitly start qemu-nbd or qemu-storage-daemon with multi-conn > > advertisement in a known-safe setup where the client end can then > > benefit from parallel clients. > > > > It makes sense, and will be used by oVirt. Actually we are already using > multiple connections for writing about 2 years, based on your promise > that if every client writes to district areas this is safe. I presume s/district/distinct/, but yes, I'm glad we're finally trying to make the code match existing practice ;) > > +++ b/docs/tools/qemu-nbd.rst > > @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. > > .. option:: -e, --shared=NUM > > > >Allow up to *NUM* clients to share the device (default > > - ``1``), 0 for unlimited. Safe for readers, but for now, > > - consistency is not guaranteed between multiple writers. > > + ``1``), 0 for unlimited. > > > > Removing the note means that now consistency is guaranteed between > multiple writers, no? > > Or maybe we want to mention here that consistency depends on the protocol > and users can opt in, or refer to the section where this is discussed? Yeah, a link to the QAPI docs where multi-conn is documented might be nice, except I'm not sure the best way to do that in our sphinx documentation setup. > > +## > > +# @NbdExportMultiConn: > > +# > > +# Possible settings for advertising NBD multiple client support. > > +# > > +# @off: Do not advertise multiple clients. > > +# > > +# @on: Allow multiple clients (for writable clients, this is only safe > > +# if the underlying BDS is cache-consistent, such as when backed > > +# by the raw file driver); ignored if the NBD server was set up > > +# with max-connections of 1. > > +# > > +# @auto: Behaves like @off if the export is writable, and @on if the > > +#export is read-only. > > +# > > +# Since: 7.0 > > +## > > +{ 'enum': 'NbdExportMultiConn', > > + 'data': ['off', 'on', 'auto'] } > > > > Are we going to have --multi-con=(on|off|auto)? Oh. The QMP command (which is immediately visible through nbd-server-add/block-storage-add to qemu and qemu-storage-daemon) gains "multi-conn":"on", but you may be right that qemu-nbd would want a command line option (either that, or we accellerate our plans that qsd should replace qemu-nbd). > > +++ b/blockdev-nbd.c > > @@ -44,6 +44,11 @@ bool nbd_server_is_running(void) > > return nbd_server || is_qemu_nbd; > > } > > > > +int nbd_server_max_connections(void) > > +{ > > +return nbd_server ? nbd_server->max_connections : -1; > > +} > > > > -1 is a little bit strange for a limit, maybe 1 is a better default when > we nbd_server == NULL? When can this happen? In qemu, if you haven't used the QMP command 'nbd-server-start' yet. In qemu-nbd, always (per the nbd_server_is_running function just above). My iotest only covered the qemu/qsd side, not the qemu-nbd side, so it looks like I need a v3... > > +++ b/nbd/server.c > > +/* > > + * Determine whether to advertise multi-conn. Default is auto, > > + * which resolves to on for read-only and off for writable. But > > + * if the server has max-connections 1, that forces the flag off. > > > > Looks good, this can be enabled automatically based on the driver > if we want, so users using auto will can upgrade to multi-con automatically. Yes, that's part of why I made it a tri-state with a default of 'auto'. > > > > + */ > > +if (!arg->has_multi_conn) { > > +arg->multi_conn = NBD_EXPORT_MULTI_CONN_AUTO; > > +} > > +if (nbd_server_max_connections() == 1) { > > +arg->multi_conn = NBD_EXPORT_MULTI_CONN_OFF; > > +} > > +if (arg->multi_conn == NBD_EXPORT_MULTI_CONN_AUTO) { > > +multi_conn = readonly; > > +} else { > > +multi_conn = arg->multi_conn == NBD_EXPORT_MULTI_CONN_ON; > > +} > > > > This part is a little bit con
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
15.02.2022 20:18, Eric Blake wrote: According to the NBD spec, a server advertising NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We satisfy these conditions in qemu when our block layer is backed by the local filesystem (by virtue of the semantics of fdatasync(), and the fact that qemu itself is not buffering writes beyond flushes). It is harder to state whether we satisfy these conditions for network-based protocols, so the safest course of action is to allow users to opt-in to advertising multi-conn. We may later tweak defaults to advertise by default when the block layer can confirm that the underlying protocol driver is cache consistent between multiple writers, but for now, this at least allows savvy users (such as virt-v2v or nbdcopy) to explicitly start qemu-nbd or qemu-storage-daemon with multi-conn advertisement in a known-safe setup where the client end can then benefit from parallel clients. Note, however, that we don't want to advertise MULTI_CONN when we know Here we change existing default behavior. Pre-patch MULTI_CONN is still advertised for readonly export even when connections number is limited to 1. Still may be it's a good change? Let's then note it here.. Could it break some existing clients? Hard to imagine. Otherwise patch looks good to me, nonsignificant notes below. that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but concisely managing that in shell is painful. I found it easier to do by relying on the libnbd project's nbdsh, which means this test will be skipped on platforms where that is not available. Signed-off-by: Eric Blake Fixes: https://bugzilla.redhat.com/1708300 --- [..] +## +# @NbdExportMultiConn: +# +# Possible settings for advertising NBD multiple client support. +# +# @off: Do not advertise multiple clients. +# +# @on: Allow multiple clients (for writable clients, this is only safe +# if the underlying BDS is cache-consistent, such as when backed +# by the raw file driver); ignored if the NBD server was set up +# with max-connections of 1. +# +# @auto: Behaves like @off if the export is writable, and @on if the +#export is read-only. +# +# Since: 7.0 +## +{ 'enum': 'NbdExportMultiConn', + 'data': ['off', 'on', 'auto'] } Probably we should use generic OnOffAuto type that we already have.. But in your way documentation looks better. + ## # @BlockExportOptionsNbd: # [..] + +_make_test_img 4M +$QEMU_IO -c 'w -P 1 0 2M' -c 'w -P 2 2M 2M' "$TEST_IMG" | _filter_qemu_io +_launch_qemu 2> >(_filter_nbd) +_send_qemu_cmd $QEMU_HANDLE '{"execute":"qmp_capabilities"}' "return" +_send_qemu_cmd $QEMU_HANDLE '{"execute":"blockdev-add", + "arguments":{"driver":"qcow2", "node-name":"n", +"file":{"driver":"file", "filename":"'"$TEST_IMG"'"}}}' "return" +export nbd_unix_socket + [..] +nbdsh -c ' +import os +sock = os.getenv("nbd_unix_socket") Just interested why you pass it through environment and no simply do '... sock = "'"$nbd_unix_socket"'"... ' +h = [] + +for i in range(3): + h.append(nbd.NBD()) Looks like Python:) And if something looks like Python, it's Python, we know. Should I say about PEP8 that recommends 4 whitespaces?) + h[i].connect_unix(sock) + assert h[i].can_multi_conn() + +buf1 = h[0].pread(1024 * 1024, 0) +if buf1 != b"\x01" * 1024 * 1024: + print("Unexpected initial read") +buf2 = b"\x03" * 1024 * 1024 +h[1].pwrite(buf2, 0) +h[2].flush() +buf1 = h[0].pread(1024 * 1024, 0) +if buf1 == buf2: + print("Flush appears to be consistent across connections") + +for i in range(3): + h[i].shutdown() +' -- Best regards, Vladimir
Re: [PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
On Tue, Feb 15, 2022 at 7:22 PM Eric Blake wrote: > According to the NBD spec, a server advertising > NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will > not see any cache inconsistencies: when properly separated by a single > flush, actions performed by one client will be visible to another > client, regardless of which client did the flush. We satisfy these > conditions in qemu when our block layer is backed by the local > filesystem (by virtue of the semantics of fdatasync(), and the fact > that qemu itself is not buffering writes beyond flushes). It is > harder to state whether we satisfy these conditions for network-based > protocols, so the safest course of action is to allow users to opt-in > to advertising multi-conn. We may later tweak defaults to advertise > by default when the block layer can confirm that the underlying > protocol driver is cache consistent between multiple writers, but for > now, this at least allows savvy users (such as virt-v2v or nbdcopy) to > explicitly start qemu-nbd or qemu-storage-daemon with multi-conn > advertisement in a known-safe setup where the client end can then > benefit from parallel clients. > It makes sense, and will be used by oVirt. Actually we are already using multiple connections for writing about 2 years, based on your promise that if every client writes to district areas this is safe. Note, however, that we don't want to advertise MULTI_CONN when we know > that a second client cannot connect (for historical reasons, qemu-nbd > defaults to a single connection while nbd-server-add and QMP commands > default to unlimited connections; but we already have existing means > to let either style of NBD server creation alter those defaults). The > harder part of this patch is setting up an iotest to demonstrate > behavior of multiple NBD clients to a single server. It might be > possible with parallel qemu-io processes, but concisely managing that > in shell is painful. I found it easier to do by relying on the libnbd > project's nbdsh, which means this test will be skipped on platforms > where that is not available. > > Signed-off-by: Eric Blake > Fixes: https://bugzilla.redhat.com/1708300 > --- > > v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3]. > > [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html > [2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html > [3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html > > Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and > reworked the logic so that default behavior is unchanged for now > (advertising multi-conn on a writable export requires opt-in during > the command line or QMP, but remains default for a readonly export). > I've also expanded the amount of testing done in the new iotest. > > docs/interop/nbd.txt | 1 + > docs/tools/qemu-nbd.rst| 3 +- > qapi/block-export.json | 34 +++- > include/block/nbd.h| 3 +- > blockdev-nbd.c | 5 + > nbd/server.c | 27 ++- > MAINTAINERS| 1 + > tests/qemu-iotests/tests/nbd-multiconn | 188 + > tests/qemu-iotests/tests/nbd-multiconn.out | 112 > 9 files changed, 363 insertions(+), 11 deletions(-) > create mode 100755 tests/qemu-iotests/tests/nbd-multiconn > create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out > > diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt > index bdb0f2a41aca..6c99070b99c8 100644 > --- a/docs/interop/nbd.txt > +++ b/docs/interop/nbd.txt > @@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", > NBD_CMD_CACHE > * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports, > NBD_CMD_FLAG_FAST_ZERO > * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth" > +* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports > diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst > index 6031f9689312..1de785524c36 100644 > --- a/docs/tools/qemu-nbd.rst > +++ b/docs/tools/qemu-nbd.rst > @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. > .. option:: -e, --shared=NUM > >Allow up to *NUM* clients to share the device (default > - ``1``), 0 for unlimited. Safe for readers, but for now, > - consistency is not guaranteed between multiple writers. > + ``1``), 0 for unlimited. > Removing the note means that now consistency is guaranteed between multiple writers, no? Or maybe we want to mention here that consistency depends on the protocol and users can opt in, or refer to the section where this is discussed? .. option:: -t, --persistent > > diff --git a/qapi/block-export.json b/qapi/block-export.json > index f183522d0d2c..0a27e8ee84f9 100644 > --- a/qapi/block-export.json > +++ b/qapi/block-export.json > @@ -21,7 +21,9 @@ > # recreated
[PATCH v2] nbd/server: Allow MULTI_CONN for shared writable exports
According to the NBD spec, a server advertising NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We satisfy these conditions in qemu when our block layer is backed by the local filesystem (by virtue of the semantics of fdatasync(), and the fact that qemu itself is not buffering writes beyond flushes). It is harder to state whether we satisfy these conditions for network-based protocols, so the safest course of action is to allow users to opt-in to advertising multi-conn. We may later tweak defaults to advertise by default when the block layer can confirm that the underlying protocol driver is cache consistent between multiple writers, but for now, this at least allows savvy users (such as virt-v2v or nbdcopy) to explicitly start qemu-nbd or qemu-storage-daemon with multi-conn advertisement in a known-safe setup where the client end can then benefit from parallel clients. Note, however, that we don't want to advertise MULTI_CONN when we know that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but concisely managing that in shell is painful. I found it easier to do by relying on the libnbd project's nbdsh, which means this test will be skipped on platforms where that is not available. Signed-off-by: Eric Blake Fixes: https://bugzilla.redhat.com/1708300 --- v1 was in Aug 2021 [1], with further replies in Sep [2] and Oct [3]. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg04900.html [2] https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00038.html [3] https://lists.gnu.org/archive/html/qemu-devel/2021-10/msg06744.html Since then, I've tweaked the QAPI to mention 7.0 (instead of 6.2), and reworked the logic so that default behavior is unchanged for now (advertising multi-conn on a writable export requires opt-in during the command line or QMP, but remains default for a readonly export). I've also expanded the amount of testing done in the new iotest. docs/interop/nbd.txt | 1 + docs/tools/qemu-nbd.rst| 3 +- qapi/block-export.json | 34 +++- include/block/nbd.h| 3 +- blockdev-nbd.c | 5 + nbd/server.c | 27 ++- MAINTAINERS| 1 + tests/qemu-iotests/tests/nbd-multiconn | 188 + tests/qemu-iotests/tests/nbd-multiconn.out | 112 9 files changed, 363 insertions(+), 11 deletions(-) create mode 100755 tests/qemu-iotests/tests/nbd-multiconn create mode 100644 tests/qemu-iotests/tests/nbd-multiconn.out diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt index bdb0f2a41aca..6c99070b99c8 100644 --- a/docs/interop/nbd.txt +++ b/docs/interop/nbd.txt @@ -68,3 +68,4 @@ NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE * 4.2: NBD_FLAG_CAN_MULTI_CONN for shareable read-only exports, NBD_CMD_FLAG_FAST_ZERO * 5.2: NBD_CMD_BLOCK_STATUS for "qemu:allocation-depth" +* 7.0: NBD_FLAG_CAN_MULTI_CONN for shareable writable exports diff --git a/docs/tools/qemu-nbd.rst b/docs/tools/qemu-nbd.rst index 6031f9689312..1de785524c36 100644 --- a/docs/tools/qemu-nbd.rst +++ b/docs/tools/qemu-nbd.rst @@ -139,8 +139,7 @@ driver options if ``--image-opts`` is specified. .. option:: -e, --shared=NUM Allow up to *NUM* clients to share the device (default - ``1``), 0 for unlimited. Safe for readers, but for now, - consistency is not guaranteed between multiple writers. + ``1``), 0 for unlimited. .. option:: -t, --persistent diff --git a/qapi/block-export.json b/qapi/block-export.json index f183522d0d2c..0a27e8ee84f9 100644 --- a/qapi/block-export.json +++ b/qapi/block-export.json @@ -21,7 +21,9 @@ # recreated on the fly while the NBD server is active. # If missing, it will default to denying access (since 4.0). # @max-connections: The maximum number of connections to allow at the same -# time, 0 for unlimited. (since 5.2; default: 0) +# time, 0 for unlimited. Setting this to 1 also stops +# the server from advertising multiple client support +# (since 5.2; default: 0) # # Since: 4.2 ## @@ -50,7 +52,9 @@ # recreated on the fly while the NBD server is active. # If missing, it will default to denying access (since 4.0). # @max-connections: The