Re: [PATCH 12/15] iotests: Disable AQMP logging under non-debug modes
On Fri, Sep 17, 2021 at 8:58 PM John Snow wrote: > > > On Fri, Sep 17, 2021 at 10:30 AM Hanna Reitz wrote: > >> On 17.09.21 07:40, John Snow wrote: >> > Disable the aqmp logger, which likes to (at the moment) print out >> > intermediate warnings and errors that cause session termination; disable >> > them so they don't interfere with the job output. >> > >> > Leave any "CRITICAL" warnings enabled though, those are ones that we >> > should never see, no matter what. >> >> I mean, looks OK to me, but from what I understand (i.e. little), >> qmp_client doesn’t log CRITICAL messages, at least I can’t see any. Only >> ERRORs. >> >> > There's *one* critical message in protocol.py, used for a circumstance > that I *think* should be impossible. I do not think I currently use any > WARNING level statements. > > >> I guess I’m missing some CRITICAL messages in external functions called >> from qmp_client.py, but shouldn’t we still keep ERRORs? >> > > ...Mayybe? > > The errors logged by AQMP are *almost always* raised as Exceptions > somewhere else, eventually. Sometimes when we encounter them in one > context, we need to save them and then re-raise them in a different > execution context. There's one good exception to this: My pal, EOFError. > > If the reader context encounters EOF, it raises EOFError and this causes a > disconnect to be scheduled asynchronously. *Any* Exception that causes a > disconnect to be scheduled asynchronously is dutifully logged as an ERROR. > At this point in the code, we don't really know if the user of the library > considers this an "error" yet or not. I've waffled a lot on how exactly to > treat this circumstance. ...Hm, I guess that's really the only case where I > have an error that really ought to be suppressed. I suppose what I will do > here is: if the exception happens to be an EOFError I will drop the > severity of the log message down to INFO. I don't know why it takes being > challenged on this stuff to start thinking clearly about it, but here we > are. Thank you for your feedback :~) > > --js > Oh, CI testing reminds me of why I am a liar here. the mirror-top-perms test intentionally expects not to be able to connect, but we're treated to these two additional lines of output: +ERROR:qemu.aqmp.qmp_client.qemub-2536319:Negotiation failed: EOFError +ERROR:qemu.aqmp.qmp_client.qemub-2536319:Failed to establish session: EOFError Uh. I guess a temporary suppression in mirror-top-perms, then ...? In most other cases I rather imagine we do want to see this kind of output to help give more information about how things have failed -- they ARE errors. We just happen to not care about them right here. The only thing I don't exactly like about this is that it requires some knowledge by the caller to know to disable it. It makes writing negative tests a bit more annoying because the library leans so heavily on yelling loudly when it encounters problems.
Re: [PATCH 13/15] iotests: Accommodate async QMP Exception classes
On Fri, Sep 17, 2021 at 10:35 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > (But continue to support the old ones for now, too.) > > > > There are very few cases of any user of QEMUMachine or a subclass > > thereof relying on a QMP Exception type. If you'd like to check for > > yourself, you want to grep for all of the derivatives of QMPError, > > excluding 'AQMPError' and its derivatives. That'd be these: > > > > - QMPError > > - QMPConnectError > > - QMPCapabilitiesError > > - QMPTimeoutError > > - QMPProtocolError > > - QMPResponseError > > - QMPBadPortError > > > > > > Signed-off-by: John Snow > > --- > > scripts/simplebench/bench_block_job.py| 3 ++- > > tests/qemu-iotests/tests/mirror-top-perms | 6 +- > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/scripts/simplebench/bench_block_job.py > b/scripts/simplebench/bench_block_job.py > > index 4f03c12169..a403c35b08 100755 > > --- a/scripts/simplebench/bench_block_job.py > > +++ b/scripts/simplebench/bench_block_job.py > > @@ -28,6 +28,7 @@ > > sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', > 'python')) > > from qemu.machine import QEMUMachine > > from qemu.qmp import QMPConnectError > > +from qemu.aqmp import ConnectError > > > > > > def bench_block_job(cmd, cmd_args, qemu_args): > > @@ -49,7 +50,7 @@ def bench_block_job(cmd, cmd_args, qemu_args): > > vm.launch() > > except OSError as e: > > return {'error': 'popen failed: ' + str(e)} > > -except (QMPConnectError, socket.timeout): > > +except (QMPConnectError, ConnectError, socket.timeout): > > return {'error': 'qemu failed: ' + str(vm.get_log())} > > > > try: > > diff --git a/tests/qemu-iotests/tests/mirror-top-perms > b/tests/qemu-iotests/tests/mirror-top-perms > > index 451a0666f8..7d448f4d23 100755 > > --- a/tests/qemu-iotests/tests/mirror-top-perms > > +++ b/tests/qemu-iotests/tests/mirror-top-perms > > @@ -103,7 +103,11 @@ class TestMirrorTopPerms(iotests.QMPTestCase): > > print('ERROR: VM B launched successfully, this should not > have ' > > 'happened') > > except qemu.qmp.QMPConnectError: > > -assert 'Is another process using the image' in > self.vm_b.get_log() > > +pass > > +except qemu.aqmp.ConnectError: > > +pass > > + > > +assert 'Is another process using the image' in > self.vm_b.get_log() > > But this assertion will fail if there was no exception, and so we won’t > get to see the real problem, which is the original VM aborting (see the > doc string). > Uh, hm. OK, so the intent was that if vm_b somehow starts successfully that we will fail the test based on output in the diff, but we'll continue on to see other kinds of explosions so that the output is more useful for diagnosing the failure. Gotcha. It doesn’t really matter that much that VM B can start (hence it being a > logged error message, not a fatal error), and when it can start, of > course it won’t print an error – but what’s important is that the > original VM will then abort. > > I mean, not an absolute showstopper by any means, but still, the > assertion was deliberately placed into the `except` block. > > Hanna > I misunderstood the "test style" here. I'll fix it. (Uh, I also forgot to explicitly import qemu.aqmp. It happens to work anyway because of $reasons, but it's not very good style. I'll fix that, too.) --js
Re: [PATCH 12/15] iotests: Disable AQMP logging under non-debug modes
On Fri, Sep 17, 2021 at 10:30 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > Disable the aqmp logger, which likes to (at the moment) print out > > intermediate warnings and errors that cause session termination; disable > > them so they don't interfere with the job output. > > > > Leave any "CRITICAL" warnings enabled though, those are ones that we > > should never see, no matter what. > > I mean, looks OK to me, but from what I understand (i.e. little), > qmp_client doesn’t log CRITICAL messages, at least I can’t see any. Only > ERRORs. > > There's *one* critical message in protocol.py, used for a circumstance that I *think* should be impossible. I do not think I currently use any WARNING level statements. > I guess I’m missing some CRITICAL messages in external functions called > from qmp_client.py, but shouldn’t we still keep ERRORs? > ...Mayybe? The errors logged by AQMP are *almost always* raised as Exceptions somewhere else, eventually. Sometimes when we encounter them in one context, we need to save them and then re-raise them in a different execution context. There's one good exception to this: My pal, EOFError. If the reader context encounters EOF, it raises EOFError and this causes a disconnect to be scheduled asynchronously. *Any* Exception that causes a disconnect to be scheduled asynchronously is dutifully logged as an ERROR. At this point in the code, we don't really know if the user of the library considers this an "error" yet or not. I've waffled a lot on how exactly to treat this circumstance. ...Hm, I guess that's really the only case where I have an error that really ought to be suppressed. I suppose what I will do here is: if the exception happens to be an EOFError I will drop the severity of the log message down to INFO. I don't know why it takes being challenged on this stuff to start thinking clearly about it, but here we are. Thank you for your feedback :~) --js
Re: [PATCH 11/15] python/aqmp: Create sync QMP wrapper for iotests
On Fri, Sep 17, 2021 at 10:23 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > This is a wrapper around the async QMPClient that mimics the old, > > synchronous QEMUMonitorProtocol class. It is designed to be > > interchangeable with the old implementation. > > > > It does not, however, attempt to mimic Exception compatibility. > > > > Signed-off-by: John Snow > > --- > > python/qemu/aqmp/legacy.py | 131 + > > 1 file changed, 131 insertions(+) > > create mode 100644 python/qemu/aqmp/legacy.py > > Looks reasonable to me, although I can’t give an R-b in good > conscience. I’m tempted to give a neo-tag (“Looks-okay-to:”), but let’s > just be boring and do an > > Acked-by: Hanna Reitz > > The mischief maker in me wants to add the custom tag, but I suppose I'll be kind to the people who crunch the stats and use the vanilla tag. :) Thanks a bunch! --js
Re: [PATCH 10/15] python/machine: Add support for AQMP backend
On Fri, Sep 17, 2021 at 10:16 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > To use the AQMP backend, Machine just needs to be a little more diligent > > about what happens when closing a QMP connection. The operation is no > > longer a freebie in the async world. > > > > Because async QMP continues to check for messages asynchronously, it's > > almost certainly likely that the loop will have exited due to EOF after > > issuing the last 'quit' command. That error will ultimately be bubbled > > up when attempting to close the QMP connection. The manager class here > > then is free to discard it if it was expected. > > > > Signed-off-by: John Snow > > > > --- > > > > Yes, I regret that this class has become quite a dumping ground for > > complexity around the exit path. It's in need of a refactor to help > > separate out the exception handling and cleanup mechanisms from the > > VM-related stuff, but it's not a priority to do that just yet -- but > > it's on the list. > > > > --- > > python/qemu/machine/machine.py | 51 ++ > > 1 file changed, 46 insertions(+), 5 deletions(-) > > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index 6e58d2f951..8f5a6649e5 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -342,9 +342,15 @@ def _post_shutdown(self) -> None: > > # Comprehensive reset for the failed launch case: > > self._early_cleanup() > > > > -if self._qmp_connection: > > -self._qmp.close() > > -self._qmp_connection = None > > +try: > > +self._close_qmp_connection() > > +except Exception as err: # pylint: disable=broad-except > > +LOG.warning( > > +"Exception closing QMP connection: %s", > > +str(err) if str(err) else type(err).__name__ > > +) > > +finally: > > +assert self._qmp_connection is None > > > > self._close_qemu_log_file() > > > > @@ -420,6 +426,31 @@ def _launch(self) -> None: > > close_fds=False) > > self._post_launch() > > > > +def _close_qmp_connection(self) -> None: > > +""" > > +Close the underlying QMP connection, if any. > > + > > +Dutifully report errors that occurred while closing, but assume > > +that any error encountered indicates an abnormal termination > > +process and not a failure to close. > > +""" > > +if self._qmp_connection is None: > > +return > > + > > +try: > > +self._qmp.close() > > +except EOFError: > > +# EOF can occur as an Exception here when using the Async > > +# QMP backend. It indicates that the server closed the > > +# stream. If we successfully issued 'quit' at any point, > > +# then this was expected. If the remote went away without > > +# our permission, it's worth reporting that as an abnormal > > +# shutdown case. > > +if not self._has_quit: > > +raise > > +finally: > > +self._qmp_connection = None > > + > > def _early_cleanup(self) -> None: > > """ > > Perform any cleanup that needs to happen before the VM exits. > > @@ -461,8 +492,18 @@ def _soft_shutdown(self, timeout: Optional[int]) -> > None: > > > > if self._qmp_connection: > > if not self._has_quit: > > -# Might raise ConnectionReset > > -self.qmp('quit') > > +try: > > +# May raise ExecInterruptedError or StateError if > the > > +# connection dies or has already died. > > +self.qmp('quit') > > +finally: > > +# If 'quit' fails, we'll still want to call close(), > > +# which will raise any final errors that may have > > +# occurred while trying to send 'quit'. > > +self._close_qmp_connection() > > +else: > > +# Regardless, we want to tidy up the socket. > > +self._close_qmp_connection() > > Why can’t we wait for _post_shutdown to do that? Has it something to do > with this operation being “no longer a freeby” (I’m not quite sure what > this refers to, execution time, or the set of possible exceptions, or > perhaps something else entirely), and so we want to do it in the > already-not-instantaneous _soft_shutdown? > > Hanna > > I wrote that commit message too casually. By "freebie", I meant that closing a simple sync socket does not have any potential for raising an error, so we don't really have to worry about that operation failing. The async machinery, by comparison, uses the disconnection point as its synchronization point where it gathers the
Re: [PATCH] qemu-img: Add -F shorthand to convert
17.09.2021 23:13, Eric Blake wrote: On Mon, Sep 13, 2021 at 08:17:35AM -0500, Eric Blake wrote: Although we have long supported 'qemu-img convert -o backing_file=foo,backing_fmt=bar', the fact that we have a shortcut -B for backing_file but none for backing_fmt has made it more likely that users accidentally run into: qemu-img: warning: Deprecated use of backing file without explicit backing format when using -B instead of -o. For similarity with other qemu-img commands, such as create and compare, add '-F $fmt' as the shorthand for '-o backing_fmt=$fmt'. Update iotest 122 for coverage of both spellings. Signed-off-by: Eric Blake --- Self-review (and late, too), but... +++ b/docs/tools/qemu-img.rst @@ -414,7 +414,7 @@ Command description: 4 Error on reading data -.. option:: convert [--object OBJECTDEF] [--image-opts] [--target-image-opts] [--target-is-zero] [--bitmaps [--skip-broken-bitmaps]] [-U] [-C] [-c] [-p] [-q] [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O OUTPUT_FMT] [-B BACKING_FILE] [-o OPTIONS] [-l SNAPSHOT_PARAM] [-S SPARSE_SIZE] [-r RATE_LIMIT] [-m NUM_COROUTINES] [-W] FILENAME [FILENAME2 [...]] OUTPUT_FILENAME +.. option:: convert [--object OBJECTDEF] [--image-opts] [--target-image-opts] [--target-is-zero] [--bitmaps [--skip-broken-bitmaps]] [-U] [-C] [-c] [-p] [-q] [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O OUTPUT_FMT] [-B BACKING_FILE [-F backing_fmt]] [-o OPTIONS] [-l SNAPSHOT_PARAM] [-S SPARSE_SIZE] [-r RATE_LIMIT] [-m NUM_COROUTINES] [-W] FILENAME [FILENAME2 [...]] OUTPUT_FILENAME s/backing_fmt/BACKING_FMT/ would be more consistent here +++ b/qemu-img-cmds.hx @@ -46,7 +46,7 @@ SRST ERST DEF("convert", img_convert, -"convert [--object objectdef] [--image-opts] [--target-image-opts] [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-r rate_limit] [-m num_coroutines] [-W] [--salvage] filename [filename2 [...]] output_filename") +"convert [--object objectdef] [--image-opts] [--target-image-opts] [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file [-F backing_fmt]] [-o options] [-l snapshot_param] [-S sparse_size] [-r rate_limit] [-m num_coroutines] [-W] [--salvage] filename [filename2 [...]] output_filename") SRST .. option:: convert [--object OBJECTDEF] [--image-opts] [--target-image-opts] [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O OUTPUT_FMT] [-B BACKING_FILE] [-o OPTIONS] [-l SNAPSHOT_PARAM] [-S SPARSE_SIZE] [-r RATE_LIMIT] [-m NUM_COROUTINES] [-W] [--salvage] FILENAME [FILENAME2 [...]] OUTPUT_FILENAME and I missed this line here. We have too much not-quite-identical duplication :( Followup patch coming. That's for sure accidental complexity :) -- Best regards, Vladimir
Re: [PATCH 09/15] python/machine: remove has_quit argument
On Fri, Sep 17, 2021 at 9:59 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > If we spy on the QMP commands instead, we don't need callers to remember > > to pass it. Seems like a fair trade-off. > > > > The one slightly weird bit is overloading this instance variable for > > wait(), where we use it to mean "don't issue the qmp 'quit' > > command". This means that wait() will "fail" if the QEMU process does > > not terminate of its own accord. > > > > In most cases, we probably did already actually issue quit -- some > > iotests do this -- but in some others, we may be waiting for QEMU to > > terminate for some other reason. > > > > Signed-off-by: John Snow > > --- > > python/qemu/machine/machine.py | 35 +++--- > > tests/qemu-iotests/040 | 7 +-- > > tests/qemu-iotests/218 | 2 +- > > tests/qemu-iotests/255 | 2 +- > > 4 files changed, 23 insertions(+), 23 deletions(-) > > Looks good overall, some bikeshedding below. > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index 056d340e35..6e58d2f951 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -170,6 +170,7 @@ def __init__(self, > > self._console_socket: Optional[socket.socket] = None > > self._remove_files: List[str] = [] > > self._user_killed = False > > +self._has_quit = False > > Sounds a bit weird, because evidently it has quit. > > I mean, it was always more of a has_sent_quit or quit_issued, but now it > really seems a bit wrong. > > "quit_issued" seems like it might help overall, I'll do that. > [...] > > > @@ -529,7 +526,9 @@ def wait(self, timeout: Optional[int] = 30) -> None: > > :param timeout: Optional timeout in seconds. Default 30 > seconds. > > A value of `None` is an infinite wait. > > """ > > -self.shutdown(has_quit=True, timeout=timeout) > > +# In case QEMU was stopped without issuing 'quit': > > This comment confused me more than anything and only looking at the > function’s name and doc string was I able to understand why we have > has_quit = True here, and only by scratching my head asking myself why > you’d write the comment did I understand the comment’s purpose. > > What isn’t quite clear in the comment is that we kind of expect > _has_quit to already be True, because the VM was probably exited with > `quit`. But we don’t know for sure, so we have to force _has_quit to True. > > But it’s also not necessary to explain, I think. The idea is simply > that this function should do nothing to make the VM quit, so it’s > absolutely natural that we’d set _has_quit to True, because the caller > guarantees that they already made the VM quit. No need to explain why > this might already be True, and why it’s still OK. > > So I’d just say something like “Do not send a `quit` to the VM, just > wait for it to exit”. > > I'll just drop the comment, then. > Hanna > > > +self._has_quit = True > > +self.shutdown(timeout=timeout) > > > > def set_qmp_monitor(self, enabled: bool = True) -> None: > > """ > >
Re: [PATCH] qemu-img: Add -F shorthand to convert
On Mon, Sep 13, 2021 at 08:17:35AM -0500, Eric Blake wrote: > Although we have long supported 'qemu-img convert -o > backing_file=foo,backing_fmt=bar', the fact that we have a shortcut -B > for backing_file but none for backing_fmt has made it more likely that > users accidentally run into: > > qemu-img: warning: Deprecated use of backing file without explicit backing > format > > when using -B instead of -o. For similarity with other qemu-img > commands, such as create and compare, add '-F $fmt' as the shorthand > for '-o backing_fmt=$fmt'. Update iotest 122 for coverage of both > spellings. > > Signed-off-by: Eric Blake > --- > Self-review (and late, too), but... > +++ b/docs/tools/qemu-img.rst > @@ -414,7 +414,7 @@ Command description: >4 > Error on reading data > > -.. option:: convert [--object OBJECTDEF] [--image-opts] > [--target-image-opts] [--target-is-zero] [--bitmaps [--skip-broken-bitmaps]] > [-U] [-C] [-c] [-p] [-q] [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O > OUTPUT_FMT] [-B BACKING_FILE] [-o OPTIONS] [-l SNAPSHOT_PARAM] [-S > SPARSE_SIZE] [-r RATE_LIMIT] [-m NUM_COROUTINES] [-W] FILENAME [FILENAME2 > [...]] OUTPUT_FILENAME > +.. option:: convert [--object OBJECTDEF] [--image-opts] > [--target-image-opts] [--target-is-zero] [--bitmaps [--skip-broken-bitmaps]] > [-U] [-C] [-c] [-p] [-q] [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O > OUTPUT_FMT] [-B BACKING_FILE [-F backing_fmt]] [-o OPTIONS] [-l > SNAPSHOT_PARAM] [-S SPARSE_SIZE] [-r RATE_LIMIT] [-m NUM_COROUTINES] [-W] > FILENAME [FILENAME2 [...]] OUTPUT_FILENAME s/backing_fmt/BACKING_FMT/ would be more consistent here > +++ b/qemu-img-cmds.hx > @@ -46,7 +46,7 @@ SRST > ERST > > DEF("convert", img_convert, > -"convert [--object objectdef] [--image-opts] [--target-image-opts] > [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t > cache] [-T src_cache] [-O output_fmt] [-B backing_file] [-o options] [-l > snapshot_param] [-S sparse_size] [-r rate_limit] [-m num_coroutines] [-W] > [--salvage] filename [filename2 [...]] output_filename") > +"convert [--object objectdef] [--image-opts] [--target-image-opts] > [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t > cache] [-T src_cache] [-O output_fmt] [-B backing_file [-F backing_fmt]] [-o > options] [-l snapshot_param] [-S sparse_size] [-r rate_limit] [-m > num_coroutines] [-W] [--salvage] filename [filename2 [...]] output_filename") > SRST > .. option:: convert [--object OBJECTDEF] [--image-opts] > [--target-image-opts] [--target-is-zero] [--bitmaps] [-U] [-C] [-c] [-p] [-q] > [-n] [-f FMT] [-t CACHE] [-T SRC_CACHE] [-O OUTPUT_FMT] [-B BACKING_FILE] [-o > OPTIONS] [-l SNAPSHOT_PARAM] [-S SPARSE_SIZE] [-r RATE_LIMIT] [-m > NUM_COROUTINES] [-W] [--salvage] FILENAME [FILENAME2 [...]] OUTPUT_FILENAME and I missed this line here. We have too much not-quite-identical duplication :( Followup patch coming. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH 08/15] python/aqmp: Create MessageModel and StandaloneModel classes
On Fri, Sep 17, 2021, 9:39 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > This allows 'Greeting' to be subclass of 'Message'. We need the adapter > > classes to avoid some typing problems that occur if we try to put too > > much into the 'Model' class itself; the exact details of why are left as > > an exercise to the reader. > > > > Why bother? This makes 'Greeting' ⊆ 'Message', which is taxonomically > > true; but the real motivation is to be able to inherit and abuse all of > > the Mapping dunders so that we can call dict(greeting) or > > bytes(greeting), for example. > > > > Signed-off-by: John Snow > > --- > > python/qemu/aqmp/models.py | 67 -- > > 1 file changed, 50 insertions(+), 17 deletions(-) > > (I feel like this is a bit much outside of my scope, so this’ll have to > do without my R-b.) > It's a patch I like the least, too. I shouldn't have included it here in this series. It needs more time in the oven and it should be included ... somewhere else. Sorry about this one. >
Re: [PATCH v3 00/17] iotests: support zstd
17.09.2021 17:49, Hanna Reitz wrote: On 15.09.21 16:45, Hanna Reitz wrote: On 14.09.21 12:25, Vladimir Sementsov-Ogievskiy wrote: These series makes tests pass with IMGOPTS='compression_type=zstd' Also, python iotests start to support IMGOPTS (they didn't before). Unfortunately, the problem I have now is that it makes the tests fail with other IMGOPTS. My regular test set includes 'refcount_bits=1', 'compat=0.10', and 'data_file=$TEST_IMG.ext_data_file'. These fail now, because the Python tests don’t have a way to specify which test options they don’t support (like _unsupported_imgopts). Handling data_file of course is extra tricky because now every disk image consists of two files, and the qemu-img create invocation needs to expand '$TEST_IMG', like _make_test_img does (in the line where imgopts_expanded is set). I think we need an unsupported_imgopts parameter for Python tests, and it needs to be filled (perhaps https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg00082.html can serve as a starting point). And for the time being, I think Python tests should be skipped altogether when IMGOPTS contains data_file. (Perhaps I should explicitly say that this means I didn’t include this series in my pull request this week, because, well, my pre-pull tests were failing :/) That was clean :) I'll resend (Also wanted to let you know I’ll be on PTO the next two weeks, so I won’t be able to review a v4 or talk about how exactly we want to handle IMGOPTS other than compression_type until Oct 4.) -- Best regards, Vladimir
Re: [PATCH 07/15] python/aqmp: add send_fd_scm
On Fri, Sep 17, 2021 at 9:34 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > The single space is indeed required to successfully transmit the file > > descriptor to QEMU. > > Yeah, socket_scm_helper.c said “Send a blank to notify qemu”. > > > Signed-off-by: John Snow > > --- > > python/qemu/aqmp/qmp_client.py | 17 + > > 1 file changed, 17 insertions(+) > > > > diff --git a/python/qemu/aqmp/qmp_client.py > b/python/qemu/aqmp/qmp_client.py > > index d2ad7459f9..58f85990bc 100644 > > --- a/python/qemu/aqmp/qmp_client.py > > +++ b/python/qemu/aqmp/qmp_client.py > > @@ -9,6 +9,8 @@ > > > > import asyncio > > import logging > > +import socket > > +import struct > > from typing import ( > > Dict, > > List, > > @@ -624,3 +626,18 @@ async def execute(self, cmd: str, > > """ > > msg = self.make_execute_msg(cmd, arguments, oob=oob) > > return await self.execute_msg(msg) > > + > > +@upper_half > > +@require(Runstate.RUNNING) > > +def send_fd_scm(self, fd: int) -> None: > > +""" > > +Send a file descriptor to the remote via SCM_RIGHTS. > > +""" > > +assert self._writer is not None > > +sock = self._writer.transport.get_extra_info('socket') > > + > > +# Python 3.9+ adds socket.send_fds(...) > > +sock.sendmsg( > > +[b' '], > > +[(socket.SOL_SOCKET, socket.SCM_RIGHTS, struct.pack('@i', > fd))] > > +) > > AFAIU the socket can be either TCP or a UNIX socket > (AsyncProtocol._do_accept’s docstring sounds this way), so should we > check that this is a UNIX socket? (Or is it fine to just run into the > error that I suspect we would get with a TCP socket?) > > Hanna > > Uhh, hm. I was going to say "Yeah, just let it fail!" but ... upon going to document what error to expect in this case, I am realizing it fails silently. So, uh, that's not ideal. I'll fix this to make it bark. --js
Re: [PATCH 05/15] python/qmp: add send_fd_scm directly to QEMUMonitorProtocol
On Fri, Sep 17, 2021 at 9:21 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > It turns out you can do this directly from Python ... and because of > > this, you don't need to worry about setting the inheritability of the > > fds or spawning another process. > > > > Doing this is helpful because it allows QEMUMonitorProtocol to keep its > > file descriptor and socket object as private implementation details. > > > > *that* is helpful in turn because it allows me to write a compatible, > > alternative implementation. > > Bit of a weird indentation here. > > > Signed-off-by: John Snow > > --- > > python/qemu/machine/machine.py | 44 +++--- > > python/qemu/qmp/__init__.py| 21 +++- > > 2 files changed, 18 insertions(+), 47 deletions(-) > > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index ae945ca3c9..1c6532a3d6 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -213,48 +213,22 @@ def add_fd(self: _T, fd: int, fdset: int, > > def send_fd_scm(self, fd: Optional[int] = None, > > file_path: Optional[str] = None) -> int: > > [...] > > > if file_path is not None: > > assert fd is None > > -fd_param.append(file_path) > > +with open(file_path, "rb") as passfile: > > +fd = passfile.fileno() > > +self._qmp.send_fd_scm(fd) > > Seems a bit strange to send an fd that is then immediately closed, but > that’s what socket_scm_helper did, and so it looks like the fd is > effectively duplicated. OK then. > > Same boat. It's weird, but it seems to work, and it's how the old interface (ultimately) behaved, so ... https://i.imgur.com/O0CQXoh.png > Reviewed-by: Hanna Reitz > >
Re: [PATCH 04/15] python/qmp: clear events on get_events() call
On Fri, Sep 17, 2021 at 8:51 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > All callers in the tree *already* clear the events after a call to > > get_events(). Do it automatically instead and update callsites to remove > > the manual clear call. > > > > These semantics are quite a bit easier to emulate with async QMP, and > > nobody appears to be abusing some emergent properties of what happens if > > you decide not to clear them, so let's dial down to the dumber, simpler > > thing. > > > > Specifically: callers of clear() right after a call to get_events() are > > more likely expressing their desire to not see any events they just > > retrieved, whereas callers of clear_events() not in relation to a recent > > call to pull_event/get_events are likely expressing their desire to > > simply drop *all* pending events straight onto the floor. In the sync > > world, this is safe enough; in the async world it's nearly impossible to > > promise that nothing happens between getting and clearing the > > events. > > > > Making the retrieval also clear the queue is vastly simpler. > > > > Signed-off-by: John Snow > > --- > > python/qemu/machine/machine.py | 1 - > > python/qemu/qmp/__init__.py| 4 +++- > > python/qemu/qmp/qmp_shell.py | 1 - > > 3 files changed, 3 insertions(+), 3 deletions(-) > > [...] > > > diff --git a/python/qemu/qmp/__init__.py b/python/qemu/qmp/__init__.py > > index 269516a79b..ba15668c25 100644 > > --- a/python/qemu/qmp/__init__.py > > +++ b/python/qemu/qmp/__init__.py > > @@ -374,7 +374,9 @@ def get_events(self, wait: bool = False) -> > List[QMPMessage]: > > @return The list of available QMP events. > > """ > > self.__get_events(wait) > > -return self.__events > > +events = self.__events > > +self.__events = [] > > +return events > > Maybe it’s worth updating the doc string that right now just says “Get a > list of available QMP events”? > > Yes, that's an oversight on my part. I'm updating it to: "Get a list of available QMP events and clear the list of pending events." and adding your RB. (Also, perhaps renaming it to get_new_events, but that’s an even weaker > suggestion.) > I tried to avoid churn in the iotests as much as I could manage, so I think I will leave this be for now. I have an impression that the number of cases where one might wish to see events that have already been witnessed is actually quite low, so I am not sure that it needs disambiguation from a case that is essentially never used. (I have certainly been wrong before, though.) > Anyway: > > Reviewed-by: Hanna Reitz > > ~ thankyou ~
Re: [PATCH 03/15] python/aqmp: Return cleared events from EventListener.clear()
On Fri, Sep 17, 2021 at 8:36 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > This serves two purposes: > > > > (1) It is now possible to discern whether or not clear() removed any > > event(s) from the queue with absolute certainty, and > > > > (2) It is now very easy to get a List of all pending events in one > > chunk, which is useful for the sync bridge. > > > > Signed-off-by: John Snow > > --- > > python/qemu/aqmp/events.py | 9 +++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > Not sure if `clear` is an ideal name then, but `drain` sounds like > something that would block, and `drop` is really much different from > `clear`. Also, doesn’t matter, having Collection.delete return the > deleted element is a common thing in any language’s standard library, so > why not have `clear` do the same. > > It isn't too late to change the name, but it sounds like you don't necessarily prefer any of those others over what's there now. > Reviewed-by: Hanna Reitz > > Thanks!
Re: [PATCH v2] nbd/server: Suppress Broken pipe errors on abrupt disconnection
On Wed, Sep 15, 2021 at 12:11:35PM +0300, Vladimir Sementsov-Ogievskiy wrote: > > > >There are two methods of terminating the transmission phase: > > > >... > > > >"The client or the server drops the TCP session (in which case it > > > >SHOULD shut down the TLS session first). This is referred to as > > > >'initiating a hard disconnect'." > > > > > > Right. Here the method is defined, no word that client can do it at any > > > time. Note that right now, most TLS clients are NOT gracefully shutting down the TLS session, on either hard or soft disconnect. And this is even observable in non-deterministic behavior of what message the server reports when it diagnoses that a TLS client has gone away. As argued elsewhere, a hard disconnect does not necessarily lose data, but it does add non-determinism into the equation, which makes regression testing a bit harder. > > > > I don't read this as a definition. > > If so, next paragraphs that specify client behavior doesn't make sense. > > > > > But we should probably clarify the spec to note that dropping the > > connection is fine, because it is - no one has made any argument so > > far that there's an actual reason to waste time on NBD_CMD_DISC. > > > > I agree that it's safe enough.. > > Hmm. If we update the spec to clarify that sending DISC is not necessary, is > there any reason to send it at all? We can stop sending it in Qemu as well. No, I'd still prefer that qemu send NBD_CMD_DISC where possible. Although existing servers tolerate hard disconnects, the way I read the NBD spec is that clients SHOULD prefer soft disconnect, in case it deals with a server where the difference matters. When implementing 'qemu-nbd --list', I remember having to rejigger code to add in a graceful NBD_OPT_ABORT into more places, for the same reason as sending NBD_CMD_DISC at the end of transmission phase reduces the chance for non-determinism on how the server may react. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2] nbd/server: Suppress Broken pipe errors on abrupt disconnection
On Wed, Sep 15, 2021 at 10:00:25AM +0100, Richard W.M. Jones wrote: > > >I would say the spec is at best contradictory, but if you read other > > >parts of the spec, then it's pretty clear that we're allowed to drop > > >the connection whenever we like. This section says as much: > > > > > >https://github.com/NetworkBlockDevice/nbd/blob/5805b25ad3da96e7c0b3160cda51ea19eb518d5b/doc/proto.md#terminating-the-transmission-phase > > > > Hmm, that was exactly the section I read and quoted :) > > > > > > > > There are two methods of terminating the transmission phase: > > > ... > > > "The client or the server drops the TCP session (in which case it > > > SHOULD shut down the TLS session first). This is referred to as > > > 'initiating a hard disconnect'." > > > > Right. Here the method is defined, no word that client can do it at any > > time. > > I don't read this as a definition. > > But we should probably clarify the spec to note that dropping the > connection is fine, because it is - no one has made any argument so > far that there's an actual reason to waste time on NBD_CMD_DISC. > > Rich. > > > Next, in same section: > > > >Either side MAY initiate a hard disconnect if it detects a violation by > > the other party of a mandatory condition within this document. > > > > Next > >The client MAY issue a soft disconnect at any time > > > > And finally: > > > >The client and the server MUST NOT initiate any form of disconnect other > > than in one of the above circumstances. I went and re-read the upstream NBD discussion when this part of the doc was added... https://lists.debian.org/nbd/2016/04/msg00420.html https://lists.debian.org/nbd/2016/04/msg00425.html The argument back then was that sending CMD_DISC is desirable (clients want to be nice to the server) but not mandatory (server must be prepared for clients that aren't nice). Sending CMD_DISC is what counts as initiating soft disconnect; terminating abruptly without CMD_DISC is hard disconnect (servers must not do it without detecting some other first, but clients doing it when they have nothing further to send is something we have to live with). At the time the text was added, there was a question of whether to add a command NBD_CMD_CLOSE that guaranteed clean server shutdown (the client had to wait for the server's response): https://lists.debian.org/nbd/2016/04/msg00236.html but in the end, it was decided that the semantics of NBD_CMD_DISC were already good enough for that purpose. Which in turn means that clients really do expect servers to gracefully flush things to disk on NBD_CMD_DISC, and merely disconnecting (a hard shutdown) after NBD_CMD_FLUSH is a bit riskier than sending a soft shutdown request. > > > > > > Or do you mean some other spec section I missed? > > > > > > > >Anyway we're dropping the TCP connection because sometimes we are just > > >interrogating an NBD server eg to find out what it supports, and doing > > >a graceful shutdown is a waste of time and internet. You're right that it costs another couple of packets, particularly for things like 'nbdinfo --list', but is it really dominating the time spent in the normal use of NBD? Micro-optimizing the shutdown doesn't have as much payoff as optimizing the speed of NBD_CMD_READ/WRITE. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH 14/15] python/aqmp: Remove scary message
On Fri, Sep 17, 2021 at 10:39 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > The scary message interferes with the iotests output. Coincidentally, if > > iotests works by removing this, then it's good evidence that we don't > > really need to scare people away from using it. > > > > Signed-off-by: John Snow > > --- > > python/qemu/aqmp/__init__.py | 14 -- > > 1 file changed, 14 deletions(-) > > I definitely won’t give an R-b for this one, because that would require > me reviewing the AQMP series, and, well, you know. > > Yep, no worries. I'd feel bad if you started digging into it. Not the best use of your time. I am trying to shield you from this junk, not pull you into it. (but, I hope the new library is easy to use. I went out of my way to make sure the transition would be as seamless as possible for iotest writers, and I genuinely hope I achieved that. Though as you've seen, there's a few messy bits -- One of the reasons for sending this series out to list so soon was precisely to force someone to witness the ugly bits so I could align my thinking on how best to address them.) Also, if I were to review the AQMP series and could actually give R-bs > in good faith, why would I accept adding this message? I mean, if I’d > reviewed it, I’d’ve had to trust it. > > So, öhm, I’m fine with dropping this message because evidently I’d’ve > never agreed to adding it in the first place (had I reviewed the AQMP > series). > > Hanna > > I added as a pre-emptive mollification, it's been in my series for a while. I jokingly refer to it as "tech preview". The asyncio stuff is fairly new (even to Python programmers) and though I have tried my very hardest to test that library as thoroughly as I possibly could, it seems like everyone's reaction has been "Ah jeez, I dunno if I can truly review this" so I added a little warning to ease minds a bit and try to make people feel like they were committing to less.
Re: [PATCH 15/15] python, iotests: replace qmp with aqmp
On Fri, Sep 17, 2021 at 10:40 AM Hanna Reitz wrote: > On 17.09.21 07:40, John Snow wrote: > > Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the sync > > wrapper from qemu.aqmp instead. > > > > Add an escape hatch in the form of the environment variable > > QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into using > > the old interface, proving that both implementations work concurrently. > > > > Signed-off-by: John Snow > > --- > > python/qemu/machine/machine.py | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/python/qemu/machine/machine.py > b/python/qemu/machine/machine.py > > index 8f5a6649e5..6b005dd5d1 100644 > > --- a/python/qemu/machine/machine.py > > +++ b/python/qemu/machine/machine.py > > @@ -41,7 +41,6 @@ > > ) > > > > from qemu.qmp import ( # pylint: disable=import-error > > -QEMUMonitorProtocol, > > QMPMessage, > > QMPReturnValue, > > SocketAddrT, > > @@ -50,6 +49,12 @@ > > from . import console_socket > > > > > > +if os.environ.get('QEMU_PYTHON_LEGACY_QMP'): > > +from qemu.qmp import QEMUMonitorProtocol > > +else: > > +from qemu.aqmp.legacy import QEMUMonitorProtocol > > + > > + > > LOG = logging.getLogger(__name__) > > Black magic. > > Reviewed-by: Hanna Reitz > Tested-by: Hanna Reitz > > Thanks for taking a look! Sorry for making you look at python :)
Re: [PATCH v3 00/17] iotests: support zstd
On 15.09.21 16:45, Hanna Reitz wrote: On 14.09.21 12:25, Vladimir Sementsov-Ogievskiy wrote: These series makes tests pass with IMGOPTS='compression_type=zstd' Also, python iotests start to support IMGOPTS (they didn't before). Unfortunately, the problem I have now is that it makes the tests fail with other IMGOPTS. My regular test set includes 'refcount_bits=1', 'compat=0.10', and 'data_file=$TEST_IMG.ext_data_file'. These fail now, because the Python tests don’t have a way to specify which test options they don’t support (like _unsupported_imgopts). Handling data_file of course is extra tricky because now every disk image consists of two files, and the qemu-img create invocation needs to expand '$TEST_IMG', like _make_test_img does (in the line where imgopts_expanded is set). I think we need an unsupported_imgopts parameter for Python tests, and it needs to be filled (perhaps https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg00082.html can serve as a starting point). And for the time being, I think Python tests should be skipped altogether when IMGOPTS contains data_file. (Perhaps I should explicitly say that this means I didn’t include this series in my pull request this week, because, well, my pre-pull tests were failing :/) (Also wanted to let you know I’ll be on PTO the next two weeks, so I won’t be able to review a v4 or talk about how exactly we want to handle IMGOPTS other than compression_type until Oct 4.) Hanna
Re: [PATCH 15/15] python, iotests: replace qmp with aqmp
On 17.09.21 07:40, John Snow wrote: Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the sync wrapper from qemu.aqmp instead. Add an escape hatch in the form of the environment variable QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into using the old interface, proving that both implementations work concurrently. Signed-off-by: John Snow --- python/qemu/machine/machine.py | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index 8f5a6649e5..6b005dd5d1 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -41,7 +41,6 @@ ) from qemu.qmp import ( # pylint: disable=import-error -QEMUMonitorProtocol, QMPMessage, QMPReturnValue, SocketAddrT, @@ -50,6 +49,12 @@ from . import console_socket +if os.environ.get('QEMU_PYTHON_LEGACY_QMP'): +from qemu.qmp import QEMUMonitorProtocol +else: +from qemu.aqmp.legacy import QEMUMonitorProtocol + + LOG = logging.getLogger(__name__) Black magic. Reviewed-by: Hanna Reitz Tested-by: Hanna Reitz
Re: [PATCH 14/15] python/aqmp: Remove scary message
On 17.09.21 07:40, John Snow wrote: The scary message interferes with the iotests output. Coincidentally, if iotests works by removing this, then it's good evidence that we don't really need to scare people away from using it. Signed-off-by: John Snow --- python/qemu/aqmp/__init__.py | 14 -- 1 file changed, 14 deletions(-) I definitely won’t give an R-b for this one, because that would require me reviewing the AQMP series, and, well, you know. Also, if I were to review the AQMP series and could actually give R-bs in good faith, why would I accept adding this message? I mean, if I’d reviewed it, I’d’ve had to trust it. So, öhm, I’m fine with dropping this message because evidently I’d’ve never agreed to adding it in the first place (had I reviewed the AQMP series). Hanna
Re: [PATCH 13/15] iotests: Accommodate async QMP Exception classes
On 17.09.21 07:40, John Snow wrote: (But continue to support the old ones for now, too.) There are very few cases of any user of QEMUMachine or a subclass thereof relying on a QMP Exception type. If you'd like to check for yourself, you want to grep for all of the derivatives of QMPError, excluding 'AQMPError' and its derivatives. That'd be these: - QMPError - QMPConnectError - QMPCapabilitiesError - QMPTimeoutError - QMPProtocolError - QMPResponseError - QMPBadPortError Signed-off-by: John Snow --- scripts/simplebench/bench_block_job.py| 3 ++- tests/qemu-iotests/tests/mirror-top-perms | 6 +- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/scripts/simplebench/bench_block_job.py b/scripts/simplebench/bench_block_job.py index 4f03c12169..a403c35b08 100755 --- a/scripts/simplebench/bench_block_job.py +++ b/scripts/simplebench/bench_block_job.py @@ -28,6 +28,7 @@ sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python')) from qemu.machine import QEMUMachine from qemu.qmp import QMPConnectError +from qemu.aqmp import ConnectError def bench_block_job(cmd, cmd_args, qemu_args): @@ -49,7 +50,7 @@ def bench_block_job(cmd, cmd_args, qemu_args): vm.launch() except OSError as e: return {'error': 'popen failed: ' + str(e)} -except (QMPConnectError, socket.timeout): +except (QMPConnectError, ConnectError, socket.timeout): return {'error': 'qemu failed: ' + str(vm.get_log())} try: diff --git a/tests/qemu-iotests/tests/mirror-top-perms b/tests/qemu-iotests/tests/mirror-top-perms index 451a0666f8..7d448f4d23 100755 --- a/tests/qemu-iotests/tests/mirror-top-perms +++ b/tests/qemu-iotests/tests/mirror-top-perms @@ -103,7 +103,11 @@ class TestMirrorTopPerms(iotests.QMPTestCase): print('ERROR: VM B launched successfully, this should not have ' 'happened') except qemu.qmp.QMPConnectError: -assert 'Is another process using the image' in self.vm_b.get_log() +pass +except qemu.aqmp.ConnectError: +pass + +assert 'Is another process using the image' in self.vm_b.get_log() But this assertion will fail if there was no exception, and so we won’t get to see the real problem, which is the original VM aborting (see the doc string). It doesn’t really matter that much that VM B can start (hence it being a logged error message, not a fatal error), and when it can start, of course it won’t print an error – but what’s important is that the original VM will then abort. I mean, not an absolute showstopper by any means, but still, the assertion was deliberately placed into the `except` block. Hanna
Re: [PATCH 12/15] iotests: Disable AQMP logging under non-debug modes
On 17.09.21 07:40, John Snow wrote: Disable the aqmp logger, which likes to (at the moment) print out intermediate warnings and errors that cause session termination; disable them so they don't interfere with the job output. Leave any "CRITICAL" warnings enabled though, those are ones that we should never see, no matter what. I mean, looks OK to me, but from what I understand (i.e. little), qmp_client doesn’t log CRITICAL messages, at least I can’t see any. Only ERRORs. I guess I’m missing some CRITICAL messages in external functions called from qmp_client.py, but shouldn’t we still keep ERRORs? Hanna Signed-off-by: John Snow --- tests/qemu-iotests/iotests.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 273d2777ae..47e5f9738b 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -1383,6 +1383,8 @@ def execute_setup_common(supported_fmts: Sequence[str] = (), if debug: sys.argv.remove('-d') logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN)) +if not debug: +logging.getLogger("qemu.aqmp.qmp_client").setLevel(logging.CRITICAL) _verify_image_format(supported_fmts, unsupported_fmts) _verify_protocol(supported_protocols, unsupported_protocols)
Re: [PATCH 11/15] python/aqmp: Create sync QMP wrapper for iotests
On 17.09.21 07:40, John Snow wrote: This is a wrapper around the async QMPClient that mimics the old, synchronous QEMUMonitorProtocol class. It is designed to be interchangeable with the old implementation. It does not, however, attempt to mimic Exception compatibility. Signed-off-by: John Snow --- python/qemu/aqmp/legacy.py | 131 + 1 file changed, 131 insertions(+) create mode 100644 python/qemu/aqmp/legacy.py Looks reasonable to me, although I can’t give an R-b in good conscience. I’m tempted to give a neo-tag (“Looks-okay-to:”), but let’s just be boring and do an Acked-by: Hanna Reitz
Re: [PATCH 10/15] python/machine: Add support for AQMP backend
On 17.09.21 07:40, John Snow wrote: To use the AQMP backend, Machine just needs to be a little more diligent about what happens when closing a QMP connection. The operation is no longer a freebie in the async world. Because async QMP continues to check for messages asynchronously, it's almost certainly likely that the loop will have exited due to EOF after issuing the last 'quit' command. That error will ultimately be bubbled up when attempting to close the QMP connection. The manager class here then is free to discard it if it was expected. Signed-off-by: John Snow --- Yes, I regret that this class has become quite a dumping ground for complexity around the exit path. It's in need of a refactor to help separate out the exception handling and cleanup mechanisms from the VM-related stuff, but it's not a priority to do that just yet -- but it's on the list. --- python/qemu/machine/machine.py | 51 ++ 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index 6e58d2f951..8f5a6649e5 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -342,9 +342,15 @@ def _post_shutdown(self) -> None: # Comprehensive reset for the failed launch case: self._early_cleanup() -if self._qmp_connection: -self._qmp.close() -self._qmp_connection = None +try: +self._close_qmp_connection() +except Exception as err: # pylint: disable=broad-except +LOG.warning( +"Exception closing QMP connection: %s", +str(err) if str(err) else type(err).__name__ +) +finally: +assert self._qmp_connection is None self._close_qemu_log_file() @@ -420,6 +426,31 @@ def _launch(self) -> None: close_fds=False) self._post_launch() +def _close_qmp_connection(self) -> None: +""" +Close the underlying QMP connection, if any. + +Dutifully report errors that occurred while closing, but assume +that any error encountered indicates an abnormal termination +process and not a failure to close. +""" +if self._qmp_connection is None: +return + +try: +self._qmp.close() +except EOFError: +# EOF can occur as an Exception here when using the Async +# QMP backend. It indicates that the server closed the +# stream. If we successfully issued 'quit' at any point, +# then this was expected. If the remote went away without +# our permission, it's worth reporting that as an abnormal +# shutdown case. +if not self._has_quit: +raise +finally: +self._qmp_connection = None + def _early_cleanup(self) -> None: """ Perform any cleanup that needs to happen before the VM exits. @@ -461,8 +492,18 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None: if self._qmp_connection: if not self._has_quit: -# Might raise ConnectionReset -self.qmp('quit') +try: +# May raise ExecInterruptedError or StateError if the +# connection dies or has already died. +self.qmp('quit') +finally: +# If 'quit' fails, we'll still want to call close(), +# which will raise any final errors that may have +# occurred while trying to send 'quit'. +self._close_qmp_connection() +else: +# Regardless, we want to tidy up the socket. +self._close_qmp_connection() Why can’t we wait for _post_shutdown to do that? Has it something to do with this operation being “no longer a freeby” (I’m not quite sure what this refers to, execution time, or the set of possible exceptions, or perhaps something else entirely), and so we want to do it in the already-not-instantaneous _soft_shutdown? Hanna # May raise subprocess.TimeoutExpired self._subp.wait(timeout=timeout)
Re: [PATCH 09/15] python/machine: remove has_quit argument
On 17.09.21 07:40, John Snow wrote: If we spy on the QMP commands instead, we don't need callers to remember to pass it. Seems like a fair trade-off. The one slightly weird bit is overloading this instance variable for wait(), where we use it to mean "don't issue the qmp 'quit' command". This means that wait() will "fail" if the QEMU process does not terminate of its own accord. In most cases, we probably did already actually issue quit -- some iotests do this -- but in some others, we may be waiting for QEMU to terminate for some other reason. Signed-off-by: John Snow --- python/qemu/machine/machine.py | 35 +++--- tests/qemu-iotests/040 | 7 +-- tests/qemu-iotests/218 | 2 +- tests/qemu-iotests/255 | 2 +- 4 files changed, 23 insertions(+), 23 deletions(-) Looks good overall, some bikeshedding below. diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index 056d340e35..6e58d2f951 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -170,6 +170,7 @@ def __init__(self, self._console_socket: Optional[socket.socket] = None self._remove_files: List[str] = [] self._user_killed = False +self._has_quit = False Sounds a bit weird, because evidently it has quit. I mean, it was always more of a has_sent_quit or quit_issued, but now it really seems a bit wrong. [...] @@ -529,7 +526,9 @@ def wait(self, timeout: Optional[int] = 30) -> None: :param timeout: Optional timeout in seconds. Default 30 seconds. A value of `None` is an infinite wait. """ -self.shutdown(has_quit=True, timeout=timeout) +# In case QEMU was stopped without issuing 'quit': This comment confused me more than anything and only looking at the function’s name and doc string was I able to understand why we have has_quit = True here, and only by scratching my head asking myself why you’d write the comment did I understand the comment’s purpose. What isn’t quite clear in the comment is that we kind of expect _has_quit to already be True, because the VM was probably exited with `quit`. But we don’t know for sure, so we have to force _has_quit to True. But it’s also not necessary to explain, I think. The idea is simply that this function should do nothing to make the VM quit, so it’s absolutely natural that we’d set _has_quit to True, because the caller guarantees that they already made the VM quit. No need to explain why this might already be True, and why it’s still OK. So I’d just say something like “Do not send a `quit` to the VM, just wait for it to exit”. Hanna +self._has_quit = True +self.shutdown(timeout=timeout) def set_qmp_monitor(self, enabled: bool = True) -> None: """
Re: [PATCH 08/15] python/aqmp: Create MessageModel and StandaloneModel classes
On 17.09.21 07:40, John Snow wrote: This allows 'Greeting' to be subclass of 'Message'. We need the adapter classes to avoid some typing problems that occur if we try to put too much into the 'Model' class itself; the exact details of why are left as an exercise to the reader. Why bother? This makes 'Greeting' ⊆ 'Message', which is taxonomically true; but the real motivation is to be able to inherit and abuse all of the Mapping dunders so that we can call dict(greeting) or bytes(greeting), for example. Signed-off-by: John Snow --- python/qemu/aqmp/models.py | 67 -- 1 file changed, 50 insertions(+), 17 deletions(-) (I feel like this is a bit much outside of my scope, so this’ll have to do without my R-b.)
Re: [PATCH 07/15] python/aqmp: add send_fd_scm
On 17.09.21 07:40, John Snow wrote: The single space is indeed required to successfully transmit the file descriptor to QEMU. Yeah, socket_scm_helper.c said “Send a blank to notify qemu”. Signed-off-by: John Snow --- python/qemu/aqmp/qmp_client.py | 17 + 1 file changed, 17 insertions(+) diff --git a/python/qemu/aqmp/qmp_client.py b/python/qemu/aqmp/qmp_client.py index d2ad7459f9..58f85990bc 100644 --- a/python/qemu/aqmp/qmp_client.py +++ b/python/qemu/aqmp/qmp_client.py @@ -9,6 +9,8 @@ import asyncio import logging +import socket +import struct from typing import ( Dict, List, @@ -624,3 +626,18 @@ async def execute(self, cmd: str, """ msg = self.make_execute_msg(cmd, arguments, oob=oob) return await self.execute_msg(msg) + +@upper_half +@require(Runstate.RUNNING) +def send_fd_scm(self, fd: int) -> None: +""" +Send a file descriptor to the remote via SCM_RIGHTS. +""" +assert self._writer is not None +sock = self._writer.transport.get_extra_info('socket') + +# Python 3.9+ adds socket.send_fds(...) +sock.sendmsg( +[b' '], +[(socket.SOL_SOCKET, socket.SCM_RIGHTS, struct.pack('@i', fd))] +) AFAIU the socket can be either TCP or a UNIX socket (AsyncProtocol._do_accept’s docstring sounds this way), so should we check that this is a UNIX socket? (Or is it fine to just run into the error that I suspect we would get with a TCP socket?) Hanna
Re: [PATCH 06/15] python, iotests: remove socket_scm_helper
On 17.09.21 07:40, John Snow wrote: It's not used anymore, now. Signed-off-by: John Snow --- tests/qemu-iotests/socket_scm_helper.c | 136 - python/qemu/machine/machine.py | 3 - python/qemu/machine/qtest.py | 2 - tests/Makefile.include | 1 - tests/meson.build | 4 - tests/qemu-iotests/iotests.py | 3 - tests/qemu-iotests/meson.build | 5 - tests/qemu-iotests/testenv.py | 8 +- 8 files changed, 1 insertion(+), 161 deletions(-) delete mode 100644 tests/qemu-iotests/socket_scm_helper.c delete mode 100644 tests/qemu-iotests/meson.build Oh, huh. Nice. Reviewed-by: Hanna Reitz
Re: [PATCH 05/15] python/qmp: add send_fd_scm directly to QEMUMonitorProtocol
On 17.09.21 07:40, John Snow wrote: It turns out you can do this directly from Python ... and because of this, you don't need to worry about setting the inheritability of the fds or spawning another process. Doing this is helpful because it allows QEMUMonitorProtocol to keep its file descriptor and socket object as private implementation details. *that* is helpful in turn because it allows me to write a compatible, alternative implementation. Bit of a weird indentation here. Signed-off-by: John Snow --- python/qemu/machine/machine.py | 44 +++--- python/qemu/qmp/__init__.py| 21 +++- 2 files changed, 18 insertions(+), 47 deletions(-) diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py index ae945ca3c9..1c6532a3d6 100644 --- a/python/qemu/machine/machine.py +++ b/python/qemu/machine/machine.py @@ -213,48 +213,22 @@ def add_fd(self: _T, fd: int, fdset: int, def send_fd_scm(self, fd: Optional[int] = None, file_path: Optional[str] = None) -> int: [...] if file_path is not None: assert fd is None -fd_param.append(file_path) +with open(file_path, "rb") as passfile: +fd = passfile.fileno() +self._qmp.send_fd_scm(fd) Seems a bit strange to send an fd that is then immediately closed, but that’s what socket_scm_helper did, and so it looks like the fd is effectively duplicated. OK then. Reviewed-by: Hanna Reitz
Re: [PATCH 04/15] python/qmp: clear events on get_events() call
On 17.09.21 07:40, John Snow wrote: All callers in the tree *already* clear the events after a call to get_events(). Do it automatically instead and update callsites to remove the manual clear call. These semantics are quite a bit easier to emulate with async QMP, and nobody appears to be abusing some emergent properties of what happens if you decide not to clear them, so let's dial down to the dumber, simpler thing. Specifically: callers of clear() right after a call to get_events() are more likely expressing their desire to not see any events they just retrieved, whereas callers of clear_events() not in relation to a recent call to pull_event/get_events are likely expressing their desire to simply drop *all* pending events straight onto the floor. In the sync world, this is safe enough; in the async world it's nearly impossible to promise that nothing happens between getting and clearing the events. Making the retrieval also clear the queue is vastly simpler. Signed-off-by: John Snow --- python/qemu/machine/machine.py | 1 - python/qemu/qmp/__init__.py| 4 +++- python/qemu/qmp/qmp_shell.py | 1 - 3 files changed, 3 insertions(+), 3 deletions(-) [...] diff --git a/python/qemu/qmp/__init__.py b/python/qemu/qmp/__init__.py index 269516a79b..ba15668c25 100644 --- a/python/qemu/qmp/__init__.py +++ b/python/qemu/qmp/__init__.py @@ -374,7 +374,9 @@ def get_events(self, wait: bool = False) -> List[QMPMessage]: @return The list of available QMP events. """ self.__get_events(wait) -return self.__events +events = self.__events +self.__events = [] +return events Maybe it’s worth updating the doc string that right now just says “Get a list of available QMP events”? (Also, perhaps renaming it to get_new_events, but that’s an even weaker suggestion.) Anyway: Reviewed-by: Hanna Reitz
Re: [PATCH 03/15] python/aqmp: Return cleared events from EventListener.clear()
On 17.09.21 07:40, John Snow wrote: This serves two purposes: (1) It is now possible to discern whether or not clear() removed any event(s) from the queue with absolute certainty, and (2) It is now very easy to get a List of all pending events in one chunk, which is useful for the sync bridge. Signed-off-by: John Snow --- python/qemu/aqmp/events.py | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) Not sure if `clear` is an ideal name then, but `drain` sounds like something that would block, and `drop` is really much different from `clear`. Also, doesn’t matter, having Collection.delete return the deleted element is a common thing in any language’s standard library, so why not have `clear` do the same. Reviewed-by: Hanna Reitz
Re: [PATCH 02/15] python/aqmp: add .empty() method to EventListener
On 17.09.21 07:40, John Snow wrote: Synchronous clients may want to know if they're about to block waiting for an event or not. A method such as this is necessary to implement a compatible interface for the old QEMUMonitorProtocol using the new async internals. Signed-off-by: John Snow --- python/qemu/aqmp/events.py | 6 ++ 1 file changed, 6 insertions(+) Reviewed-by: Hanna Reitz
Re: [PATCH 01/15] python/aqmp: add greeting property to QMPClient
On 17.09.21 07:40, John Snow wrote: Expose the greeting as a read-only property of QMPClient so it can be retrieved at-will. Signed-off-by: John Snow --- python/qemu/aqmp/qmp_client.py | 5 + 1 file changed, 5 insertions(+) Reviewed-by: Hanna Reitz
Re: [PATCH v3 16/16] iotests/linters: check mypy files all at once
On 16.09.21 06:09, John Snow wrote: We can circumvent the '__main__' redefinition problem by passing --scripts-are-modules. Take mypy out of the loop per-filename and check everything in one go: it's quite a bit faster. Is it possible to pull this to the beginning of the series? Just because patch 14 has to make everything quite slow (which might be a tiny nuisance in bisecting some day?). Signed-off-by: John Snow --- tests/qemu-iotests/linters.py | 62 --- 1 file changed, 29 insertions(+), 33 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH v3 14/16] iotests/linters: Add workaround for mypy bug #9852
On 16.09.21 06:09, John Snow wrote: This one is insidious: if you use the invocation "from {namespace} import {subpackage}" as mirror-top-perms does, mypy will fail on every-other invocation *if* the package being imported is a package. Now, I could just edit mirror-top-perms to avoid this invocation, but since I tripped on a landmine, I might as well head it off at the pass and make sure nobody else trips on the same landmine. It seems to have something to do with the order in which files are checked as well, meaning the random order in which set(os.listdir()) produces the list of files to test will cause problems intermittently. mypy >= 0.920 isn't released yet, so add this workaround for now. See also: https://github.com/python/mypy/issues/11010 https://github.com/python/mypy/issues/9852 Signed-off-by: John Snow --- tests/qemu-iotests/linters.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/qemu-iotests/linters.py b/tests/qemu-iotests/linters.py index 4df062a973..9c97324e87 100755 --- a/tests/qemu-iotests/linters.py +++ b/tests/qemu-iotests/linters.py @@ -100,6 +100,9 @@ def run_linters( '--warn-unused-ignores', '--no-implicit-reexport', '--namespace-packages', +# Until we can use mypy >= 0.920, see +# https://github.com/python/mypy/issues/9852 +'--no-incremental', filename, ), I’m afraid I still don’t really understand this, but I’m happy with this given as the reported workaround and you saying it works. Reviewed-by: Hanna Reitz Question is, when “can we use” mypy >= 0.920? Should we check the version string and append this switch as required? Hanna
Re: [PATCH v3 11/16] iotests/297: return error code from run_linters()
On 16.09.21 06:09, John Snow wrote: This turns run_linters() into a bit of a hybrid test; returning non-zero on failed execution while also printing diffable information. This is done for the benefit of the avocado simple test runner, which will soon be attempting to execute this test from a different environment. (Note: universal_newlines is added to the pylint invocation for type consistency with the mypy run -- it's not strictly necessary, but it avoids some typing errors caused by our re-use of the 'p' variable.) Signed-off-by: John Snow Reviewed-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/297 | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) I don’t think I like this very much. Returning an integer error code seems archaic. (You can perhaps already see that this is going to be one of these reviews of mine where I won’t say anything is really wrong, but where I will just lament subjectively missing beauty.) As you say, run_linters() to me seems very iotests-specific still: It emits a specific output that is compared against a reference output. Fine for 297, but not fine for a function provided by a linters.py, I’d say. I’d prefer run_linters() to return something like a Map[str, Optional[str]], that would map a tool to its output in case of error, i.e. ideally: `{'pylint': None, 'mypy': None}` 297 could format it something like ``` for tool, output in run_linters().items(): print(f'=== {tool} ===') if output is not None: print(output) ``` Or something. To check for error, you could put a Python script in python/tests that checks `any(output is not None for output in run_linters().values())` or something (and on error print the output). Pulling out run_linters() into an external file and having it print something to stdout just seems too iotests-centric to me. I suppose as long as the return code is right (which this patch is for) it should work for Avocado’s simple tests, too (which I don’t like very much either, by the way, because they too seem archaic to me), but, well. It almost seems like the Avocado test should just run ./check then. Come to think of it, to be absolutely blasphemous, why not. I could say all of this seems like quite some work that could be done by a python/tests script that does this: ``` #!/bin/sh set -e cat >/tmp/qemu-parrot.shNone: +) -> int: +ret = 0 print('=== pylint ===') sys.stdout.flush() # Todo notes are fine, but fixme's or xxx's should probably just be # fixed (in tests, at least) -subprocess.run( +p = subprocess.run( ('python3', '-m', 'pylint', '--score=n', '--notes=FIXME,XXX', *files), cwd=directory, env=env, check=False, +universal_newlines=True, ) +ret += p.returncode print('=== mypy ===') sys.stdout.flush() @@ -113,9 +116,12 @@ def run_linters( universal_newlines=True ) +ret += p.returncode if p.returncode != 0: print(p.stdout) +return ret + def main() -> None: for linter in ('pylint-3', 'mypy'):
Re: [PATCH v3 10/16] iotests/297: Add 'directory' argument to run_linters
On 16.09.21 06:09, John Snow wrote: Allow run_linters to work well if it's executed from a different directory. Signed-off-by: John Snow Reviewed-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/297 | 3 +++ 1 file changed, 3 insertions(+) Reviewed-by: Hanna Reitz
Re: [PATCH v3 09/16] iotests/297: Separate environment setup from test execution
On 16.09.21 06:09, John Snow wrote: Move environment setup into main(), leaving pure test execution behind in run_linters(). Signed-off-by: John Snow Reviewed-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/297 | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH v3 08/16] iotests/297: Create main() function
On 16.09.21 06:09, John Snow wrote: Instead of running "run_linters" directly, create a main() function that will be responsible for environment setup, leaving run_linters() responsible only for execution of the linters. (That environment setup will be moved over in forthcoming commits.) Signed-off-by: John Snow Reviewed-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/297 | 12 1 file changed, 8 insertions(+), 4 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH v3 07/16] iotests/297: Don't rely on distro-specific linter binaries
On 16.09.21 06:09, John Snow wrote: 'pylint-3' is another Fedora-ism. Use "python3 -m pylint" or "python3 -m mypy" to access these scripts instead. This style of invocation will prefer the "correct" tool when run in a virtual environment. Note that we still check for "pylint-3" before the test begins -- this check is now "overly strict", but shouldn't cause anything that was already running correctly to start failing. Signed-off-by: John Snow Reviewed-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Philippe Mathieu-Daudé --- tests/qemu-iotests/297 | 45 -- 1 file changed, 26 insertions(+), 19 deletions(-) I know it sounds silly, but to be honest I have no idea if replacing `mypy` by `python3 -m mypy` is correct, because no mypy documentation seems to suggest it. From what I understand, that’s generally how Python “binaries” work, though (i.e., installed as a module invokable with `python -m`, and then providing some stub binary that, well, effectively does this, but kind of in a weird way, I just don’t understand it), and none of the parameters seem to be hurt in this conversion, so: Reviewed-by: Hanna Reitz
Re: [PATCH v3 06/16] iotests/297: Add get_files() function
On 16.09.21 06:09, John Snow wrote: Split out file discovery into its own method to begin separating out the "environment setup" and "test execution" phases. Signed-off-by: John Snow --- tests/qemu-iotests/297 | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297 index 3e86f788fc..b3a56a3a29 100755 --- a/tests/qemu-iotests/297 +++ b/tests/qemu-iotests/297 @@ -21,6 +21,7 @@ import re import shutil import subprocess import sys +from typing import List import iotests @@ -56,10 +57,15 @@ def is_python_file(filename: str, directory: str = '.') -> bool: return False +def get_test_files(directory: str = '.') -> List[str]: +named_test_dir = os.path.join(directory, 'tests') +named_tests = [f"tests/{entry}" for entry in os.listdir(named_test_dir)] +check_tests = set(os.listdir(directory) + named_tests) - set(SKIP_FILES) +return list(filter(lambda f: is_python_file(f, directory), check_tests)) Seeing a filter() makes me immensely happy, but I thought that was unpythonic? Reviewed-by: Hanna Reitz
Re: [PATCH v3 05/16] iotests/297: modify is_python_file to work from any CWD
On 16.09.21 06:09, John Snow wrote: Add a directory argument to is_python_file to allow it to work correctly no matter what CWD we happen to run it from. This is done in anticipation of running the iotests from another directory (./python/). “the iotests” or just 297? All of the iotests would sound like an ambitious goal. Signed-off-by: John Snow --- tests/qemu-iotests/297 | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH v3 04/16] iotests/migrate-bitmaps-test: delint
On 16.09.21 06:09, John Snow wrote: Mostly uninteresting stuff. Move the test injections under a function named main() so that the variables used during that process aren't in the global scope. Signed-off-by: John Snow --- tests/qemu-iotests/tests/migrate-bitmaps-test | 50 +++ 1 file changed, 28 insertions(+), 22 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH v3 03/16] iotests/migrate-bitmaps-postcopy-test: declare instance variables
On 16.09.21 06:09, John Snow wrote: Signed-off-by: John Snow --- tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test index 00ebb5c251..9c00ae61c8 100755 --- a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test +++ b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test @@ -115,6 +115,9 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase): self.vm_a_events = [] self.vm_b_events = [] +self.discards1_sha256: str +self.all_discards_sha256: str + def start_postcopy(self): """ Run migration until RESUME event on target. Return this event. """ for i in range(nb_bitmaps): Isn’t this obsolete after e2ad17a62d9da45fbcddc3faa3d6ced519a9453a? Hanna
Re: [PATCH v3 02/16] iotests/mirror-top-perms: Adjust imports
On 16.09.21 06:09, John Snow wrote: We need to import things from the qemu namespace; importing the namespace alone doesn't bring the submodules with it -- unless someone else (like iotests.py) imports them too. Adjust the imports. Signed-off-by: John Snow --- tests/qemu-iotests/tests/mirror-top-perms | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) Reviewed-by: Hanna Reitz
Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects
On Sep 17 09:38, Kevin Wolf wrote: > Am 17.09.2021 um 08:21 hat Klaus Jensen geschrieben: > > On Sep 16 18:30, Klaus Jensen wrote: > > > On Sep 16 14:41, Kevin Wolf wrote: > > > > Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben: > > > > > From: Klaus Jensen > > > > > > > > > > Hi, > > > > > > > > > > This is an attempt at adressing a bunch of issues that have presented > > > > > themselves since we added subsystem support. It's been brewing for a > > > > > while now. > > > > > > > > > > Fundamentally, I've come to the conclusion that modeling namespaces > > > > > and > > > > > subsystems as "devices" is wrong. They should have been user-creatable > > > > > objects. We've run into multiple issues with wrt. hotplugging due to > > > > > how > > > > > namespaces hook up to the controller with a bus. The bus-based design > > > > > made a lot of sense when we didn't have subsystem support and it > > > > > follows > > > > > the design of hw/scsi. But, the problem here is that the bus-based > > > > > design dictates a one parent relationship, and with shared namespaces, > > > > > that is just not true. If the namespaces are considered to have a > > > > > single > > > > > parent, that parent is the subsystem, not any specific controller. > > > > > > > > > > This series adds a set of experimental user-creatable objects: > > > > > > > > > > -object x-nvme-subsystem > > > > > -object x-nvme-ns-nvm > > > > > -object x-nvme-ns-zoned > > > > > > > > > > It also adds a new controller device (-device x-nvme-ctrl) that > > > > > supports > > > > > these new objects (and gets rid of a bunch of deprecated and confusing > > > > > parameters). This new approach has a bunch of benefits (other than > > > > > just > > > > > fixing the hotplugging issues properly) - we also get support for some > > > > > nice introspection through some new dynamic properties: > > > > > > > > > > (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces > > > > > [ > > > > > "/objects/nvm-1", > > > > > "/objects/zns-1" > > > > > ] > > > > > > > > > > (qemu) qom-list /objects/zns-1 > > > > > type (string) > > > > > subsys (link) > > > > > nsid (uint32) > > > > > uuid (string) > > > > > attached-ctrls (str) > > > > > eui64 (string) > > > > > blockdev (string) > > > > > pi-first (bool) > > > > > pi-type (NvmeProtInfoType) > > > > > extended-lba (bool) > > > > > metadata-size (uint16) > > > > > lba-size (size) > > > > > zone-descriptor-extension-size (size) > > > > > zone-cross-read (bool) > > > > > zone-max-open (uint32) > > > > > zone-capacity (size) > > > > > zone-size (size) > > > > > zone-max-active (uint32) > > > > > > > > > > (qemu) qom-get /objects/zns-1 pi-type > > > > > "none" > > > > > > > > > > (qemu) qom-get /objects/zns-1 eui64 > > > > > "52:54:00:17:67:a0:40:15" > > > > > > > > > > (qemu) qom-get /objects/zns-1 zone-capacity > > > > > 12582912 > > > > > > > > > > Currently, there are no shortcuts, so you have to define the full > > > > > topology to get it up and running. Notice that the topology is > > > > > explicit > > > > > (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus' > > > > > anymore. > > > > > > > > > > -object x-nvme-subsystem,id=subsys0,subnqn=foo > > > > > -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0 > > > > > -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0 > > > > > -drive id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap > > > > > -object > > > > > x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1 > > > > > -drive id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap > > > > > -object > > > > > x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0 > > > > > > > > I may be wrong here, but my first gut feeling when seeing this was that > > > > referencing the controller device in the namespace object feels > > > > backwards. Usually, we have objects that are created independently and > > > > then the devices reference them. > > > > > > > > Your need to use a machine_done notifier is probably related to that, > > > > too, because it goes against the normal initialisation order, so you > > > > have to wait. Error handling also isn't really possible in the notifier > > > > any more, so this series seems to just print something to stderr, but > > > > ignore the error otherwise. > > > > > > > > Did you consider passing a list of namespaces to the controller device > > > > instead? > > > > > > > > I guess a problem that you have with both ways is that support for > > > > list options isn't great in QemuOpts, which is still used both for > > > > -object and -device in the system emulator... > > > > > > Heh. Exactly. The ability to better support lists with -object through > > > QAPI is why I did it like this... > > I see. I really need to continue with the QAPIfication work, in
Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects
Am 17.09.2021 um 08:21 hat Klaus Jensen geschrieben: > On Sep 16 18:30, Klaus Jensen wrote: > > On Sep 16 14:41, Kevin Wolf wrote: > > > Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben: > > > > From: Klaus Jensen > > > > > > > > Hi, > > > > > > > > This is an attempt at adressing a bunch of issues that have presented > > > > themselves since we added subsystem support. It's been brewing for a > > > > while now. > > > > > > > > Fundamentally, I've come to the conclusion that modeling namespaces and > > > > subsystems as "devices" is wrong. They should have been user-creatable > > > > objects. We've run into multiple issues with wrt. hotplugging due to how > > > > namespaces hook up to the controller with a bus. The bus-based design > > > > made a lot of sense when we didn't have subsystem support and it follows > > > > the design of hw/scsi. But, the problem here is that the bus-based > > > > design dictates a one parent relationship, and with shared namespaces, > > > > that is just not true. If the namespaces are considered to have a single > > > > parent, that parent is the subsystem, not any specific controller. > > > > > > > > This series adds a set of experimental user-creatable objects: > > > > > > > > -object x-nvme-subsystem > > > > -object x-nvme-ns-nvm > > > > -object x-nvme-ns-zoned > > > > > > > > It also adds a new controller device (-device x-nvme-ctrl) that supports > > > > these new objects (and gets rid of a bunch of deprecated and confusing > > > > parameters). This new approach has a bunch of benefits (other than just > > > > fixing the hotplugging issues properly) - we also get support for some > > > > nice introspection through some new dynamic properties: > > > > > > > > (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces > > > > [ > > > > "/objects/nvm-1", > > > > "/objects/zns-1" > > > > ] > > > > > > > > (qemu) qom-list /objects/zns-1 > > > > type (string) > > > > subsys (link) > > > > nsid (uint32) > > > > uuid (string) > > > > attached-ctrls (str) > > > > eui64 (string) > > > > blockdev (string) > > > > pi-first (bool) > > > > pi-type (NvmeProtInfoType) > > > > extended-lba (bool) > > > > metadata-size (uint16) > > > > lba-size (size) > > > > zone-descriptor-extension-size (size) > > > > zone-cross-read (bool) > > > > zone-max-open (uint32) > > > > zone-capacity (size) > > > > zone-size (size) > > > > zone-max-active (uint32) > > > > > > > > (qemu) qom-get /objects/zns-1 pi-type > > > > "none" > > > > > > > > (qemu) qom-get /objects/zns-1 eui64 > > > > "52:54:00:17:67:a0:40:15" > > > > > > > > (qemu) qom-get /objects/zns-1 zone-capacity > > > > 12582912 > > > > > > > > Currently, there are no shortcuts, so you have to define the full > > > > topology to get it up and running. Notice that the topology is explicit > > > > (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus' > > > > anymore. > > > > > > > > -object x-nvme-subsystem,id=subsys0,subnqn=foo > > > > -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0 > > > > -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0 > > > > -drive id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap > > > > -object > > > > x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1 > > > > -drive id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap > > > > -object > > > > x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0 > > > > > > I may be wrong here, but my first gut feeling when seeing this was that > > > referencing the controller device in the namespace object feels > > > backwards. Usually, we have objects that are created independently and > > > then the devices reference them. > > > > > > Your need to use a machine_done notifier is probably related to that, > > > too, because it goes against the normal initialisation order, so you > > > have to wait. Error handling also isn't really possible in the notifier > > > any more, so this series seems to just print something to stderr, but > > > ignore the error otherwise. > > > > > > Did you consider passing a list of namespaces to the controller device > > > instead? > > > > > > I guess a problem that you have with both ways is that support for > > > list options isn't great in QemuOpts, which is still used both for > > > -object and -device in the system emulator... > > > > Heh. Exactly. The ability to better support lists with -object through > > QAPI is why I did it like this... I see. I really need to continue with the QAPIfication work, in the hope that devices will have the same level of support for lists in the future... QOM really has usable support for lists, it's just the outer layer that doesn't support them well. I wonder how hard (or how stupid) it would be to support JSON syntax without QAPIfying things first. > > Having the list of namespaces on