On 5/3/21 11:02 AM, Max Reitz wrote:
On 30.04.21 23:03, Emanuele Giuseppe Esposito wrote:
On 30/04/2021 13:59, Max Reitz wrote:
On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote:
Attaching a gdbserver implies that the qmp socket
should wait indefinitely for an answer from QEMU.
Signed-off-by: Emanuele Giuseppe Esposito <eespo...@redhat.com>
---
python/qemu/machine.py | 3 +++
tests/qemu-iotests/iotests.py | 10 +++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/python/qemu/machine.py b/python/qemu/machine.py
index 12752142c9..d6142271c2 100644
--- a/python/qemu/machine.py
+++ b/python/qemu/machine.py
@@ -409,6 +409,9 @@ def _launch(self) -> None:
stderr=subprocess.STDOUT,
shell=False,
close_fds=False)
+
+ if 'gdbserver' in self._wrapper:
+ self._qmp_timer = None
Why doesn’t __init__() evaluate this? This here doesn’t feel like
the right place for it. If we want to evaluate it here,
self._qmp_timer shouldn’t exist, and instead the timeout should be a
_post_launch() parameter. (Which I would have nothing against, by
the way.)
Uhm.. I got another comment in a previous version where for the
"event" callbacks it was better a property than passing around a
parameter. Which I honestly agree.
I think that comment was in the sense of providing a default value,
which can be expressed by having a property that is set in __init__.
My comment was along the lines that "_post_launch()" is behaving as an
event loop hook and not the sort of thing I want to pass parameters to.
It's a private method, so the only possibility for someone passing a
parameter to is another class method anyway.
We have a hierarchy of things that depend on the Machine class and I
didn't want to start cascading optional parameters into the subclasses.
It was my intent that the information needed to run _post_launch()
correctly should be known by the state of the object -- which I think
should be true anyway.
I don’t have anything against making this a property, but I also don’t
have anything against making it a _post_launch() parameter. I could
even live with both, i.e. set _qmp_timer to 15 in __init__, then have a
_post_launch parameter, and pass either self._qmp_timer or None if
self._wrapper includes 'gdbserver'.
What I do mind is that I don’t understand why the property is modified
here. The value of self._qmp_timer is supposed to be 15 by default and
None if self._wrapper includes 'gdbserver'. It should thus be changed
to None the moment self._wrapper is made to include 'gdbserver'. Because
self._wrapper is set only in __init__, this should happen in __init__.
What should __init__() do? The check here is to see if the invocation
has gdb (and a couple of patches ahead also valgrind), to remove the
timer.
If I understand what you mean, you want something like
def __init__(self, timer):
Oh, no. We can optionally do that perhaps later, but what I meant is
just to put this in __init__() (without adding any parameters to it):
self._qmp_timer = 15.0 if 'gdbserver' not in self._wrapper else None
I think self._qmp_timer should always reflect what timeout we are going
to use when a VM is launched. So if the conditions influencing the
timeout change, it should be updated immediately to reflect this. The
only condition we have right now is the content of self._wrapper, which
is only set in __init__, so self._qmp_timer should be set once in
__init__ and not changed afterwards.
That sounds academic, but imagine what would happen if we had a
set_qmp_timer() method: The timout could be adjusted, but launch() would
just ignore it and update the property, even though the conditions
influencing the timout didn’t change between set_qmp_timer() and launch().
Or if we had a get_qmp_timer(); a caller would read a timeout of 15.0
before launch(), even though the timeout is going to be None.
Therefore, I think a property should not be updated just before it is
read, but instead when any condition that’s supposed to influence its
value changes.
I agree with Max's reasoning here.
I am also not a fan of squishing magic into this class; changing class
behavior based on introspection of wrapper arguments feels like a
layering violation.
Maybe what you want is a subclass or a wrapper class that knows how to
run QEMU using gdbserver, and changes some behaviors accordingly?
The factoring of Machine is quite bad already, admittedly, and is in
need of a good spit-shine. Too many init parameters, too many state
variables, too many methods that got patched in to support one specific
use-case at one point or another. At a certain point, I begin to worry
about how it's possible to audit how all of these one-off features
behave and interact. It's getting complex.
Is it time to dream up a refactoring for how the Machine class behaves?
I suggested making it a parameter because updating a property when
reading it sounds like it should be a parameter instead. I.e., one
would say
def __init__():
self._qmp_timeout_default = 15.0
def post_launch(qmp_timeout):
self._qmp.accept(qmp_timeout)
def launch(self):
...
qmp_timeout = None if 'gdbserver' in self._wrapper \
else self._qmp_timout_default
self.post_launch(qmp_timeout)
Which is basically the structure your patch has, which gave me the idea.
[...]
self._post_launch()
def _early_cleanup(self) -> None:
diff --git a/tests/qemu-iotests/iotests.py
b/tests/qemu-iotests/iotests.py
index 05d0dc0751..380527245e 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
[...]
@@ -684,6 +687,11 @@ def qmp_to_opts(self, obj):
output_list += [key + '=' + obj[key]]
return ','.join(output_list)
+ def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]:
+ if qemu_gdb:
+ wait = 0.0
[...]
Second, I don’t understand this. If the caller wants to block
waiting on an event, then that should have nothing to do with whether
we have gdb running or not. As far as I understand, setting wait to
0.0 is the same as wait = False, i.e. we don’t block and just return
None immediately if there is no pending event.
You're right, this might not be needed here. The problem I had was
that calling gdb and pausing at a breakpoint or something for a while
would make the QMP socket timeout, thus aborting the whole test. In
order to avoid that, I need to stop or delay timers.
I can't remember why I added this check here. At some point I am sure
the test was failing because of socket timeout expiration, but I
cannot reproduce the problem when commenting out this check above in
get_qmp_events. The other check in patch 3 should be enough.
Hm, ok. I’d guessed that you intended the wait=0.0 or wait=False to
mean that we get an infinite timeout (i.e., no timeout), but that’s
exactly why I didn’t get it. wait=0.0 doesn’t give an infinite timeout,
but instead basically times out immediately.
Max
Well, I suppose if we don't need it, then that makes things easier too :)