On 8/29/24 2:56 PM, Markus Armbruster wrote:
Daniil Tatianin <d-tatia...@yandex-team.ru> writes:

The "reconnect" option only allows to specify the time in seconds,
which is way too long for certain workflows.

We have a lightweight disk backend server, which takes about 20ms to
live update, but due to this limitation in QEMU, previously the guest
disk controller would hang for one second because it would take this
long for QEMU to reinitialize the socket connection.

Make it possible to specify a smaller timeout by treating the value in
"reconnect" as milliseconds via the new "reconnect-is-ms" option.

Signed-off-by: Daniil Tatianin <d-tatia...@yandex-team.ru>
Your use case demonstrates that a granularity of seconds was the wrong
choice for the reconnection delay.

[...]

diff --git a/qapi/char.json b/qapi/char.json
index ef58445cee..61aeccf09d 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -272,8 +272,13 @@
# (default: false) (Since: 3.1)
#
# @reconnect: For a client socket, if a socket is disconnected, then
-# attempt a reconnect after the given number of seconds. Setting
-# this to zero disables this function. (default: 0) (Since: 2.2)
+# attempt a reconnect after the given number of seconds (unless
+# @reconnect-is-ms is set to true, in that case the number is
+# treated as milliseconds). Setting this to zero disables
+# this function. (default: 0) (Since: 2.2)
+#
+# @reconnect-is-ms: The value specified in @reconnect should be treated
+# as milliseconds. (default: false) (Since: 9.2)
#
# Since: 1.4
##
@@ -287,7 +292,8 @@
'*telnet': 'bool',
'*tn3270': 'bool',
'*websocket': 'bool',
- '*reconnect': 'int' },
+ '*reconnect': 'int',
+ '*reconnect-is-ms': 'bool' },
'base': 'ChardevCommon' }
##
I don't like this interface.

Indeed. After discussing this more internally we have actually settled on one of the alternatives you have mentioned below.

PRO: compatible extension; no management application updates needed
unless they want to support sub-second delays.

CON: specifying a sub-second delay takes two parameters, which is
awkward.

CON: trap in combination with -set. Before the patch, something like
-set chardev.ID.reconnect=N means N seconds no matter what.
Afterwards, it depends on the value of reconnect-is-ms, which may be
set far away. Mitigating factor: -set is obscure.

Alternatives:

1. Change @reconnect to 'number', specify sub-second delays as
fractions.

PRO: compatible extension; no management application updates needed
unless they want to support sub-second delays.

CON: first use of floating-point for time in seconds in QAPI, as far
as I can see.

CON: QemuOpts can't do floating-point.

2. Deprecate @reconnect in favour of a new member with a more suitable
unit. Error if both are present.

PRO: after @reconnect is gone, the interface is what it arguably
should've been from the start.

CON: incompatible change. Management application updates needed
within the deprecation grace period.

This is the one we decided to go with. We simply added a new "reconnect-ms" option, which doesn't replace the old one for backwards compatibility, but also disallows both to be specified at the same time, making them mutually exclusive. I think deprecation would be the way to go here.


Let's get additional input from management application developers. I
cc'ed some.

Sure, sounds great. Thanks!


Related: NetdevSocketOptions member @reconnect.


Reply via email to