Commit 2155d2dd introduced rate limiting for BLOCK_IO_ERROR to emit an
event only once a second. This makes sense for cases in which the guest
keeps running and can submit more requests that would possibly also fail
because there is a problem with the backend.

However, if the error policy is configured so that the VM is stopped on
errors, this is both unnecessary because stopping the VM means that the
guest can't issue more requests and in fact harmful because stopping the
VM is an important state change that management tools need to keep track
of even if it happens more than once in a given second. If an event is
dropped, the management tool would see a VM randomly going to paused
state without an associated error, so it has a hard time deciding how to
handle the situation.

This patch disables rate limiting for action=stop by essentially
considering all BLOCK_IO_ERRORs with action=stop different errors. If
the error is reported to the guest or ignored, the rate limiting stays
in place.

Fixes: 2155d2dd7f73 ('block-backend: per-device throttling of BLOCK_IO_ERROR 
reports')
Signed-off-by: Kevin Wolf <[email protected]>
---
 qapi/block-core.json |  2 +-
 monitor/monitor.c    | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index b82af742561..4118d884f46 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5789,7 +5789,7 @@
 # .. note:: If action is "stop", a `STOP` event will eventually follow
 #    the `BLOCK_IO_ERROR` event.
 #
-# .. note:: This event is rate-limited.
+# .. note:: This event is rate-limited, except if action is "stop".
 #
 # Since: 0.13
 #
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 1273eb72605..93bd2b93e65 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -525,6 +525,18 @@ static gboolean qapi_event_throttle_equal(const void *a, 
const void *b)
                        qdict_get_str(evb->data, "node-name"));
     }
 
+    /*
+     * If the VM is stopped after an I/O error, this is important information
+     * for the management tool to keep track of the state of QEMU and we can't
+     * merge any events. At the same time, stopping the VM means that the guest
+     * can't send additional requests and the number of events is already
+     * limited, so we can do without rate limiting.
+     */
+    if (eva->event == QAPI_EVENT_BLOCK_IO_ERROR &&
+        !strcmp(qdict_get_str(eva->data, "action"), "stop")) {
+        return FALSE;
+    }
+
     if (eva->event == QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE ||
         eva->event == QAPI_EVENT_BLOCK_IO_ERROR) {
         return !strcmp(qdict_get_str(eva->data, "qom-path"),
-- 
2.53.0


Reply via email to