__send_to_port() busy-waits in virtqueue_get_buf() while holding
outvq_lock with IRQs disabled. If the host stops draining the TX
virtqueue, this loop never terminates.

This was observed during secondary VM boot: virtio_mem plugged memory
in multiple iterations, each emitting dev_info() messages through the
hvc console. A writev() on the hvc TTY entered __send_to_port() and
stalled in the spin loop. When the watchdog bark ISR fired on another
CPU, it attempted printk(), which tried to acquire outvq_lock through
the same path and spun indefinitely. With all CPUs stuck, the watchdog
could not be serviced and triggered a bite.

Add a 200 ms deadline using ktime_get_mono_fast_ns() to bound the spin
loop. ktime_get_mono_fast_ns() reads the hardware counter directly and
is safe to call with IRQs disabled and spinlocks held.

The 200 ms value is chosen to be far above normal host response latency
(microseconds) to avoid spurious exits, yet well below the watchdog
bark-to-bite window (typically 3 s) so that CPUs can escape the loop
and complete the bark handler before a bite occurs.

Signed-off-by: Peng Yang <[email protected]>
---
 drivers/char/virtio_console.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 9a33217c68d9..b3535681dfe1 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -27,6 +27,7 @@
 #include <linux/module.h>
 #include <linux/dma-mapping.h>
 #include <linux/string_choices.h>
+#include <linux/timekeeping.h>
 #include "../tty/hvc/hvc_console.h"
 
 #define is_rproc_enabled IS_ENABLED(CONFIG_REMOTEPROC)
@@ -601,6 +602,7 @@ static ssize_t __send_to_port(struct port *port, struct 
scatterlist *sg,
        int err;
        unsigned long flags;
        unsigned int len;
+       u64 deadline;
 
        out_vq = port->out_vq;
 
@@ -632,10 +634,18 @@ static ssize_t __send_to_port(struct port *port, struct 
scatterlist *sg,
         * buffer and relax the spinning requirement.  The downside is
         * we need to kmalloc a GFP_ATOMIC buffer each time the
         * console driver writes something out.
+        *
+        * To avoid spinning forever if the host stops processing the
+        * TX virtqueue (e.g. during VM shutdown), a 200ms deadline is
+        * used to break out of the loop as a fallback.
         */
-       while (!virtqueue_get_buf(out_vq, &len)
-               && !virtqueue_is_broken(out_vq))
+       deadline = ktime_get_mono_fast_ns() + 200ULL * NSEC_PER_MSEC;
+       while (!virtqueue_get_buf(out_vq, &len) &&
+              !virtqueue_is_broken(out_vq)) {
+               if (ktime_get_mono_fast_ns() >= deadline)
+                       break;
                cpu_relax();
+       }
 done:
        spin_unlock_irqrestore(&port->outvq_lock, flags);
 

---
base-commit: 97e797263a5e963da3d1e66e743fd518567dfe37
change-id: 20260420-add_timeout_to___send_to_port-104ce7bcf241

Best regards,
--  
Peng Yang <[email protected]>


Reply via email to