On Mon, 26 Jan 2026 12:29:32 +0100 Mauro Carvalho Chehab <[email protected]> wrote:
> On Mon, 26 Jan 2026 12:23:30 +0100 > Mauro Carvalho Chehab <[email protected]> (by way of Mauro Carvalho > Chehab <[email protected]>) wrote: > > > On Fri, 23 Jan 2026 16:16:03 +0000 > > Jonathan Cameron via qemu development <[email protected]> wrote: > > > > > > > > > > > > > + for i in range(0, attempts): > > > > > > + try: > > > > > > + obj = self.qmp_monitor.cmd_obj(msg) > > > > > > + > > > > > > + if obj and "return" in obj and not obj["return"]: > > > > > > + break > > > > > > + > > > > > > + except Exception as e: # pylint: > > > > > > disable=W0718 > > > > > > + print(f"Command: {command}") > > > > > > + print(f"Failed to inject error: {e}.") > > > > > > + obj = None > > > > > > + > > > > > > + if attempts > 1: > > > > > > + print(f"Error inject attempt {i + 1}/{attempts} > > > > > > failed.") > > > > > > + > > > > > > + if i + 1 < attempts: > > > > > > + sleep(0.1) > > > > > > > > ... and here, we sleep for 0.1 seconds. > > > > > > > > > > > > > > Do we care about a sleep at the end? Feels like a micro optimization > > > > > that > > > > > isn't needed. > > > > > > > > This is not a micro-optimization. It is more to ensure that we won't > > > > respin it too fast. > > > > > > > > What happens is that QMP interface asks the BIOS to send an async > > > > message to OSPM, cleaning an ack register. When the OSPM reads the > > > > error, it writes 1 to the ack register. > > > > > > > > If we send messages too fast, the logic at ghes.c will detect that > > > > the ack didn't happen, imediately returning an errocr code. > > > > > > > > On such case, we sleep for 100ms before trying again. > > > I was suggesting the opposite. Just sleep one more time at the end > > > before timing out. > > > So instead of > > > if i + 1 < attempts > > > sleep(0.1) > > > > > > simply > > > sleep(0.1) > > > > If one writes an external loop calling fuzzy with different parameters, > > like: > > > > for i in $(seq 1 360000); do > > scripts/ghes_inject.py fuzzy -T proc-arm; > > scripts/ghes_inject.py fuzzy -T firmware-error; > > done > > > > The extra unneeded would sleep waste 10 hours doing nothing. True if it fails every time, which you were suggesting was very rare. Anyhow I really don't mind that much, just seemed like a tiny bit over engineered for a rare case. > > Btw, the same applies when using the -c parameter: > > scripts/ghes_inject.py fuzzy -T proc-arm -c 360000 > > The goal here is to optimize in a way that we could one day have a > CI running lots of tests in a reasonable time to detect regressions > at QEMU + Linux Kernel + rasdaemon. > > So, we don't want unneeded delays. We only need to sleep if a > retry attempt failed and it will be retrying again. > > Regards, >
