On Mon, 26 Jan 2026 12:23:30 +0100
Mauro Carvalho Chehab <[email protected]> (by way of Mauro Carvalho 
Chehab <[email protected]>) wrote:

> On Fri, 23 Jan 2026 16:16:03 +0000
> Jonathan Cameron via qemu development <[email protected]> wrote:
> 
> > > >     
> > > > > +        for i in range(0, attempts):
> > > > > +            try:
> > > > > +                obj = self.qmp_monitor.cmd_obj(msg)
> > > > > +
> > > > > +                if obj and "return" in obj and not obj["return"]:
> > > > > +                    break
> > > > > +
> > > > > +            except Exception as e:                     # pylint: 
> > > > > disable=W0718
> > > > > +                print(f"Command: {command}")
> > > > > +                print(f"Failed to inject error: {e}.")
> > > > > +                obj = None
> > > > > +
> > > > > +            if attempts > 1:
> > > > > +                print(f"Error inject attempt {i + 1}/{attempts} 
> > > > > failed.")
> > > > > +
> > > > > +            if i + 1 < attempts:
> > > > > +                sleep(0.1)    
> > > 
> > > ... and here, we sleep for 0.1 seconds.
> > >   
> > > > 
> > > > Do we care about a sleep at the end?  Feels like a micro optimization 
> > > > that
> > > > isn't needed.    
> > > 
> > > This is not a micro-optimization. It is more to ensure that we won't
> > > respin it too fast.
> > > 
> > > What happens is that QMP interface asks the BIOS to send an async
> > > message to OSPM, cleaning an ack register. When the OSPM reads the
> > > error, it writes 1 to the ack register.
> > > 
> > > If we send messages too fast, the logic at ghes.c will detect that
> > > the ack didn't happen, imediately returning an errocr code.
> > > 
> > > On such case, we sleep for 100ms before trying again.  
> > I was suggesting the opposite.  Just sleep one more time at the end
> > before timing out.
> > So instead of
> >     if i + 1 < attempts
> >             sleep(0.1)
> > 
> > simply
> >     sleep(0.1)
> 
> If one writes an external loop calling fuzzy with different parameters,
> like:
> 
>       for i in $(seq 1 360000); do
>             scripts/ghes_inject.py fuzzy -T proc-arm;
>             scripts/ghes_inject.py fuzzy -T firmware-error;
>         done
> 
> The extra unneeded would sleep waste 10 hours doing nothing.

Btw, the same applies when using the -c parameter:

             scripts/ghes_inject.py fuzzy -T proc-arm -c 360000

The goal here is to optimize in a way that we could one day have a
CI running lots of tests in a reasonable time to detect regressions
at QEMU + Linux Kernel + rasdaemon.

So, we don't want unneeded delays. We only need to sleep if a
retry attempt failed and it will be retrying again.

Regards,

Reply via email to