On Mon, 26 Jan 2026 12:29:32 +0100
Mauro Carvalho Chehab <[email protected]> wrote:

> On Mon, 26 Jan 2026 12:23:30 +0100
> Mauro Carvalho Chehab <[email protected]> (by way of Mauro Carvalho 
> Chehab <[email protected]>) wrote:
> 
> > On Fri, 23 Jan 2026 16:16:03 +0000
> > Jonathan Cameron via qemu development <[email protected]> wrote:
> >   
> > > > >       
> > > > > > +        for i in range(0, attempts):
> > > > > > +            try:
> > > > > > +                obj = self.qmp_monitor.cmd_obj(msg)
> > > > > > +
> > > > > > +                if obj and "return" in obj and not obj["return"]:
> > > > > > +                    break
> > > > > > +
> > > > > > +            except Exception as e:                     # pylint: 
> > > > > > disable=W0718
> > > > > > +                print(f"Command: {command}")
> > > > > > +                print(f"Failed to inject error: {e}.")
> > > > > > +                obj = None
> > > > > > +
> > > > > > +            if attempts > 1:
> > > > > > +                print(f"Error inject attempt {i + 1}/{attempts} 
> > > > > > failed.")
> > > > > > +
> > > > > > +            if i + 1 < attempts:
> > > > > > +                sleep(0.1)      
> > > > 
> > > > ... and here, we sleep for 0.1 seconds.
> > > >     
> > > > > 
> > > > > Do we care about a sleep at the end?  Feels like a micro optimization 
> > > > > that
> > > > > isn't needed.      
> > > > 
> > > > This is not a micro-optimization. It is more to ensure that we won't
> > > > respin it too fast.
> > > > 
> > > > What happens is that QMP interface asks the BIOS to send an async
> > > > message to OSPM, cleaning an ack register. When the OSPM reads the
> > > > error, it writes 1 to the ack register.
> > > > 
> > > > If we send messages too fast, the logic at ghes.c will detect that
> > > > the ack didn't happen, imediately returning an errocr code.
> > > > 
> > > > On such case, we sleep for 100ms before trying again.    
> > > I was suggesting the opposite.  Just sleep one more time at the end
> > > before timing out.
> > > So instead of
> > >   if i + 1 < attempts
> > >           sleep(0.1)
> > > 
> > > simply
> > >   sleep(0.1)  
> > 
> > If one writes an external loop calling fuzzy with different parameters,
> > like:
> > 
> >     for i in $(seq 1 360000); do
> >             scripts/ghes_inject.py fuzzy -T proc-arm;
> >             scripts/ghes_inject.py fuzzy -T firmware-error;
> >         done
> > 
> > The extra unneeded would sleep waste 10 hours doing nothing.

True if it fails every time, which you were suggesting was very rare. 

Anyhow I really don't mind that much, just seemed like a tiny
bit over engineered for a rare case. 
 
> 
> Btw, the same applies when using the -c parameter:
> 
>              scripts/ghes_inject.py fuzzy -T proc-arm -c 360000
> 
> The goal here is to optimize in a way that we could one day have a
> CI running lots of tests in a reasonable time to detect regressions
> at QEMU + Linux Kernel + rasdaemon.
> 
> So, we don't want unneeded delays. We only need to sleep if a
> retry attempt failed and it will be retrying again.
> 
> Regards,
> 


Reply via email to