Am Fri, 12 May 2023 12:25:41 +0200
schrieb Jan Kiszka <[email protected]>:

> On 11.05.23 14:07, 'Henning Schild' via EFI Boot Guard wrote:
> > Hi,
> > 
> > i did not receive this one directly so the reply might look weird.
> >   
> >> Okay the story is that;
> >>
> >> In my case there is two different devices. You can think them like
> >> ubuntu and ubuntu installer. (not similar with ubuntu, I want to
> >> only show the hierarchy between the images)
> >>
> >>
> >> These two images are using efibootguard as bootloader.
> >>
> >> At the installer image, there is a program that is checking
> >> watchdog on
> >> the device.
> >>
> >> I want to inform the user if there is any watchdogs are available
> >> to use.
> >>
> >> After then if there is no watchdog to use, user has already took a
> >> warning like "WARNING: Cannnot probe watchdog".
> >>
> >> If the user want to continue without the watchdog feature, user
> >> will proceed the installation without watchdog.
> >>
> >> And the image (that is not installer) will set the watchdog timeout
> >> and 
> >> also will set a variable at the runtime (I was thinking like that) 
> >> watchdog_may_fail = 1 then even if there is a setted unsupported
> >> watchdog, it will not work.
> >>
> >> In this case also we need to get "watchdog probed or not" info
> >> from userspace too. Because installer will inform the user about
> >> watchdog probe.
> >>
> >> At the installer side I need to learn if there is a watchdog
> >> probed or not to inform the user.
> >>
> >> At the image (that is not installer) side I need to set
> >> watchdog_may_fail = 1 because I'll put the image to the installer
> >> by default watchdog timeout setted.  
> > 
> > Thanks for the detailed explanation.
> > 
> > So you have an installer that flashes and also modifies/configures
> > an image. And you want to enable the whole watchdog chain if
> > possible, but want to still have a bootable device if that device
> > would not yet be supported by ebg, but by the linux-kernel.
> > 
> > Maybe a "manual retry run via installer" would be the best option.
> > So you flash the image and if it looks like watchdog support could
> > work you ask the user whether they want to enable that and modify
> > that just flashed image. If that does not work the user installs
> > again and answers a watchdog question with "please disable". That
> > would require potentially installing twice but would give users
> > very clear feedback on which of the devices the watchdog feature
> > works and on which it did not. All without having to change ebg,
> > but only change "a postinstall function".
> > 
> > installer code:
> >   if ! exists /dev/watchdog*:
> >     warn "watchdog not supported on this device"
> >   else
> >     enable_watchdog = user_dialog("enable watchdog?", default=true)
> >     if enable_watchdog:
> >       user_info("if that install fails to enable the watchdog in the
> >       bootloader, please install again with watchdog feature
> > disabled") set bg_env timeout 60 (default was 0)
> >   fi
> >   
> > You could also have a "enable/disable watchdog on previous install"
> > button in the installer user interaction. So people could change
> > that setting without having to re-install/flash.
> > 
> > But all that said, i still think it might be a nice feature for ebg
> > to not be so hard about that watchdog. For generic images that
> > should work anywhere, that is not acceptable. Where using the
> > watchdog where available is a nice mitigation of a certain arguably
> > not too high risk. Because only some classes of updates errors will
> > require watchdog support in the bootloader, while many other
> > classes of errors can be handled by the Linux kernel and its
> > watchdog support just fine).  
> 
> That's why we support "timeout=0" - you can turn this feature of if
> you take the risk. But it makes little sense to have a "best effort"
> watchdog, even more as there is no easy way to spot that, at least so
> far. It will only create more problems that it can solve.

I am currently writing a new watchdog driver for IPMI watchdogs, found
i.e. in Supermicro and other server-class machines.

One needs to send two commands to a controller, and while doing so wait
several times for certain state-transitions. At the moment i
implemented the waiting as a busy waiting that would end up as an
endless loop should that controller for some reason enter an error
state. So far i have never seen any errors but they could potentially
happen any time, especially since such an IPMI BMC could be used by
others at the same time and there might be races.

So i will have to implement error handling and with it some retry
logic (sporadic errors). But also the retries will have to eventually
stop and i have to give up and decide on how to continue (temporary or
permanent errors). At which point i would likely have to signal just a
warning and continue booting. Simply because i have no clue whether i
am dealing with a temporary error (driver does not work this time, even
with retry logic) or a permanent one (driver never worked on that
machine).

Suggestions on how to deal with that are welcome, as i said i will
implement a "best effort with retry and eventual warning" and do my best
to test that.

At which point we would have one watchdog driver that in itself would
have potential to not being able to arm the watchdog, and we could
think about how such problems could be signaled to the OS.
But this would likely be patches on top. And "we have a driver that
failed (this time?)" is still something else than "we do not have a
driver at all".

Henning

> Jan
> 
> > 
> > So one could add a new entry to that env, that would allow booting
> > without a watchdog even if a timeout is configured (as a wish but
> > not a need). A userinterface to give people feedback could be an
> > on-screen warning and a delay. But well ... on embedded there would
> > be no screen or nobody to look at it.
> > 
> > Henning
> > 
> > Henning
> >   
> 

-- 
You received this message because you are subscribed to the Google Groups "EFI 
Boot Guard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/efibootguard-dev/20230513111407.476347dc%40md1za8fc.ad001.siemens.net.

Reply via email to