On Mon, Nov 08, 2021 at 09:09:20AM -0700, Simon Glass wrote:
> Hi Tom,
> 
> On Mon, 8 Nov 2021 at 09:05, Tom Rini <tr...@konsulko.com> wrote:
> >
> > On Mon, Nov 08, 2021 at 08:58:33AM -0700, Simon Glass wrote:
> > > Hi,
> > >
> > > On Sun, 7 Nov 2021 at 04:18, Heinrich Schuchardt
> > > <heinrich.schucha...@canonical.com> wrote:
> > > >
> > > >
> > > >
> > > > On 11/6/21 14:53, Tom Rini wrote:
> > > > > On Sat, Nov 06, 2021 at 04:55:44AM +0100, Heinrich Schuchardt wrote:
> > > > >>
> > > > >>
> > > > >> On 11/6/21 02:52, Andre Przywara wrote:
> > > > >>> On Fri, 5 Nov 2021 18:56:34 -0400
> > > > >>> Tom Rini <tr...@konsulko.com> wrote:
> > > > >>>
> > > > >>>> On Fri, Nov 05, 2021 at 09:38:50PM +0100, Heinrich Schuchardt 
> > > > >>>> wrote:
> > > > >>>>> On 11/5/21 20:17, Tom Rini wrote:
> > > > >>>>>> On Fri, Nov 05, 2021 at 07:37:02PM +0100, Heinrich Schuchardt 
> > > > >>>>>> wrote:
> > > > >>>>>>> On 11/5/21 17:12, Simon Glass wrote:
> > > > >>>>>>>> Hi,
> > > > >>>>>>>>
> > > > >>>>>>>> On Fri, 5 Nov 2021 at 08:21, Tom Rini <tr...@konsulko.com> 
> > > > >>>>>>>> wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>> On Fri, Nov 05, 2021 at 12:14:47PM +0100, Stefan Roese wrote:
> > > > >>>>>>>>>> Hi Andre,
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Added Tom to Cc.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> On 05.11.21 11:04, Andre Przywara wrote:
> > > > >>>>>>>>>>> On Thu, 4 Nov 2021 20:02:41 -0600
> > > > >>>>>>>>>>> Simon Glass <s...@chromium.org> wrote:
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Hi,
> > > > >>>>>>>>>>>> On Thu, 4 Nov 2021 at 19:22, Stefan Roese <s...@denx.de> 
> > > > >>>>>>>>>>>> wrote:
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Hi Andre,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> On 05.11.21 00:11, Andre Przywara wrote:
> > > > >>>>>>>>>>>>>> On Thu, 4 Nov 2021 11:37:57 +0100
> > > > >>>>>>>>>>>>>> Stefan Roese <s...@denx.de> wrote:
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Hi Stefan,
> > > > >>>>>>>>>>>>>>> On 04.11.21 04:55, Samuel Holland wrote:
> > > > >>>>>>>>>>>>>>>> This series hooks up the watchdog uclass to 
> > > > >>>>>>>>>>>>>>>> automatically register
> > > > >>>>>>>>>>>>>>>> watchdog devices for use with sysreset, doing a bit of 
> > > > >>>>>>>>>>>>>>>> minor cleanup
> > > > >>>>>>>>>>>>>>>> along the way.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> The goal is for this to replace the sunxi board-level 
> > > > >>>>>>>>>>>>>>>> non-DM reset_cpu()
> > > > >>>>>>>>>>>>>>>> function. I was surprised to find that the wdt_reboot 
> > > > >>>>>>>>>>>>>>>> driver requires
> > > > >>>>>>>>>>>>>>>> its own undocumented device tree node, which 
> > > > >>>>>>>>>>>>>>>> references the watchdog
> > > > >>>>>>>>>>>>>>>> device by phandle. This is problematic for us, because 
> > > > >>>>>>>>>>>>>>>> sunxi-u-boot.dtsi
> > > > >>>>>>>>>>>>>>>> file covers 20 different SoCs with varying watchdog 
> > > > >>>>>>>>>>>>>>>> node phandle names.
> > > > >>>>>>>>>>>>>>>> So it would have required adding a -u-boot.dtsi file 
> > > > >>>>>>>>>>>>>>>> for each board.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Hooking things up automatically makes sense to me; 
> > > > >>>>>>>>>>>>>>>> this is what Linux
> > > > >>>>>>>>>>>>>>>> does. However, I put the code behind a new option to 
> > > > >>>>>>>>>>>>>>>> avoid surprises for
> > > > >>>>>>>>>>>>>>>> other platforms.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Changes in v3:
> > > > >>>>>>>>>>>>>>>>         - Move condition to wdt-uclass.c to fix build 
> > > > >>>>>>>>>>>>>>>> errors.
> > > > >>>>>>>>>>>>>>>>         - Include watchdog name in error message.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Changes in v2:
> > > > >>>>>>>>>>>>>>>>         - Extend the "if SYSRESET" block to the end of 
> > > > >>>>>>>>>>>>>>>> the file.
> > > > >>>>>>>>>>>>>>>>         - Also make gpio_reboot_probe function static.
> > > > >>>>>>>>>>>>>>>>         - Rebase on top of 492ee6b8d0e7 (now handle 
> > > > >>>>>>>>>>>>>>>> all watchdogs).
> > > > >>>>>>>>>>>>>>>>         - Added patches 5-6 as an example of how the 
> > > > >>>>>>>>>>>>>>>> new option will be used.
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>> Samuel Holland (6):
> > > > >>>>>>>>>>>>>>>>          sysreset: Add uclass Kconfig dependency to 
> > > > >>>>>>>>>>>>>>>> drivers
> > > > >>>>>>>>>>>>>>>>          sysreset: Mark driver probe functions as 
> > > > >>>>>>>>>>>>>>>> static
> > > > >>>>>>>>>>>>>>>>          sysreset: watchdog: Move watchdog reference 
> > > > >>>>>>>>>>>>>>>> to plat data
> > > > >>>>>>>>>>>>>>>>          watchdog: Automatically register device with 
> > > > >>>>>>>>>>>>>>>> sysreset
> > > > >>>>>>>>>>>>>>>>          sunxi: Avoid duplicate reset_cpu with 
> > > > >>>>>>>>>>>>>>>> SYSRESET enabled
> > > > >>>>>>>>>>>>>>>>          sunxi: Use sysreset framework for 
> > > > >>>>>>>>>>>>>>>> poweroff/reset
> > > > >>>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>>>         arch/arm/Kconfig                     |  3 +++
> > > > >>>>>>>>>>>>>>>>         arch/arm/mach-sunxi/board.c          |  2 ++
> > > > >>>>>>>>>>>>>>>>         drivers/sysreset/Kconfig             | 11 
> > > > >>>>>>>>>>>>>>>> ++++++--
> > > > >>>>>>>>>>>>>>>>         drivers/sysreset/sysreset_gpio.c     |  2 +-
> > > > >>>>>>>>>>>>>>>>         drivers/sysreset/sysreset_resetctl.c |  2 +-
> > > > >>>>>>>>>>>>>>>>         drivers/sysreset/sysreset_syscon.c   |  2 +-
> > > > >>>>>>>>>>>>>>>>         drivers/sysreset/sysreset_watchdog.c | 40 
> > > > >>>>>>>>>>>>>>>> ++++++++++++++++++++++------
> > > > >>>>>>>>>>>>>>>>         drivers/watchdog/wdt-uclass.c        |  8 
> > > > >>>>>>>>>>>>>>>> ++++++
> > > > >>>>>>>>>>>>>>>>         include/sysreset.h                   | 10 
> > > > >>>>>>>>>>>>>>>> +++++++
> > > > >>>>>>>>>>>>>>>>         9 files changed, 67 insertions(+), 13 
> > > > >>>>>>>>>>>>>>>> deletions(-)
> > > > >>>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>>> Applied to u-boot-marvell
> > > > >>>>>>>>>>>>>>
> > > > >>>>>>>>>>>>>> Mmmh, why u-boot-marvell,
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> Because I'm handling watchdog related changed since a few 
> > > > >>>>>>>>>>>>> years and we
> > > > >>>>>>>>>>>>> did not create a specific subsystem repo for this and I'm 
> > > > >>>>>>>>>>>>> usually
> > > > >>>>>>>>>>>>> using my "marvell" one for this.
> > > > >>>>>>>>>
> > > > >>>>>>>>> And fwiw, there's a few other cases like this.  If it's too 
> > > > >>>>>>>>> confusing,
> > > > >>>>>>>>> maybe we should just roll out a few more repositories, I 
> > > > >>>>>>>>> think it's
> > > > >>>>>>>>> easier to do that now than pre-gitlab?
> > > > >>>>>>>>>>>>>> and why did this end up already in master?
> > > > >>>>>>>>>>>>>> Isn't that material for the next merge window? After all 
> > > > >>>>>>>>>>>>>> this changes
> > > > >>>>>>>>>>>>>> quite a bit, for a lot of boards, and I did not have a 
> > > > >>>>>>>>>>>>>> closer look at
> > > > >>>>>>>>>>>>>> the sunxi parts yet.
> > > > >>>>>>>>>>>>>
> > > > >>>>>>>>>>>>> I was hesitating also a bit. But since this patchset is 
> > > > >>>>>>>>>>>>> on the list in
> > > > >>>>>>>>>>>>> v1 since over 2 months now (2021-08-21) I thought it was 
> > > > >>>>>>>>>>>>> "ready" for
> > > > >>>>>>>>>>>>> inclusion now. We are at -rc1 and I think we still have 
> > > > >>>>>>>>>>>>> enough time to
> > > > >>>>>>>>>>>>> fix any resulting problems in this release cycle.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Why do we have the merge window then? This is clearly not a 
> > > > >>>>>>>>>>> regression or
> > > > >>>>>>>>>>> general fix.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> AFAIU, we are a bit less strict here in U-Boot. Patches that 
> > > > >>>>>>>>>> were posted
> > > > >>>>>>>>>> before the merge-window and skipped the review process (most 
> > > > >>>>>>>>>> likely
> > > > >>>>>>>>>> because of lack of time) are often still integrated in the 
> > > > >>>>>>>>>> early rcX
> > > > >>>>>>>>>> cycles. At least this is how I handle it usually.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Tom, is my understanding here correct?
> > > > >>>>>>>>>
> > > > >>>>>>>>> Yes.  We are not as strict as the kernel is about what can 
> > > > >>>>>>>>> come in
> > > > >>>>>>>>> between rc1 and rc2 (and to a certain degree, post rc2).  I 
> > > > >>>>>>>>> leave things
> > > > >>>>>>>>> up to the discretion of the custodians.  People tend of have 
> > > > >>>>>>>>> less time
> > > > >>>>>>>>> to handle U-Boot changes than other stuff, so I try and be 
> > > > >>>>>>>>> flexible in
> > > > >>>>>>>>> picking things up.
> > > > >>>>>>>>>>>> Yes I agree, that should be plenty of time for people to 
> > > > >>>>>>>>>>>> review it.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Well, if there would be people to review the sunxi parts :-(
> > > > >>>>>>>>>>> I am totally fine with the generic patches (as they have 
> > > > >>>>>>>>>>> been reviewed),
> > > > >>>>>>>>>>> but the sunxi integration is somewhat risky.
> > > > >>>>>>>>>>> I was explicitly deprioritising that in my queue, as it 
> > > > >>>>>>>>>>> really doesn't
> > > > >>>>>>>>>>> change, add or fix anything, it's mere refactoring, from 
> > > > >>>>>>>>>>> the user's point
> > > > >>>>>>>>>>> of view.
> > > > >>>>>>>>>>>>> Do you see any specific issues?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> Patch 6/6 changes the config for all 157 Allwinner boards, 
> > > > >>>>>>>>>>> so I think that
> > > > >>>>>>>>>>> deserves at least some testing, *before* merging it.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> I expect that Samuel did some testing. But still, I agree 
> > > > >>>>>>>>>> that it
> > > > >>>>>>>>>> would be much better, if these patches - especially the 
> > > > >>>>>>>>>> Allwinner parts
> > > > >>>>>>>>>> got more extensive testing.
> > > > >>>>>>>>>>> I will do as much testing now as possible, but I am not 
> > > > >>>>>>>>>>> happy about that
> > > > >>>>>>>>>>> situation.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>> Understood. Should we revert patch 6/6 for now?
> > > > >>>>>>>>>
> > > > >>>>>>>>> FWIW, given Samuel has been doing a number of allwinner 
> > > > >>>>>>>>> changes, I had
> > > > >>>>>>>>> also assumed it was sufficiently tested, which is why I 
> > > > >>>>>>>>> didn't raise a
> > > > >>>>>>>>> further concern when I saw the widespread nature of the 
> > > > >>>>>>>>> overall changes,
> > > > >>>>>>>>> just figured it was a few more ready-to-go cleanups that 
> > > > >>>>>>>>> weren't quite
> > > > >>>>>>>>> picked up in time.  Please do speak up if you want me to 
> > > > >>>>>>>>> revert the last
> > > > >>>>>>>>> part.
> > > > >>>>>>>>
> > > > >>>>>>>> Also it is often true that people find problems by testing on 
> > > > >>>>>>>> master
> > > > >>>>>>>> so applying it helps to shake the tree a bit.
> > > > >>>>>>>>
> > > > >>>>>>>> Regards,
> > > > >>>>>>>> Simon
> > > > >>>>>>>
> > > > >>>>>>> We don't actually have a problem with this series but with a 
> > > > >>>>>>> previous
> > > > >>>>>>> watchdog patch. The culprit according to bisecting is:
> > > > >>>>>>>
> > > > >>>>>>> b147bd3607f8 ("sunxi: Enable watchdog timer support by default")
> > > > >>>>>>>
> > > > >>>>>>> When booting the OrangePi PC the watchdog triggers while Linux 
> > > > >>>>>>> is booting,
> > > > >>>>>>> ca. 16 s after leaving the UEFI subsystem. This matches 
> > > > >>>>>>> WDT_MAX_TIMEOUT in
> > > > >>>>>>> drivers/watchdog/sunxi_wdt.c.
> > > > >>>>>>>
> > > > >>>>>>> If I run
> > > > >>>>>>> => wdt dev watchdog@1c20ca0
> > > > >>>>>>> => wdt stop
> > > > >>>>>>>
> > > > >>>>>>> before the bootefi command booting succeeds.
> > > > >>>>>>>
> > > > >>>>>>> We don't disarm the watchdog and Linux does not do it for us in 
> > > > >>>>>>> time.
> > > > >>>>>>>
> > > > >>>>>>> The UEFI specification requires that the default watchdog reset 
> > > > >>>>>>> time is 300
> > > > >>>>>>> s. We should never arm the Sunxi hardware watchdog except 
> > > > >>>>>>> within the
> > > > >>>>>>> watchdog reset driver.
> > > > >>>>>>>
> > > > >>>>>>> The solution is to disable CONFIG_WATCHDOG_AUTOSTART on SUNXI. 
> > > > >>>>>>> See
> > > > >>>>>>>
> > > > >>>>>>> [PATCH 1/1] watchdog: don't autostart watchdog on Sunxi boards
> > > > >>>>>>> https://lists.denx.de/pipermail/u-boot/2021-November/466318.html
> > > > >>>>>>
> > > > >>>>>> This means we never did come up with a satisfactory to everyone 
> > > > >>>>>> solution
> > > > >>>>>> to what UEFI thinks a watchdog should do, and what other types of
> > > > >>>>>> deployment think a watchdog should do, yes?
> > > > >>>>>
> > > > >>>>> Dear Tom,
> > > > >>>>>
> > > > >>>>> The issue is *not* UEFI specific.
> > > > >>>>>
> > > > >>>>> A watchdog timeout of 16 seconds is too short for Linux to boot 
> > > > >>>>> no matter
> > > > >>>>> whether you use the EFI stub or the legacy entry point.
> > > > >>>>>
> > > > >>>>> I only referred to the UEFI specification as it indicates what 
> > > > >>>>> can be
> > > > >>>>> considered as a reasonable timeout interval: 300 seconds.
> > > > >>>>
> > > > >>>> 16 seconds from the last time we pet the watchdog in U-Boot to the
> > > > >>>> kernel being able to take over is quite reasonable.
> > > > >>>
> > > > >>> How do we know that the kernel takes over? What if the kernel/EFI
> > > > >>> payload doesn't have a watchdog driver? I was assuming that the
> > > > >>> watchdog would be disabled as soon as we boot a kernel or an EFI app
> > > > >>> calls ExitBootServices (maybe even earlier).
> > > > >>> But this sounds like a generic problem, not sunxi specific. So how 
> > > > >>> do
> > > > >>> other platforms solve this?
> > > > >>>
> > > > >>> Cheers,
> > > > >>> Andre
> > > > >>
> > > > >> The UEFI specification has this requirement in chapter "3.1.2 Load 
> > > > >> Option
> > > > >> Processing":
> > > > >>
> > > > >> "... the boot manager must enable the watchdog timer for 5 minutes 
> > > > >> by using
> > > > >> the EFI_BOOT_SERVICES.SetWatchdogTimer() boot service prior to 
> > > > >> calling
> > > > >> EFI_BOOT_SERVICES.StartImage(). If a boot option returns control to 
> > > > >> the boot
> > > > >> manager, the boot manager must disable the watchdog timer with an 
> > > > >> additional
> > > > >> call to the SetWatchdogTimer() boot service."
> > > > >>
> > > > >> This means that having an armed watchdog when starting the kernel is
> > > > >> correct.
> > > > >>
> > > > >> If you start a watchdog in the firmware which is not disabled or 
> > > > >> reset by
> > > > >> the operating system, you are out of luck and won't be able to boot.
> > > > >>
> > > > >> Current Linux has driver drivers/watchdog/sunxi_wdt.c compatible to
> > > > >> "allwinner,sun4i-a10-wdt","allwinner,sun6i-a31-wdt" and enabled by
> > > > >> CONFIG_SUNXI_WATCHDOG. This driver was introduced in Linux v3.12. It
> > > > >> originally had compatible "allwinner,sun4i-wdt" only.
> > > > >>
> > > > >> Debian Bullseye has the driver enabled as a module. In the bootlog 
> > > > >> of the
> > > > >> Orange Pi PC I find:
> > > > >> [   12.321909] sunxi-wdt 1c20ca0.watchdog: Watchdog enabled 
> > > > >> (timeout=16 sec,
> > > > >> nowayout=0)
> > > > >> This message appears approximately *20 seconds* after the EFI stub 
> > > > >> hands
> > > > >> over to the main kernel. Adding the driver to initrd shortens this 
> > > > >> to *18
> > > > >> seconds*. The message occurs after file system checks which can be a 
> > > > >> lengthy
> > > > >> operation. In Debian systemd manages the watchdog.
> > > > >>
> > > > >> As I said: 16 seconds is way too short for a hardware watchdog 
> > > > >> timeout.
> > > > >
> > > > > What's the time if you build it in?
> > > > >
> > > >
> > > > For sure you will find some board and configuration that is faster.
> > > >
> > > > But why should I care? This series breaks booting Debian on my board. So
> > > > it needs to be fixed. So, please, apply my patch that is doing so.
> > >
> > > Five minutes sounds completely unacceptable for embedded platforms.
> > > The user will surely have packaged the item up and will be just
> > > heading out to drop it off for return...
> >
> > I'm trying to avoid bringing up the long discussion from the previous
> > thread about this :)
> >
> > > Do we need to add a special case for UEFI here? E.g. bootefi could use
> > > a hook to lengthen the watchdog?
> >
> > Well, the problem is that the hardware watchdog has a maximum period of
> > 16 seconds, I believe.
> 
> Well at least we could have a hook to display a warning message. I
> don't understand why 16 seconds is too long, actually. Is this because
> it is going via grub?

As Heinrich noted, it takes from kernel start more than 16 seconds to
reach the point in the boot sequence where the initrd is found (so no,
not EFI slowing things down), loaded and the watchdog module loaded.
And since I was surprised here, yes, all watchdogs are done as modules
in the upstream Debian kernel.  Other platforms I gather just have a
longer maximum period.

-- 
Tom

Attachment: signature.asc
Description: PGP signature

Reply via email to