On Mon, Nov 08, 2021 at 05:09:14PM -0700, Simon Glass wrote: > Hi Heinrich, > > On Mon, 8 Nov 2021 at 09:17, Heinrich Schuchardt > <heinrich.schucha...@canonical.com> wrote: > > > > On 11/8/21 17:05, Tom Rini wrote: > > > On Mon, Nov 08, 2021 at 08:58:33AM -0700, Simon Glass wrote: > > >> Hi, > > >> > > >> On Sun, 7 Nov 2021 at 04:18, Heinrich Schuchardt > > >> <heinrich.schucha...@canonical.com> wrote: > > >>> > > >>> > > >>> > > >>> On 11/6/21 14:53, Tom Rini wrote: > > >>>> On Sat, Nov 06, 2021 at 04:55:44AM +0100, Heinrich Schuchardt wrote: > > >>>>> > > >>>>> > > >>>>> On 11/6/21 02:52, Andre Przywara wrote: > > >>>>>> On Fri, 5 Nov 2021 18:56:34 -0400 > > >>>>>> Tom Rini <tr...@konsulko.com> wrote: > > >>>>>> > > >>>>>>> On Fri, Nov 05, 2021 at 09:38:50PM +0100, Heinrich Schuchardt wrote: > > >>>>>>>> On 11/5/21 20:17, Tom Rini wrote: > > >>>>>>>>> On Fri, Nov 05, 2021 at 07:37:02PM +0100, Heinrich Schuchardt > > >>>>>>>>> wrote: > > >>>>>>>>>> On 11/5/21 17:12, Simon Glass wrote: > > >>>>>>>>>>> Hi, > > >>>>>>>>>>> > > >>>>>>>>>>> On Fri, 5 Nov 2021 at 08:21, Tom Rini <tr...@konsulko.com> > > >>>>>>>>>>> wrote: > > >>>>>>>>>>>> > > >>>>>>>>>>>> On Fri, Nov 05, 2021 at 12:14:47PM +0100, Stefan Roese wrote: > > >>>>>>>>>>>>> Hi Andre, > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Added Tom to Cc. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> On 05.11.21 11:04, Andre Przywara wrote: > > >>>>>>>>>>>>>> On Thu, 4 Nov 2021 20:02:41 -0600 > > >>>>>>>>>>>>>> Simon Glass <s...@chromium.org> wrote: > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Hi, > > >>>>>>>>>>>>>>> On Thu, 4 Nov 2021 at 19:22, Stefan Roese <s...@denx.de> > > >>>>>>>>>>>>>>> wrote: > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Hi Andre, > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> On 05.11.21 00:11, Andre Przywara wrote: > > >>>>>>>>>>>>>>>>> On Thu, 4 Nov 2021 11:37:57 +0100 > > >>>>>>>>>>>>>>>>> Stefan Roese <s...@denx.de> wrote: > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Hi Stefan, > > >>>>>>>>>>>>>>>>>> On 04.11.21 04:55, Samuel Holland wrote: > > >>>>>>>>>>>>>>>>>>> This series hooks up the watchdog uclass to > > >>>>>>>>>>>>>>>>>>> automatically register > > >>>>>>>>>>>>>>>>>>> watchdog devices for use with sysreset, doing a bit of > > >>>>>>>>>>>>>>>>>>> minor cleanup > > >>>>>>>>>>>>>>>>>>> along the way. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> The goal is for this to replace the sunxi board-level > > >>>>>>>>>>>>>>>>>>> non-DM reset_cpu() > > >>>>>>>>>>>>>>>>>>> function. I was surprised to find that the wdt_reboot > > >>>>>>>>>>>>>>>>>>> driver requires > > >>>>>>>>>>>>>>>>>>> its own undocumented device tree node, which references > > >>>>>>>>>>>>>>>>>>> the watchdog > > >>>>>>>>>>>>>>>>>>> device by phandle. This is problematic for us, because > > >>>>>>>>>>>>>>>>>>> sunxi-u-boot.dtsi > > >>>>>>>>>>>>>>>>>>> file covers 20 different SoCs with varying watchdog > > >>>>>>>>>>>>>>>>>>> node phandle names. > > >>>>>>>>>>>>>>>>>>> So it would have required adding a -u-boot.dtsi file > > >>>>>>>>>>>>>>>>>>> for each board. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Hooking things up automatically makes sense to me; this > > >>>>>>>>>>>>>>>>>>> is what Linux > > >>>>>>>>>>>>>>>>>>> does. However, I put the code behind a new option to > > >>>>>>>>>>>>>>>>>>> avoid surprises for > > >>>>>>>>>>>>>>>>>>> other platforms. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Changes in v3: > > >>>>>>>>>>>>>>>>>>> - Move condition to wdt-uclass.c to fix build > > >>>>>>>>>>>>>>>>>>> errors. > > >>>>>>>>>>>>>>>>>>> - Include watchdog name in error message. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Changes in v2: > > >>>>>>>>>>>>>>>>>>> - Extend the "if SYSRESET" block to the end of > > >>>>>>>>>>>>>>>>>>> the file. > > >>>>>>>>>>>>>>>>>>> - Also make gpio_reboot_probe function static. > > >>>>>>>>>>>>>>>>>>> - Rebase on top of 492ee6b8d0e7 (now handle > > >>>>>>>>>>>>>>>>>>> all watchdogs). > > >>>>>>>>>>>>>>>>>>> - Added patches 5-6 as an example of how the > > >>>>>>>>>>>>>>>>>>> new option will be used. > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> Samuel Holland (6): > > >>>>>>>>>>>>>>>>>>> sysreset: Add uclass Kconfig dependency to > > >>>>>>>>>>>>>>>>>>> drivers > > >>>>>>>>>>>>>>>>>>> sysreset: Mark driver probe functions as > > >>>>>>>>>>>>>>>>>>> static > > >>>>>>>>>>>>>>>>>>> sysreset: watchdog: Move watchdog reference > > >>>>>>>>>>>>>>>>>>> to plat data > > >>>>>>>>>>>>>>>>>>> watchdog: Automatically register device with > > >>>>>>>>>>>>>>>>>>> sysreset > > >>>>>>>>>>>>>>>>>>> sunxi: Avoid duplicate reset_cpu with > > >>>>>>>>>>>>>>>>>>> SYSRESET enabled > > >>>>>>>>>>>>>>>>>>> sunxi: Use sysreset framework for > > >>>>>>>>>>>>>>>>>>> poweroff/reset > > >>>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>>> arch/arm/Kconfig | 3 +++ > > >>>>>>>>>>>>>>>>>>> arch/arm/mach-sunxi/board.c | 2 ++ > > >>>>>>>>>>>>>>>>>>> drivers/sysreset/Kconfig | 11 > > >>>>>>>>>>>>>>>>>>> ++++++-- > > >>>>>>>>>>>>>>>>>>> drivers/sysreset/sysreset_gpio.c | 2 +- > > >>>>>>>>>>>>>>>>>>> drivers/sysreset/sysreset_resetctl.c | 2 +- > > >>>>>>>>>>>>>>>>>>> drivers/sysreset/sysreset_syscon.c | 2 +- > > >>>>>>>>>>>>>>>>>>> drivers/sysreset/sysreset_watchdog.c | 40 > > >>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++------ > > >>>>>>>>>>>>>>>>>>> drivers/watchdog/wdt-uclass.c | 8 > > >>>>>>>>>>>>>>>>>>> ++++++ > > >>>>>>>>>>>>>>>>>>> include/sysreset.h | 10 > > >>>>>>>>>>>>>>>>>>> +++++++ > > >>>>>>>>>>>>>>>>>>> 9 files changed, 67 insertions(+), 13 > > >>>>>>>>>>>>>>>>>>> deletions(-) > > >>>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>>> Applied to u-boot-marvell > > >>>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>>> Mmmh, why u-boot-marvell, > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> Because I'm handling watchdog related changed since a few > > >>>>>>>>>>>>>>>> years and we > > >>>>>>>>>>>>>>>> did not create a specific subsystem repo for this and I'm > > >>>>>>>>>>>>>>>> usually > > >>>>>>>>>>>>>>>> using my "marvell" one for this. > > >>>>>>>>>>>> > > >>>>>>>>>>>> And fwiw, there's a few other cases like this. If it's too > > >>>>>>>>>>>> confusing, > > >>>>>>>>>>>> maybe we should just roll out a few more repositories, I think > > >>>>>>>>>>>> it's > > >>>>>>>>>>>> easier to do that now than pre-gitlab? > > >>>>>>>>>>>>>>>>> and why did this end up already in master? > > >>>>>>>>>>>>>>>>> Isn't that material for the next merge window? After all > > >>>>>>>>>>>>>>>>> this changes > > >>>>>>>>>>>>>>>>> quite a bit, for a lot of boards, and I did not have a > > >>>>>>>>>>>>>>>>> closer look at > > >>>>>>>>>>>>>>>>> the sunxi parts yet. > > >>>>>>>>>>>>>>>> > > >>>>>>>>>>>>>>>> I was hesitating also a bit. But since this patchset is on > > >>>>>>>>>>>>>>>> the list in > > >>>>>>>>>>>>>>>> v1 since over 2 months now (2021-08-21) I thought it was > > >>>>>>>>>>>>>>>> "ready" for > > >>>>>>>>>>>>>>>> inclusion now. We are at -rc1 and I think we still have > > >>>>>>>>>>>>>>>> enough time to > > >>>>>>>>>>>>>>>> fix any resulting problems in this release cycle. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Why do we have the merge window then? This is clearly not a > > >>>>>>>>>>>>>> regression or > > >>>>>>>>>>>>>> general fix. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> AFAIU, we are a bit less strict here in U-Boot. Patches that > > >>>>>>>>>>>>> were posted > > >>>>>>>>>>>>> before the merge-window and skipped the review process (most > > >>>>>>>>>>>>> likely > > >>>>>>>>>>>>> because of lack of time) are often still integrated in the > > >>>>>>>>>>>>> early rcX > > >>>>>>>>>>>>> cycles. At least this is how I handle it usually. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Tom, is my understanding here correct? > > >>>>>>>>>>>> > > >>>>>>>>>>>> Yes. We are not as strict as the kernel is about what can > > >>>>>>>>>>>> come in > > >>>>>>>>>>>> between rc1 and rc2 (and to a certain degree, post rc2). I > > >>>>>>>>>>>> leave things > > >>>>>>>>>>>> up to the discretion of the custodians. People tend of have > > >>>>>>>>>>>> less time > > >>>>>>>>>>>> to handle U-Boot changes than other stuff, so I try and be > > >>>>>>>>>>>> flexible in > > >>>>>>>>>>>> picking things up. > > >>>>>>>>>>>>>>> Yes I agree, that should be plenty of time for people to > > >>>>>>>>>>>>>>> review it. > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Well, if there would be people to review the sunxi parts :-( > > >>>>>>>>>>>>>> I am totally fine with the generic patches (as they have > > >>>>>>>>>>>>>> been reviewed), > > >>>>>>>>>>>>>> but the sunxi integration is somewhat risky. > > >>>>>>>>>>>>>> I was explicitly deprioritising that in my queue, as it > > >>>>>>>>>>>>>> really doesn't > > >>>>>>>>>>>>>> change, add or fix anything, it's mere refactoring, from the > > >>>>>>>>>>>>>> user's point > > >>>>>>>>>>>>>> of view. > > >>>>>>>>>>>>>>>> Do you see any specific issues? > > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Patch 6/6 changes the config for all 157 Allwinner boards, > > >>>>>>>>>>>>>> so I think that > > >>>>>>>>>>>>>> deserves at least some testing, *before* merging it. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> I expect that Samuel did some testing. But still, I agree > > >>>>>>>>>>>>> that it > > >>>>>>>>>>>>> would be much better, if these patches - especially the > > >>>>>>>>>>>>> Allwinner parts > > >>>>>>>>>>>>> got more extensive testing. > > >>>>>>>>>>>>>> I will do as much testing now as possible, but I am not > > >>>>>>>>>>>>>> happy about that > > >>>>>>>>>>>>>> situation. > > >>>>>>>>>>>>> > > >>>>>>>>>>>>> Understood. Should we revert patch 6/6 for now? > > >>>>>>>>>>>> > > >>>>>>>>>>>> FWIW, given Samuel has been doing a number of allwinner > > >>>>>>>>>>>> changes, I had > > >>>>>>>>>>>> also assumed it was sufficiently tested, which is why I didn't > > >>>>>>>>>>>> raise a > > >>>>>>>>>>>> further concern when I saw the widespread nature of the > > >>>>>>>>>>>> overall changes, > > >>>>>>>>>>>> just figured it was a few more ready-to-go cleanups that > > >>>>>>>>>>>> weren't quite > > >>>>>>>>>>>> picked up in time. Please do speak up if you want me to > > >>>>>>>>>>>> revert the last > > >>>>>>>>>>>> part. > > >>>>>>>>>>> > > >>>>>>>>>>> Also it is often true that people find problems by testing on > > >>>>>>>>>>> master > > >>>>>>>>>>> so applying it helps to shake the tree a bit. > > >>>>>>>>>>> > > >>>>>>>>>>> Regards, > > >>>>>>>>>>> Simon > > >>>>>>>>>> > > >>>>>>>>>> We don't actually have a problem with this series but with a > > >>>>>>>>>> previous > > >>>>>>>>>> watchdog patch. The culprit according to bisecting is: > > >>>>>>>>>> > > >>>>>>>>>> b147bd3607f8 ("sunxi: Enable watchdog timer support by default") > > >>>>>>>>>> > > >>>>>>>>>> When booting the OrangePi PC the watchdog triggers while Linux > > >>>>>>>>>> is booting, > > >>>>>>>>>> ca. 16 s after leaving the UEFI subsystem. This matches > > >>>>>>>>>> WDT_MAX_TIMEOUT in > > >>>>>>>>>> drivers/watchdog/sunxi_wdt.c. > > >>>>>>>>>> > > >>>>>>>>>> If I run > > >>>>>>>>>> => wdt dev watchdog@1c20ca0 > > >>>>>>>>>> => wdt stop > > >>>>>>>>>> > > >>>>>>>>>> before the bootefi command booting succeeds. > > >>>>>>>>>> > > >>>>>>>>>> We don't disarm the watchdog and Linux does not do it for us in > > >>>>>>>>>> time. > > >>>>>>>>>> > > >>>>>>>>>> The UEFI specification requires that the default watchdog reset > > >>>>>>>>>> time is 300 > > >>>>>>>>>> s. We should never arm the Sunxi hardware watchdog except within > > >>>>>>>>>> the > > >>>>>>>>>> watchdog reset driver. > > >>>>>>>>>> > > >>>>>>>>>> The solution is to disable CONFIG_WATCHDOG_AUTOSTART on SUNXI. > > >>>>>>>>>> See > > >>>>>>>>>> > > >>>>>>>>>> [PATCH 1/1] watchdog: don't autostart watchdog on Sunxi boards > > >>>>>>>>>> https://lists.denx.de/pipermail/u-boot/2021-November/466318.html > > >>>>>>>>> > > >>>>>>>>> This means we never did come up with a satisfactory to everyone > > >>>>>>>>> solution > > >>>>>>>>> to what UEFI thinks a watchdog should do, and what other types of > > >>>>>>>>> deployment think a watchdog should do, yes? > > >>>>>>>> > > >>>>>>>> Dear Tom, > > >>>>>>>> > > >>>>>>>> The issue is *not* UEFI specific. > > >>>>>>>> > > >>>>>>>> A watchdog timeout of 16 seconds is too short for Linux to boot no > > >>>>>>>> matter > > >>>>>>>> whether you use the EFI stub or the legacy entry point. > > >>>>>>>> > > >>>>>>>> I only referred to the UEFI specification as it indicates what can > > >>>>>>>> be > > >>>>>>>> considered as a reasonable timeout interval: 300 seconds. > > >>>>>>> > > >>>>>>> 16 seconds from the last time we pet the watchdog in U-Boot to the > > >>>>>>> kernel being able to take over is quite reasonable. > > >>>>>> > > >>>>>> How do we know that the kernel takes over? What if the kernel/EFI > > >>>>>> payload doesn't have a watchdog driver? I was assuming that the > > >>>>>> watchdog would be disabled as soon as we boot a kernel or an EFI app > > >>>>>> calls ExitBootServices (maybe even earlier). > > >>>>>> But this sounds like a generic problem, not sunxi specific. So how do > > >>>>>> other platforms solve this? > > >>>>>> > > >>>>>> Cheers, > > >>>>>> Andre > > >>>>> > > >>>>> The UEFI specification has this requirement in chapter "3.1.2 Load > > >>>>> Option > > >>>>> Processing": > > >>>>> > > >>>>> "... the boot manager must enable the watchdog timer for 5 minutes by > > >>>>> using > > >>>>> the EFI_BOOT_SERVICES.SetWatchdogTimer() boot service prior to calling > > >>>>> EFI_BOOT_SERVICES.StartImage(). If a boot option returns control to > > >>>>> the boot > > >>>>> manager, the boot manager must disable the watchdog timer with an > > >>>>> additional > > >>>>> call to the SetWatchdogTimer() boot service." > > >>>>> > > >>>>> This means that having an armed watchdog when starting the kernel is > > >>>>> correct. > > >>>>> > > >>>>> If you start a watchdog in the firmware which is not disabled or > > >>>>> reset by > > >>>>> the operating system, you are out of luck and won't be able to boot. > > >>>>> > > >>>>> Current Linux has driver drivers/watchdog/sunxi_wdt.c compatible to > > >>>>> "allwinner,sun4i-a10-wdt","allwinner,sun6i-a31-wdt" and enabled by > > >>>>> CONFIG_SUNXI_WATCHDOG. This driver was introduced in Linux v3.12. It > > >>>>> originally had compatible "allwinner,sun4i-wdt" only. > > >>>>> > > >>>>> Debian Bullseye has the driver enabled as a module. In the bootlog of > > >>>>> the > > >>>>> Orange Pi PC I find: > > >>>>> [ 12.321909] sunxi-wdt 1c20ca0.watchdog: Watchdog enabled > > >>>>> (timeout=16 sec, > > >>>>> nowayout=0) > > >>>>> This message appears approximately *20 seconds* after the EFI stub > > >>>>> hands > > >>>>> over to the main kernel. Adding the driver to initrd shortens this to > > >>>>> *18 > > >>>>> seconds*. The message occurs after file system checks which can be a > > >>>>> lengthy > > >>>>> operation. In Debian systemd manages the watchdog. > > >>>>> > > >>>>> As I said: 16 seconds is way too short for a hardware watchdog > > >>>>> timeout. > > >>>> > > >>>> What's the time if you build it in? > > >>>> > > >>> > > >>> For sure you will find some board and configuration that is faster. > > >>> > > >>> But why should I care? This series breaks booting Debian on my board. So > > >>> it needs to be fixed. So, please, apply my patch that is doing so. > > >> > > >> Five minutes sounds completely unacceptable for embedded platforms. > > >> The user will surely have packaged the item up and will be just > > >> heading out to drop it off for return... > > > > I have no problem with people switching on the SUNXI hardware watchdog > > for their specific embedded solution. But here it was switched on for > > all SUNXI boards and breaks booting into Linux distributions. > > > > If the Linux distribution resets the watchdog *after* file system checks > > because it is managed by systemd (as is true for Debian), 5 minutes is > > realistic. > > > > > > > > I'm trying to avoid bringing up the long discussion from the previous > > > thread about this :) > > > > > >> Do we need to add a special case for UEFI here? E.g. bootefi could use > > >> a hook to lengthen the watchdog? > > > > The problem is not UEFI related. > > Oh dear, that is definitely quite slow. > > > > > > > > > Well, the problem is that the hardware watchdog has a maximum period of > > > 16 seconds, I believe. > > > > > > > Yes, Sunxi is limited to 16 seconds. > > I suppose it isn't possible to disable the watchdog once it is enabled?
You can, it's just not a general solution because, well, other platforms with a more reasonable overall period may still violate the UEFI spec but not get tripped up. And globally disabling the watchdog before "bootefi" or similar is also not a great idea. It probably is, sadly, the best option to just disable by default (so not enable until the kernel does) watchdog on sunxi. -- Tom
signature.asc
Description: PGP signature