Hi Heinrich,

On Mon, 8 Nov 2021 at 09:17, Heinrich Schuchardt
<heinrich.schucha...@canonical.com> wrote:
>
> On 11/8/21 17:05, Tom Rini wrote:
> > On Mon, Nov 08, 2021 at 08:58:33AM -0700, Simon Glass wrote:
> >> Hi,
> >>
> >> On Sun, 7 Nov 2021 at 04:18, Heinrich Schuchardt
> >> <heinrich.schucha...@canonical.com> wrote:
> >>>
> >>>
> >>>
> >>> On 11/6/21 14:53, Tom Rini wrote:
> >>>> On Sat, Nov 06, 2021 at 04:55:44AM +0100, Heinrich Schuchardt wrote:
> >>>>>
> >>>>>
> >>>>> On 11/6/21 02:52, Andre Przywara wrote:
> >>>>>> On Fri, 5 Nov 2021 18:56:34 -0400
> >>>>>> Tom Rini <tr...@konsulko.com> wrote:
> >>>>>>
> >>>>>>> On Fri, Nov 05, 2021 at 09:38:50PM +0100, Heinrich Schuchardt wrote:
> >>>>>>>> On 11/5/21 20:17, Tom Rini wrote:
> >>>>>>>>> On Fri, Nov 05, 2021 at 07:37:02PM +0100, Heinrich Schuchardt wrote:
> >>>>>>>>>> On 11/5/21 17:12, Simon Glass wrote:
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, 5 Nov 2021 at 08:21, Tom Rini <tr...@konsulko.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Nov 05, 2021 at 12:14:47PM +0100, Stefan Roese wrote:
> >>>>>>>>>>>>> Hi Andre,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Added Tom to Cc.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 05.11.21 11:04, Andre Przywara wrote:
> >>>>>>>>>>>>>> On Thu, 4 Nov 2021 20:02:41 -0600
> >>>>>>>>>>>>>> Simon Glass <s...@chromium.org> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>> On Thu, 4 Nov 2021 at 19:22, Stefan Roese <s...@denx.de> 
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Andre,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On 05.11.21 00:11, Andre Przywara wrote:
> >>>>>>>>>>>>>>>>> On Thu, 4 Nov 2021 11:37:57 +0100
> >>>>>>>>>>>>>>>>> Stefan Roese <s...@denx.de> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi Stefan,
> >>>>>>>>>>>>>>>>>> On 04.11.21 04:55, Samuel Holland wrote:
> >>>>>>>>>>>>>>>>>>> This series hooks up the watchdog uclass to automatically 
> >>>>>>>>>>>>>>>>>>> register
> >>>>>>>>>>>>>>>>>>> watchdog devices for use with sysreset, doing a bit of 
> >>>>>>>>>>>>>>>>>>> minor cleanup
> >>>>>>>>>>>>>>>>>>> along the way.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> The goal is for this to replace the sunxi board-level 
> >>>>>>>>>>>>>>>>>>> non-DM reset_cpu()
> >>>>>>>>>>>>>>>>>>> function. I was surprised to find that the wdt_reboot 
> >>>>>>>>>>>>>>>>>>> driver requires
> >>>>>>>>>>>>>>>>>>> its own undocumented device tree node, which references 
> >>>>>>>>>>>>>>>>>>> the watchdog
> >>>>>>>>>>>>>>>>>>> device by phandle. This is problematic for us, because 
> >>>>>>>>>>>>>>>>>>> sunxi-u-boot.dtsi
> >>>>>>>>>>>>>>>>>>> file covers 20 different SoCs with varying watchdog node 
> >>>>>>>>>>>>>>>>>>> phandle names.
> >>>>>>>>>>>>>>>>>>> So it would have required adding a -u-boot.dtsi file for 
> >>>>>>>>>>>>>>>>>>> each board.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hooking things up automatically makes sense to me; this 
> >>>>>>>>>>>>>>>>>>> is what Linux
> >>>>>>>>>>>>>>>>>>> does. However, I put the code behind a new option to 
> >>>>>>>>>>>>>>>>>>> avoid surprises for
> >>>>>>>>>>>>>>>>>>> other platforms.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Changes in v3:
> >>>>>>>>>>>>>>>>>>>          - Move condition to wdt-uclass.c to fix build 
> >>>>>>>>>>>>>>>>>>> errors.
> >>>>>>>>>>>>>>>>>>>          - Include watchdog name in error message.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Changes in v2:
> >>>>>>>>>>>>>>>>>>>          - Extend the "if SYSRESET" block to the end of 
> >>>>>>>>>>>>>>>>>>> the file.
> >>>>>>>>>>>>>>>>>>>          - Also make gpio_reboot_probe function static.
> >>>>>>>>>>>>>>>>>>>          - Rebase on top of 492ee6b8d0e7 (now handle all 
> >>>>>>>>>>>>>>>>>>> watchdogs).
> >>>>>>>>>>>>>>>>>>>          - Added patches 5-6 as an example of how the new 
> >>>>>>>>>>>>>>>>>>> option will be used.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Samuel Holland (6):
> >>>>>>>>>>>>>>>>>>>           sysreset: Add uclass Kconfig dependency to 
> >>>>>>>>>>>>>>>>>>> drivers
> >>>>>>>>>>>>>>>>>>>           sysreset: Mark driver probe functions as static
> >>>>>>>>>>>>>>>>>>>           sysreset: watchdog: Move watchdog reference to 
> >>>>>>>>>>>>>>>>>>> plat data
> >>>>>>>>>>>>>>>>>>>           watchdog: Automatically register device with 
> >>>>>>>>>>>>>>>>>>> sysreset
> >>>>>>>>>>>>>>>>>>>           sunxi: Avoid duplicate reset_cpu with SYSRESET 
> >>>>>>>>>>>>>>>>>>> enabled
> >>>>>>>>>>>>>>>>>>>           sunxi: Use sysreset framework for poweroff/reset
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>          arch/arm/Kconfig                     |  3 +++
> >>>>>>>>>>>>>>>>>>>          arch/arm/mach-sunxi/board.c          |  2 ++
> >>>>>>>>>>>>>>>>>>>          drivers/sysreset/Kconfig             | 11 
> >>>>>>>>>>>>>>>>>>> ++++++--
> >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_gpio.c     |  2 +-
> >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_resetctl.c |  2 +-
> >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_syscon.c   |  2 +-
> >>>>>>>>>>>>>>>>>>>          drivers/sysreset/sysreset_watchdog.c | 40 
> >>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++------
> >>>>>>>>>>>>>>>>>>>          drivers/watchdog/wdt-uclass.c        |  8 ++++++
> >>>>>>>>>>>>>>>>>>>          include/sysreset.h                   | 10 +++++++
> >>>>>>>>>>>>>>>>>>>          9 files changed, 67 insertions(+), 13 
> >>>>>>>>>>>>>>>>>>> deletions(-)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Applied to u-boot-marvell
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Mmmh, why u-boot-marvell,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Because I'm handling watchdog related changed since a few 
> >>>>>>>>>>>>>>>> years and we
> >>>>>>>>>>>>>>>> did not create a specific subsystem repo for this and I'm 
> >>>>>>>>>>>>>>>> usually
> >>>>>>>>>>>>>>>> using my "marvell" one for this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> And fwiw, there's a few other cases like this.  If it's too 
> >>>>>>>>>>>> confusing,
> >>>>>>>>>>>> maybe we should just roll out a few more repositories, I think 
> >>>>>>>>>>>> it's
> >>>>>>>>>>>> easier to do that now than pre-gitlab?
> >>>>>>>>>>>>>>>>> and why did this end up already in master?
> >>>>>>>>>>>>>>>>> Isn't that material for the next merge window? After all 
> >>>>>>>>>>>>>>>>> this changes
> >>>>>>>>>>>>>>>>> quite a bit, for a lot of boards, and I did not have a 
> >>>>>>>>>>>>>>>>> closer look at
> >>>>>>>>>>>>>>>>> the sunxi parts yet.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I was hesitating also a bit. But since this patchset is on 
> >>>>>>>>>>>>>>>> the list in
> >>>>>>>>>>>>>>>> v1 since over 2 months now (2021-08-21) I thought it was 
> >>>>>>>>>>>>>>>> "ready" for
> >>>>>>>>>>>>>>>> inclusion now. We are at -rc1 and I think we still have 
> >>>>>>>>>>>>>>>> enough time to
> >>>>>>>>>>>>>>>> fix any resulting problems in this release cycle.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Why do we have the merge window then? This is clearly not a 
> >>>>>>>>>>>>>> regression or
> >>>>>>>>>>>>>> general fix.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> AFAIU, we are a bit less strict here in U-Boot. Patches that 
> >>>>>>>>>>>>> were posted
> >>>>>>>>>>>>> before the merge-window and skipped the review process (most 
> >>>>>>>>>>>>> likely
> >>>>>>>>>>>>> because of lack of time) are often still integrated in the 
> >>>>>>>>>>>>> early rcX
> >>>>>>>>>>>>> cycles. At least this is how I handle it usually.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Tom, is my understanding here correct?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes.  We are not as strict as the kernel is about what can come 
> >>>>>>>>>>>> in
> >>>>>>>>>>>> between rc1 and rc2 (and to a certain degree, post rc2).  I 
> >>>>>>>>>>>> leave things
> >>>>>>>>>>>> up to the discretion of the custodians.  People tend of have 
> >>>>>>>>>>>> less time
> >>>>>>>>>>>> to handle U-Boot changes than other stuff, so I try and be 
> >>>>>>>>>>>> flexible in
> >>>>>>>>>>>> picking things up.
> >>>>>>>>>>>>>>> Yes I agree, that should be plenty of time for people to 
> >>>>>>>>>>>>>>> review it.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Well, if there would be people to review the sunxi parts :-(
> >>>>>>>>>>>>>> I am totally fine with the generic patches (as they have been 
> >>>>>>>>>>>>>> reviewed),
> >>>>>>>>>>>>>> but the sunxi integration is somewhat risky.
> >>>>>>>>>>>>>> I was explicitly deprioritising that in my queue, as it really 
> >>>>>>>>>>>>>> doesn't
> >>>>>>>>>>>>>> change, add or fix anything, it's mere refactoring, from the 
> >>>>>>>>>>>>>> user's point
> >>>>>>>>>>>>>> of view.
> >>>>>>>>>>>>>>>> Do you see any specific issues?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Patch 6/6 changes the config for all 157 Allwinner boards, so 
> >>>>>>>>>>>>>> I think that
> >>>>>>>>>>>>>> deserves at least some testing, *before* merging it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I expect that Samuel did some testing. But still, I agree that 
> >>>>>>>>>>>>> it
> >>>>>>>>>>>>> would be much better, if these patches - especially the 
> >>>>>>>>>>>>> Allwinner parts
> >>>>>>>>>>>>> got more extensive testing.
> >>>>>>>>>>>>>> I will do as much testing now as possible, but I am not happy 
> >>>>>>>>>>>>>> about that
> >>>>>>>>>>>>>> situation.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Understood. Should we revert patch 6/6 for now?
> >>>>>>>>>>>>
> >>>>>>>>>>>> FWIW, given Samuel has been doing a number of allwinner changes, 
> >>>>>>>>>>>> I had
> >>>>>>>>>>>> also assumed it was sufficiently tested, which is why I didn't 
> >>>>>>>>>>>> raise a
> >>>>>>>>>>>> further concern when I saw the widespread nature of the overall 
> >>>>>>>>>>>> changes,
> >>>>>>>>>>>> just figured it was a few more ready-to-go cleanups that weren't 
> >>>>>>>>>>>> quite
> >>>>>>>>>>>> picked up in time.  Please do speak up if you want me to revert 
> >>>>>>>>>>>> the last
> >>>>>>>>>>>> part.
> >>>>>>>>>>>
> >>>>>>>>>>> Also it is often true that people find problems by testing on 
> >>>>>>>>>>> master
> >>>>>>>>>>> so applying it helps to shake the tree a bit.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Simon
> >>>>>>>>>>
> >>>>>>>>>> We don't actually have a problem with this series but with a 
> >>>>>>>>>> previous
> >>>>>>>>>> watchdog patch. The culprit according to bisecting is:
> >>>>>>>>>>
> >>>>>>>>>> b147bd3607f8 ("sunxi: Enable watchdog timer support by default")
> >>>>>>>>>>
> >>>>>>>>>> When booting the OrangePi PC the watchdog triggers while Linux is 
> >>>>>>>>>> booting,
> >>>>>>>>>> ca. 16 s after leaving the UEFI subsystem. This matches 
> >>>>>>>>>> WDT_MAX_TIMEOUT in
> >>>>>>>>>> drivers/watchdog/sunxi_wdt.c.
> >>>>>>>>>>
> >>>>>>>>>> If I run
> >>>>>>>>>> => wdt dev watchdog@1c20ca0
> >>>>>>>>>> => wdt stop
> >>>>>>>>>>
> >>>>>>>>>> before the bootefi command booting succeeds.
> >>>>>>>>>>
> >>>>>>>>>> We don't disarm the watchdog and Linux does not do it for us in 
> >>>>>>>>>> time.
> >>>>>>>>>>
> >>>>>>>>>> The UEFI specification requires that the default watchdog reset 
> >>>>>>>>>> time is 300
> >>>>>>>>>> s. We should never arm the Sunxi hardware watchdog except within 
> >>>>>>>>>> the
> >>>>>>>>>> watchdog reset driver.
> >>>>>>>>>>
> >>>>>>>>>> The solution is to disable CONFIG_WATCHDOG_AUTOSTART on SUNXI. See
> >>>>>>>>>>
> >>>>>>>>>> [PATCH 1/1] watchdog: don't autostart watchdog on Sunxi boards
> >>>>>>>>>> https://lists.denx.de/pipermail/u-boot/2021-November/466318.html
> >>>>>>>>>
> >>>>>>>>> This means we never did come up with a satisfactory to everyone 
> >>>>>>>>> solution
> >>>>>>>>> to what UEFI thinks a watchdog should do, and what other types of
> >>>>>>>>> deployment think a watchdog should do, yes?
> >>>>>>>>
> >>>>>>>> Dear Tom,
> >>>>>>>>
> >>>>>>>> The issue is *not* UEFI specific.
> >>>>>>>>
> >>>>>>>> A watchdog timeout of 16 seconds is too short for Linux to boot no 
> >>>>>>>> matter
> >>>>>>>> whether you use the EFI stub or the legacy entry point.
> >>>>>>>>
> >>>>>>>> I only referred to the UEFI specification as it indicates what can be
> >>>>>>>> considered as a reasonable timeout interval: 300 seconds.
> >>>>>>>
> >>>>>>> 16 seconds from the last time we pet the watchdog in U-Boot to the
> >>>>>>> kernel being able to take over is quite reasonable.
> >>>>>>
> >>>>>> How do we know that the kernel takes over? What if the kernel/EFI
> >>>>>> payload doesn't have a watchdog driver? I was assuming that the
> >>>>>> watchdog would be disabled as soon as we boot a kernel or an EFI app
> >>>>>> calls ExitBootServices (maybe even earlier).
> >>>>>> But this sounds like a generic problem, not sunxi specific. So how do
> >>>>>> other platforms solve this?
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Andre
> >>>>>
> >>>>> The UEFI specification has this requirement in chapter "3.1.2 Load 
> >>>>> Option
> >>>>> Processing":
> >>>>>
> >>>>> "... the boot manager must enable the watchdog timer for 5 minutes by 
> >>>>> using
> >>>>> the EFI_BOOT_SERVICES.SetWatchdogTimer() boot service prior to calling
> >>>>> EFI_BOOT_SERVICES.StartImage(). If a boot option returns control to the 
> >>>>> boot
> >>>>> manager, the boot manager must disable the watchdog timer with an 
> >>>>> additional
> >>>>> call to the SetWatchdogTimer() boot service."
> >>>>>
> >>>>> This means that having an armed watchdog when starting the kernel is
> >>>>> correct.
> >>>>>
> >>>>> If you start a watchdog in the firmware which is not disabled or reset 
> >>>>> by
> >>>>> the operating system, you are out of luck and won't be able to boot.
> >>>>>
> >>>>> Current Linux has driver drivers/watchdog/sunxi_wdt.c compatible to
> >>>>> "allwinner,sun4i-a10-wdt","allwinner,sun6i-a31-wdt" and enabled by
> >>>>> CONFIG_SUNXI_WATCHDOG. This driver was introduced in Linux v3.12. It
> >>>>> originally had compatible "allwinner,sun4i-wdt" only.
> >>>>>
> >>>>> Debian Bullseye has the driver enabled as a module. In the bootlog of 
> >>>>> the
> >>>>> Orange Pi PC I find:
> >>>>> [   12.321909] sunxi-wdt 1c20ca0.watchdog: Watchdog enabled (timeout=16 
> >>>>> sec,
> >>>>> nowayout=0)
> >>>>> This message appears approximately *20 seconds* after the EFI stub hands
> >>>>> over to the main kernel. Adding the driver to initrd shortens this to 
> >>>>> *18
> >>>>> seconds*. The message occurs after file system checks which can be a 
> >>>>> lengthy
> >>>>> operation. In Debian systemd manages the watchdog.
> >>>>>
> >>>>> As I said: 16 seconds is way too short for a hardware watchdog timeout.
> >>>>
> >>>> What's the time if you build it in?
> >>>>
> >>>
> >>> For sure you will find some board and configuration that is faster.
> >>>
> >>> But why should I care? This series breaks booting Debian on my board. So
> >>> it needs to be fixed. So, please, apply my patch that is doing so.
> >>
> >> Five minutes sounds completely unacceptable for embedded platforms.
> >> The user will surely have packaged the item up and will be just
> >> heading out to drop it off for return...
>
> I have no problem with people switching on the SUNXI hardware watchdog
> for their specific embedded solution. But here it was switched on for
> all SUNXI boards and breaks booting into Linux distributions.
>
> If the Linux distribution resets the watchdog *after* file system checks
> because it is managed by systemd (as is true for Debian), 5 minutes is
> realistic.
>
> >
> > I'm trying to avoid bringing up the long discussion from the previous
> > thread about this :)
> >
> >> Do we need to add a special case for UEFI here? E.g. bootefi could use
> >> a hook to lengthen the watchdog?
>
> The problem is not UEFI related.

Oh dear, that is definitely quite slow.

>
> >
> > Well, the problem is that the hardware watchdog has a maximum period of
> > 16 seconds, I believe.
> >
>
> Yes, Sunxi is limited to 16 seconds.

I suppose it isn't possible to disable the watchdog once it is enabled?

Regards,
Simon

Reply via email to