On Thu, 31 Aug 2017 18:10:14 -0400 Don Zickus <dzic...@redhat.com> wrote:
> On Thu, Aug 31, 2017 at 09:15:58AM +0200, Thomas Gleixner wrote: > > The lockup detector is broken is several ways: > > > > - It's deadlock prone vs. CPU hotplug in various ways. Some of these > > are due to recursive cpus_read_lock() others are due to > > cpus_read_lock() from CPU hotplug callbacks which immediately lock > > the machine because cpus are write locked. > > > > - The handling of the cpu hotplug threads happens sideways to the > > smpboot thread infrastructure, which is racy and pointless > > > > - The handling of the user space sysctl interface is a complete > > trainwreck as it fiddles directly with variables which can be > > modified or evaluated by the running watchdogs. > > > > - The perf event initialization is a steaming pile of duct tape as it > > idiotically tries to create perf events over and over even if perf is > > not functional (no hardware, ....). To avoid excessive dmesg spam it > > contains magic printk ratelimiting along with either wrong or useless > > messages. > > > > - The code structure is horrible as ifdef sections are scattered all > > over the place which makes it unreadable > > > > - There is more wreckage, but see the changelogs for the ugly details. > > > > Before I get utterly grumpy, I just pretend that I don't give a sh*t! > > > > The following series sanitizes the facility and addresses the problems. > > Hi Thomas, > > Thanks for the patchset. I agree with most your issues you complained > about, just wasn't smart enough to figure out the right way to solve them. > Despite your aggressive comments, I will review the code to see if it covers > the scenarios that have popped up over the years and run some testing on my > side. Probably need a few days to do that. The powerpc bits look fine, there's no real changes pending there, so just take them through your tree if you like. I had a glance throught the series, no comments yet. The powerpc watchdog already duplicates the proc tunables rather than using them directly, so in theory it did not need the 2 stage reconfigure. In practice, it has a brown paper bag bug because it does not stop the watchdog before changing its internal variables :P 2 stage is probably safer and clearer way to go though. Thanks, Nick