The lockup detector is broken is several ways:

    - It's deadlock prone vs. CPU hotplug in various ways. Some of these
      are due to recursive cpus_read_lock() others are due to
      cpus_read_lock() from CPU hotplug callbacks which immediately lock
      the machine because cpus are write locked.

    - The handling of the cpu hotplug threads happens sideways to the
      smpboot thread infrastructure, which is racy and pointless

    - The handling of the user space sysctl interface is a complete
      trainwreck as it fiddles directly with variables which can be
      modified or evaluated by the running watchdogs.

    - The perf event initialization is a steaming pile of duct tape as it
      idiotically tries to create perf events over and over even if perf is
      not functional (no hardware, ....). To avoid excessive dmesg spam it
      contains magic printk ratelimiting along with either wrong or useless
      messages.

    - The code structure is horrible as ifdef sections are scattered all
      over the place which makes it unreadable

    - There is more wreckage, but see the changelogs for the ugly details.

Before I get utterly grumpy, I just pretend that I don't give a sh*t!

The following series sanitizes the facility and addresses the problems.

Thanks,

        tglx
---
 arch/parisc/kernel/process.c   |    2 
 arch/powerpc/kernel/watchdog.c |   22 -
 arch/x86/events/intel/core.c   |   11 
 include/linux/nmi.h            |  121 +++----
 include/linux/smpboot.h        |    4 
 kernel/cpu.c                   |    6 
 kernel/smpboot.c               |   22 -
 kernel/sysctl.c                |   22 -
 kernel/watchdog.c              |  638 ++++++++++++++---------------------------
 kernel/watchdog_hld.c          |  193 ++++++------
 10 files changed, 433 insertions(+), 608 deletions(-)

      

Reply via email to