On Tue 2016-12-06 11:17 -0500, Don Zickus wrote:
> On an overloaded system, it is possible that a change in the watchdog 
> threshold
> can be delayed long enough to trigger a false positive.
> 
> This can easily be achieved by having a cpu spinning indefinitely on a task,
> while another cpu updates watchdog threshold.
> 
> What happens is while trying to park the watchdog threads, the hrtimers on the
> other cpus trigger and reprogram themselves with the new slower watchdog
> threshold.  Meanwhile, the nmi watchdog is still programmed with the old 
> faster
> threshold.
> 
> Because the one cpu is blocked, it prevents the thread parking on the other
> cpus from completing, which is needed to shutdown the nmi watchdog and
> reprogram it correctly.  As a result, a false positive from the nmi watchdog 
> is
> reported.
> 
> Fix this by setting a park_in_progress flag to block all lockups
> until the parking is complete.
> 
> Fix provided by Ulrich Obergfell.
> 
> Cc: Ulrich Obergfell <uober...@redhat.com>
> Signed-off-by: Don Zickus <dzic...@redhat.com>
> ---
>  include/linux/nmi.h   | 1 +
>  kernel/watchdog.c     | 9 +++++++++
>  kernel/watchdog_hld.c | 3 +++
>  3 files changed, 13 insertions(+)

Looks fine to me.

Reviewed-by: Aaron Tomlin <atom...@redhat.com>

-- 
Aaron Tomlin

Reply via email to