Re: [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter

2018-06-13 Thread Randy Dunlap
On 06/13/2018 05:58 PM, Ricardo Neri wrote:
> On Tue, Jun 12, 2018 at 10:26:57PM -0700, Randy Dunlap wrote:
>> On 06/12/2018 05:57 PM, Ricardo Neri wrote:
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>>> b/Documentation/admin-guide/kernel-parameters.txt
>>> index f2040d4..a8833c7 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -2577,7 +2577,7 @@
>>> Format: [state][,regs][,debounce][,die]
>>>  
>>> nmi_watchdog=   [KNL,BUGS=X86] Debugging features for SMP kernels
>>> -   Format: [panic,][nopanic,][num]
>>> +   Format: [panic,][nopanic,][num,][hpet]
>>> Valid num: 0 or 1
>>> 0 - turn hardlockup detector in nmi_watchdog off
>>> 1 - turn hardlockup detector in nmi_watchdog on
>>
>> This says that I can use "nmi_watchdog=hpet" without using 0 or 1.
>> Is that correct?
> 
> Yes, this what I meant. In my view, if you set nmi_watchdog=hpet it
> implies that you want to activate the NMI watchdog. In this case, perf.
> 
> I can see how this will be ambiguous for the case of perf and arch NMI
> watchdogs.
> 
> Alternative, a new parameter could be added; such as nmi_watchdog_type. I
> didn't want to add it in this patchset as I think that a single parameter
> can handle the enablement and type of the NMI watchdog.
> 
> What do you think?

I think it's OK like it is.

thanks,
-- 
~Randy
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Nicholas Piggin
On Wed, 13 Jun 2018 18:31:17 -0700
Ricardo Neri  wrote:

> On Wed, Jun 13, 2018 at 09:52:25PM +1000, Nicholas Piggin wrote:
> > On Wed, 13 Jun 2018 11:26:49 +0200 (CEST)
> > Thomas Gleixner  wrote:
> >   
> > > On Wed, 13 Jun 2018, Peter Zijlstra wrote:  
> > > > On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:
> > > > > On Tue, 12 Jun 2018 17:57:32 -0700
> > > > > Ricardo Neri  wrote:
> > > > > 
> > > > > > Instead of exposing individual functions for the operations of the 
> > > > > > NMI
> > > > > > watchdog, define a common interface that can be used across multiple
> > > > > > implementations.
> > > > > > 
> > > > > > The struct nmi_watchdog_ops is defined for such operations. These 
> > > > > > initial
> > > > > > definitions include the enable, disable, start, stop, and cleanup
> > > > > > operations.
> > > > > > 
> > > > > > Only a single NMI watchdog can be used in the system. The 
> > > > > > operations of
> > > > > > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > > > > > variable is set to point the operations of the first NMI watchdog 
> > > > > > that
> > > > > > initializes successfully. Even though at this moment, the only 
> > > > > > available
> > > > > > NMI watchdog is the perf-based hardlockup detector. More 
> > > > > > implementations
> > > > > > can be added in the future.
> > > > > 
> > > > > Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> > > > > least have their own NMI watchdogs, it would be good to have those
> > > > > converted as well.
> > > > 
> > > > Yeah, agreed, this looks like half a patch.
> > > 
> > > Though I'm not seeing the advantage of it. That kind of NMI watchdogs are
> > > low level architecture details so having yet another 'ops' data structure
> > > with a gazillion of callbacks, checks and indirections does not provide
> > > value over the currently available weak stubs.  
> > 
> > The other way to go of course is librify the perf watchdog and make an
> > x86 watchdog that selects between perf and hpet... I also probably
> > prefer that for code such as this, but I wouldn't strongly object to
> > ops struct if I'm not writing the code. It's not that bad is it?  
> 
> My motivation to add the ops was that the hpet and perf watchdog share
> significant portions of code.

Right, a good motivation.

> I could look into creating the library for
> common code and relocate the hpet watchdog into arch/x86 for the hpet-
> specific parts.

If you can investigate that approach, that would be appreciated. I hope
I did not misunderstand you there, Thomas.

Basically you would have perf infrastructure and hpet infrastructure,
and then the x86 watchdog driver will use one or the other of those. The
generic watchdog driver will be just a simple shim that uses the perf
infrastructure. Then hopefully the powerpc driver would require almost
no change.

Thanks,
Nick
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf

2018-06-13 Thread Nicholas Piggin
On Wed, 13 Jun 2018 18:19:01 -0700
Ricardo Neri  wrote:

> On Wed, Jun 13, 2018 at 10:43:24AM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 12, 2018 at 05:57:34PM -0700, Ricardo Neri wrote:  
> > > The current default implementation of the hardlockup detector assumes that
> > > it is implemented using perf events.  
> > 
> > The sparc and powerpc things are very much not using perf.  
> 
> Isn't it true that the current hardlockup detector
> (under kernel/watchdog_hld.c) is based on perf?

arch/powerpc/kernel/watchdog.c is a powerpc implementation that uses
the kernel/watchdog_hld.c framework.

> As far as I understand,
> this hardlockup detector is constructed using perf events for architectures
> that don't provide an NMI watchdog. Perhaps I can be more specific and say
> that this synthetized detector is based on perf.

The perf detector is like that, but we want NMI watchdogs to share
the watchdog_hld code as much as possible even for arch specific NMI
watchdogs, so that kernel and user interfaces and behaviour are
consistent.

Other arch watchdogs like sparc are a little older so they are not
using HLD. You don't have to change those for your series, but it
would be good to bring them into the fold if possible at some time.
IIRC sparc was slightly non-trivial because it has some differences
in sysctl or cmdline APIs that we don't want to break.

But powerpc at least needs to be updated if you change hld apis.

Thanks,
Nick
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Ricardo Neri
On Wed, Jun 13, 2018 at 09:52:25PM +1000, Nicholas Piggin wrote:
> On Wed, 13 Jun 2018 11:26:49 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > On Wed, 13 Jun 2018, Peter Zijlstra wrote:
> > > On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:  
> > > > On Tue, 12 Jun 2018 17:57:32 -0700
> > > > Ricardo Neri  wrote:
> > > >   
> > > > > Instead of exposing individual functions for the operations of the NMI
> > > > > watchdog, define a common interface that can be used across multiple
> > > > > implementations.
> > > > > 
> > > > > The struct nmi_watchdog_ops is defined for such operations. These 
> > > > > initial
> > > > > definitions include the enable, disable, start, stop, and cleanup
> > > > > operations.
> > > > > 
> > > > > Only a single NMI watchdog can be used in the system. The operations 
> > > > > of
> > > > > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > > > > variable is set to point the operations of the first NMI watchdog that
> > > > > initializes successfully. Even though at this moment, the only 
> > > > > available
> > > > > NMI watchdog is the perf-based hardlockup detector. More 
> > > > > implementations
> > > > > can be added in the future.  
> > > > 
> > > > Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> > > > least have their own NMI watchdogs, it would be good to have those
> > > > converted as well.  
> > > 
> > > Yeah, agreed, this looks like half a patch.  
> > 
> > Though I'm not seeing the advantage of it. That kind of NMI watchdogs are
> > low level architecture details so having yet another 'ops' data structure
> > with a gazillion of callbacks, checks and indirections does not provide
> > value over the currently available weak stubs.
> 
> The other way to go of course is librify the perf watchdog and make an
> x86 watchdog that selects between perf and hpet... I also probably
> prefer that for code such as this, but I wouldn't strongly object to
> ops struct if I'm not writing the code. It's not that bad is it?

My motivation to add the ops was that the hpet and perf watchdog share
significant portions of code. I could look into creating the library for
common code and relocate the hpet watchdog into arch/x86 for the hpet-
specific parts.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Ricardo Neri
On Wed, Jun 13, 2018 at 10:42:19AM +0200, Peter Zijlstra wrote:
> On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:
> > On Tue, 12 Jun 2018 17:57:32 -0700
> > Ricardo Neri  wrote:
> > 
> > > Instead of exposing individual functions for the operations of the NMI
> > > watchdog, define a common interface that can be used across multiple
> > > implementations.
> > > 
> > > The struct nmi_watchdog_ops is defined for such operations. These initial
> > > definitions include the enable, disable, start, stop, and cleanup
> > > operations.
> > > 
> > > Only a single NMI watchdog can be used in the system. The operations of
> > > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > > variable is set to point the operations of the first NMI watchdog that
> > > initializes successfully. Even though at this moment, the only available
> > > NMI watchdog is the perf-based hardlockup detector. More implementations
> > > can be added in the future.
> > 
> > Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> > least have their own NMI watchdogs, it would be good to have those
> > converted as well.
> 
> Yeah, agreed, this looks like half a patch.

I planned to look into the conversion of sparc and powerpc. I just wanted
to see the reception to these patches before jumping and do potentially
useless work. Comments in this thread lean towards keep using the weak
stubs.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf

2018-06-13 Thread Ricardo Neri
On Wed, Jun 13, 2018 at 10:43:24AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 05:57:34PM -0700, Ricardo Neri wrote:
> > The current default implementation of the hardlockup detector assumes that
> > it is implemented using perf events.
> 
> The sparc and powerpc things are very much not using perf.

Isn't it true that the current hardlockup detector
(under kernel/watchdog_hld.c) is based on perf? As far as I understand,
this hardlockup detector is constructed using perf events for architectures
that don't provide an NMI watchdog. Perhaps I can be more specific and say
that this synthetized detector is based on perf.

On a side note, I saw that powerpc might use a perf-based hardlockup
detector if it has perf events [1].

Please let me know if my understanding is not correct.

Thanks and BR,
Ricardo

[1]. https://elixir.bootlin.com/linux/v4.17/source/arch/powerpc/Kconfig#L218

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 16/23] watchdog/hardlockup: Add an HPET-based hardlockup detector

2018-06-13 Thread Ricardo Neri
On Tue, Jun 12, 2018 at 10:23:47PM -0700, Randy Dunlap wrote:
> Hi,

Hi Randy,

> 
> On 06/12/2018 05:57 PM, Ricardo Neri wrote:
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index c40c7b7..6e79833 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -828,6 +828,16 @@ config HARDLOCKUP_DETECTOR_PERF
> > bool
> > select SOFTLOCKUP_DETECTOR
> >  
> > +config HARDLOCKUP_DETECTOR_HPET
> > +   bool "Use HPET Timer for Hard Lockup Detection"
> > +   select SOFTLOCKUP_DETECTOR
> > +   select HARDLOCKUP_DETECTOR
> > +   depends on HPET_TIMER && HPET
> > +   help
> > + Say y to enable a hardlockup detector that is driven by an 
> > High-Precision
> > + Event Timer. In addition to selecting this option, the command-line
> > + parameter nmi_watchdog option. See 
> > Documentation/admin-guide/kernel-parameters.rst
> 
> The "In addition ..." thing is a broken (incomplete) sentence.

Oops. I apologize. I missed this I will fix it in my next version.

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter

2018-06-13 Thread Ricardo Neri
On Tue, Jun 12, 2018 at 10:26:57PM -0700, Randy Dunlap wrote:
> On 06/12/2018 05:57 PM, Ricardo Neri wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> > b/Documentation/admin-guide/kernel-parameters.txt
> > index f2040d4..a8833c7 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -2577,7 +2577,7 @@
> > Format: [state][,regs][,debounce][,die]
> >  
> > nmi_watchdog=   [KNL,BUGS=X86] Debugging features for SMP kernels
> > -   Format: [panic,][nopanic,][num]
> > +   Format: [panic,][nopanic,][num,][hpet]
> > Valid num: 0 or 1
> > 0 - turn hardlockup detector in nmi_watchdog off
> > 1 - turn hardlockup detector in nmi_watchdog on
> 
> This says that I can use "nmi_watchdog=hpet" without using 0 or 1.
> Is that correct?

Yes, this what I meant. In my view, if you set nmi_watchdog=hpet it
implies that you want to activate the NMI watchdog. In this case, perf.

I can see how this will be ambiguous for the case of perf and arch NMI
watchdogs.

Alternative, a new parameter could be added; such as nmi_watchdog_type. I
didn't want to add it in this patchset as I think that a single parameter
can handle the enablement and type of the NMI watchdog.

What do you think?

Thanks and BR,
Ricardo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Nicholas Piggin
On Wed, 13 Jun 2018 11:26:49 +0200 (CEST)
Thomas Gleixner  wrote:

> On Wed, 13 Jun 2018, Peter Zijlstra wrote:
> > On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:  
> > > On Tue, 12 Jun 2018 17:57:32 -0700
> > > Ricardo Neri  wrote:
> > >   
> > > > Instead of exposing individual functions for the operations of the NMI
> > > > watchdog, define a common interface that can be used across multiple
> > > > implementations.
> > > > 
> > > > The struct nmi_watchdog_ops is defined for such operations. These 
> > > > initial
> > > > definitions include the enable, disable, start, stop, and cleanup
> > > > operations.
> > > > 
> > > > Only a single NMI watchdog can be used in the system. The operations of
> > > > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > > > variable is set to point the operations of the first NMI watchdog that
> > > > initializes successfully. Even though at this moment, the only available
> > > > NMI watchdog is the perf-based hardlockup detector. More implementations
> > > > can be added in the future.  
> > > 
> > > Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> > > least have their own NMI watchdogs, it would be good to have those
> > > converted as well.  
> > 
> > Yeah, agreed, this looks like half a patch.  
> 
> Though I'm not seeing the advantage of it. That kind of NMI watchdogs are
> low level architecture details so having yet another 'ops' data structure
> with a gazillion of callbacks, checks and indirections does not provide
> value over the currently available weak stubs.

The other way to go of course is librify the perf watchdog and make an
x86 watchdog that selects between perf and hpet... I also probably
prefer that for code such as this, but I wouldn't strongly object to
ops struct if I'm not writing the code. It's not that bad is it?

Thanks,
Nick
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Julien Thierry




On 13/06/18 10:57, Thomas Gleixner wrote:

On Wed, 13 Jun 2018, Julien Thierry wrote:

On 13/06/18 10:20, Thomas Gleixner wrote:

Adding NMI delivery support at low level architecture irq chip level is
perfectly fine, but the exposure of that needs to be restricted very
much. Adding it to the generic interrupt control interfaces is not going to
happen. That's doomed to begin with and a complete abuse of the interface
as the handler can not ever be used for that.



Understood, however the need would be to provide a way for a driver to request
an interrupt to be delivered as an NMI (if irqchip supports it).


s/driver/specialized code written by people who know what they are doing/


But from your response this would be out of the question (in the
interrupt/irq/irqchip definitions).


Adding some magic to the irq chip is fine, because that's where the low
level integration needs to be done, but exposing it through the generic
interrupt subsystem is a NONO for obvious reasons.


Or somehow the concerned irqchip informs the arch it supports NMI delivery and
it is up to the interested drivers to query the arch whether NMI delivery is
supported by the system?


Yes, we need some infrastructure for that, but that needs to be separate
and with very limited exposure.



Right, makes sense. I'll check with Marc how such an infrastructure 
should be introduced.


Thanks,

--
Julien Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Marc Zyngier
On 13/06/18 10:20, Thomas Gleixner wrote:
> On Wed, 13 Jun 2018, Julien Thierry wrote:
>> On 13/06/18 09:34, Peter Zijlstra wrote:
>>> On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:
 diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
 index 5426627..dbc5e02 100644
 --- a/include/linux/interrupt.h
 +++ b/include/linux/interrupt.h
 @@ -61,6 +61,8 @@
*interrupt handler after suspending interrupts. For
 system
*wakeup devices users need to implement wakeup
 detection in
*their interrupt handlers.
 + * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
 non-maskable, if
 + *supported by the chip.
*/
>>>
>>> NAK on the first 6 patches. You really _REALLY_ don't want to expose
>>> NMIs to this level.
>>>
>>
>> I've been working on something similar on arm64 side, and effectively the one
>> thing that might be common to arm64 and intel is the interface to set an
>> interrupt as NMI. So I guess it would be nice to agree on the right approach
>> for this.
>>
>> The way I did it was by introducing a new irq_state and let the irqchip 
>> driver
>> handle most of the work (if it supports that state):
>>
>> https://lkml.org/lkml/2018/5/25/181
>>
>> This has not been ACKed nor NAKed. So I am just asking whether this is a more
>> suitable approach, and if not, is there any suggestions on how to do this?
> 
> I really didn't pay attention to that as it's burried in the GIC/ARM series
> which is usually Marc's playground.

I'm working my way through it ATM now that I have some brain cycles back.

> Adding NMI delivery support at low level architecture irq chip level is
> perfectly fine, but the exposure of that needs to be restricted very
> much. Adding it to the generic interrupt control interfaces is not going to
> happen. That's doomed to begin with and a complete abuse of the interface
> as the handler can not ever be used for that.

I can only agree with that. Allowing random driver to use request_irq()
to make anything an NMI ultimately turns it into a complete mess ("hey,
NMI is *faster*, let's use that"), and a potential source of horrible
deadlocks.

What I'd find more palatable is a way for an irqchip to be able to
prioritize some interrupts based on a set of architecturally-defined
requirements, and a separate NMI requesting/handling framework that is
separate from the IRQ API, as the overall requirements are likely to
completely different.

It shouldn't have to be nearly as complex as the IRQ API, and require
much stricter requirements in terms of what you can do there (flow
handling should definitely be different).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Thomas Gleixner
On Wed, 13 Jun 2018, Julien Thierry wrote:
> On 13/06/18 10:20, Thomas Gleixner wrote:
> > Adding NMI delivery support at low level architecture irq chip level is
> > perfectly fine, but the exposure of that needs to be restricted very
> > much. Adding it to the generic interrupt control interfaces is not going to
> > happen. That's doomed to begin with and a complete abuse of the interface
> > as the handler can not ever be used for that.
> > 
> 
> Understood, however the need would be to provide a way for a driver to request
> an interrupt to be delivered as an NMI (if irqchip supports it).

s/driver/specialized code written by people who know what they are doing/

> But from your response this would be out of the question (in the
> interrupt/irq/irqchip definitions).

Adding some magic to the irq chip is fine, because that's where the low
level integration needs to be done, but exposing it through the generic
interrupt subsystem is a NONO for obvious reasons.

> Or somehow the concerned irqchip informs the arch it supports NMI delivery and
> it is up to the interested drivers to query the arch whether NMI delivery is
> supported by the system?

Yes, we need some infrastructure for that, but that needs to be separate
and with very limited exposure.

Thanks,

tglx


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs

2018-06-13 Thread Thomas Gleixner
On Tue, 12 Jun 2018, Ricardo Neri wrote:
> + /* There are no CPUs to monitor. */
> + if (!cpumask_weight(>monitored_mask))
> + return NMI_HANDLED;
> +
>   inspect_for_hardlockups(regs);
>  
> + /*
> +  * Target a new CPU. Keep trying until we find a monitored CPU. CPUs
> +  * are addded and removed to this mask at cpu_up() and cpu_down(),
> +  * respectively. Thus, the interrupt should be able to be moved to
> +  * the next monitored CPU.
> +  */
> + spin_lock(_data->lock);

Yuck. Taking a spinlock from NMI ... 

> + for_each_cpu_wrap(cpu, >monitored_mask, smp_processor_id() + 1) {
> + if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu)))
> + break;

... and then calling into generic interrupt code which will take even more
locks is completely broken.

Guess what happens when the NMI hits a section where one of those locks is
held? Then you need another watchdog to decode the lockup you just ran into.

Thanks,

tglx


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Julien Thierry



On 13/06/18 10:36, Julien Thierry wrote:



On 13/06/18 10:20, Thomas Gleixner wrote:

On Wed, 13 Jun 2018, Julien Thierry wrote:

On 13/06/18 09:34, Peter Zijlstra wrote:

On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5426627..dbc5e02 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,8 @@
    *    interrupt handler after suspending interrupts. 
For

system
    *    wakeup devices users need to implement wakeup
detection in
    *    their interrupt handlers.
+ * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
non-maskable, if
+ *    supported by the chip.
    */


NAK on the first 6 patches. You really _REALLY_ don't want to expose
NMIs to this level.



I've been working on something similar on arm64 side, and effectively 
the one

thing that might be common to arm64 and intel is the interface to set an
interrupt as NMI. So I guess it would be nice to agree on the right 
approach

for this.

The way I did it was by introducing a new irq_state and let the 
irqchip driver

handle most of the work (if it supports that state):

https://lkml.org/lkml/2018/5/25/181

This has not been ACKed nor NAKed. So I am just asking whether this 
is a more
suitable approach, and if not, is there any suggestions on how to do 
this?


I really didn't pay attention to that as it's burried in the GIC/ARM 
series

which is usually Marc's playground.

Adding NMI delivery support at low level architecture irq chip level is
perfectly fine, but the exposure of that needs to be restricted very
much. Adding it to the generic interrupt control interfaces is not 
going to

happen. That's doomed to begin with and a complete abuse of the interface
as the handler can not ever be used for that.



Understood, however the need would be to provide a way for a driver to 
request an interrupt to be delivered as an NMI (if irqchip supports it).


But from your response this would be out of the question (in the 
interrupt/irq/irqchip definitions).


Or somehow the concerned irqchip informs the arch it supports NMI 
delivery and it is up to the interested drivers to query the arch 
whether NMI delivery is supported by the system?


Actually scratch that last part, it is also missing a way for the driver 
to actually communicate to the irqchip that its interrupt should be 
treated as an NMI, so it wouldn't work...


--
Julien Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI

2018-06-13 Thread Thomas Gleixner
On Tue, 12 Jun 2018, Ricardo Neri wrote:
> @@ -183,6 +184,8 @@ static irqreturn_t hardlockup_detector_irq_handler(int 
> irq, void *data)
>   if (!(hdata->flags & HPET_DEV_PERI_CAP))
>   kick_timer(hdata);
>  
> + pr_err("This interrupt should not have happened. Ensure delivery mode 
> is NMI.\n");

Eeew.

>  /**
> + * hardlockup_detector_nmi_handler() - NMI Interrupt handler
> + * @val: Attribute associated with the NMI. Not used.
> + * @regs:Register values as seen when the NMI was asserted
> + *
> + * When an NMI is issued, look for hardlockups. If the timer is not periodic,
> + * kick it. The interrupt is always handled when if delivered via the
> + * Front-Side Bus.
> + *
> + * Returns:
> + *
> + * NMI_DONE if the HPET timer did not cause the interrupt. NMI_HANDLED
> + * otherwise.
> + */
> +static int hardlockup_detector_nmi_handler(unsigned int val,
> +struct pt_regs *regs)
> +{
> + struct hpet_hld_data *hdata = hld_data;
> + unsigned int use_fsb;
> +
> + /*
> +  * If FSB delivery mode is used, the timer interrupt is programmed as
> +  * edge-triggered and there is no need to check the ISR register.
> +  */
> + use_fsb = hdata->flags & HPET_DEV_FSB_CAP;
> +
> + if (!use_fsb && !is_hpet_wdt_interrupt(hdata))
> + return NMI_DONE;

So for 'use_fsb == True' every single NMI will fall through into the
watchdog code below.

> + inspect_for_hardlockups(regs);
> +
> + if (!(hdata->flags & HPET_DEV_PERI_CAP))
> + kick_timer(hdata);

And in case that the HPET does not support periodic mode this reprogramms
the timer on every NMI which means that while perf is running the watchdog
will never ever detect anything.

Aside of that, reading TWO HPET registers for every NMI is insane. HPET
access is horribly slow, so any high frequency perf monitoring will take a
massive performance hit.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Julien Thierry




On 13/06/18 10:20, Thomas Gleixner wrote:

On Wed, 13 Jun 2018, Julien Thierry wrote:

On 13/06/18 09:34, Peter Zijlstra wrote:

On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5426627..dbc5e02 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,8 @@
*interrupt handler after suspending interrupts. For
system
*wakeup devices users need to implement wakeup
detection in
*their interrupt handlers.
+ * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
non-maskable, if
+ *supported by the chip.
*/


NAK on the first 6 patches. You really _REALLY_ don't want to expose
NMIs to this level.



I've been working on something similar on arm64 side, and effectively the one
thing that might be common to arm64 and intel is the interface to set an
interrupt as NMI. So I guess it would be nice to agree on the right approach
for this.

The way I did it was by introducing a new irq_state and let the irqchip driver
handle most of the work (if it supports that state):

https://lkml.org/lkml/2018/5/25/181

This has not been ACKed nor NAKed. So I am just asking whether this is a more
suitable approach, and if not, is there any suggestions on how to do this?


I really didn't pay attention to that as it's burried in the GIC/ARM series
which is usually Marc's playground.

Adding NMI delivery support at low level architecture irq chip level is
perfectly fine, but the exposure of that needs to be restricted very
much. Adding it to the generic interrupt control interfaces is not going to
happen. That's doomed to begin with and a complete abuse of the interface
as the handler can not ever be used for that.



Understood, however the need would be to provide a way for a driver to 
request an interrupt to be delivered as an NMI (if irqchip supports it).


But from your response this would be out of the question (in the 
interrupt/irq/irqchip definitions).


Or somehow the concerned irqchip informs the arch it supports NMI 
delivery and it is up to the interested drivers to query the arch 
whether NMI delivery is supported by the system?


Thanks,

--
Julien Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Thomas Gleixner
On Wed, 13 Jun 2018, Peter Zijlstra wrote:
> On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:
> > On Tue, 12 Jun 2018 17:57:32 -0700
> > Ricardo Neri  wrote:
> > 
> > > Instead of exposing individual functions for the operations of the NMI
> > > watchdog, define a common interface that can be used across multiple
> > > implementations.
> > > 
> > > The struct nmi_watchdog_ops is defined for such operations. These initial
> > > definitions include the enable, disable, start, stop, and cleanup
> > > operations.
> > > 
> > > Only a single NMI watchdog can be used in the system. The operations of
> > > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > > variable is set to point the operations of the first NMI watchdog that
> > > initializes successfully. Even though at this moment, the only available
> > > NMI watchdog is the perf-based hardlockup detector. More implementations
> > > can be added in the future.
> > 
> > Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> > least have their own NMI watchdogs, it would be good to have those
> > converted as well.
> 
> Yeah, agreed, this looks like half a patch.

Though I'm not seeing the advantage of it. That kind of NMI watchdogs are
low level architecture details so having yet another 'ops' data structure
with a gazillion of callbacks, checks and indirections does not provide
value over the currently available weak stubs.

> > Is hpet a cross platform thing, or just x86? We should avoid
> > proliferation of files under kernel/ I think, so with these watchdog
> > driver structs then maybe implementations could go in drivers/ or
> > arch/
> 
> HPET is mostly an x86 thing (altough it can be found elsewhere), but the

On ia64 and I doubt that anyone wants to take on the task of underwater
welding it to Itanic.

> whole thing relies on the x86 NMI mechanism and is thus firmly arch/
> material (like the sparc and ppc thing).

Right. Trying to make this 'generic' is not really solving anything.

Thanks,

tglx

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Thomas Gleixner
On Wed, 13 Jun 2018, Julien Thierry wrote:
> On 13/06/18 09:34, Peter Zijlstra wrote:
> > On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:
> > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> > > index 5426627..dbc5e02 100644
> > > --- a/include/linux/interrupt.h
> > > +++ b/include/linux/interrupt.h
> > > @@ -61,6 +61,8 @@
> > >*interrupt handler after suspending interrupts. For
> > > system
> > >*wakeup devices users need to implement wakeup
> > > detection in
> > >*their interrupt handlers.
> > > + * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as
> > > non-maskable, if
> > > + *supported by the chip.
> > >*/
> > 
> > NAK on the first 6 patches. You really _REALLY_ don't want to expose
> > NMIs to this level.
> > 
> 
> I've been working on something similar on arm64 side, and effectively the one
> thing that might be common to arm64 and intel is the interface to set an
> interrupt as NMI. So I guess it would be nice to agree on the right approach
> for this.
> 
> The way I did it was by introducing a new irq_state and let the irqchip driver
> handle most of the work (if it supports that state):
> 
> https://lkml.org/lkml/2018/5/25/181
>
> This has not been ACKed nor NAKed. So I am just asking whether this is a more
> suitable approach, and if not, is there any suggestions on how to do this?

I really didn't pay attention to that as it's burried in the GIC/ARM series
which is usually Marc's playground.

Adding NMI delivery support at low level architecture irq chip level is
perfectly fine, but the exposure of that needs to be restricted very
much. Adding it to the generic interrupt control interfaces is not going to
happen. That's doomed to begin with and a complete abuse of the interface
as the handler can not ever be used for that.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Julien Thierry

Hi Peter, Ricardo,

On 13/06/18 09:34, Peter Zijlstra wrote:

On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5426627..dbc5e02 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,8 @@
   *interrupt handler after suspending interrupts. For system
   *wakeup devices users need to implement wakeup detection in
   *their interrupt handlers.
+ * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as non-maskable, 
if
+ *supported by the chip.
   */


NAK on the first 6 patches. You really _REALLY_ don't want to expose
NMIs to this level.



I've been working on something similar on arm64 side, and effectively 
the one thing that might be common to arm64 and intel is the interface 
to set an interrupt as NMI. So I guess it would be nice to agree on the 
right approach for this.


The way I did it was by introducing a new irq_state and let the irqchip 
driver handle most of the work (if it supports that state):


https://lkml.org/lkml/2018/5/25/181

This has not been ACKed nor NAKed. So I am just asking whether this is a 
more suitable approach, and if not, is there any suggestions on how to 
do this?


Thanks,

--
Julien Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI

2018-06-13 Thread Peter Zijlstra
On Tue, Jun 12, 2018 at 05:57:37PM -0700, Ricardo Neri wrote:

+static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata)
+{
+   unsigned long this_isr;
+   unsigned int lvl_trig;
+
+   this_isr = hpet_readl(HPET_STATUS) & BIT(hdata->num);
+
+   lvl_trig = hpet_readl(HPET_Tn_CFG(hdata->num)) & HPET_TN_LEVEL;
+
+   if (lvl_trig && this_isr)
+   return true;
+
+   return false;
+}

> +static int hardlockup_detector_nmi_handler(unsigned int val,
> +struct pt_regs *regs)
> +{
> + struct hpet_hld_data *hdata = hld_data;
> + unsigned int use_fsb;
> +
> + /*
> +  * If FSB delivery mode is used, the timer interrupt is programmed as
> +  * edge-triggered and there is no need to check the ISR register.
> +  */
> + use_fsb = hdata->flags & HPET_DEV_FSB_CAP;

Please do explain.. That FSB thing basically means MSI. But there's only
a single NMI vector. How do we know this NMI came from the HPET?

> +
> + if (!use_fsb && !is_hpet_wdt_interrupt(hdata))

So you add _2_ HPET reads for every single NMI that gets triggered...
and IIRC HPET reads are _slloww_.

> + return NMI_DONE;
> +
> + inspect_for_hardlockups(regs);
> +
> + if (!(hdata->flags & HPET_DEV_PERI_CAP))
> + kick_timer(hdata);
> +
> + /* Acknowledge interrupt if in level-triggered mode */
> + if (!use_fsb)
> + hpet_writel(BIT(hdata->num), HPET_STATUS);
> +
> + return NMI_HANDLED;

So if I read this right, when in FSB/MSI mode, we'll basically _always_
claim every single NMI as handled?

That's broken.

> +}
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf

2018-06-13 Thread Peter Zijlstra
On Tue, Jun 12, 2018 at 05:57:34PM -0700, Ricardo Neri wrote:
> The current default implementation of the hardlockup detector assumes that
> it is implemented using perf events.

The sparc and powerpc things are very much not using perf.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Peter Zijlstra
On Wed, Jun 13, 2018 at 05:41:41PM +1000, Nicholas Piggin wrote:
> On Tue, 12 Jun 2018 17:57:32 -0700
> Ricardo Neri  wrote:
> 
> > Instead of exposing individual functions for the operations of the NMI
> > watchdog, define a common interface that can be used across multiple
> > implementations.
> > 
> > The struct nmi_watchdog_ops is defined for such operations. These initial
> > definitions include the enable, disable, start, stop, and cleanup
> > operations.
> > 
> > Only a single NMI watchdog can be used in the system. The operations of
> > this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> > variable is set to point the operations of the first NMI watchdog that
> > initializes successfully. Even though at this moment, the only available
> > NMI watchdog is the perf-based hardlockup detector. More implementations
> > can be added in the future.
> 
> Cool, this looks pretty nice at a quick glance. sparc and powerpc at
> least have their own NMI watchdogs, it would be good to have those
> converted as well.

Yeah, agreed, this looks like half a patch.

> Is hpet a cross platform thing, or just x86? We should avoid
> proliferation of files under kernel/ I think, so with these watchdog
> driver structs then maybe implementations could go in drivers/ or
> arch/

HPET is mostly an x86 thing (altough it can be found elsewhere), but the
whole thing relies on the x86 NMI mechanism and is thus firmly arch/
material (like the sparc and ppc thing).
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI

2018-06-13 Thread Peter Zijlstra
On Tue, Jun 12, 2018 at 05:57:23PM -0700, Ricardo Neri wrote:
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 5426627..dbc5e02 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -61,6 +61,8 @@
>   *interrupt handler after suspending interrupts. For system
>   *wakeup devices users need to implement wakeup detection in
>   *their interrupt handlers.
> + * IRQF_DELIVER_AS_NMI - Configure interrupt to be delivered as 
> non-maskable, if
> + *supported by the chip.
>   */

NAK on the first 6 patches. You really _REALLY_ don't want to expose
NMIs to this level.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations

2018-06-13 Thread Nicholas Piggin
On Tue, 12 Jun 2018 17:57:32 -0700
Ricardo Neri  wrote:

> Instead of exposing individual functions for the operations of the NMI
> watchdog, define a common interface that can be used across multiple
> implementations.
> 
> The struct nmi_watchdog_ops is defined for such operations. These initial
> definitions include the enable, disable, start, stop, and cleanup
> operations.
> 
> Only a single NMI watchdog can be used in the system. The operations of
> this NMI watchdog are accessed via the new variable nmi_wd_ops. This
> variable is set to point the operations of the first NMI watchdog that
> initializes successfully. Even though at this moment, the only available
> NMI watchdog is the perf-based hardlockup detector. More implementations
> can be added in the future.

Cool, this looks pretty nice at a quick glance. sparc and powerpc at
least have their own NMI watchdogs, it would be good to have those
converted as well.

Is hpet a cross platform thing, or just x86? We should avoid
proliferation of files under kernel/ I think, so with these watchdog
driver structs then maybe implementations could go in drivers/ or
arch/

Thanks,
Nick
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu