Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-22 Thread Josh Hunt
On Mon, Aug 19, 2019 at 4:16 PM Josh Hunt wrote: > > On Mon, Aug 19, 2019 at 2:17 PM Josh Hunt wrote: > > > > On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote: > > > > > > On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner > > > wrote: > > > > > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > >

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-19 Thread Josh Hunt
On Mon, Aug 19, 2019 at 2:17 PM Josh Hunt wrote: > > On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote: > > > > On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote: > > > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner > > > > wrote: > > >

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-19 Thread Josh Hunt
On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote: > > On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote: > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner > > > wrote: > > > > > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > > > Was the

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-12 Thread Josh Hunt
On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote: > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote: > > > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > > Was there any progress made on debugging this issue? We are still > > > > seeing

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-12 Thread Thomas Gleixner
On Mon, 12 Aug 2019, Josh Hunt wrote: > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote: > > > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > > Was there any progress made on debugging this issue? We are still > > > seeing it on 4.19.44: > > > > I haven't seen anyone looking at this. > > > > C

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-12 Thread Josh Hunt
On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote: > > On Mon, 12 Aug 2019, Josh Hunt wrote: > > Was there any progress made on debugging this issue? We are still > > seeing it on 4.19.44: > > I haven't seen anyone looking at this. > > Can you please try the patch Ingo posted: > > https://l

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-12 Thread Thomas Gleixner
On Mon, 12 Aug 2019, Josh Hunt wrote: > Was there any progress made on debugging this issue? We are still > seeing it on 4.19.44: I haven't seen anyone looking at this. Can you please try the patch Ingo posted: https://lore.kernel.org/lkml/20150501070226.gb18...@gmail.com/ and if it fixes the

Re: Long standing kernel warning: perfevents: irq loop stuck!

2019-08-12 Thread Josh Hunt
that tick really > fast (like cycles, uops retired etc.) > > -Andi Was there any progress made on debugging this issue? We are still seeing it on 4.19.44: [ 2660.685392] --------[ cut here ] [ 2660.685392] perfevents: irq loop stuck! [ 2660.685392] WARNING: CPU: 1 PID: 4436 at arc

Re: Long standing kernel warning: perfevents: irq loop stuck!

2018-02-26 Thread Andi Kleen
> Given the HSD143 errata and its possible relevance, have you tried > changing the magic number to 32, does it then still fix things? > > No real objection to the patch as such, it just needs a coherent comment > and a tested-by tag I think. 128 min period will affect a lot of valid use cases wi

Re: Long standing kernel warning: perfevents: irq loop stuck!

2018-02-26 Thread Cong Wang
On Fri, Feb 23, 2018 at 4:14 AM, Peter Zijlstra wrote: > On Thu, Feb 22, 2018 at 08:59:47PM -0800, Cong Wang wrote: >> Hello, >> >> We keep seeing the following kernel warning from 3.10 kernel to 4.9 >> kernel, it exists for a rather long time. >> >> Google search shows there was a patch from Ingo

Re: Long standing kernel warning: perfevents: irq loop stuck!

2018-02-23 Thread Peter Zijlstra
On Thu, Feb 22, 2018 at 08:59:47PM -0800, Cong Wang wrote: > Hello, > > We keep seeing the following kernel warning from 3.10 kernel to 4.9 > kernel, it exists for a rather long time. > > Google search shows there was a patch from Ingo: > https://patchwork.kernel.org/patch/6308681/ > > but it do

Long standing kernel warning: perfevents: irq loop stuck!

2018-02-22 Thread Cong Wang
(7710 > 7696), lowering kernel.perf_event_max_sample_rate to 25000 [14751.091121] perfevents: irq loop stuck! [14751.095169] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 4.099 msecs [14751.103265] perf: interrupt took too long (40100 > 9637), lowering kernel.perf_event_max_

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-18 Thread Vince Weaver
On Fri, 8 May 2015, Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > > * Vince Weaver wrote: > > > > > So this is just a warning, and I've reported it before, but the > > > perf_fuzzer triggers this fairly regularly on my Haswell system. > > > > > > It looks like fixed counter 0 (retire

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-08 Thread Ingo Molnar
* Ingo Molnar wrote: > > * Vince Weaver wrote: > > > So this is just a warning, and I've reported it before, but the > > perf_fuzzer triggers this fairly regularly on my Haswell system. > > > > It looks like fixed counter 0 (retired instructions) being set to > > fffe occasiona

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-08 Thread Ingo Molnar
* Vince Weaver wrote: > On Fri, 1 May 2015, Ingo Molnar wrote: > > > So fffe corresponds to 2 events left until overflow, > > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we > > allow these super short periods. > > > > Maybe like on Broadwell we need a quirk

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-07 Thread Vince Weaver
On Fri, 1 May 2015, Ingo Molnar wrote: > So fffe corresponds to 2 events left until overflow, > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we > allow these super short periods. > > Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well, > one sim

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-01 Thread Vince Weaver
On Fri, 1 May 2015, Ingo Molnar wrote: > > * Vince Weaver wrote: > > > So this is just a warning, and I've reported it before, but the > > perf_fuzzer triggers this fairly regularly on my Haswell system. > > > > It looks like fixed counter 0 (retired instructions) being set to > > ff

Re: perf: WARNING perfevents: irq loop stuck!

2015-05-01 Thread Ingo Molnar
* Vince Weaver wrote: > So this is just a warning, and I've reported it before, but the > perf_fuzzer triggers this fairly regularly on my Haswell system. > > It looks like fixed counter 0 (retired instructions) being set to > fffe occasionally causes an irq loop storm and gets >

perf: WARNING perfevents: irq loop stuck!

2015-04-30 Thread Vince Weaver
te is cleared. [ 8224.179407] [ cut here ] [ 8224.184368] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel.c:1602 intel_pmu_handle_irq+0x2bc/0x450() [ 8224.195686] perfevents: irq loop stuck! [ 8224.199835] Modules linked in: fuse x86_pkg_temp_thermal intel_power

Re: perfevents: irq loop stuck!

2014-05-19 Thread Vince Weaver
On Mon, 19 May 2014, Vince Weaver wrote: > The fuzzing also turned up a few other issues, and in the end after 2 days > it locked up the machine so hard that it also took out the ethernet switch > due to some sort of packet trasmit storm, which is a failure mode I > have to admit I haven't enco

Re: perfevents: irq loop stuck!

2014-05-19 Thread Vince Weaver
t storm, which is a failure mode I have to admit I haven't encountered before. Vince [69213.252805] [ cut here ] [69213.260637] WARNING: CPU: 4 PID: 11343 at arch/x86/kernel/cpu/perf_event_intel.c:1373 intel_pmu_handle_irq+0x2a4/0x3c0() [69213.276788] perfevents: irq lo

Re: perfevents: irq loop stuck!

2014-05-16 Thread Peter Zijlstra
On Fri, May 16, 2014 at 12:25:28AM -0400, Vince Weaver wrote: > anyway I'm not sure if it's worth tracking this more if it's possible to > mostly fix the case by fixing the sample_period bounds. Right, so lets start with that, if it triggers again, we'll have another look. FWIW I ran with the be

Re: perfevents: irq loop stuck!

2014-05-15 Thread Vince Weaver
On Thu, 15 May 2014, Peter Zijlstra wrote: > > So, not sure how to fix this without a total re-write, unless we want to > > cheat and just say sample_period is capped at 62-bits or something. > > 63 bits should do I think, but yes, we hit a very similar but a few days > ago in the sched_deadline

Re: perfevents: irq loop stuck!

2014-05-15 Thread Peter Zijlstra
On Wed, May 14, 2014 at 10:55:40PM -0400, Vince Weaver wrote: > On Tue, 13 May 2014, Vince Weaver wrote: > > > pe[32].sample_period=0xc0bd; > > > > Should it be possible to open an event with a large negative sample_period > > like that? > > so this seems to be a real bug.

Re: perfevents: irq loop stuck!

2014-05-14 Thread Vince Weaver
On Tue, 13 May 2014, Vince Weaver wrote: > pe[32].sample_period=0xc0bd; > > Should it be possible to open an event with a large negative sample_period > like that? so this seems to be a real bug. attr->sample_period is a u64 value, but internally it gets cast to s64 and is

perfevents: irq loop stuck!

2014-05-13 Thread Vince Weaver
33692] perfevents: irq loop stuck! [ 425.839116] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp coretemp kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul snd_hda_intel