On Mon, Aug 19, 2019 at 4:16 PM Josh Hunt wrote:
>
> On Mon, Aug 19, 2019 at 2:17 PM Josh Hunt wrote:
> >
> > On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote:
> > >
> > > On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner
> > > wrote:
> > > >
> > > > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > >
On Mon, Aug 19, 2019 at 2:17 PM Josh Hunt wrote:
>
> On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote:
> >
> > On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote:
> > >
> > > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > > > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner
> > > > wrote:
> > >
On Mon, Aug 12, 2019 at 12:42 PM Josh Hunt wrote:
>
> On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote:
> >
> > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner
> > > wrote:
> > > >
> > > > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > > > > Was the
On Mon, Aug 12, 2019 at 12:34 PM Thomas Gleixner wrote:
>
> On Mon, 12 Aug 2019, Josh Hunt wrote:
> > On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote:
> > >
> > > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > > > Was there any progress made on debugging this issue? We are still
> > > > seeing
On Mon, 12 Aug 2019, Josh Hunt wrote:
> On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote:
> >
> > On Mon, 12 Aug 2019, Josh Hunt wrote:
> > > Was there any progress made on debugging this issue? We are still
> > > seeing it on 4.19.44:
> >
> > I haven't seen anyone looking at this.
> >
> > C
On Mon, Aug 12, 2019 at 10:55 AM Thomas Gleixner wrote:
>
> On Mon, 12 Aug 2019, Josh Hunt wrote:
> > Was there any progress made on debugging this issue? We are still
> > seeing it on 4.19.44:
>
> I haven't seen anyone looking at this.
>
> Can you please try the patch Ingo posted:
>
> https://l
On Mon, 12 Aug 2019, Josh Hunt wrote:
> Was there any progress made on debugging this issue? We are still
> seeing it on 4.19.44:
I haven't seen anyone looking at this.
Can you please try the patch Ingo posted:
https://lore.kernel.org/lkml/20150501070226.gb18...@gmail.com/
and if it fixes the
that tick really
> fast (like cycles, uops retired etc.)
>
> -Andi
Was there any progress made on debugging this issue? We are still
seeing it on 4.19.44:
[ 2660.685392] --------[ cut here ]
[ 2660.685392] perfevents: irq loop stuck!
[ 2660.685392] WARNING: CPU: 1 PID: 4436 at
arc
> Given the HSD143 errata and its possible relevance, have you tried
> changing the magic number to 32, does it then still fix things?
>
> No real objection to the patch as such, it just needs a coherent comment
> and a tested-by tag I think.
128 min period will affect a lot of valid use cases wi
On Fri, Feb 23, 2018 at 4:14 AM, Peter Zijlstra wrote:
> On Thu, Feb 22, 2018 at 08:59:47PM -0800, Cong Wang wrote:
>> Hello,
>>
>> We keep seeing the following kernel warning from 3.10 kernel to 4.9
>> kernel, it exists for a rather long time.
>>
>> Google search shows there was a patch from Ingo
On Thu, Feb 22, 2018 at 08:59:47PM -0800, Cong Wang wrote:
> Hello,
>
> We keep seeing the following kernel warning from 3.10 kernel to 4.9
> kernel, it exists for a rather long time.
>
> Google search shows there was a patch from Ingo:
> https://patchwork.kernel.org/patch/6308681/
>
> but it do
(7710 > 7696), lowering
kernel.perf_event_max_sample_rate to 25000
[14751.091121] perfevents: irq loop stuck!
[14751.095169] INFO: NMI handler (perf_event_nmi_handler) took too
long to run: 4.099 msecs
[14751.103265] perf: interrupt took too long (40100 > 9637), lowering
kernel.perf_event_max_
On Fri, 8 May 2015, Ingo Molnar wrote:
>
> * Ingo Molnar wrote:
>
> >
> > * Vince Weaver wrote:
> >
> > > So this is just a warning, and I've reported it before, but the
> > > perf_fuzzer triggers this fairly regularly on my Haswell system.
> > >
> > > It looks like fixed counter 0 (retire
* Ingo Molnar wrote:
>
> * Vince Weaver wrote:
>
> > So this is just a warning, and I've reported it before, but the
> > perf_fuzzer triggers this fairly regularly on my Haswell system.
> >
> > It looks like fixed counter 0 (retired instructions) being set to
> > fffe occasiona
* Vince Weaver wrote:
> On Fri, 1 May 2015, Ingo Molnar wrote:
>
> > So fffe corresponds to 2 events left until overflow,
> > right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we
> > allow these super short periods.
> >
> > Maybe like on Broadwell we need a quirk
On Fri, 1 May 2015, Ingo Molnar wrote:
> So fffe corresponds to 2 events left until overflow,
> right? And on Haswell we don't set x86_pmu.limit_period AFAICS, so we
> allow these super short periods.
>
> Maybe like on Broadwell we need a quirk on Nehalem/Haswell as well,
> one sim
On Fri, 1 May 2015, Ingo Molnar wrote:
>
> * Vince Weaver wrote:
>
> > So this is just a warning, and I've reported it before, but the
> > perf_fuzzer triggers this fairly regularly on my Haswell system.
> >
> > It looks like fixed counter 0 (retired instructions) being set to
> > ff
* Vince Weaver wrote:
> So this is just a warning, and I've reported it before, but the
> perf_fuzzer triggers this fairly regularly on my Haswell system.
>
> It looks like fixed counter 0 (retired instructions) being set to
> fffe occasionally causes an irq loop storm and gets
>
te is cleared.
[ 8224.179407] [ cut here ]
[ 8224.184368] WARNING: CPU: 0 PID: 0 at
arch/x86/kernel/cpu/perf_event_intel.c:1602 intel_pmu_handle_irq+0x2bc/0x450()
[ 8224.195686] perfevents: irq loop stuck!
[ 8224.199835] Modules linked in: fuse x86_pkg_temp_thermal intel_power
On Mon, 19 May 2014, Vince Weaver wrote:
> The fuzzing also turned up a few other issues, and in the end after 2 days
> it locked up the machine so hard that it also took out the ethernet switch
> due to some sort of packet trasmit storm, which is a failure mode I
> have to admit I haven't enco
t storm, which is a failure mode I
have to admit I haven't encountered before.
Vince
[69213.252805] [ cut here ]
[69213.260637] WARNING: CPU: 4 PID: 11343 at
arch/x86/kernel/cpu/perf_event_intel.c:1373 intel_pmu_handle_irq+0x2a4/0x3c0()
[69213.276788] perfevents: irq lo
On Fri, May 16, 2014 at 12:25:28AM -0400, Vince Weaver wrote:
> anyway I'm not sure if it's worth tracking this more if it's possible to
> mostly fix the case by fixing the sample_period bounds.
Right, so lets start with that, if it triggers again, we'll have another
look.
FWIW I ran with the be
On Thu, 15 May 2014, Peter Zijlstra wrote:
> > So, not sure how to fix this without a total re-write, unless we want to
> > cheat and just say sample_period is capped at 62-bits or something.
>
> 63 bits should do I think, but yes, we hit a very similar but a few days
> ago in the sched_deadline
On Wed, May 14, 2014 at 10:55:40PM -0400, Vince Weaver wrote:
> On Tue, 13 May 2014, Vince Weaver wrote:
>
> > pe[32].sample_period=0xc0bd;
> >
> > Should it be possible to open an event with a large negative sample_period
> > like that?
>
> so this seems to be a real bug.
On Tue, 13 May 2014, Vince Weaver wrote:
> pe[32].sample_period=0xc0bd;
>
> Should it be possible to open an event with a large negative sample_period
> like that?
so this seems to be a real bug.
attr->sample_period is a u64 value, but internally it gets cast to
s64 and is
33692] perfevents: irq loop stuck!
[ 425.839116] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp
coretemp kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw
gf128mul snd_hda_intel
26 matches
Mail list logo