Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 03:54:36PM -0700, Andrew Morton wrote: > On Tue, 21 Jul 2015 15:07:57 -0700 Spencer Baugh wrote: > > > From: Joern Engel > > > > We have observed cases where the soft lockup detector triggered, but no > > kernel bug existed. Instead we had a buggy realtime thread that

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Andrew Morton
On Tue, 21 Jul 2015 15:07:57 -0700 Spencer Baugh wrote: > From: Joern Engel > > We have observed cases where the soft lockup detector triggered, but no > kernel bug existed. Instead we had a buggy realtime thread that > monopolized a cpu. So let's kill the responsible party and not panic >

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 09:35:28AM +0200, Mike Galbraith wrote: > On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: > > > One could argue that killing the realtime thread is even better than > > panic, as things can restart with a blank slate even faster. But the > > real benefit is that we

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Don Zickus
On Wed, Jul 22, 2015 at 09:35:28AM +0200, Mike Galbraith wrote: > On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: > > > One could argue that killing the realtime thread is even better than > > panic, as things can restart with a blank slate even faster. But the > > real benefit is that we

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Mike Galbraith
On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: > One could argue that killing the realtime thread is even better than > panic, as things can restart with a blank slate even faster. But the > real benefit is that we get better debug data for the failing component. > If we had a kernel bug,

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread yalin wang
> On Jul 22, 2015, at 06:07, Spencer Baugh wrote: > > From: Joern Engel > > We have observed cases where the soft lockup detector triggered, but no > kernel bug existed. Instead we had a buggy realtime thread that > monopolized a cpu. So let's kill the responsible party and not panic > the

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 07:41:48AM +0200, Mike Galbraith wrote: > On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote: > > > > Not sure if this patch is something for mainline, but those two > > alternatives have problems of their own. Not panicking on lockups can > > leave a system disabled

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 09:35:28AM +0200, Mike Galbraith wrote: On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: One could argue that killing the realtime thread is even better than panic, as things can restart with a blank slate even faster. But the real benefit is that we get

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread yalin wang
On Jul 22, 2015, at 06:07, Spencer Baugh sba...@catern.com wrote: From: Joern Engel jo...@logfs.org We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's kill the responsible

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 07:41:48AM +0200, Mike Galbraith wrote: On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote: Not sure if this patch is something for mainline, but those two alternatives have problems of their own. Not panicking on lockups can leave a system disabled until some

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Don Zickus
On Wed, Jul 22, 2015 at 09:35:28AM +0200, Mike Galbraith wrote: On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: One could argue that killing the realtime thread is even better than panic, as things can restart with a blank slate even faster. But the real benefit is that we get

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Mike Galbraith
On Tue, 2015-07-21 at 23:33 -0700, Jörn Engel wrote: One could argue that killing the realtime thread is even better than panic, as things can restart with a blank slate even faster. But the real benefit is that we get better debug data for the failing component. If we had a kernel bug, the

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Andrew Morton
On Tue, 21 Jul 2015 15:07:57 -0700 Spencer Baugh sba...@catern.com wrote: From: Joern Engel jo...@logfs.org We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's kill the

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-22 Thread Jörn Engel
On Wed, Jul 22, 2015 at 03:54:36PM -0700, Andrew Morton wrote: On Tue, 21 Jul 2015 15:07:57 -0700 Spencer Baugh sba...@catern.com wrote: From: Joern Engel jo...@logfs.org We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Mike Galbraith
On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote: > On Wed, Jul 22, 2015 at 06:36:30AM +0200, Mike Galbraith wrote: > > On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: > > > > > We have observed cases where the soft lockup detector triggered, but no > > > kernel bug existed. Instead

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Jörn Engel
On Wed, Jul 22, 2015 at 06:36:30AM +0200, Mike Galbraith wrote: > On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: > > > We have observed cases where the soft lockup detector triggered, but no > > kernel bug existed. Instead we had a buggy realtime thread that > > monopolized a cpu. So

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Mike Galbraith
On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: > We have observed cases where the soft lockup detector triggered, but no > kernel bug existed. Instead we had a buggy realtime thread that > monopolized a cpu. So let's kill the responsible party and not panic > the entire system. If you

[PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Spencer Baugh
From: Joern Engel We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's kill the responsible party and not panic the entire system. Signed-off-by: Joern Engel Signed-off-by: Spencer

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Mike Galbraith
On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's kill the responsible party and not panic the entire system. If you

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Jörn Engel
On Wed, Jul 22, 2015 at 06:36:30AM +0200, Mike Galbraith wrote: On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's

Re: [PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Mike Galbraith
On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote: On Wed, Jul 22, 2015 at 06:36:30AM +0200, Mike Galbraith wrote: On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote: We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a

[PATCH] soft lockup: kill realtime threads before panic

2015-07-21 Thread Spencer Baugh
From: Joern Engel jo...@logfs.org We have observed cases where the soft lockup detector triggered, but no kernel bug existed. Instead we had a buggy realtime thread that monopolized a cpu. So let's kill the responsible party and not panic the entire system. Signed-off-by: Joern Engel