Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-09 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 08:57:28AM -0800, Eric Dumazet wrote:
> 
> Well, remove all alien patches and try to reproduce the bug with a
> pristine linux kernel.

I wrote to Spender (developer grsec) and he confirmed that it's possible
that a problem is with grsec patch.

Thank you greatly for your answers!

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-09 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 08:57:28AM -0800, Eric Dumazet wrote:
 
 Well, remove all alien patches and try to reproduce the bug with a
 pristine linux kernel.

I wrote to Spender (developer grsec) and he confirmed that it's possible
that a problem is with grsec patch.

Thank you greatly for your answers!

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 20:37 +0400, Alexey Vlasov wrote:
> On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:
> >
> > What are gr_ symbols ?
> 
> This is grsecurity patches ;)
>  

Well, remove all alien patches and try to reproduce the bug with a
pristine linux kernel.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread richard -rw- weinberger
On Thu, Mar 7, 2013 at 5:37 PM, Alexey Vlasov  wrote:
> On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:
>>
>> What are gr_ symbols ?
>
> This is grsecurity patches ;)

Please reproduce without grsec...

-- 
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:
>
> What are gr_ symbols ?

This is grsecurity patches ;)
 
> Mar  7 00:50:00 l25 [1735187.889877]  [] ? 
> is_path_reachable+0x48/0x60
> Mar  7 00:50:00 l25 [1735187.889880]  [] ? 
> path_is_under+0x33/0x60
> Mar  7 00:50:00 l25 [1735187.889887]  [] ? 
> gr_is_outside_chroot+0x54/0x70
> Mar  7 00:50:00 l25 [1735187.889890]  [] ? 
> gr_chroot_fchdir+0x55/0x80
> Mar  7 00:50:00 l25 [1735187.889894]  [] ? 
> filename_lookup.clone.39+0x9e/0xe0
> Mar  7 00:50:00 l25 [1735187.889897]  [] ? 
> user_path_at_empty+0x5c/0xb0
> Mar  7 00:50:00 l25 [1735187.889903]  [] ? 
> __do_page_fault+0x1b9/0x480
> Mar  7 00:50:00 l25 [1735187.889907]  [] ? 
> page_fault+0x22/0x30
> Mar  7 00:50:00 l25 [1735187.889910]  [] ? 
> vfs_fstatat+0x3e/0x90
> Mar  7 00:50:00 l25 [1735187.889914]  [] ? 
> gr_learn_resource+0x3b/0x1e0
> Mar  7 00:50:00 l25 [1735187.889918]  [] ? 
> sys_newstat+0x1f/0x50
> Mar  7 00:50:00 l25 [1735187.889922]  [] ? 
> filp_close+0x54/0x80
> Mar  7 00:50:00 l25 [1735187.889925]  [] ? 
> page_fault+0x22/0x30
> Mar  7 00:50:00 l25 [1735187.889928]  [] ? 
> system_call_fastpath+0x18/0x1d
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 16:54 +0400, Alexey Vlasov wrote:
> Hi,
> 
> On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
> > >
> > > I used 2.6.2x kernel for a long time on my shared hosting and I didn't
> > > have any problems. Kernels worked well and server uptime was about 2-3
> > > years.
> > > 
> > > ...
> > > 
> > > it doesn't happen on an empty server, only on loaded ones. Unfortunately
> > > I don't know how to provoke such hanging artificially.
> > > 
>  
> > Your traces dont contain symbols, its quite hard to guess the issue.
> 
> Well the server got high loaded and began to crash almost once a day.
> 
> =
> BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
> Call Trace:
> [] ? mntput_no_expire+0x25/0x170
> [] ? path_lookupat+0x189/0x890
> [] ? filename_lookup.clone.39+0xd7/0xe0
> [] ? user_path_at_empty+0x5c/0xb0
> [] ? __do_page_fault+0x1b9/0x480
> [] ? vfs_fstatat+0x3e/0x90
> [] ? remove_vma+0x5f/0x70
> [] ? sys_newstat+0x1f/0x50
> [] ? page_fault+0x22/0x30
> [] ? system_call_fastpath+0x18/0x1d
> =
> 
> There's a full trace in attachment. 
> 


Seems a VFS issue. 

A "umount" is done, blocking almost all other cpus in lg_local_lock()

What are gr_ symbols ?

Mar  7 00:50:00 l25 [1735187.889877]  [] ? 
is_path_reachable+0x48/0x60
Mar  7 00:50:00 l25 [1735187.889880]  [] ? 
path_is_under+0x33/0x60
Mar  7 00:50:00 l25 [1735187.889887]  [] ? 
gr_is_outside_chroot+0x54/0x70
Mar  7 00:50:00 l25 [1735187.889890]  [] ? 
gr_chroot_fchdir+0x55/0x80
Mar  7 00:50:00 l25 [1735187.889894]  [] ? 
filename_lookup.clone.39+0x9e/0xe0
Mar  7 00:50:00 l25 [1735187.889897]  [] ? 
user_path_at_empty+0x5c/0xb0
Mar  7 00:50:00 l25 [1735187.889903]  [] ? 
__do_page_fault+0x1b9/0x480
Mar  7 00:50:00 l25 [1735187.889907]  [] ? 
page_fault+0x22/0x30
Mar  7 00:50:00 l25 [1735187.889910]  [] ? 
vfs_fstatat+0x3e/0x90
Mar  7 00:50:00 l25 [1735187.889914]  [] ? 
gr_learn_resource+0x3b/0x1e0
Mar  7 00:50:00 l25 [1735187.889918]  [] ? 
sys_newstat+0x1f/0x50
Mar  7 00:50:00 l25 [1735187.889922]  [] ? 
filp_close+0x54/0x80
Mar  7 00:50:00 l25 [1735187.889925]  [] ? 
page_fault+0x22/0x30
Mar  7 00:50:00 l25 [1735187.889928]  [] ? 
system_call_fastpath+0x18/0x1d


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x (include full log)

2013-03-07 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 05:34:14PM +0400, Alexey Vlasov wrote:
> 
> There's a full trace in attachment.
 
-- 
BRGDS. Alexey Vlasov.


bug_softlockup.txt.gz
Description: Binary data


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
Hi,

On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
> >
> > I used 2.6.2x kernel for a long time on my shared hosting and I
> > didn't
> > have any problems. Kernels worked well and server uptime was about
> > 2-3
> > years.
> >
> > ...
> >
> > it doesn't happen on an empty server, only on loaded ones.
> > Unfortunately
> > I don't know how to provoke such hanging artificially.
> >

> Your traces dont contain symbols, its quite hard to guess the issue.

Well the server got high loaded and began to crash almost once a day.

=
BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
Call Trace:
[] ? mntput_no_expire+0x25/0x170
[] ? path_lookupat+0x189/0x890
[] ? filename_lookup.clone.39+0xd7/0xe0
[] ? user_path_at_empty+0x5c/0xb0
[] ? __do_page_fault+0x1b9/0x480
[] ? vfs_fstatat+0x3e/0x90
[] ? remove_vma+0x5f/0x70
[] ? sys_newstat+0x1f/0x50
[] ? page_fault+0x22/0x30
[] ? system_call_fastpath+0x18/0x1d
=

There's a full trace in attachment.

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
Hi,

On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
> >
> > I used 2.6.2x kernel for a long time on my shared hosting and I didn't
> > have any problems. Kernels worked well and server uptime was about 2-3
> > years.
> > 
> > ...
> > 
> > it doesn't happen on an empty server, only on loaded ones. Unfortunately
> > I don't know how to provoke such hanging artificially.
> > 
 
> Your traces dont contain symbols, its quite hard to guess the issue.

Well the server got high loaded and began to crash almost once a day.

=
BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
Call Trace:
[] ? mntput_no_expire+0x25/0x170
[] ? path_lookupat+0x189/0x890
[] ? filename_lookup.clone.39+0xd7/0xe0
[] ? user_path_at_empty+0x5c/0xb0
[] ? __do_page_fault+0x1b9/0x480
[] ? vfs_fstatat+0x3e/0x90
[] ? remove_vma+0x5f/0x70
[] ? sys_newstat+0x1f/0x50
[] ? page_fault+0x22/0x30
[] ? system_call_fastpath+0x18/0x1d
=

There's a full trace in attachment. 

-- 
BRGDS. Alexey Vlasov.


bug_softlockup.txt.gz
Description: Binary data


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
Hi,

On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
 
  I used 2.6.2x kernel for a long time on my shared hosting and I didn't
  have any problems. Kernels worked well and server uptime was about 2-3
  years.
  
  ...
  
  it doesn't happen on an empty server, only on loaded ones. Unfortunately
  I don't know how to provoke such hanging artificially.
  
 
 Your traces dont contain symbols, its quite hard to guess the issue.

Well the server got high loaded and began to crash almost once a day.

=
BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
Call Trace:
[8110bba5] ? mntput_no_expire+0x25/0x170
[810f9389] ? path_lookupat+0x189/0x890
[810f9b67] ? filename_lookup.clone.39+0xd7/0xe0
[810fc85c] ? user_path_at_empty+0x5c/0xb0
[8102b5f9] ? __do_page_fault+0x1b9/0x480
[810f146e] ? vfs_fstatat+0x3e/0x90
[810c54bf] ? remove_vma+0x5f/0x70
[810f168f] ? sys_newstat+0x1f/0x50
[814b09c2] ? page_fault+0x22/0x30
[814b0f49] ? system_call_fastpath+0x18/0x1d
=

There's a full trace in attachment. 

-- 
BRGDS. Alexey Vlasov.


bug_softlockup.txt.gz
Description: Binary data


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
Hi,

On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
 
  I used 2.6.2x kernel for a long time on my shared hosting and I
  didn't
  have any problems. Kernels worked well and server uptime was about
  2-3
  years.
 
  ...
 
  it doesn't happen on an empty server, only on loaded ones.
  Unfortunately
  I don't know how to provoke such hanging artificially.
 

 Your traces dont contain symbols, its quite hard to guess the issue.

Well the server got high loaded and began to crash almost once a day.

=
BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
Call Trace:
[8110bba5] ? mntput_no_expire+0x25/0x170
[810f9389] ? path_lookupat+0x189/0x890
[810f9b67] ? filename_lookup.clone.39+0xd7/0xe0
[810fc85c] ? user_path_at_empty+0x5c/0xb0
[8102b5f9] ? __do_page_fault+0x1b9/0x480
[810f146e] ? vfs_fstatat+0x3e/0x90
[810c54bf] ? remove_vma+0x5f/0x70
[810f168f] ? sys_newstat+0x1f/0x50
[814b09c2] ? page_fault+0x22/0x30
[814b0f49] ? system_call_fastpath+0x18/0x1d
=

There's a full trace in attachment.

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x (include full log)

2013-03-07 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 05:34:14PM +0400, Alexey Vlasov wrote:
 
 There's a full trace in attachment.
 
-- 
BRGDS. Alexey Vlasov.


bug_softlockup.txt.gz
Description: Binary data


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 16:54 +0400, Alexey Vlasov wrote:
 Hi,
 
 On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
  
   I used 2.6.2x kernel for a long time on my shared hosting and I didn't
   have any problems. Kernels worked well and server uptime was about 2-3
   years.
   
   ...
   
   it doesn't happen on an empty server, only on loaded ones. Unfortunately
   I don't know how to provoke such hanging artificially.
   
  
  Your traces dont contain symbols, its quite hard to guess the issue.
 
 Well the server got high loaded and began to crash almost once a day.
 
 =
 BUG: soft lockup - CPU#1 stuck for 23s! [httpd:21686]
 Call Trace:
 [8110bba5] ? mntput_no_expire+0x25/0x170
 [810f9389] ? path_lookupat+0x189/0x890
 [810f9b67] ? filename_lookup.clone.39+0xd7/0xe0
 [810fc85c] ? user_path_at_empty+0x5c/0xb0
 [8102b5f9] ? __do_page_fault+0x1b9/0x480
 [810f146e] ? vfs_fstatat+0x3e/0x90
 [810c54bf] ? remove_vma+0x5f/0x70
 [810f168f] ? sys_newstat+0x1f/0x50
 [814b09c2] ? page_fault+0x22/0x30
 [814b0f49] ? system_call_fastpath+0x18/0x1d
 =
 
 There's a full trace in attachment. 
 


Seems a VFS issue. 

A umount is done, blocking almost all other cpus in lg_local_lock()

What are gr_ symbols ?

Mar  7 00:50:00 l25 [1735187.889877]  [8110e118] ? 
is_path_reachable+0x48/0x60
Mar  7 00:50:00 l25 [1735187.889880]  [8110e163] ? 
path_is_under+0x33/0x60
Mar  7 00:50:00 l25 [1735187.889887]  [812257a4] ? 
gr_is_outside_chroot+0x54/0x70
Mar  7 00:50:00 l25 [1735187.889890]  [81225815] ? 
gr_chroot_fchdir+0x55/0x80
Mar  7 00:50:00 l25 [1735187.889894]  [810f9b2e] ? 
filename_lookup.clone.39+0x9e/0xe0
Mar  7 00:50:00 l25 [1735187.889897]  [810fc85c] ? 
user_path_at_empty+0x5c/0xb0
Mar  7 00:50:00 l25 [1735187.889903]  [8102b5f9] ? 
__do_page_fault+0x1b9/0x480
Mar  7 00:50:00 l25 [1735187.889907]  [814b09c2] ? 
page_fault+0x22/0x30
Mar  7 00:50:00 l25 [1735187.889910]  [810f146e] ? 
vfs_fstatat+0x3e/0x90
Mar  7 00:50:00 l25 [1735187.889914]  [812278cb] ? 
gr_learn_resource+0x3b/0x1e0
Mar  7 00:50:00 l25 [1735187.889918]  [810f168f] ? 
sys_newstat+0x1f/0x50
Mar  7 00:50:00 l25 [1735187.889922]  [810ea4b4] ? 
filp_close+0x54/0x80
Mar  7 00:50:00 l25 [1735187.889925]  [814b09c2] ? 
page_fault+0x22/0x30
Mar  7 00:50:00 l25 [1735187.889928]  [814b0f49] ? 
system_call_fastpath+0x18/0x1d


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Alexey Vlasov
On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:

 What are gr_ symbols ?

This is grsecurity patches ;)
 
 Mar  7 00:50:00 l25 [1735187.889877]  [8110e118] ? 
 is_path_reachable+0x48/0x60
 Mar  7 00:50:00 l25 [1735187.889880]  [8110e163] ? 
 path_is_under+0x33/0x60
 Mar  7 00:50:00 l25 [1735187.889887]  [812257a4] ? 
 gr_is_outside_chroot+0x54/0x70
 Mar  7 00:50:00 l25 [1735187.889890]  [81225815] ? 
 gr_chroot_fchdir+0x55/0x80
 Mar  7 00:50:00 l25 [1735187.889894]  [810f9b2e] ? 
 filename_lookup.clone.39+0x9e/0xe0
 Mar  7 00:50:00 l25 [1735187.889897]  [810fc85c] ? 
 user_path_at_empty+0x5c/0xb0
 Mar  7 00:50:00 l25 [1735187.889903]  [8102b5f9] ? 
 __do_page_fault+0x1b9/0x480
 Mar  7 00:50:00 l25 [1735187.889907]  [814b09c2] ? 
 page_fault+0x22/0x30
 Mar  7 00:50:00 l25 [1735187.889910]  [810f146e] ? 
 vfs_fstatat+0x3e/0x90
 Mar  7 00:50:00 l25 [1735187.889914]  [812278cb] ? 
 gr_learn_resource+0x3b/0x1e0
 Mar  7 00:50:00 l25 [1735187.889918]  [810f168f] ? 
 sys_newstat+0x1f/0x50
 Mar  7 00:50:00 l25 [1735187.889922]  [810ea4b4] ? 
 filp_close+0x54/0x80
 Mar  7 00:50:00 l25 [1735187.889925]  [814b09c2] ? 
 page_fault+0x22/0x30
 Mar  7 00:50:00 l25 [1735187.889928]  [814b0f49] ? 
 system_call_fastpath+0x18/0x1d
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread richard -rw- weinberger
On Thu, Mar 7, 2013 at 5:37 PM, Alexey Vlasov ren...@renton.name wrote:
 On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:

 What are gr_ symbols ?

 This is grsecurity patches ;)

Please reproduce without grsec...

-- 
Thanks,
//richard
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-03-07 Thread Eric Dumazet
On Thu, 2013-03-07 at 20:37 +0400, Alexey Vlasov wrote:
 On Thu, Mar 07, 2013 at 08:20:23AM -0800, Eric Dumazet wrote:
 
  What are gr_ symbols ?
 
 This is grsecurity patches ;)
  

Well, remove all alien patches and try to reproduce the bug with a
pristine linux kernel.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-02-09 Thread Alexey Vlasov
On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
> Did you compile the kernel yourself, or is it a standard kernel (distro
> provided) ?
> 
> Your traces dont contain symbols, its quite hard to guess the issue.

I compile the kernel myself. Should I add CONFIG_DEBUG_INFO ? ok then
I'll try it. May be I should switch on anything else to get more info
for debug?
Thanks.

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-02-09 Thread Eric Dumazet
On Sat, 2013-02-09 at 18:10 +0400, Alexey Vlasov wrote:
> Hello.
> 
> I used 2.6.2x kernel for a long time on my shared hosting and I didn't
> have any problems. Kernels worked well and server uptime was about 2-3
> years.
> 
> But investigating some strange hangings of my clients' sites I came to
> this:
> http://bugs.mysql.com/bug.php?id=50399
> from this bug it is clear that on kernels younger than 2.6.32 (
> unfortunately I can't remember if it is true with 2.6.30-31) happens
> mysql client hanging.
> 
> It is not clear whether it is a bug of kernel or libc or mysql-client, I
> didn't manage to find it out. I decided to do simpler (as it seemed to
> me at that moment) to start using 2.6.3x kernels. And that caused
> greater problems. By trying to use new kernels on my working servers
> with peak load I got an uptime from an hour to 1-3 months.
> 
> I even got some statistics for how long can every kernel from version
> 2.6.32 work in peak load situations. It sounds funny but my clients are
> not happy with all these reboots.
> 
> From all the variety of servers from 2.6.32 to 3.7.4 I can say that
> 2.6.35 is the most stabil, I got about 30 servers on it. But they hang
> usually once in 1-3 months.
> 
> Returning to the problem of kernels >= 2.6.32, as I have noticed they
> hang totally alike, giving the console:
> 
> ...
> Feb  8 10:27:45 10.2.0.7 [470393.417168] BUG: soft lockup - CPU#2 stuck for 
> 61s! [vsftpd:29013]
> ...
> [see the attachment]
> 
> it doesn't happen on an empty server, only on loaded ones. Unfortunately
> I don't know how to provoke such hanging artificially.
> 
> I' ve given a trace attached. In fact I don't know what to do with all
> these bugs, I can't use 2.6.2x because of MySQL hanging and = >2.6.3
> start hanging themselves.
> 

Did you compile the kernel yourself, or is it a standard kernel (distro
provided) ?

Your traces dont contain symbols, its quite hard to guess the issue.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-02-09 Thread Eric Dumazet
On Sat, 2013-02-09 at 18:10 +0400, Alexey Vlasov wrote:
 Hello.
 
 I used 2.6.2x kernel for a long time on my shared hosting and I didn't
 have any problems. Kernels worked well and server uptime was about 2-3
 years.
 
 But investigating some strange hangings of my clients' sites I came to
 this:
 http://bugs.mysql.com/bug.php?id=50399
 from this bug it is clear that on kernels younger than 2.6.32 (
 unfortunately I can't remember if it is true with 2.6.30-31) happens
 mysql client hanging.
 
 It is not clear whether it is a bug of kernel or libc or mysql-client, I
 didn't manage to find it out. I decided to do simpler (as it seemed to
 me at that moment) to start using 2.6.3x kernels. And that caused
 greater problems. By trying to use new kernels on my working servers
 with peak load I got an uptime from an hour to 1-3 months.
 
 I even got some statistics for how long can every kernel from version
 2.6.32 work in peak load situations. It sounds funny but my clients are
 not happy with all these reboots.
 
 From all the variety of servers from 2.6.32 to 3.7.4 I can say that
 2.6.35 is the most stabil, I got about 30 servers on it. But they hang
 usually once in 1-3 months.
 
 Returning to the problem of kernels = 2.6.32, as I have noticed they
 hang totally alike, giving the console:
 
 ...
 Feb  8 10:27:45 10.2.0.7 [470393.417168] BUG: soft lockup - CPU#2 stuck for 
 61s! [vsftpd:29013]
 ...
 [see the attachment]
 
 it doesn't happen on an empty server, only on loaded ones. Unfortunately
 I don't know how to provoke such hanging artificially.
 
 I' ve given a trace attached. In fact I don't know what to do with all
 these bugs, I can't use 2.6.2x because of MySQL hanging and = 2.6.3
 start hanging themselves.
 

Did you compile the kernel yourself, or is it a standard kernel (distro
provided) ?

Your traces dont contain symbols, its quite hard to guess the issue.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: soft lockup on all kernels after 2.6.3x

2013-02-09 Thread Alexey Vlasov
On Sat, Feb 09, 2013 at 07:07:53AM -0800, Eric Dumazet wrote:
 Did you compile the kernel yourself, or is it a standard kernel (distro
 provided) ?
 
 Your traces dont contain symbols, its quite hard to guess the issue.

I compile the kernel myself. Should I add CONFIG_DEBUG_INFO ? ok then
I'll try it. May be I should switch on anything else to get more info
for debug?
Thanks.

-- 
BRGDS. Alexey Vlasov.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/