Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-30 Thread Ben Hutchings
[Dropped John Stultz from the cc list as this doesn't seem to relate to
his changes.]

On Fri, 2012-04-27 at 15:30 +0200, Sven Hoexter wrote:
 On Fri, Apr 27, 2012 at 04:19:21AM +0100, Ben Hutchings wrote:
 
 Hi,
 
  So it looks like in this case at least you're seeing a bug in USB error
  recovery and not anything to do with timing using the TSC vs HPET.
 
 Ok, a few minutes ago I got aware of another system with the given
 symptoms with a SandyBridge CPU (E3-1220) and the same CallTrace.
 
 
 I'm now wondering what I could do to help to debug such issue
 further?
 
 
 All those Dell systems (R210 and R210 II) are equipped with DRAC
 KVM cards. They expose the input of the KVM Java applet as USB
 devices to the system. For at least one of the effected systems
 I'm pretty sure that nobody tried to use the KVM applet while
 that issue appearead but that doesn't mean anything.
 External USB devices were not connected to the systems.
 
 Kernel is linux-image-2.6.32-5-amd64  2.6.32-38

The description of an update for RHEL 6
https://rhn.redhat.com/errata/RHSA-2012-0481.html says:

BZ#797205
Due to a bug in the hid_reset() function, a deadlock could occur
when a Dell iDRAC controller was reset. Consequently, its USB
keyboard or mouse device became unresponsive. A patch that fixes
the underlying code has been provided to address this bug and
the hangs no longer occur in the described scenario.

Could this possibly be the same bug?

Ben.

-- 
Ben Hutchings
Design a system any fool can use, and only a fool will want to use it.


signature.asc
Description: This is a digitally signed message part


Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-27 Thread Sven Hoexter
On Fri, Apr 27, 2012 at 04:19:21AM +0100, Ben Hutchings wrote:

Hi,

 So it looks like in this case at least you're seeing a bug in USB error
 recovery and not anything to do with timing using the TSC vs HPET.

Ok, a few minutes ago I got aware of another system with the given
symptoms with a SandyBridge CPU (E3-1220) and the same CallTrace.


I'm now wondering what I could do to help to debug such issue
further?


All those Dell systems (R210 and R210 II) are equipped with DRAC
KVM cards. They expose the input of the KVM Java applet as USB
devices to the system. For at least one of the effected systems
I'm pretty sure that nobody tried to use the KVM applet while
that issue appearead but that doesn't mean anything.
External USB devices were not connected to the systems.

Kernel is linux-image-2.6.32-5-amd64  2.6.32-38

Sven



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-26 Thread Ben Hutchings
On Thu, 2012-04-26 at 16:17 +0200, Sven Hoexter wrote:
 On Thu, Apr 26, 2012 at 01:45:30PM +0100, Ben Hutchings wrote:
 
 Hi,
 
  You can use 'echo w  /proc/sysrq-trigger' to get a traceback for
 all
  the tasks in D state, which might provide some clues.
 
 ok, see the attached file.


 Apr 26 16:08:34 vdf1 kernel: [6726714.281854] SysRq : Show Blocked State
 Apr 26 16:08:34 vdf1 kernel: [6726714.281883]   task
 PC stack   pid father
 Apr 26 16:08:34 vdf1 kernel: [6726714.281912] events/0  D 
  015  2 0x
 Apr 26 16:08:34 vdf1 kernel: [6726714.281946]  814891f0 
 0046  88043e4e5c0c
 Apr 26 16:08:34 vdf1 kernel: [6726714.281997]   
 88000fa15780 f9e0 88043e4e5fd8
 Apr 26 16:08:34 vdf1 kernel: [6726714.282048]  00015780 
 00015780 88043e4e8000 88043e4e82f8
 Apr 26 16:08:34 vdf1 kernel: [6726714.282099] Call Trace:
 Apr 26 16:08:34 vdf1 kernel: [6726714.282127]  [8105af0e] ? 
 __mod_timer+0x141/0x153
 Apr 26 16:08:34 vdf1 kernel: [6726714.287558]  [8105a9f4] ? 
 try_to_del_timer_sync+0x63/0x6c
 Apr 26 16:08:34 vdf1 kernel: [6726714.287589]  [812fbd24] ? 
 schedule_timeout+0xa5/0xdd
 Apr 26 16:08:34 vdf1 kernel: [6726714.287617]  [8105aa88] ? 
 process_timeout+0x0/0x5
 Apr 26 16:08:34 vdf1 kernel: [6726714.287651]  [a00fc8f7] ? 
 ehci_endpoint_disable+0xa4/0x141 [ehci_hcd]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287699]  [a00b8f0d] ? 
 usb_ep0_reinit+0x13/0x34 [usbcore]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287731]  [a00b970a] ? 
 usb_reset_and_verify_device+0x87/0x3d6 [usbcore]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287779]  [a00be1c7] ? 
 usb_kill_urb+0x10/0xbb [usbcore]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287811]  [a00be1c7] ? 
 usb_kill_urb+0x10/0xbb [usbcore]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287842]  [a00b9aed] ? 
 usb_reset_device+0x94/0x124 [usbcore]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287873]  [a00d96a0] ? 
 hid_reset+0x91/0x122 [usbhid]
 Apr 26 16:08:34 vdf1 kernel: [6726714.287903]  [810619f7] ? 
 worker_thread+0x188/0x21d
 Apr 26 16:08:34 vdf1 kernel: [6726714.287932]  [a00d960f] ? 
 hid_reset+0x0/0x122 [usbhid]

This is a global workqueue task, and it's running a work item that does
error handling for a USB HID (keyboard or mouse, or possibly a UPS).
Presumably this is taking a long time for some reason.

[...]
 Apr 26 16:08:34 vdf1 kernel: [6726714.288131] sshd  D 
  0 17858  17854 0x
 Apr 26 16:08:34 vdf1 kernel: [6726714.288164]  88043e473880 
 0086  0ffc
 Apr 26 16:08:34 vdf1 kernel: [6726714.288215]   
  f9e0 880270a55fd8
 Apr 26 16:08:34 vdf1 kernel: [6726714.288266]  00015780 
 00015780 88043c100710 88043c100a08
 Apr 26 16:08:34 vdf1 kernel: [6726714.288316] Call Trace:
 Apr 26 16:08:34 vdf1 kernel: [6726714.288340]  [8100f6c4] ? 
 __switch_to+0x1ad/0x297
 Apr 26 16:08:34 vdf1 kernel: [6726714.288367]  [812fbcad] ? 
 schedule_timeout+0x2e/0xdd
 Apr 26 16:08:34 vdf1 kernel: [6726714.288397]  [810482ed] ? 
 finish_task_switch+0x3a/0xaf
 Apr 26 16:08:34 vdf1 kernel: [6726714.288426]  [812fb8f0] ? 
 thread_return+0x79/0xe0
 Apr 26 16:08:34 vdf1 kernel: [6726714.288453]  [812fbb64] ? 
 wait_for_common+0xde/0x15b
 Apr 26 16:08:34 vdf1 kernel: [6726714.288482]  [8104a4cc] ? 
 default_wake_function+0x0/0x9
 Apr 26 16:08:34 vdf1 kernel: [6726714.288510]  [81062326] ? 
 flush_work+0x75/0x87
 Apr 26 16:08:34 vdf1 kernel: [6726714.288538]  [81061d00] ? 
 wq_barrier_func+0x0/0x9
 Apr 26 16:08:34 vdf1 kernel: [6726714.288567]  [811fbd44] ? 
 n_tty_poll+0x5e/0x138
[...]

This is the sshd task, and it's waiting for a terminal-related work item
to finish.  Presumably this is queued behind the work item for the USB
HID cleanup.

So it looks like in this case at least you're seeing a bug in USB error
recovery and not anything to do with timing using the TSC vs HPET.

Ben.

-- 
Ben Hutchings
I haven't lost my mind; it's backed up on tape somewhere.


signature.asc
Description: This is a digitally signed message part


Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-26 Thread Sven Hoexter
On Thu, Apr 26, 2012 at 04:49:56AM +0100, Ben Hutchings wrote:
 On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:

Hi,

  Searching through munin graphs we could narrow down the starting point of 
  this issue
  to the point when the hpet interrupts for one CPU core multiplied. 
  Sometimes they
  multiplied by six. Looking further we've found the Kernel [events/$x] in 
  state D
  where $x is the number of the CPU core which has the high number of hpet 
  interrupts.
 
  When we started strace -f on the sshd master process everything works until 
  you logout.
  Then you'll again see the forked sshd process hanging in state D.
 
 This is strange, because D state means uninterruptible sleep (not
 handling signals).  But perhaps the sshd process was repeatedly changing
 between uninterruptible and interruptible state.

Is it possible to gather such data? I guess grep'ing through ps output
is not the right tool here.

From a system currently suffering from this issue:

ps aux|grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root15  0.0  0.0  0 0 ?DApr25   0:53 [events/0]
root  4162  0.0  0.0  0 0 ?Ds   08:33   0:00 [bash]
480   7875  0.0  0.0  0 0 ?Ds   09:28   0:00 [bash]
root  9407  0.0  0.0  76644  3392 ?Ds   09:49   0:00 sshd: 
root@pts/79
480  11310  0.0  0.0   8940   884 ?S09:59   0:00 grep D
480  11765  0.0  0.0  0 0 ?Ds   Apr25   0:00 [bash]
root 12803  0.0  0.0  76644  3392 ?Ds   Apr25   0:00 sshd: 
root@pts/12
root 13762  0.0  0.0  76644  3392 ?Ds   Apr25   0:00 sshd: 
root@pts/73
root 15111  0.0  0.0  0 0 ?Ds   Apr25   0:00 [bash]
root 19361  0.0  0.0  0 0 ?Ds   Apr25   0:00 [bash]
root 20966  0.0  0.0  0 0 ?Ds   Apr25   0:00 [bash]
root 29323  0.0  0.0  0 0 ?Ds   Apr25   0:00 [bash]

Sven



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-26 Thread Sven Hoexter
On Wed, Apr 25, 2012 at 09:33:44PM -0700, John Stultz wrote:

Hi,

 When you can connect to the system that is having problems, do you
 see any problems with the time? ie: does date show the correct time,
 and does it increment normally?

I don't see any jumps in time here:

while true; do /sbin/hwclock --show; date; done
Thu 26 Apr 2012 09:50:19 AM CEST  -0.781566 seconds
Thu Apr 26 09:50:18 CEST 2012
Thu 26 Apr 2012 09:50:20 AM CEST  -1.000351 seconds
Thu Apr 26 09:50:19 CEST 2012
Thu 26 Apr 2012 09:50:21 AM CEST  -1.000321 seconds
Thu Apr 26 09:50:20 CEST 2012
Thu 26 Apr 2012 09:50:22 AM CEST  -1.000299 seconds
Thu Apr 26 09:50:21 CEST 2012
Thu 26 Apr 2012 09:50:23 AM CEST  -1.000342 seconds
Thu Apr 26 09:50:22 CEST 2012
Thu 26 Apr 2012 09:50:24 AM CEST  -1.000315 seconds
Thu Apr 26 09:50:23 CEST 2012
Thu 26 Apr 2012 09:50:25 AM CEST  -0.984703 seconds
Thu Apr 26 09:50:24 CEST 2012
Thu 26 Apr 2012 09:50:26 AM CEST  -1.000319 seconds
Thu Apr 26 09:50:25 CEST 2012
Thu 26 Apr 2012 09:50:27 AM CEST  -1.000324 seconds
Thu Apr 26 09:50:26 CEST 2012
Thu 26 Apr 2012 09:50:28 AM CEST  -1.000320 seconds
Thu Apr 26 09:50:27 CEST 2012
Thu 26 Apr 2012 09:50:29 AM CEST  -1.000326 seconds
Thu Apr 26 09:50:28 CEST 2012
Thu 26 Apr 2012 09:50:30 AM CEST  -1.000307 seconds
Thu Apr 26 09:50:29 CEST 2012
Thu 26 Apr 2012 09:50:31 AM CEST  -1.000337 seconds
Thu Apr 26 09:50:30 CEST 2012
Thu 26 Apr 2012 09:50:32 AM CEST  -1.004597 seconds
Thu Apr 26 09:50:32 CEST 2012
Thu 26 Apr 2012 09:50:33 AM CEST  -0.984694 seconds
Thu Apr 26 09:50:32 CEST 2012
Thu 26 Apr 2012 09:50:34 AM CEST  -1.000320 seconds
Thu Apr 26 09:50:33 CEST 2012
Thu 26 Apr 2012 09:50:35 AM CEST  -1.000321 seconds
Thu Apr 26 09:50:34 CEST 2012
Thu 26 Apr 2012 09:50:36 AM CEST  -1.000320 seconds
Thu Apr 26 09:50:35 CEST 2012
Thu 26 Apr 2012 09:50:37 AM CEST  -1.000321 seconds
Thu Apr 26 09:50:36 CEST 2012
Thu 26 Apr 2012 09:50:38 AM CEST  -1.004611 seconds
Thu Apr 26 09:50:38 CEST 2012
Thu 26 Apr 2012 09:50:39 AM CEST  -0.984718 seconds
Thu Apr 26 09:50:38 CEST 2012
Thu 26 Apr 2012 09:50:40 AM CEST  -1.000274 seconds
Thu Apr 26 09:50:39 CEST 2012
Thu 26 Apr 2012 09:50:41 AM CEST  -1.000315 seconds


 It sounds like if there is some HPET irq issue, it would likely be
 due to some sort of global wakeup to handle local apics that halt in
 deep sleep modes.  Its likely that getting /proc/timer_list output
 would help (both before and after the problem).

I've attached /proc/timer_list from a system that is currently
suffering from this problem. Unfortunately I've not the state
before. I can only grab it from a system that is currently
not effected but suffered in the past if it helps?

Regards,
Sven
Timer List Version: v0.5
HRTIMER_MAX_CLOCK_BASES: 2
now at 83615781793881 nsecs

cpu: 0
 clock 0:
  .base:   880008a100c8
  .index:  0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset: 1335342613769994271 nsecs
active timers:
 #0: 8801a0967d08, hrtimer_wakeup, S:01, futex_wait_queue_me, java/18099
 # expires at 1335426229557084000-1335426229557134000 nsecs [in 
1335342613775290119 to 1335342613775340119 nsecs]
 #1: 88023deb7d08, hrtimer_wakeup, S:01, futex_wait_queue_me, java/18106
 # expires at 1335426254026767000-1335426254026817000 nsecs [in 
1335342638244973119 to 1335342638245023119 nsecs]
 #2: 8801f184dd08, hrtimer_wakeup, S:01, futex_wait_queue_me, java/18105
 # expires at 1335426270572844000-1335426270572894000 nsecs [in 
1335342654791050119 to 1335342654791100119 nsecs]
 #3: 8801f3d9bd08, hrtimer_wakeup, S:01, futex_wait_queue_me, java/18110
 # expires at 1335426275040889000-1335426275040939000 nsecs [in 
1335342659259095119 to 1335342659259145119 nsecs]
 #4: 8801a097fd08, hrtimer_wakeup, S:01, futex_wait_queue_me, java/18111
 # expires at 133542628422666-133542628422671 nsecs [in 
1335342668444866119 to 1335342668444916119 nsecs]
 clock 1:
  .base:   880008a10108
  .index:  1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset: 0 nsecs
active timers:
 #0: 880008a101b0, tick_sched_timer, S:01, tick_nohz_restart_sched_tick, 
swapper/0
 # expires at 8361578400-8361578400 nsecs [in 2206119 to 2206119 nsecs]
 #1: 88021ebbb968, hrtimer_wakeup, S:01, schedule_hrtimeout_range, 
init/5900
 # expires at 83616008255524-83616013255522 nsecs [in 226461643 to 231461641 
nsecs]
 #2: 8802181f1968, hrtimer_wakeup, S:01, schedule_hrtimeout_range, 
init/6795
 # expires at 83616584188235-83616589188233 nsecs [in 802394354 to 807394352 
nsecs]
 #3: 88021d9af968, hrtimer_wakeup, S:01, schedule_hrtimeout_range, 
init/6032
 # expires at 83616708138883-83616713138881 nsecs [in 926345002 to 931345000 
nsecs]
 #4: 88023030d968, hrtimer_wakeup, S:01, schedule_hrtimeout_range, 
mysqld/4211
 # expires at 83616968110539-83616973110538 nsecs [in 1186316658 to 1191316657 
nsecs]
 #5: 880215f7b968, hrtimer_wakeup, S:01, schedule_hrtimeout_range, 
init/7172
 # expires at 

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-26 Thread Ben Hutchings
On Thu, 2012-04-26 at 10:02 +0200, Sven Hoexter wrote:
 On Thu, Apr 26, 2012 at 04:49:56AM +0100, Ben Hutchings wrote:
  On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:
 
 Hi,
 
   Searching through munin graphs we could narrow down the starting point of 
   this issue
   to the point when the hpet interrupts for one CPU core multiplied. 
   Sometimes they
   multiplied by six. Looking further we've found the Kernel [events/$x] in 
   state D
   where $x is the number of the CPU core which has the high number of hpet 
   interrupts.
  
   When we started strace -f on the sshd master process everything works 
   until you logout.
   Then you'll again see the forked sshd process hanging in state D.
  
  This is strange, because D state means uninterruptible sleep (not
  handling signals).  But perhaps the sshd process was repeatedly changing
  between uninterruptible and interruptible state.
 
 Is it possible to gather such data? I guess grep'ing through ps output
 is not the right tool here.
 
 From a system currently suffering from this issue:
[...]

You can use 'echo w  /proc/sysrq-trigger' to get a traceback for all
the tasks in D state, which might provide some clues.

Ben.

-- 
Ben Hutchings
For every action, there is an equal and opposite criticism. - Harrison


signature.asc
Description: This is a digitally signed message part


Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-26 Thread Sven Hoexter
On Thu, Apr 26, 2012 at 01:45:30PM +0100, Ben Hutchings wrote:

Hi,

 You can use 'echo w  /proc/sysrq-trigger' to get a traceback for all
 the tasks in D state, which might provide some clues.

ok, see the attached file.

Regards,
Sven

Apr 26 16:08:34 vdf1 kernel: [6726714.281854] SysRq : Show Blocked State
Apr 26 16:08:34 vdf1 kernel: [6726714.281883]   taskPC 
stack   pid father
Apr 26 16:08:34 vdf1 kernel: [6726714.281912] events/0  D   
   015  2 0x
Apr 26 16:08:34 vdf1 kernel: [6726714.281946]  814891f0 
0046  88043e4e5c0c
Apr 26 16:08:34 vdf1 kernel: [6726714.281997]   
88000fa15780 f9e0 88043e4e5fd8
Apr 26 16:08:34 vdf1 kernel: [6726714.282048]  00015780 
00015780 88043e4e8000 88043e4e82f8
Apr 26 16:08:34 vdf1 kernel: [6726714.282099] Call Trace:
Apr 26 16:08:34 vdf1 kernel: [6726714.282127]  [8105af0e] ? 
__mod_timer+0x141/0x153
Apr 26 16:08:34 vdf1 kernel: [6726714.287558]  [8105a9f4] ? 
try_to_del_timer_sync+0x63/0x6c
Apr 26 16:08:34 vdf1 kernel: [6726714.287589]  [812fbd24] ? 
schedule_timeout+0xa5/0xdd
Apr 26 16:08:34 vdf1 kernel: [6726714.287617]  [8105aa88] ? 
process_timeout+0x0/0x5
Apr 26 16:08:34 vdf1 kernel: [6726714.287651]  [a00fc8f7] ? 
ehci_endpoint_disable+0xa4/0x141 [ehci_hcd]
Apr 26 16:08:34 vdf1 kernel: [6726714.287699]  [a00b8f0d] ? 
usb_ep0_reinit+0x13/0x34 [usbcore]
Apr 26 16:08:34 vdf1 kernel: [6726714.287731]  [a00b970a] ? 
usb_reset_and_verify_device+0x87/0x3d6 [usbcore]
Apr 26 16:08:34 vdf1 kernel: [6726714.287779]  [a00be1c7] ? 
usb_kill_urb+0x10/0xbb [usbcore]
Apr 26 16:08:34 vdf1 kernel: [6726714.287811]  [a00be1c7] ? 
usb_kill_urb+0x10/0xbb [usbcore]
Apr 26 16:08:34 vdf1 kernel: [6726714.287842]  [a00b9aed] ? 
usb_reset_device+0x94/0x124 [usbcore]
Apr 26 16:08:34 vdf1 kernel: [6726714.287873]  [a00d96a0] ? 
hid_reset+0x91/0x122 [usbhid]
Apr 26 16:08:34 vdf1 kernel: [6726714.287903]  [810619f7] ? 
worker_thread+0x188/0x21d
Apr 26 16:08:34 vdf1 kernel: [6726714.287932]  [a00d960f] ? 
hid_reset+0x0/0x122 [usbhid]
Apr 26 16:08:34 vdf1 kernel: [6726714.287961]  [8106502a] ? 
autoremove_wake_function+0x0/0x2e
Apr 26 16:08:34 vdf1 kernel: [6726714.287990]  [8106186f] ? 
worker_thread+0x0/0x21d
Apr 26 16:08:34 vdf1 kernel: [6726714.288018]  [81064d5d] ? 
kthread+0x79/0x81
Apr 26 16:08:34 vdf1 kernel: [6726714.288046]  [81011baa] ? 
child_rip+0xa/0x20
Apr 26 16:08:34 vdf1 kernel: [6726714.288072]  [81064ce4] ? 
kthread+0x0/0x81
Apr 26 16:08:34 vdf1 kernel: [6726714.288098]  [81011ba0] ? 
child_rip+0x0/0x20
Apr 26 16:08:34 vdf1 kernel: [6726714.288131] sshd  D   
   0 17858  17854 0x
Apr 26 16:08:34 vdf1 kernel: [6726714.288164]  88043e473880 
0086  0ffc
Apr 26 16:08:34 vdf1 kernel: [6726714.288215]   
 f9e0 880270a55fd8
Apr 26 16:08:34 vdf1 kernel: [6726714.288266]  00015780 
00015780 88043c100710 88043c100a08
Apr 26 16:08:34 vdf1 kernel: [6726714.288316] Call Trace:
Apr 26 16:08:34 vdf1 kernel: [6726714.288340]  [8100f6c4] ? 
__switch_to+0x1ad/0x297
Apr 26 16:08:34 vdf1 kernel: [6726714.288367]  [812fbcad] ? 
schedule_timeout+0x2e/0xdd
Apr 26 16:08:34 vdf1 kernel: [6726714.288397]  [810482ed] ? 
finish_task_switch+0x3a/0xaf
Apr 26 16:08:34 vdf1 kernel: [6726714.288426]  [812fb8f0] ? 
thread_return+0x79/0xe0
Apr 26 16:08:34 vdf1 kernel: [6726714.288453]  [812fbb64] ? 
wait_for_common+0xde/0x15b
Apr 26 16:08:34 vdf1 kernel: [6726714.288482]  [8104a4cc] ? 
default_wake_function+0x0/0x9
Apr 26 16:08:34 vdf1 kernel: [6726714.288510]  [81062326] ? 
flush_work+0x75/0x87
Apr 26 16:08:34 vdf1 kernel: [6726714.288538]  [81061d00] ? 
wq_barrier_func+0x0/0x9
Apr 26 16:08:34 vdf1 kernel: [6726714.288567]  [811fbd44] ? 
n_tty_poll+0x5e/0x138
Apr 26 16:08:34 vdf1 kernel: [6726714.288594]  [811f8732] ? 
tty_poll+0x56/0x6d
Apr 26 16:08:34 vdf1 kernel: [6726714.288622]  [810fc782] ? 
do_select+0x37b/0x57a
Apr 26 16:08:34 vdf1 kernel: [6726714.288650]  [810fcdfb] ? 
__pollwait+0x0/0xd6
Apr 26 16:08:34 vdf1 kernel: [6726714.288677]  [810fced1] ? 
pollwake+0x0/0x5b
Apr 26 16:08:34 vdf1 kernel: [6726714.288704]  [810fced1] ? 
pollwake+0x0/0x5b
Apr 26 16:08:34 vdf1 kernel: [6726714.288731]  [810fced1] ? 
pollwake+0x0/0x5b
Apr 26 16:08:34 vdf1 kernel: [6726714.288757]  [810fced1] ? 
pollwake+0x0/0x5b
Apr 26 16:08:34 vdf1 kernel: [6726714.288785]  [8127f29b] ? 
tcp_recvmsg+0x98b/0xa9e
Apr 26 16:08:34 vdf1 kernel: [6726714.288813]  [8103fc9e] ? 
update_curr+0xa6/0x147
Apr 26 16:08:34 vdf1 kernel: 

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-25 Thread Sven Hoexter
Package: linux-image-2.6.32-5-amd64
Version: 2.6.32-41squeeze2
Severity: important

Hi,

since about December 2011 we've seen systems were SSH sessions suddenly hang and
further logins on the physical TTY or via SSH are no longer possible. In some 
cases
ssh logins still work and you see motd and mayeb can even issue one or two 
commands.
(I've brought this issue up on debian-user in march with a private reply from a
fellow DD yesterday http://lists.debian.org/debian-user/2012/03/msg01204.html)


Over time we observed that ssh logins without PTS (ssh -T) still work. Looking 
at
other sessions sshd was in state and D entries in /dev/pts/ were created 
correctly.
Searching through munin graphs we could narrow down the starting point of this 
issue
to the point when the hpet interrupts for one CPU core multiplied. Sometimes 
they
multiplied by six. Looking further we've found the Kernel [events/$x] in state D
where $x is the number of the CPU core which has the high number of hpet 
interrupts.

When we started strace -f on the sshd master process everything works until you 
logout.
Then you'll again see the forked sshd process hanging in state D.

Up to that point we've seen this issue exclusively on Linux 2.6.32 based 
systems,
most often on Debian/Squeeze and less often on Ubuntu 10.04 and once or twice on
a RHEL 6.1 system.

Searching further I've seen references on a Dell PowerEdge mailinglist 
referencing
RedHat BZ#750201 and Intel CPU errata number AAO67 for Nehalem (rapid C state 
switching).
The RedHat bug is currently non-public but through our technical contact at 
RedHat I was
able to receive a summary of this bug and other referenced bugs which describe 
more or
less exactly our issue.

According to RedHat that should be fixed in their Kernel 2.6.32-220.7.1.el6
citing the following in the changelog:
- [x86] hpet: Disable per-cpu hpet timer if ARAT is supported (Prarit Bhargava) 
[772884 750201]
- [x86] Improve TSC calibration using a delayed workqueue (Prarit Bhargava) 
[772884 750201]
- [kernel] clocksource: Add clocksource_register_hz/khz interface (Prarit 
Bhargava) [772884 750201]
- [kernel] clocksource: Provide a generic mult/shift factor calculation (Prarit 
Bhargava) [772884 750201]

(Maybe that helps to track down the relevant changes.)

As a workaround it could work to disable C-states in the BIOS or on the Kernel 
commandline
with intel_idle.max_cstate=0 processor.max_cstate=1.
Since we run into that issue only from time to time on the same system we could 
not yet
verify either workaround. Rumours indicate that sometimes disabling it in the 
BIOS did
not help because the Kernel enabled C-states again.

My current guess is that it's somehow related to the Intel Nehalem CPU bug and 
only happens
if you have a high single threaded load which leads to one or core cores are 
switched into
a C-6 sleep state so that they can overclock one core. Marketing name is 
TurboBoost.

Regarding the CPUs I know this happens with:
- Intel X3430
- Intel X3450
- Intel L3426

We see it in almost all cases on Dell R210 with the X3430 CPUs.
Rumours claim it also happens with other Dell models based on other CPUs from 
the
Intel Nehalem series with TurboBoost. 


Would be great if someone could track down the needed changes and incorporate 
those
into a point release. In general I would be available for testing but we still 
have
no way reproduce it beside waiting a few month. :(


Regards,
Sven

-- System Information:
Debian Release: 6.0.4
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-0.bpo.1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.32-5-amd64 depends on:
ii  debconf [debconf-2.0]   1.5.36.1 Debian configuration management sy
ii  initramfs-tools [linux-init 0.99~bpo60+1 tools for generating an initramfs
ii  linux-base  3.4~bpo60+1  Linux image base package
ii  module-init-tools   3.12-2   tools for managing Linux kernel mo

Versions of packages linux-image-2.6.32-5-amd64 recommends:
pn  firmware-linux-free   none (no description available)

Versions of packages linux-image-2.6.32-5-amd64 suggests:
pn  grub | lilo   none (no description available)
pn  linux-doc-2.6.32  none (no description available)



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-25 Thread Ben Hutchings
On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:
 Package: linux-image-2.6.32-5-amd64
 Version: 2.6.32-41squeeze2
 Severity: important
 
 Hi,
 
 since about December 2011 we've seen systems were SSH sessions suddenly hang 
 and
 further logins on the physical TTY or via SSH are no longer possible. In some 
 cases
 ssh logins still work and you see motd and mayeb can even issue one or two 
 commands.
 (I've brought this issue up on debian-user in march with a private reply from 
 a
 fellow DD yesterday http://lists.debian.org/debian-user/2012/03/msg01204.html)
 
 
 Over time we observed that ssh logins without PTS (ssh -T) still work. 
 Looking at
 other sessions sshd was in state and D entries in /dev/pts/ were created 
 correctly.
 Searching through munin graphs we could narrow down the starting point of 
 this issue
 to the point when the hpet interrupts for one CPU core multiplied. Sometimes 
 they
 multiplied by six. Looking further we've found the Kernel [events/$x] in 
 state D
 where $x is the number of the CPU core which has the high number of hpet 
 interrupts.

 When we started strace -f on the sshd master process everything works until 
 you logout.
 Then you'll again see the forked sshd process hanging in state D.

This is strange, because D state means uninterruptible sleep (not
handling signals).  But perhaps the sshd process was repeatedly changing
between uninterruptible and interruptible state.

 Up to that point we've seen this issue exclusively on Linux 2.6.32 based 
 systems,
 most often on Debian/Squeeze and less often on Ubuntu 10.04 and once or twice 
 on
 a RHEL 6.1 system.
 
 Searching further I've seen references on a Dell PowerEdge mailinglist 
 referencing
 RedHat BZ#750201 and Intel CPU errata number AAO67 for Nehalem (rapid C state 
 switching).
 The RedHat bug is currently non-public but through our technical contact at 
 RedHat I was
 able to receive a summary of this bug and other referenced bugs which 
 describe more or
 less exactly our issue.
 
 According to RedHat that should be fixed in their Kernel 2.6.32-220.7.1.el6
 citing the following in the changelog:
 - [x86] hpet: Disable per-cpu hpet timer if ARAT is supported (Prarit 
 Bhargava) [772884 750201]

This is also in Linux 2.6.32.30 and therefore in Debian's version
2.6.32-31.

 - [x86] Improve TSC calibration using a delayed workqueue (Prarit Bhargava) 
 [772884 750201]
 - [kernel] clocksource: Add clocksource_register_hz/khz interface (Prarit 
 Bhargava) [772884 750201]
 - [kernel] clocksource: Provide a generic mult/shift factor calculation 
 (Prarit Bhargava) [772884 750201]

These are not in Linux 2.6.32-longterm or Debian stable.

 (Maybe that helps to track down the relevant changes.)

Based on those summary lines, I think the upstream changes are:

commit 08ec0c58fb8a05d3191d5cb6f5d6f81adb419798 (v2.6.38-rc1~480^2~1)
Author: John Stultz johns...@us.ibm.com
Date:   Tue Jul 27 17:00:00 2010 -0700

x86: Improve TSC calibration using a delayed workqueue

commit f12a15be63d1de9a35971f35f06b73088fa25c3a (v2.6.36-rc1~514^2~4)
Author: John Stultz johns...@us.ibm.com
Date:   Tue Jul 13 17:56:27 2010 -0700

x86: Convert common clocksources to use clocksource_register_hz/khz

commit 7d2f944a2b836c69a9d260a0a5f0d1720d57fdff (v2.6.33-rc1~363^2~12)
Author: Thomas Gleixner t...@linutronix.de
Date:   Wed Nov 11 14:05:29 2009 +

clocksource: Provide a generic mult/shift factor calculation

The latter two are dependencies for the first, which is presumably the
really important change.  At a guess, better TSC calibration helps us to
avoid switching to the HPET.  John, do you know whether your changes
have that effect?

 As a workaround it could work to disable C-states in the BIOS or on the 
 Kernel commandline
 with intel_idle.max_cstate=0 processor.max_cstate=1.
 Since we run into that issue only from time to time on the same system we 
 could not yet
 verify either workaround. Rumours indicate that sometimes disabling it in the 
 BIOS did
 not help because the Kernel enabled C-states again.
 
 My current guess is that it's somehow related to the Intel Nehalem CPU bug 
 and only happens
 if you have a high single threaded load which leads to one or core cores are 
 switched into
 a C-6 sleep state so that they can overclock one core. Marketing name is 
 TurboBoost.
 
 Regarding the CPUs I know this happens with:
 - Intel X3430
 - Intel X3450
 - Intel L3426
 
 We see it in almost all cases on Dell R210 with the X3430 CPUs.
 Rumours claim it also happens with other Dell models based on other CPUs from 
 the
 Intel Nehalem series with TurboBoost. 
 
 
 Would be great if someone could track down the needed changes and incorporate 
 those
 into a point release. In general I would be available for testing but we 
 still have
 no way reproduce it beside waiting a few month. :(

This really isn't a viable means of testing a bug fix.

So we're really going to need to get some sort of explanation of how
these 

Bug#670398: linux-image-2.6.32-5-amd64: SSH logins hang while hpet interrupts multiply on Intel Nehalem CPUs

2012-04-25 Thread John Stultz

On 04/25/2012 08:49 PM, Ben Hutchings wrote:

On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote:

Package: linux-image-2.6.32-5-amd64
Version: 2.6.32-41squeeze2
Severity: important

Hi,

since about December 2011 we've seen systems were SSH sessions suddenly hang and
further logins on the physical TTY or via SSH are no longer possible. In some 
cases
ssh logins still work and you see motd and mayeb can even issue one or two 
commands.
(I've brought this issue up on debian-user in march with a private reply from a
fellow DD yesterday http://lists.debian.org/debian-user/2012/03/msg01204.html)


Over time we observed that ssh logins without PTS (ssh -T) still work. Looking 
at
other sessions sshd was in state and D entries in /dev/pts/ were created 
correctly.
Searching through munin graphs we could narrow down the starting point of this 
issue
to the point when the hpet interrupts for one CPU core multiplied. Sometimes 
they
multiplied by six. Looking further we've found the Kernel [events/$x] in state D
where $x is the number of the CPU core which has the high number of hpet 
interrupts.

When we started strace -f on the sshd master process everything works until you 
logout.
Then you'll again see the forked sshd process hanging in state D.

This is strange, because D state means uninterruptible sleep (not
handling signals).  But perhaps the sshd process was repeatedly changing
between uninterruptible and interruptible state.


[snip]

Based on those summary lines, I think the upstream changes are:

commit 08ec0c58fb8a05d3191d5cb6f5d6f81adb419798 (v2.6.38-rc1~480^2~1)
Author: John Stultzjohns...@us.ibm.com
Date:   Tue Jul 27 17:00:00 2010 -0700

 x86: Improve TSC calibration using a delayed workqueue

commit f12a15be63d1de9a35971f35f06b73088fa25c3a (v2.6.36-rc1~514^2~4)
Author: John Stultzjohns...@us.ibm.com
Date:   Tue Jul 13 17:56:27 2010 -0700

 x86: Convert common clocksources to use clocksource_register_hz/khz

commit 7d2f944a2b836c69a9d260a0a5f0d1720d57fdff (v2.6.33-rc1~363^2~12)
Author: Thomas Gleixnert...@linutronix.de
Date:   Wed Nov 11 14:05:29 2009 +

 clocksource: Provide a generic mult/shift factor calculation

The latter two are dependencies for the first, which is presumably the
really important change.  At a guess, better TSC calibration helps us to
avoid switching to the HPET.  John, do you know whether your changes
have that effect?
No, the TSC calibration just makes sure we get the same fine-grained 
cpukhz value every bootup, which avoids time skew. It should not affect 
if we switch to the HPET, which would be a sign of a unsynced or halting 
TSC.


When you can connect to the system that is having problems, do you see 
any problems with the time? ie: does date show the correct time, and 
does it increment normally?


It sounds like if there is some HPET irq issue, it would likely be due 
to some sort of global wakeup to handle local apics that halt in deep 
sleep modes.  Its likely that getting /proc/timer_list output would help 
(both before and after the problem).


Also I'd try to bring in tglx for his thoughts.

thanks
-john




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org