We've been able to reproduce the bug in a more isolated environment.

I wrote a Python script (pgslam.py) that generates the (correct enough)
similar load to our production traffic. In addition, I wrote a bash
script that will setup a hi1.4xlarge EC2 instance to reproduce the
issue. During the tests, I launched the pgslam.py script from another
instance and pointed it at the instance prepared with the bash script:

This command results in the EC2 instance built with that script locking
up in under a minute:

$ python pgslam.py 'host=10.10.10.10 user=pgslam password=pgslam' 800

These messages appear in the console log:

706342.844192] BUG: soft lockup - CPU#7 stuck for 23s! [postgres:9266]
[706342.844272] Stack:
[706342.844296] Call Trace:
[706342.844409] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
[706370.844190] BUG: soft lockup - CPU#7 stuck for 23s! [postgres:9266]
[706370.844519] Stack:
[706370.844549] Call Trace:
[706370.844916] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
[706371.320186] INFO: rcu_sched detected stalls on CPUs/tasks: { 0 11 13} 
(detected by 7, t=15002 jiffies)
[706406.844191] BUG: soft lockup - CPU#7 stuck for 24s! [postgres:9266]
[706406.844293] Stack:
[706406.844330] Call Trace:
[706406.844461] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
[706434.844191] BUG: soft lockup - CPU#7 stuck for 22s! [postgres:9266]
[706434.844273] Stack:
[706434.844297] Call Trace:
[706434.844411] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
[706462.844192] BUG: soft lockup - CPU#7 stuck for 22s! [postgres:9266]
[706462.844273] Stack:
[706462.844297] Call Trace:
[706462.844412] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 
59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to