[Bug 1787258] Re: 3.13.0-155.205 Kernel Panic - divide by zero

2018-08-15 Thread Matt Wilson
What instance type saw this kernel panic?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1787258

Title:
  3.13.0-155.205 Kernel Panic - divide by zero

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1787258/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1780548] [NEW] SSH server won't start, exit code 255

2018-07-07 Thread Matt Wilson
Public bug reported:

I keep trying to set up external SSH access using openssh server on my
18.04 system and it throws back this error

sudo service ssh status
● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor preset: 
enabled)
   Active: failed (Result: exit-code) since Sat 2018-07-07 09:33:19 CDT; 12min 
ago
  Process: 3243 ExecStartPre=/usr/sbin/sshd -t (code=exited, status=255)

Jul 07 09:33:19 warehouse systemd[1]: ssh.service: Service hold-off time over, 
scheduling restart.
Jul 07 09:33:19 warehouse systemd[1]: ssh.service: Scheduled restart job, 
restart counter is at 5.
Jul 07 09:33:19 warehouse systemd[1]: Stopped OpenBSD Secure Shell server.
Jul 07 09:33:19 warehouse systemd[1]: ssh.service: Start request repeated too 
quickly.
Jul 07 09:33:19 warehouse systemd[1]: ssh.service: Failed with result 
'exit-code'.
Jul 07 09:33:19 warehouse systemd[1]: Failed to start OpenBSD Secure Shell 
server.

I was in the process of uninstalling the openssh-server and ssh packages
and was prompted to start a bug report.  If it's in error, just let me
know.  My ssh config file is all default except for
passwordauthentication = yes.  I've toggled that to default as well, and
still get the same error.

ProblemType: Package
DistroRelease: Ubuntu 18.04
Package: openssh-server 1:7.6p1-4
ProcVersionSignature: Ubuntu 4.15.0-23.25-generic 4.15.18
Uname: Linux 4.15.0-23-generic x86_64
NonfreeKernelModules: kpatch_livepatch_Ubuntu_4_15_0_23_25_generic_40
ApportVersion: 2.20.9-0ubuntu7.2
AptOrdering:
 openssh-server:amd64: Install
 ssh:amd64: Install
 NULL: ConfigurePending
Architecture: amd64
Date: Sat Jul  7 09:48:11 2018
ErrorMessage: installed openssh-server package post-installation script 
subprocess returned error exit status 1
InstallationDate: Installed on 2018-07-07 (0 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
Python3Details: /usr/bin/python3.6, Python 3.6.5, python3-minimal, 3.6.5-3
PythonDetails: N/A
RelatedPackageVersions:
 dpkg 1.19.0.5ubuntu2
 apt  1.6.2
SSHDConfig:
 Error: command ['/usr/sbin/sshd', '-T'] failed with exit code 255: 
/etc/ssh/sshd_config: line 1: Bad configuration option: \342\200\213\342\200\213
 /etc/ssh/sshd_config: terminating, 1 bad configuration options
SourcePackage: openssh
Title: package openssh-server 1:7.6p1-4 failed to install/upgrade: installed 
openssh-server package post-installation script subprocess returned error exit 
status 1
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: openssh (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-package bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1780548

Title:
  SSH server won't start, exit code 255

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/1780548/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1668129] Re: Amazon I3 Instance Buffer I/O error on dev nvme0n1

2017-03-01 Thread Matt Wilson
I imagine CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is set for the Ubuntu
kernel?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129

Title:
  Amazon I3 Instance Buffer I/O error on dev nvme0n1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668129/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1668129] Re: Amazon I3 Instance Buffer I/O error on dev nvme0n1

2017-03-01 Thread Matt Wilson
Yes, ballooning has been a constant source of problems which is why it
is disabled in Amazon Linux AMI.

We do not currently support DMA to/from guest physical addresses outside
of the E820 map for ENA networking or NVMe storage interfaces. This
effectively means that ballooning needs to be disabled, or perhaps some
changes would need to be made in the Xen swiotlb code to bounce data
that resides in guest physical addresses that are outside of the E820
map.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129

Title:
  Amazon I3 Instance Buffer I/O error on dev nvme0n1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668129/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1668129] Re: Amazon I3 Instance Buffer I/O error on dev nvme0n1

2017-03-01 Thread Matt Wilson
Dan,

It appears that the requests that are being submitted refer to DMA
addresses that exceed the guest physical memory range, and this is why
the requests are being failed. The address seen is outside the E820 map:

[ 0.00] e820: BIOS-provided physical RAM map:
[ 0.00] BIOS-e820: [mem 0x-0x0009dfff] usable
[ 0.00] BIOS-e820: [mem 0x0009e000-0x0009] reserved
[ 0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[ 0.00] BIOS-e820: [mem 0x0010-0x7fff] usable
[ 0.00] BIOS-e820: [mem 0xfc00-0x] reserved
[ 0.00] BIOS-e820: [mem 0x0001-0x000fbfff] usable
[ 0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[ 0.00] e820: remove [mem 0x000a-0x000f] usable
[ 0.00] e820: last_pfn = 0xfc max_arch_pfn = 0x4
[ 0.00] e820: last_pfn = 0x8 max_arch_pfn = 0x4
[ 0.00] e820: [mem 0x8000-0xfbff] available for PCI devices
[ 5.595004] e820: reserve RAM buffer [mem 0x0009e000-0x0009]

We see an address of 0xfc7ffb000

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129

Title:
  Amazon I3 Instance Buffer I/O error on dev nvme0n1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668129/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1497428] Re: kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

2016-01-11 Thread Matt Wilson
Dan,

This BUG_ON has been demoted to only trigger when DEBUG_VM is set in
upstream:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=97ee4ba7cbd30f1858f0d16911e042737c53f2ef

I'm looking into why there's a one page difference between the E820
tables and SRAT. You're right that there seems to be an off-by-one in
one or the other.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1497428

Title:
  kernel BUG at /build/buildd/linux-3.13.0/mm/page_alloc.c:968

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1497428/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349883] Re: dmesg time wildly incorrect on paravirtual EC2 instances.

2014-09-15 Thread Matt Wilson
Hi Stefan,

I looked at this a long time back (circa 2011), and things may have
changed since then. See:
https://forums.aws.amazon.com/thread.jspa?threadID=59753

When I looked at this last, we weren't emulating TSC and the CPUID flags
that advertise invariant TSC came through. This was making the scheduler
clock "stable" and printk() would use native_sched_clock() for x86, i.e.
rdtscll().

I don't know how this is involved with setting up the pvclock
timesource...

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349883

Title:
  dmesg time wildly incorrect on paravirtual EC2 instances.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349883/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1304001] Re: xen:balloon errors in 14.04 beta

2014-07-08 Thread Matt Wilson
Not precisely. What toolstack are you using? I can try to reproduce
outside of our control plane with a config file that would work on your
toolstack.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1304001

Title:
   xen:balloon errors in 14.04 beta

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304001/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1304001] Re: xen:balloon errors in 14.04 beta

2014-07-03 Thread Matt Wilson
Boris, all:

We did a test with disabling the SRAT entirely, but the balloon messages
persisted.

ami-af8d9ac6 (ubuntu/images-milestone/hvm/ubuntu-
trusty-14.04-beta2-amd64-server-20140326)

$ dmesg | grep -i 'srat\|node\|numa\|balloon' |grep -iv inode
[0.00] No NUMA configuration found
[0.00] Faking a node at [mem 0x-0xefff]
[0.00] Initmem setup node 0 [mem 0x-0xefff]
[0.00]   NODE_DATA [mem 0xefffa000-0xefffefff]
[0.00]  [ea00-ea0003bf] PMD -> 
[8800eba0-8800ef5f] on node 0
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009dfff]
[0.00]   node   0: [mem 0x0010-0xefff]
[0.00] On node 0 totalpages: 982941
[0.00] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:15 
nr_node_ids:1
[0.00] Built 1 zonelists in Node order, mobility grouping on.  Total 
pages: 967560
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=15, Nodes=1
[0.166158]  node  #0, CPUs:#1cpu 1 spinlock event irq 75
[0.181508] x86: Booted up 1 node, 2 CPUs
[0.427625] xen:balloon: Initialising balloon driver
[0.428050] xen_balloon: Initialising balloon driver
[0.432041] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[2.436064] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[6.444075] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[   14.460116] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[   30.492124] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[   62.556264] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[   94.620288] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[  126.684103] xen:balloon: reserve_additional_memory: add_memory() failed: -17
[  158.748099] xen:balloon: reserve_additional_memory: add_memory() failed: -17

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1304001

Title:
   xen:balloon errors in 14.04 beta

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304001/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1093644] Re: [Samsung NP535U3C-A03DE] Blank screen on resume from suspend

2013-01-26 Thread Matt Wilson
I have a very similar setup, and had the same issue.
Samsung NP535U3C-B01US
AMD A6-4455M APU with Radeon(tm) HD 7500G Graphics
3.5.0-17-generic
Ubuntu 12.10

Suspend ok, resume with blank screen, no backlight, but system was fully
function

*Not an expert; actually, I have no idea what I'm doing*

1. Installed FGLRX drivers from "additional drivers", did not work, black 
screen on reboot
2. Updated kernel to 3.5.0-23 and then 3.8 mainline, did not work.  Ended up 
causing some boot issues.  Unistalled everything but 3.5.0-17
3. Installed driver directly from AMD.  New version released 1/17/2013
http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx
Did not work, had an install issue, ended up my header was not installed.  No 
idea what that means.

4. Installed header.  Maybe I messed something up with the kernel updates.  
Sucess
sudo apt-get install linux-headers-$(uname -r)

5. Installed driver directly from AMD.  Success

Suspend and Resume now work correctly, and the screen comes back on as it 
should.  However, brightness function keys no longer work.
My root cause was a driver issue.

I was able to run "dmesg" and  "/sys/power/pm_trace" before the fix, so
if you are interested in seeing that, please let me know.  Similar setup
to Mr. Kukol

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1093644

Title:
  [Samsung NP535U3C-A03DE] Blank screen on resume from suspend

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1093644/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1078619] Re: [raring] xen power managment (freq scaling) fails on linux 3.7

2013-01-15 Thread Matt Wilson
See my post here: http://lists.xen.org/archives/html/xen-
devel/2013-01/msg00941.html

The correct values should be returned already via rdmsr if
"cpureq=dom0-kernel" is specified on the Xen command line. Looking at
the LP report, it doesn't seem that this option was used.

Likely you will also need to use "dom0_vcpu_pin=1" if you want dom0 to
be the cpufreq controller on AMD.

Matt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1078619

Title:
  [raring] xen power managment (freq scaling) fails on linux 3.7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1078619/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] Re: Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

2012-09-20 Thread Matt Wilson
For what it's worth, I started running this test case on the Amazon
Linux AMI (ami-aecd60c7) yesterday. It hasn't crashed. The DB is now >96
GiB.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1052275] Re: "BUG: Bad page state in process" when running on EC2

2012-09-17 Thread Matt Wilson
** Attachment added: "i-b557f6ce.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/1052275/+attachment/3321706/+files/i-b557f6ce.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1052275

Title:
  "BUG: Bad page state in process" when running on EC2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/1052275/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1052275] Re: "BUG: Bad page state in process" when running on EC2

2012-09-17 Thread Matt Wilson
** Attachment added: "i-af57f6d4.txt"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/1052275/+attachment/3321705/+files/i-af57f6d4.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1052275

Title:
  "BUG: Bad page state in process" when running on EC2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/1052275/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1052275] [NEW] "BUG: Bad page state in process" when running on EC2

2012-09-17 Thread Matt Wilson
Public bug reported:

After running for some time, several m1.large 64-bit instances started
repeatedly hitting this BUG_ON()

[525758.322281] BUG: Bad page state in process pdnsd  pfn:1d1a6f
[525758.322290] page:88000b26f848 flags:887c count:2 mapcount:0 
mapping:8800d2da0860 index:99
[525758.322294] Pid: 731, comm: pdnsd Not tainted 2.6.32-346-ec2 #51-Ubuntu
[525758.322296] Call Trace:
[525758.322305]  [] bad_page+0xd0/0x130
[525758.322307]  [] prep_new_page+0x1aa/0x1c0
[525758.322310]  [] ? zone_watermark_ok+0x25/0xe0
[525758.322312]  [] get_page_from_freelist+0x16b/0x550
[525758.322315]  [] __alloc_pages_nodemask+0xd6/0x180
[525758.322319]  [] do_anonymous_page+0x21d/0x540
[525758.322321]  [] handle_mm_fault+0x427/0x4f0
[525758.322333]  [] do_page_fault+0x147/0x390
[525758.322335]  [] page_fault+0x28/0x30

One instance ultimately hit a GPF:
[525758.336588] general protection fault:  [#1] SMP 
[525758.336598] last sysfs file: 
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
[525758.336601] CPU 1 
[525758.336603] Modules linked in: ipv6 raid0 md_mod
[525758.336610] Pid: 731, comm: pdnsd Tainted: GB  2.6.32-346-ec2 
#51-Ubuntu 
[525758.336613] RIP: e030:[]  [] 
get_page_from_freelist+0x1f6/0x550
[525758.336623] RSP: e02b:8801dce4bce8  EFLAGS: 00010096
[525758.336625] RAX: 816b1570 RBX: 816b1480 RCX: 
0040
[525758.336628] RDX: dead00100100 RSI:  RDI: 
0005
[525758.336630] RBP: 8801dce4bdb8 R08: 00010ffa R09: 

[525758.336633] R10: 0005 R11:  R12: 
88000b26f848
[525758.336636] R13: 0001 R14: dead00200200 R15: 
0002
[525758.336642] FS:  7f4b928ee700() GS:880002e7e000() 
knlGS:
[525758.336645] CS:  e033 DS:  ES:  CR0: 80050033
[525758.336647] CR2: 7f4b900e9ff8 CR3: 0001debdf000 CR4: 
2660
[525758.336650] DR0:  DR1:  DR2: 

[525758.336653] DR3:  DR6: 0ff0 DR7: 

[525758.336656] Process pdnsd (pid: 731, threadinfo 8801dce4a000, task 
8801dce40300)
[525758.336659] Stack:
[525758.336660]  8801dcdaf0c0 0002dcdaf0c0  
a3c0
[525758.336665] <0> 88010041 8801dcdaf0c0 dce4be28 
0001
[525758.336670] <0> 00030040  816b6088 
816b34c0
[525758.336677] Call Trace:
[525758.336682]  [] __alloc_pages_nodemask+0xd6/0x180
[525758.336687]  [] do_anonymous_page+0x21d/0x540
[525758.336690]  [] handle_mm_fault+0x427/0x4f0
[525758.336695]  [] do_page_fault+0x147/0x390
[525758.336698]  [] page_fault+0x28/0x30
[525758.336701] Code: 84 b0 00 00 00 4b 8d 44 ef 05 48 c1 e0 04 4c 8b 64 18 08 
49 83 ec 28 49 8b 44 24 30 49 8b 54 24 28 49 be 00 02 20 00 00 00 ad de <48> 89 
42 08 48 89 10 48 b
8 00 01 10 00 00 00 ad de 49 89 44 24 
[525758.336747] RIP  [] get_page_from_freelist+0x1f6/0x550
[525758.336752]  RSP 
[525758.336757] ---[ end trace 371c569b99678b87 ]---

** Affects: linux-ec2 (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1052275

Title:
  "BUG: Bad page state in process" when running on EC2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/1052275/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] Re: Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

2012-08-21 Thread Matt Wilson
"@Matt, when you produce those cpu stacktraces, how do you do that? Is
that from a dump or somehow tapping into the still running instance?"

@smb, these are traces from running, but unresponsive, instances. I pull
the traces from the vCPU context in the hypervisor, then resolve symbols
from the System.map of the kernel running in the guest.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Kernel lockup running 3.0.0 and 3.2.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] Re: Scheduler deadlock running 3.0.0 on multiple EC2 instance types

2012-08-15 Thread Matt Wilson
We've observed this on another instance running 12.04, ami-3c994355,
with a read-heavy postgresql load.

CPU0
rip: 8105e51a try_to_wake_up+0xca
flags: 1202 i nz
rsp: 880f22deb7d0
rax: 0004   rcx: 880f22deb900   rdx: 0082
rbx: 880c633196e0   rsi:    rdi: 880c63319cc0
rbp: 880f22deb820r8: ea0024739ec0r9: 0001
r10: 57ff5831a9739ec0   r11:    r12: 880c63319cc0
r13:    r14:    r15: 0003
 cs: e033ss: e02bds: es: 
 fs:  @ 7ff790365740
 gs:  @ 880f22de8000/
Code (instr addr 8105e51a)
66 90 45 85 ff 74 0e 44 8b 7d cc e9 aa 00 00 00 0f 1f 00 f3 90 <44> 8b 4b 28 45 
85 c9 75 f5 48 8b


Stack:
 81c70004 81c70004 0001 00041e7b3969
  880a931f9c70 0001 880f22fbcaf0
 880f22deb900  880f22deb830 8105e662
 880f22deb850 81089356 007fff57 880f22fbcad8

Call Trace:
  [] try_to_wake_up+0xca  <--
  [] default_wake_function+0x12
  [] autoremove_wake_function+0x16
  [] wake_bit_function+0x3b
  [] __wake_up_common+0x58
  [] __wake_up+0x48
  [] __wake_up_bit+0x31
  [] unlock_page+0x2a
  [] mpage_end_io+0x46
  [] bio_endio+0x1d
  [] req_bio_endio.isra.45+0xa3
  [] blk_update_request+0xf5
  [] blk_update_bidi_request+0x31
  [] __blk_end_bidi_request+0x20
  [] __blk_end_request_all+0x1f
  [] blkif_interrupt+0x1cc
  [] handle_irq_event_percpu+0x55
  [] radix_tree_lookup+0xb
  [] handle_irq_event+0x4e
  [] handle_edge_irq+0x84
  [] __xen_evtchn_do_upcall+0x199
  [] xen_evtchn_do_upcall+0x2f
  [] xen_do_hypervisor_callback+0x1e
  [] hypercall_page+0x3aa
  [] hypercall_page+0x3aa
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x97
  [] xen_spin_lock_flags+0x63
  [] _raw_spin_lock_irqsave+0x2e
  [] update_shares+0x9e
  [] rebalance_domains+0x48
  [] run_rebalance_domains+0x48
  [] __do_softirq+0xa8
  [] __xen_evtchn_do_upcall+0x207
  [] call_softirq+0x1c
  [] do_softirq+0x65
  [] irq_exit+0x8e
  [] xen_evtchn_do_upcall+0x35
  [] xen_do_hypervisor_callback+0x1e
CPU1
rip: 810013aa hypercall_page+0x3aa
flags: 1246 i z p
rsp: 880ec1225ec8
rax:    rcx: 810013aa   rdx: 
rbx: 880ec1225fd8   rsi:    rdi: 0001
rbp: 880ec1225ee0r8: r9: 
r10: 0001   r11: 0246   r12: 81cdbd60
r13: 0001   r14:    r15: 
 cs: e033ss: e02bds: 002bes: 002b
 fs:  @ 7ff790365740
 gs:  @ 880f22e04000/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
   8100a190 880ec1225f10
 8101b893 880ec1225fd8 81cdbd60 
  880ec1225f40 81012236 8100a9d9
 58e9d471e076535d   880ec1225f50

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_safe_halt+0x10
  [] default_idle+0x53
  [] cpu_idle+0xd6
  [] xen_irq_enable_direct_reloc+0x4
  [] cpu_bringup_and_idle+0xe
CPU2
rip: 810013aa hypercall_page+0x3aa
flags: 1202 i nz
rsp: 880ec0ef1930
rax:    rcx: 810013aa   rdx: 
rbx:    rsi: 880ec0ef1948   rdi: 0003
rbp: 880ec0ef1978r8: 880ec316d700r9: 880ee5400100
r10: 0022   r11: 0202   r12: 001d
r13: 0001   r14: 880ec0ef1a01   r15: 880c63319c00
 cs: e033ss: e02bds: es: 
 fs:  @ 7f7566bae700
 gs:  @ 880f22e2/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
 0246 fffa 813a464e 880ec0ef1964
 0001  0010813a404e 880ec0ef1978
 880c63319cc0 880ec0ef1988 813a6130 880ec0ef19d8
 8163a7a1 0002  

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x97
  [] xen_spin_lock_flags+0x63
  [] _raw_spin_lock_irqsave+0x2e
  [] task_rq_lock+0x40
  [] task_sched_runtime+0x2c
  [] thread_group_cputime+0x78
  [] thread_group_times+0x33
  [] do_task_stat+0x6c3
  [] mem_cgroup_add_lru_list+0x1a
  [] xen_set_pte_at+0x39
  [] pte_mfn_to_pfn+0x89
  [] xen_pte_val+0x30
  [] __raw_callee_save_xen_pte_val+0x11
  [] unlock_page+0x2a
  [] follow_page+0x322
  [] sys_mincore+0x131
  [] proc_tgid_stat+0x14
  [] proc_single_show+0x5c

[Bug 1011792] Re: Scheduler deadlock running 3.0.0 on multiple EC2 instance types

2012-08-14 Thread Matt Wilson
Stack traces from a second hi1.4xlarge running ami-8baa73e2:

CPU0
rip: 8105711a try_to_wake_up+0xca
flags: 1202 i nz
rsp: 880f22dfc740
rax: 0006   rcx: 880f22dfc870   rdx: 0082
rbx: 880b13d9   rsi:    rdi: 0001
rbp: 880f22dfc790r8: ea0008b0b118r9: 
r10: 57ffc73d4450b118   r11: 0202   r12: 880b13d905f8
r13:    r14:    r15: 0003
 cs: e033ss: e02bds: es: 
 fs:  @ 7f6132beb700
 gs:  @ 880f22df9000/
Code (instr addr 8105711a)
66 90 45 85 ff 74 0e 44 8b 7d cc e9 a8 00 00 00 0f 1f 00 f3 90 <8b> 7b 28 85 ff 
75 f7 48 8b 13 31


Stack:
 81007b02 00012940 00012940 000621c005e8
 8110b7f7 880b13dbfc30  880f22fc47d0
 880f22dfc870  880f22dfc7a0 81057262
 880f22dfc7c0 810813c6 810073ed 880f22fc47b8

Call Trace:
  [] try_to_wake_up+0xca  <--
  [] check_events+0x12
  [] mempool_free_slab+0x17
  [] default_wake_function+0x12
  [] autoremove_wake_function+0x16
  [] xen_force_evtchn_callback+0xd
  [] wake_bit_function+0x3b
  [] __wake_up_common+0x58
  [] mempool_free_slab+0x17
  [] __wake_up+0x48
  [] __wake_up_bit+0x31
  [] unlock_page+0x2a
  [] mpage_end_io+0x46
  [] bio_endio+0x1d
  [] dec_pending+0x87
  [] clone_endio+0xa3
  [] bio_endio+0x1d
  [] req_bio_endio.isra.45+0xa3
  [] blk_update_request+0xf5
  [] blk_update_bidi_request+0x31
  [] __blk_end_request_all+0x37
  [] blkif_interrupt+0x167
  [] handle_irq_event_percpu+0x55
  [] radix_tree_lookup+0xb
  [] handle_irq_event+0x4b
  [] handle_edge_irq+0x7c
  [] __xen_evtchn_do_upcall+0x199
  [] xen_evtchn_do_upcall+0x2f
  [] xen_do_hypervisor_callback+0x1e
  [] hypercall_page+0x3aa
  [] hypercall_page+0x3aa
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x98
  [] xen_spin_lock_flags+0x63
  [] _raw_spin_lock_irqsave+0x2e
  [] update_shares+0x92
  [] tcp_init_xmit_timers+0x30
  [] rebalance_domains+0x48
  [] xen_timer_interrupt+0x2c
  [] run_rebalance_domains+0x48
  [] __do_softirq+0xa8
  [] __xen_evtchn_do_upcall+0x207
  [] call_softirq+0x1c
  [] do_softirq+0x65
  [] irq_exit+0x8e
  [] xen_evtchn_do_upcall+0x35
  [] xen_do_hypervisor_callback+0x1e
CPU1
rip: 810013aa hypercall_page+0x3aa
flags: 1206 i nz p
rsp: 88001001b7a0
rax:    rcx: 810013aa   rdx: 
rbx:    rsi: 88001001b7b8   rdi: 0003
rbp: 88001001b7e8r8: 880ec85a3000r9: 880f21c000d0
r10: 6de60240   r11: 0206   r12: 0017
r13: 0001   r14: 880ec5acd801   r15: 
 cs: e033ss: e02bds: es: 
 fs:  @ 7fa13a3ac700
 gs:  @ 880f22e14000/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
 0001 fffa 8138512e 88001001b7d4
 0001  000a81384b3e 88001001b7e8
 81efce20 88001001b7f8 81386b80 88001001b848
 815ecdcd 1000  

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x98
  [] xen_spin_lock+0x4a
  [] _raw_spin_lock_irq+0x15
  [] __make_request+0xbd
  [] _dm_request.isra.21+0x104
  [] generic_make_request.part.51+0x24a
  [] bio_add_page+0x53
  [] do_mpage_readpage+0x470
  [] generic_make_request+0x45
  [] submit_bio+0x87
  [] mpage_readpages+0x120
  [] __alloc_pages_nodemask+0x109
  [] xen_force_evtchn_callback+0xd
  [] read_pages+0x48
  [] __do_page_cache_readahead+0x163
  [] ra_submit+0x21
  [] do_sync_mmap_readahead.isra.26+0x94
  [] filemap_fault+0x33e
  [] __do_fault+0x54
  [] pvclock_clocksource_read+0x55
  [] handle_pte_fault+0xfa
  [] xen_pmd_val+0xe
  [] __raw_callee_save_xen_pmd_val+0x11
  [] handle_mm_fault+0x1f8
  [] do_page_fault+0x14e
  [] pvclock_clocksource_read+0x55
  [] xen_clocksource_read+0x20
  [] xen_clocksource_get_cycles+0x9
  [] getnstimeofday+0x57
  [] page_fault+0x25
CPU2
rip: 810013aa hypercall_page+0x3aa
flags: 1202 i nz
rsp: 880f22e32c68
rax:    rcx: 810013aa   rdx: 
rbx:    rsi: 880f22e32c80   rdi: 0003
rbp: 880f22e32cb0r8: 880ec85a3600r9: 880f21c00100
r10: 0001   r11: 0202   r12: 001d
r13: 0001   r14: 880f22e0b901   r15: 880f22e92900
 cs: e033ss: e02bds: 002bes: 002b
 fs:  @ 7f8883dfd700
 gs:  @ 880f22e2f000/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc 

[Bug 1011792] Re: Scheduler deadlock running 3.0.0 on multiple EC2 instance types

2012-08-14 Thread Matt Wilson
CPU stack traces from a hi1.4xlarge PV instance running ami-8baa73e2:

CPU 0 is the only running CPU. The others are blocked.

CPU0
rip: 8105711a try_to_wake_up+0xca
flags: 1202 i nz
rsp: 880f22dfc870
rax: 0008   rcx:    rdx: 0002
rbx: 880012ca8000   rsi: 880e6af80048   rdi: 0001
rbp: 880f22dfc8c0r8: r9: 
r10:    r11:    r12: 880012ca85f8
r13:    r14: 880e6af80048   r15: 0003
 cs: e033ss: e02bds: es: 
 fs:  @ 7fd01a7e7700
 gs:  @ 880f22df9000/
Code (instr addr 8105711a)
66 90 45 85 ff 74 0e 44 8b 7d cc e9 a8 00 00 00 0f 1f 00 f3 90 <8b> 7b 28 85 ff 
75 f7 48 8b 13 31


Stack:
 880ec61086c0 8801a1f5dde8 880f22dfc890 000881079a45
  880e6af81648 0001 880ec6058708
   880f22dfc8d0 81057262
 880f22dfc8f0 810813c6 ea0015215a40 880ec60586f0

Call Trace:
  [] try_to_wake_up+0xca  <--
  [] default_wake_function+0x12
  [] autoremove_wake_function+0x16
  [] __wake_up_common+0x58
  [] xen_restore_fl_direct_reloc+0x4
  [] __wake_up+0x48
  [] __freed_request+0x66
  [] freed_request+0x3a
  [] __blk_put_request.part.53+0x63
  [] __blk_put_request+0x50
  [] blk_finish_request+0xb1
  [] __blk_end_request_all+0x4b
  [] blkif_interrupt+0x167
  [] handle_irq_event_percpu+0x55
  [] radix_tree_lookup+0xb
  [] handle_irq_event+0x4b
  [] handle_edge_irq+0x7c
  [] __xen_evtchn_do_upcall+0x199
  [] xen_evtchn_do_upcall+0x2f
  [] xen_do_hypervisor_callback+0x1e
  [] hypercall_page+0x3aa
  [] hypercall_page+0x3aa
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x98
  [] xen_spin_lock_flags+0x63
  [] _raw_spin_lock_irqsave+0x2e
  [] update_shares+0x92
  [] rebalance_domains+0x48
  [] run_rebalance_domains+0x48
  [] __do_softirq+0xa8
  [] __xen_evtchn_do_upcall+0x207
  [] call_softirq+0x1c
  [] do_softirq+0x65
  [] irq_exit+0x8e
  [] xen_evtchn_do_upcall+0x35
  [] xen_do_hypervisor_callback+0x1e
CPU1
rip: 810013aa hypercall_page+0x3aa
flags: 1206 i nz p
rsp: 88004e92dbf0
rax:    rcx: 810013aa   rdx: 
rbx:    rsi: 88004e92dc08   rdi: 0003
rbp: 88004e92dc38r8: 880ec85a3000r9: 880f21c000d0
r10: 0001   r11: 0206   r12: 0017
r13: 0001   r14: 880f22e0b901   r15: 880f22e26900
 cs: e033ss: e02bds: es: 
 fs:  @ 7f8f0c9cf700
 gs:  @ 880f22e14000/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
 0049 fffa 8138512e 88004e92dc24
 0001  000a81384b3e 88004e92dc38
 880f22e0b940 88004e92dc48 81386b80 88004e92dc98
 815ecdcd   

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x98
  [] xen_spin_lock+0x4a
  [] _raw_spin_lock+0xe
  [] double_rq_lock+0x2c
  [] load_balance+0xfa
  [] _raw_spin_unlock_irqrestore+0x1e
  [] idle_balance+0xab
  [] __schedule+0x6a9
  [] schedule+0x3f
  [] do_nanosleep+0x9c
  [] hrtimer_nanosleep+0xb8
  [] update_rmtp+0x70
  [] hrtimer_start_range_ns+0x14
  [] sys_nanosleep+0x57
  [] system_call_fastpath+0x16
CPU2
rip: 810013aa hypercall_page+0x3aa
flags: 1202 i nz
rsp: 880001ab7bf0
rax:    rcx: 810013aa   rdx: 
rbx:    rsi: 880001ab7c08   rdi: 0003
rbp: 880001ab7c38r8: 880ec85a3600r9: 880f21c00100
r10: 0001   r11: 0202   r12: 001d
r13: 0001   r14: 880f22e0b901   r15: 880f22e41900
 cs: e033ss: e02bds: es: 
 fs:  @ 7f51a28f6700
 gs:  @ 880f22e2f000/
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
 0049 fffa 8138512e 880001ab7c24
 0001  001081384b3e 880001ab7c38
 880f22e0b940 880001ab7c48 81386b80 880001ab7c98
 815ecdcd   

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_poll_irq_timeout+0x3e
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x98
  [] xen_spin_lock+0x4a
  [] _raw_spin_lock+0xe
  [] double_rq_lock+0x2c
  [] load_balance+0xfa
  [] _raw_spin_unlock_irqrestore+0x1e
  [] idle_balance+0xab
  [] __schedule+0x6a9
  [] schedule+0x3f

[Bug 1011792] Re: Scheduler deadlock running 3.0.0 on multiple EC2 instance types

2012-08-13 Thread Matt Wilson
Due to the nature of the issue encountered, we cannot run this command.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Scheduler deadlock running 3.0.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] Re: Scheduler deadlock running 3.0.0-20-virtual on c1.xlarge EC2 instance

2012-08-13 Thread Matt Wilson
This has been observed on https://launchpad.net/ubuntu/oneiric/+package
/linux-image-3.0.0-17-virtual

** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

** Tags added: oneiric

** Summary changed:

- Scheduler deadlock running 3.0.0-20-virtual on c1.xlarge EC2 instance
+ Scheduler deadlock running 3.0.0 on multiple EC2 instance types

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Scheduler deadlock running 3.0.0 on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] Re: Scheduler deadlock running 3.0.0-20-virtual on c1.xlarge EC2 instance

2012-06-11 Thread Matt Wilson
vCPUs 0, 2 and 3 are stuck waiting on a spinlock. vCPU 1 is running with
the EIP showing various values inside try_to_wake_up()

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1011792

Title:
  Scheduler deadlock running 3.0.0-20-virtual on c1.xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-lts-backport-oneiric/+bug/1011792/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1011792] [NEW] Scheduler deadlock running 3.0.0-20-virtual on c1.xlarge EC2 instance

2012-06-11 Thread Matt Wilson
Public bug reported:

Scheduler deadlocks have been observed on c1.xlarge EC2 instances
running 10.04.3 LTS with the  3.0.0-20-virtual Oneiric backport kernel.
The symptoms appear similar to bug 929941, where multiple CPUs are
waiting on scheduler runqueue locks. But in this case, only a few CPUs
are stuck.

A typical set of stack traces from the guest state looks like:

VCPU0
rip: 810013aa hypercall_page+0x3aa
flags: 1202 i nz
rsp: 8801b3c27910
rax:    rcx: 810013aa   rdx: 8801b3c27954
rbx: 88000265cb30   rsi: 8801b3c27938   rdi: 0003
rbp: 8801b3c27958r8: 0001r9: 0001
r10:    r11: 0202   r12: 0011
r13: 0001   r14: 0001   r15: 
 cs: e033ss: e02bds: es: 
 fs:  @ 7f4ce223f700
 gs:  @ 8801bfed4000/

cr0: 80050033
cr2: 0061ade0
cr3: 0e93d000
cr4: 2660

dr0: 
dr1: 
dr2: 
dr3: 
dr6: 0ff0
dr7: 0400
Code (instr addr 810013aa)
cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc 
cc cc cc cc cc cc


Stack:
 0246  81394b42 8801b3c27938
  8801b3c27954 0001 
 000481394ad6 8801b3c27968 81394b60 8801b3c279b8
 8100933f 8801b3c27a48  8801b3c27998

Call Trace:
  [] hypercall_page+0x3aa  <--
  [] xen_poll_irq_timeout+0x42
  [] xen_poll_irq+0x10
  [] xen_spin_lock_slow+0x7f
  [] xen_spin_lock_flags+0x75
  [] _raw_spin_lock_irqsave+0x2f
  [] task_rq_lock+0x40
  [] task_sched_runtime+0x29
  [] thread_group_cputime+0x88
  [] apparmor_ptrace_access_check+0x39
  [] thread_group_times+0x33
  [] do_task_stat+0x6d2
  [] _raw_spin_lock+0xe
  [] seq_open+0x4f
  [] sched_autogroup_show+0x70
  [] sched_autogroup_show+0x70
  [] single_open+0x7a
  [] sched_open+0x20
  [] proc_single_open+0x1b
  [] mntput_no_expire+0x60
  [] mntput+0x1d
  [] proc_tgid_stat+0x14
  [] proc_single_show+0x61
  [] seq_read+0xf2
  [] vfs_read+0xc5
  [] sys_read+0x51
  [] system_call_fastpath+0x16


VCPU1
rip: 8105a777 try_to_wake_up+0xd7
flags: 1202 i nz
rsp: 8801bfef28f0
rax: 0003   rcx:    rdx: 0001
rbx: 00012980   rsi: 8801b1990078   rdi: 
rbp: 8801bfef2950r8: r9: 
r10:    r11: fb981853   r12: 88000265c530
r13:    r14: 88000265cb30   r15: 
 cs: e033ss: e02bds: es: 
 fs:  @ 7ff68926c700
 gs:  @ 8801bfeef000/

cr0: 8005003b
cr2: 00441d80
cr3: 1174cb000
cr4: 2660

dr0: 
dr1: 
dr2: 
dr3: 
dr6: 0ff0
dr7: 0400
Code (instr addr 8105a777)
00 00 eb 0c 66 2e 0f 1f 84 00 00 00 00 00 f3 90 41 8b 54 24 28 <85> d2 75 f5 49 
8b 14 24 31 c0 83


Stack:
  8801b41d2858 8801bfef2950 8153e51e
 00030004 8801b1990078 8801bfef2930 8800026f6c18
 0001 8800026f6c30  
 8801bfef2960 8105a962 8801bfef29b0 81049709

Call Trace:
  [] try_to_wake_up+0xd7  <--
  [] ip_finish_output+0x16e
  [] default_wake_function+0x12
  [] __wake_up_common+0x59
  [] __wake_up_locked+0x18
  [] ep_poll_callback+0xa4
  [] __wake_up_common+0x59
  [] __wake_up_sync_key+0x53
  [] sock_def_readable+0x3e
  [] tcp_rcv_established+0x26a
  [] xen_force_evtchn_callback+0xd
  [] check_events+0x12
  [] tcp_v4_do_rcv+0x125
  [] tcp_v4_rcv+0x5a9
  [] ip_local_deliver_finish+0xdd
  [] ip_local_deliver+0x80
  [] ip_rcv_finish+0x119
  [] ip_rcv+0x228
  [] packet_rcv_spkt+0x4d
  [] __netif_receive_skb+0x1e0
  [] netif_receive_skb+0x80
  [] handle_incoming_queue+0x134
  [] xennet_poll+0x277
  [] net_rx_action+0x108
  [] _raw_spin_lock+0xe
  [] __do_softirq+0xbf
  [] handle_edge_irq+0x9d
  [] call_softirq+0x1c
  [] do_softirq+0x65
  [] irq_exit+0xbd
  [] xen_evtchn_do_upcall+0x35
  [] xen_do_hypervisor_callback+0x1e


VCPU2
rip: 810013aa hypercall_page+0x3aa
flags: 1202 i nz
rsp: 8801bff0da00
rax:    rcx: 810013aa   rdx: 8801bff0da44
rbx: 8800026f6c00   rsi: 8801bff0da28   rdi: 0003
rbp: 8801bff0da48r8: 00c3r9: c110
r10: 0010   r11: 0202   r12: 001d
r13: 0001   r14: 0001   r15: 
 cs: e033ss: e02bds: es: 
 fs:  @ 7f408128b700
 gs:  @ 8801bff0a000/

cr0: 80050033
cr2: 0061ade0
cr3: 289f2000
cr4: 2660

dr0: 
dr1: 
dr2: 
dr3: 
dr6: 0ff0
dr7: 0400
Code (instr addr 81001

[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-06-05 Thread Matt Wilson
We've had a customer report a very similar looking lockup on
3.0.0-20-virtual. Full version info, "3.0.0-20-virtual (buildd@yellow)
(gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) ) #34~lucid1-Ubuntu SMP Wed
May 2 17:24:41 UTC 2012 (Ubuntu 3.0.0-20.34~lucid1-virtual 3.0.30)"

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-13 Thread Matt Wilson
I've never been able to reproduce the problem with synthetic workloads.
I've asked customers that experience the lockup regularly to test the v3
builds in an environment that won't cause production problems, but
haven't received results.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on multiple EC2 instance types

2012-03-12 Thread Matt Wilson
This has also been observed on c1.xlarge, adjusting the summary

** Summary changed:

- Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance
+ Kernel deadlock in scheduler on multiple EC2 instance types

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on multiple EC2 instance types

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-24 Thread Matt Wilson
The required  CONFIG_XEN_COMPAT value for ec2 is documented here:
http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/AdvancedUsers.html

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-16 Thread Matt Wilson
$ git clone git://kernel.ubuntu.com/smb/ubuntu-lucid.git
Cloning into ubuntu-lucid...
remote: error: Could not read b43f7c4d8d293aa9f47a7094852ebd5355e4f38f
remote: fatal: Failed to traverse parents of commit 
3becab1d2df01d54a4e889cf2d69ccb902cd43c3
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack failed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-15 Thread Matt Wilson
Stefan,

Which commit has the race condition comment? I'm aware of a problem with
SUSE's kernel with regard to PV ticketlocks and HYPERVISOR_poll(), but I
don't see any mention in upstream 3.2.x or XenLinux 2.6.18.

Your 10.04 2.6.32-era kernel doesn't have ticketlocks, so the underlying
hypervisor version should not be a factor. But for the sake of argument,
the lockups are observed on Xen hypervisors newer than 3.2.

What are you using for upstream Xen components for 2.6.32? Is it the
SUSE tree?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.{2, 4}xlarge EC2 instance

2012-02-13 Thread Matt Wilson
** Attachment added: "/proc/interrupts as an attachment"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2736482/+files/proc-interrupts.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-13 Thread Matt Wilson
I also suspect something going sideways in the PV spinlock code, but
nothing has changed in the underlying hardware or hypervisor in this
area. There have been bugs in the PV spinlock code in the past,
including using mb() instead of barrier() in the unlock path, which
could cause the VCPU holding a lock to trigger a kick on the VCPU
waiting before the memory write is complete. I looked at the 10.04
kernel, and this particular bug is already addressed in the PV spinlock
code.

These instances are under load when they hang. Here's the uptime and
/proc/interrupts output from one instance before it hung, but after it
was operational:

Linux ip-10-94-81-231 2.6.32-341-ec2 #42-Ubuntu SMP Tue Dec 6 14:56:13 UTC 2011 
x86_64 GNU/Linux
16:10:54 up 16 days, 19:52,  0 users,  load average: 9.86, 5.01, 3.41"
   CPU0   CPU1   CPU2   CPU3   
 16:  186872780  170473347  170447163  170493692   Dynamic-percputimer
 17:  191775644  350788322  357828130  357481319   Dynamic-percpuresched
 18:  67019  74008  66602  66485   Dynamic-percpucallfunc
 19: 189590 193987 188670 181119   Dynamic-percpucall1func
 20:  0  0  0  0   Dynamic-percpureboot
 21:  165290618  177938588  177538577  177157514   Dynamic-percpuspinlock
 22:410  0  0  0   Dynamic-level xenbus
 23:  0  0  0  0   Dynamic-level suspend
 24:341  0 74180   Dynamic-level xencons
 25: 392339 664199 899350 700455   Dynamic-level blkif
 26:   19953668   46164431   58214738   57029478   Dynamic-level blkif
 27: 1483445834  0  0  0   Dynamic-level eth0
NMI:  0  0  0  0   Non-maskable interrupts
RES:  191775644  350788323  357828131  357481320   Rescheduling interrupts
CAL: 256609 267995 255272 247604   Function call interrupts

Over the weekend, m2.4xlarge instances hung as well. I'll work on
getting dmesg output.


** Summary changed:

- Kernel deadlock in scheduler on m2.2xlarge EC2 instance
+ Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.{2,4}xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-10 Thread Matt Wilson
Overnight an instance running 2.6.32-316 locked up. The stack traces are
attached.

** Attachment added: "stack traces from instance running 2.6.32-316"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2730182/+files/i-804475e2.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: "stack traces from instance running 2.6.32-342 (1/1)"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728703/+files/ubuntu-deadlock-2.6.32-342-1.txt

** Visibility changed to: Private

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: "stack traces from instance running 2.6.32-341 (2/2)"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728702/+files/ubuntu-deadlock-2.6.32-341-2.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] Re: Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
** Attachment added: "stack traces from instance running 2.6.32-341 (1/2)"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+attachment/2728701/+files/ubuntu-deadlock-2.6.32-341-1.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 929941] [NEW] Kernel deadlock in scheduler on m2.2xlarge EC2 instance

2012-02-09 Thread Matt Wilson
Private bug reported:

After running for some indeterminate period of time, the 2.6.32-341-ec2
and 2.6.32-342-ec2 kernels stop responding when running on m2.2xlarge
EC2 instances. No console output is emitted. Stack dumps gathered by
examining CPU context information show that all VCPUs are stuck waiting
on spinlocks. This could be a deadlock in the scheduling code.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-341-ec2 2.6.32-341.42
ProcVersionSignature: User Name 2.6.32-341.42-ec2 2.6.32.49+drm33.21
Uname: Linux 2.6.32-341-ec2 x86_64
Architecture: amd64
Date: Fri Feb 10 01:56:17 2012
Ec2AMI: ami-55dc0b3c
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: m1.xlarge
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-ec2

** Affects: linux-ec2 (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug ec2-images lucid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/929941

Title:
  Kernel deadlock in scheduler on m2.2xlarge EC2 instance

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/929941/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 919431] Re: CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

2012-01-20 Thread Matt Wilson
The hypercall fails due to invalid write permissions on the page that's
attempting to be pinned. Perhaps the page that's being pinned for PTEs
was reused?

One fix that was applied to the upstream kernel for such problems was
this:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=64141da587241301ce8638cc945f8b67853156ec

I don't think that's the cause in this case since XFS isn't in use.
Perhaps some other kernel subsystem is leaving pages behind in the
vmalloc area with write permissions set?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/919431

Title:
  CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-meta-ec2/+bug/919431/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 919431] [NEW] CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

2012-01-20 Thread Matt Wilson
Public bug reported:

The following soft lockup is seen randomly on m2.2xlarge instances in
EC2:

[1284451.875485] BUG: soft lockup - CPU#3 stuck for 61s! [identify:24060]
[1284451.875485] Modules linked in: ipv6 ipt_REJECT ipt_LOG xt_limit 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_owner 
iptable_filter ip_tables x_tables raid10 raid456 async_pq async_xor xor 
async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear 
md_mod
[1284451.875485] CPU 3:
[1284451.875485] Modules linked in: ipv6 ipt_REJECT ipt_LOG xt_limit 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_tcpudp xt_owner 
iptable_filter ip_tables x_tables raid10 raid456 async_pq async_xor xor 
async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear 
md_mod
[1284451.875485] Pid: 24060, comm: identify Tainted: G  D2.6.32-316-ec2 
#31-Ubuntu 
[1284451.875485] RIP: e030:[]  [] 
0x810063aa
[1284451.875485] RSP: e02b:8800eba4d930  EFLAGS: 0246
[1284451.875485] RAX:  RBX: 88025b7634c8 RCX: 
810063aa
[1284451.875485] RDX: 0019 RSI: 8800eba4d948 RDI: 
0003
[1284451.875485] RBP: 8800eba4d968 R08: 88025b763708 R09: 
0040
[1284451.875485] R10: 7ff0 R11: 0246 R12: 
0015
[1284451.875485] R13: 0045 R14: 88025b7634a8 R15: 
0001
[1284451.875485] FS:  7f1a7f487700() GS:880002f35000() 
knlGS:
[1284451.875485] CS:  e033 DS:  ES:  CR0: 80050033
[1284451.875485] CR2: 7f16bfb0a398 CR3: 01001000 CR4: 
2660
[1284451.875485] DR0:  DR1:  DR2: 

[1284451.875485] DR3:  DR6: 0ff0 DR7: 
0400
[1284451.875485] Call Trace:
[1284451.875485]  [] ? xen_poll_irq+0x7f/0xc0
[1284451.875485]  [] xen_spin_wait+0x84/0x170
[1284451.875485]  [] ? _spin_lock+0x3e/0x60
[1284451.875485]  [] _spin_lock+0x53/0x60
[1284451.875485]  [] _pin_lock+0x28/0x290
[1284451.875485]  [] mm_unpin+0x1f/0x40
[1284451.875485]  [] arch_exit_mmap+0x91/0xa0
[1284451.875485]  [] exit_mmap+0x38/0x1b0
[1284451.875485]  [] mmput+0x2d/0x100
[1284451.875485]  [] exit_mm+0x10d/0x150
[1284451.875485]  [] do_exit+0x132/0x370
[1284451.875485]  [] oops_end+0xa7/0xf0
[1284451.875485]  [] die+0x56/0x90
[1284451.875485]  [] do_trap+0xc4/0x170
[1284451.875485]  [] do_invalid_op+0xb0/0xd0
[1284451.875485]  [] ? do_lN_entry_update+0x174/0x180
[1284451.875485]  [] ? zone_watermark_ok+0x25/0xe0
[1284451.875485]  [] ? prep_new_page+0x11e/0x1c0
[1284451.875485]  [] invalid_op+0x25/0x30
[1284451.875485]  [] ? do_lN_entry_update+0x174/0x180
[1284451.875485]  [] ? do_lN_entry_update+0x11e/0x180
[1284451.875485]  [] ? __alloc_pages_nodemask+0xd6/0x180
[1284451.875485]  [] xen_l2_entry_update+0x13b/0x160
[1284451.875485]  [] __pte_alloc+0x166/0x170
[1284451.875485]  [] handle_mm_fault+0x495/0x4f0
[1284451.875485]  [] do_page_fault+0x147/0x390
[1284451.875485]  [] page_fault+0x28/0x30

The pinning hypercall in do_lN_entry_update() failed, triggering a BUG()
which invokes do_invalid_op(). Unfortunately we don't get the BUG()
message. Instead, we end up deadlocked on the pin lock.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-ec2 2.6.32.341.22
ProcVersionSignature: User Name 2.6.32-341.42-ec2 2.6.32.49+drm33.21
Uname: Linux 2.6.32-341-ec2 x86_64
Architecture: amd64
Date: Fri Jan 20 22:43:24 2012
Ec2AMI: ami-55dc0b3c
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: m2.2xlarge
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-meta-ec2

** Affects: linux-meta-ec2 (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug ec2-images lucid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/919431

Title:
  CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-meta-ec2/+bug/919431/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 919431] Re: CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

2012-01-20 Thread Matt Wilson
-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/919431

Title:
  CPU soft lockup in Xen PTE allocation on m2.2xlarge instances

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-meta-ec2/+bug/919431/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 704022] Re: xen_emul_unplug=unnecessary on kernel cmdline is required in ec2 hvm

2012-01-20 Thread Matt Wilson
Stefan,

The ec2 kernels already have xen-netfront and xen-blkfront compiled in.
If xen-platform-pci was also compiled in, or included in the initramfs,
then the HW emulation will be unplugged properly and you'll switch over
to the PV drivers. The following results in PV drivers for the root
volume on my test instance:

add xen-platform-pci to /etc/initramfs-tools/modules
run update-initramfs -u
remove xen_emu_unplug=unnecessary from the kernel boot command line
reboot

See also bug 804219

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/704022

Title:
  xen_emul_unplug=unnecessary on kernel cmdline is required in ec2 hvm

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/704022/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 634487] Re: t1.micro instance hangs when installing java

2011-06-21 Thread Matt Wilson
I think that the root cause is a corrupted p2m_host[] list via a PV-GRUB
bug. Updated PV-GRUB AKIs are now available. These can be used in us-
east-1 to verify the fix:

32-bit: aki-805ea7e9 
64-bit: aki-825ea7eb

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/634487

Title:
  t1.micro instance hangs when installing java

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-release-notes/+bug/634487/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 710754] Re: natty kernel does not boot on t1.micro in arch i386

2011-06-02 Thread Matt Wilson
The permanent fix for this is likely in PV-GRUB. See:
https://patchwork.kernel.org/patch/727511/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/710754

Title:
  natty kernel does not boot on t1.micro in arch i386

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 686692] Re: natty kernel does not boot on ec2 t1.micro

2011-06-02 Thread Matt Wilson
The permanent fix for this is likely in PV-GRUB. See:
https://patchwork.kernel.org/patch/727511/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/686692

Title:
  natty kernel does not boot on ec2 t1.micro

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 636091] Re: Touchpad stops working when wifi/3G connects

2011-05-03 Thread Matt Wilson
I just installed natty narwhal on my dell 1420 and as soon as I punched
my sudo password to unlock my key to get on the wireless, my touchpad
stopped working.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/636091

Title:
  Touchpad stops working when wifi/3G connects

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-02-02 Thread Matt Wilson
Mike,

You bring up a good point about CFS' need for good process time
accounting. I think that this upstream patch may fix a lot of problems:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=8a22b9996b001c88f2bfb54c6de6a05fc39e177a

This patch is in 2.6.34.7, may not be in the 10.04 kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-02-01 Thread Matt Wilson
I've done a lot of looking at this today. It feels like the problem may
lie in the process scheduler. When I pin the CPU burning process to CPU0
(through "taskset -pc 0 $pid_printed_by_a_out"), and pin a bash shell
also to CPU0, I see failure of the bash process to wake after sleeping
(i.e., it's runnable, but CFS isn't giving it time). I've seen the bash
process start to be scheduled after around 3 minutes, and I've also seen
it just sit there.

Every time I've seen a scheduler debug trace (triggered via "echo w >
/proc/sysrq-trigger"), there have been other runnable processes on the
spinning CPU that don't seem to be getting scheduled at all.

I've not been able to reproduce this problem on the kernel used in the
Amazon Linux AMI (currently 2.6.34.7). This is in line with other user's
observations (http://twitter.com/#!/synack/status/30415380321140737).

I think that Canonical might need to look into what (if any) changes
they've made to CFS in the 10.04 kernel tree. It's also possible that
improvements have been made in CFS between 2.6.32 and 2.6.34 that
account for better performance.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 710754] Re: natty kernel does not boot on t1.micro in arch i386

2011-02-01 Thread Matt Wilson
We use:
CONFIG_PHYSICAL_START=0x100
CONFIG_PHYSICAL_ALIGN=0x100

It sounds like that works for you too?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/710754

Title:
  natty kernel does not boot on t1.micro in arch i386

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 710754] Re: natty kernel does not boot on t1.micro in arch i386

2011-02-01 Thread Matt Wilson
What is CONFIG_PHYSICAL_ALIGN?


-- Sent from my Palm Pre


On Feb 1, 2011 12:16 PM, Scott Moser  wrote:

On Tue, 1 Feb 2011, Matt Wilson wrote:

> Are you using CONFIG_RELOCATABLE=y for your kernels? If so,
> CONFIG_PHYSICAL_START should not be a factor.

$ egrep "(CONFIG_RELOCATABLE|CONFIG_PHYSICAL_START)"
/boot/config-2.6.38-1-virtual
CONFIG_PHYSICAL_START=0x100
CONFIG_RELOCATABLE=y
$ uname -a
Linux ip-10-112-14-12 2.6.38-1-virtual #28-Ubuntu SMP Fri Jan 28 18:38:01
UTC 2011 i686 i686 i386 GNU/Linux

--
You received this bug notification because you are a direct subscriber
of the bug.
https://bugs.launchpad.net/bugs/710754

Title:
  natty kernel does not boot on t1.micro in arch i386

Status in “grub” package in Ubuntu:
  New
Status in “linux” package in Ubuntu:
  In Progress

Bug description:
  $ ec2-run-instances --region us-east-1 --instance-type t1.micro --key
  mykey ami-5c3fcf35

  That results in instance that has no console output and is not
  reachable.

  Note, that under bug 686692 the amd64 on t1.micro was fixed.

  ProblemType: Bug
  DistroRelease: Ubuntu 11.04
  Package: linux-image-2.6.38-1-virtual 2.6.38-1.28
  Regression: Yes
  Reproducible: Yes
  ProcVersionSignature: User Name 2.6.38-1.28-virtual 2.6.38-rc2
  Uname: Linux 2.6.38-1-virtual i686
  AlsaDevices:
   total 0
   crw--- 1 root root 116,  1 2011-01-31 16:40 seq
   crw--- 1 root root 116, 33 2011-01-31 16:40 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  Architecture: i386
  ArecordDevices: Error: [Errno 2] No such file or directory
  CurrentDmesg:

  Date: Mon Jan 31 16:45:05 2011
  Ec2AMI: ami-5c3fcf35
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-east-1b
  Ec2InstanceType: t1.micro
  # above edited, originally reported on m1.small as t1.micro does not boot
  Ec2Kernel: aki-407d9529
  Ec2Ramdisk: unavailable
  Lspci:

  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  ProcEnviron:
   PATH=(custom, user)
   LANG=en_US.UTF-8
   LC_MESSAGES=en_US.utf8
   SHELL=/bin/bash
  ProcKernelCmdLine: root=LABEL=uec-rootfs ro console=hvc0
  ProcModules: acpiphp 23425 0 - Live 0xedc1
  SourcePackage: linux

To unsubscribe from this bug, go to:
https://bugs.launchpad.net/ubuntu/+source/grub/+bug/710754/+subscribe

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/710754

Title:
  natty kernel does not boot on t1.micro in arch i386

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 710754] Re: natty kernel does not boot on t1.micro in arch i386

2011-02-01 Thread Matt Wilson
Are you using CONFIG_RELOCATABLE=y for your kernels? If so,
CONFIG_PHYSICAL_START should not be a factor.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/710754

Title:
  natty kernel does not boot on t1.micro in arch i386

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-31 Thread Matt Wilson
If anyone has a machine that they can get into the hanging state (with
fork() blocking), can you run run "echo w > /proc/sysrq-trigger" as root
and post the results?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-31 Thread Matt Wilson
Alec,

Do you have instance IDs from your hanging instances?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-31 Thread Matt Wilson
Jordan,

Do you see this behavior at boot, or only after your instance has been
up and running for a while?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-31 Thread Matt Wilson
Alec,

Do any hung task kernel stack traces get emitted during your hangs?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-30 Thread Matt Wilson
If you have an instance in a state where fork() will hang if you spin a
CPU, it would be a good experiment to see if irqbalance helps at all.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-30 Thread Matt Wilson
Hi Mike,

Let's focus on the fork() hangs in this bug. It's true that the two
could be related, but the symptoms don't quite line up.

You say you can reproduce the behavior on 2.3.32-311. Do you have a
procedure for getting an instance into the broken state, so you can then
cause fork() hangs with spinning CPUs?

Also, is irqbalance running on your instances? If not, does the fork()
hanging behavior change if irqbalance is running on the instance
(started at boot)?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
Mike, can you click on the "affects me" for this bug?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
Gavin,

Can you reproduce the issue at will? I'm struggling to find a way to
reproduce the issue on a freshly booted instance.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
The following kernel stack was captured on a system in "fork() hangs"
state via "echo t > /proc/sysrq-trigger". The code for libctest is here:
https://gist.github.com/2d2b78987ea451c2edd6

<6>[853486.204130] libctest  R  running task0 13658   1417 
0x
<4>[853486.204132]    88088b03a980 
8800e3489ea8
<4>[853486.204134]  810909f7 88088b03aa48 88088b03a980 
8800e3489ec8
<4>[853486.204137]  81109f04 880889a27c80 8002 
8804d4470680
<4>[853486.204139] Call Trace:
<4>[853486.204141]  [] ? __call_rcu+0x77/0x1c0
<4>[853486.204143]  [] ? mntput_no_expire+0x24/0x110
<4>[853486.204145]  [] ? alloc_fd+0x4b/0x160
<4>[853486.204148]  [] ? putname+0x30/0x50
<4>[853486.204150]  [] ? do_sys_open+0x106/0x160
<4>[853486.204152]  [] ? sys_open+0x1b/0x20
<4>[853486.204154]  [] ? system_call_fastpath+0x16/0x1b
<4>[853486.204157]  [] ? system_call+0x0/0x52
<4>[853486.204159]  [] ? system_call+0x0/0x52

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
Some discussions on this are at http://twitter.com/#!/mjmalone

Video posted by
http://twitter.com/#!/jordansissel/status/30421571315175425

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
Attaching /proc/slabinfo from a system that can be used to cause fork()
hangs.

** Attachment added: "/proc/slabinfo from a sick instance"
   
https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/708920/+attachment/1811279/+files/slabinfo.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
It seems that this reproduction case only happens after the system has
been used for some unknown amount of time. At that point, fork() hangs
can be triggered at will. If the instance is rebooted, the test case no
longer causes hangs.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] [NEW] Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
Private bug reported:

There have been reports of fork() hangs on Lucid when running on EC2.
See this YouTube video for an example:
http://www.youtube.com/watch?v=rbURfuAmtXw

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-305-ec2 2.6.32-305.9
ProcVersionSignature: User Name 2.6.32-305.9-ec2 2.6.32.11+drm33.2
Uname: Linux 2.6.32-305-ec2 x86_64
Architecture: amd64
Date: Thu Jan 27 22:25:15 2011
Ec2AMI: ami-fd4aa494
Ec2AMIManifest: 
ubuntu-images-us/ubuntu-lucid-10.04-amd64-server-20100427.1.manifest.xml
Ec2AvailabilityZone: us-east-1c
Ec2InstanceType: m1.xlarge
Ec2Kernel: aki-0b4aa462
Ec2Ramdisk: unavailable
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-ec2

** Affects: linux-ec2 (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug ec2-images lucid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
On a system in this condition, sometimes hung task traces are seen:


kernel: [65098.694112] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
kernel: [65098.694117] cron D 880001885380 0 21248 569 0x
kernel: [65098.694121] 880772e25d20 0282 0001 
880772e25ca0
kernel: [65098.694124] 0002 880772e25ce8 8802e100c678 
880772e25fd8
kernel: [65098.694126] 8802e100c2c0 8802e100c2c0 8802e100c2c0 
880772e25fd8
kernel: [65098.694128] Call Trace:
kernel: [65098.694137] [] ? cache_alloc_refill+0x6a/0x270
kernel: [65098.694141] [] schedule_timeout+0x1dd/0x2c0
kernel: [65098.694145] [] wait_for_common+0xf2/0x1d0
kernel: [65098.694150] [] ? default_wake_function+0x0/0x10
kernel: [65098.694152] [] wait_for_completion+0x18/0x20
kernel: [65098.694156] [] do_fork+0x149/0x430
kernel: [65098.694160] [] ? mntput_no_expire+0x24/0x110
kernel: [65098.694164] [] ? set_one_prio+0x70/0xd0
kernel: [65098.694168] [] sys_vfork+0x20/0x30
kernel: [65098.694171] [] stub_vfork+0x13/0x20
kernel: [65098.694174] [] ? system_call_fastpath+0x16/0x1b

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson
This is a transcription of the test program from the youtube video:


#include 
#include 
#include 

int main(int argc, char **argv) {
  int children = 0;
  int status;
  int i = 0;

  if (argc < 2) {
printf("Usage: %s https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 708920] Re: Strange 'fork/clone' blocking behavior under high cpu usage on EC2

2011-01-27 Thread Matt Wilson


-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/708920

Title:
  Strange 'fork/clone' blocking behavior under high cpu usage on EC2

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 664708] Re: package alsa-utils 1.0.23-2ubuntu3.3 failed to install/upgrade: subprocess installed post-installation script returned error exit status 1

2010-10-21 Thread matt wilson
*** This bug is a duplicate of bug 664645 ***
https://bugs.launchpad.net/bugs/664645

-- 
package alsa-utils 1.0.23-2ubuntu3.3 failed to install/upgrade: subprocess 
installed post-installation script returned error exit status 1
https://bugs.launchpad.net/bugs/664708
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 664708] [NEW] package alsa-utils 1.0.23-2ubuntu3.3 failed to install/upgrade: subprocess installed post-installation script returned error exit status 1

2010-10-21 Thread matt wilson
*** This bug is a duplicate of bug 664645 ***
https://bugs.launchpad.net/bugs/664645

Public bug reported:

Binary package hint: alsa-utils

trying to install guarddog firewall from software centre

ProblemType: Package
DistroRelease: Ubuntu 10.10
Package: alsa-utils 1.0.23-2ubuntu3.3
ProcVersionSignature: Ubuntu 2.6.35-22.35-generic 2.6.35.4
Uname: Linux 2.6.35-22-generic i686
Architecture: i386
Date: Thu Oct 21 20:32:06 2010
ErrorMessage: subprocess installed post-installation script returned error exit 
status 1
InstallationMedia: Ubuntu-Netbook 10.04 "Lucid Lynx" - Release i386 (20100429.4)
SourcePackage: alsa-utils
Title: package alsa-utils 1.0.23-2ubuntu3.3 failed to install/upgrade: 
subprocess installed post-installation script returned error exit status 1

** Affects: alsa-utils (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: apport-package i386 maverick ubuntu-une

-- 
package alsa-utils 1.0.23-2ubuntu3.3 failed to install/upgrade: subprocess 
installed post-installation script returned error exit status 1
https://bugs.launchpad.net/bugs/664708
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 415667] Re: libx11-data: compose ellipsis problem

2009-11-10 Thread Matt Wilson
I can confim that this still exists in Karmic Koala.

-- 
libx11-data: compose ellipsis problem
https://bugs.launchpad.net/bugs/415667
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 415667] Re: libx11-data: compose ellipsis problem

2009-10-04 Thread Matt Wilson

** Attachment added: "Xorg old"
   http://launchpadlibrarian.net/32972005/Xorg.0.log.old

-- 
libx11-data: compose ellipsis problem
https://bugs.launchpad.net/bugs/415667
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 415667] Re: libx11-data: compose ellipsis problem

2009-10-04 Thread Matt Wilson

** Attachment added: "Xorg log"
   http://launchpadlibrarian.net/32971969/Xorg.0.log

-- 
libx11-data: compose ellipsis problem
https://bugs.launchpad.net/bugs/415667
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 415667] Re: libx11-data: compose ellipsis problem

2009-10-04 Thread Matt Wilson

** Attachment added: "lspci -vvnn output"
   http://launchpadlibrarian.net/32971888/lspci-vvnn.log

-- 
libx11-data: compose ellipsis problem
https://bugs.launchpad.net/bugs/415667
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 415667] Re: libx11-data: compose ellipsis problem

2009-10-04 Thread Matt Wilson
I'm seeing this too, after upgrading 8.10 to 9.04. It happens in Pidgin,
Firefox and gnome-terminal, but NOT for some reason in xterm or urxvt.
My locale is en_NZ.UTF-8. I'll attach the requested logs.

-- 
libx11-data: compose ellipsis problem
https://bugs.launchpad.net/bugs/415667
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 227595] Re: Package Nagios3 and plugins

2008-05-14 Thread Matt Wilson
I'm not sure where this should go either, but I agree that it would be
really nice to have nagios3 packaged up.

Maybe I should take a try at it.

-- 
Package Nagios3 and plugins
https://bugs.launchpad.net/bugs/227595
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 38538] Re: man pages suggest info pages that don't exist.

2008-02-25 Thread Matt Wilson
I just discovered this bug when I read the man page for mkfifo and then
tried to read the info for mkfifo.

I followed foolishchild's advice:

cd /usr/share/info
sudo gunzip coreutils.info.gz
sudo vim coreutils.info

comment out (delete?) the first "END-INFO-DIR-ENTRY" and
second"START-INFO-DIR-ENTRY" lines.
I used "##" as comment strings.

sudo gzip coreutils.info
sudo install-info --debug --infodir=/usr/share/info -- coreutils.info

And now the problem is solved, at least for my box.

Should we consider putting this workaround in place for now, while we
wait for the better fix?

-- 
man pages suggest info pages that don't exist.
https://bugs.launchpad.net/bugs/38538
You received this bug notification because you are a member of Ubuntu
Bugs, which is a direct subscriber.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 187356] Re: /etc/init.d/kannel depends on nonexistant /var/run/kannel directory

2008-02-08 Thread Matt Wilson
Yeah, I think my way will work fine.  And if not, we'll hear about it
quickly!

:)

-- 
/etc/init.d/kannel depends on nonexistant /var/run/kannel directory
https://bugs.launchpad.net/bugs/187356
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 187356] Re: /etc/init.d/kannel depends on nonexistant /var/run/kannel directory

2008-02-07 Thread Matt Wilson
Hi David,

Do we need to fix your problems before we fix my problem?

By the way, I've been poking around to see how other init scripts deal
with the fact that /var/run is flushed after every reboot.

This is how /etc/init.d/klogd makes sure that it has a subdirectory in
/var/run:

case "$1" in
  start)
log_begin_msg "Starting kernel log daemon..."
# create klog-writeable pid and fifo directory
mkdir -p /var/run/klogd
chown klog:klog /var/run/klogd
mkfifo -m 700 $kmsgpipe
chown klog:klog $kmsgpipe


And this is how /etc/init.d/slony1 does it:

prepare_start() {
mkdir -p /var/run/slony1 \
&& chown postgres:postgres /var/run/slony1/ \
&& chmod 2775 /var/run/slony1/
}

So, it does not look like a standard method exists.  I would like to get
some really good sysadmins to help us with this.  There really should be
a standard way to do this.

Matt

-- 
/etc/init.d/kannel depends on nonexistant /var/run/kannel directory
https://bugs.launchpad.net/bugs/187356
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 187356] [NEW] /etc/init.d/kannel depends on nonexistant /var/run/kannel directory

2008-01-30 Thread Matt Wilson
Public bug reported:

Binary package hint: kannel

The /etc/init.d/kannel script tries to put files inside /var/run/kannel.
That directory doesn't exist and gets erased every time.

I suggest adding something sort of like this into the /etc/init.d/kannel
script:

# Create the PIDFILES dir if it doesn't exist.
test ! -d $PIDFILES && mkdir $PIDFILES && chown kannel $PIDFILES

This is how I have fixed the problem on my boxes, anyway.

** Affects: kannel (Ubuntu)
 Importance: Undecided
 Status: New

-- 
/etc/init.d/kannel depends on nonexistant /var/run/kannel directory
https://bugs.launchpad.net/bugs/187356
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs