[Bug 999755] Re: Kernel crash in rb_next doing ohai loops

2012-07-19 Thread Gavin Heavyside
We think that this seems to be an interplay between the kernel bug, ohai, and Ruby 1.9.3. I was unable to reproduce the crash using the opscode omnibus installer (http://www.opscode.com/blog/2012/06/29 /omnibus-chef-packaging/) which uses a bundled ruby 1.9.2, so you could potentially try that as a

[Bug 999755] Re: Kernel crash on EC2 & VirtualBox

2012-05-28 Thread Gavin Heavyside
And another one from the debug kernel on EC2, with a slightly different call stack: [ 4389.480352] [ cut here ] [ 4389.480884] kernel BUG at /home/smb/precise-amd64/ubuntu-2.6/kernel/sched_fair.c:1239! [ 4389.480894] invalid opcode: [#1] SMP [ 4389.480902] CPU 0 [ 4

[Bug 999755] Re: Kernel crash on EC2 & VirtualBox

2012-05-28 Thread Gavin Heavyside
And we've just reproduced on EC2 with the debug kernel: [248587.286290] [ cut here ] [248587.286765] kernel BUG at /home/smb/precise-amd64/ubuntu-2.6/kernel/sched_fair.c:1239! [248587.286775] invalid opcode: [#1] SMP [248587.286783] CPU 0 [248587.286786] Modules lin

[Bug 999755] Re: Kernel crash on EC2 & VirtualBox

2012-05-28 Thread Gavin Heavyside
We've got small EC2 instances (single processor) that haven't exhibited this behaviour, but we get it with large EC2 instances (2 CPUs); the VirtualBox machine I just reproduced it with was specifically set to 2 CPUS. It seems to me that this bug might only occur on multi-cpu boxes? ** Summary ch

[Bug 999755] Re: Kernel crash on EC2 m1.large instances

2012-05-28 Thread Gavin Heavyside
I've just reproduced this crash using the stock 3.2.0-24-39 kernel on VirtualBox on OS X (Lion). I created a 2-CPU VM using the latest VirtualBox (4.1.16 r78094), for Ubuntu 64-bit, default 8GB disk. The steps I followed were: * Install 64-bit 12.04 Server LTS, minimal install from ISO downloaded

[Bug 999755] Re: Kernel crash on EC2 m1.large instances

2012-05-24 Thread Gavin Heavyside
We've also seen this on the -24.38 and -24.39 kernels now: [56843.390534] BUG: unable to handle kernel NULL pointer dereference at 0010 [56843.390551] IP: [] rb_next+0x1/0x50 [56843.390566] PGD 1d20a7067 PUD 1d29a2067 PMD 0 [56843.390575] Oops: [#1] SMP [56843.390583] CPU 1 [5

[Bug 999755] Re: Kernel crash on EC2 m1.large instances

2012-05-20 Thread Gavin Heavyside
Triggered this again by running ohai in a continuous loop, took about 24 hours to occur: [18438803.627371] BUG: unable to handle kernel NULL pointer dereference at 0010 [18438803.627388] IP: [] rb_next+0x1/0x50 [18438803.627402] PGD 1d0efa067 PUD 1d232d067 PMD 0 [18438803.627411] Oo

[Bug 999755] Re: Kernel crash on EC2 m1.large instances

2012-05-19 Thread Gavin Heavyside
BTW Xen version from dmesg is: Xen version: 3.4.3-2.6.18 (preserve-AD) This is on EC2 so we have no control over this. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/999755 Title: Kernel crash on

[Bug 999755] Re: Kernel crash on EC2 m1.large instances

2012-05-19 Thread Gavin Heavyside
I've reproduced this by running the OHAI command from the OpsCode Chef ohai gem (0.6.12) in a loop, although it took nearly 2 days before it triggered. Basically I ran `gem install ohai; while true; do ohai; done` in a screen session. The stack trace is: [18362917.357055] BUG: unable to handle k