Closing this bug with Won't fix as this kernel / release is no longer supported.
Please feel free to open a new bug report if you're still experiencing this on
a newer release (Bionic 18.04.3 / Disco 19.04)
Thanks!
** Changed in: linux (Ubuntu)
Status: Confirmed => Won't Fix
--
You recei
Hi,
We just completed an upgrade to precise across our instances and it
looks like the issue is still persisting on kernel 3.2.0-61-virtual.
Only have seen this on Amazon's m1.large instances so far. I've attached
a new stack trace
** Attachment added: "precise_crash.txt"
https://bugs.launchp
I am sorry, I unfortunately got distracted by trying to finish some feature for
the next release. And I must admit right now I have no good idea how to
proceed. The pages that got dumped at least to me show no pattern that points
to a certain process. You might be in a better position there sinc
We've had a few more panics but the page has been empty a few times it
has printed it out. Is it helpful to post anymore traces or is there
any other information that would be useful to gather for debugging?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is
Ah yeah. Well maybe it is not the only way to make it happen but one
rather successful. I would really love to be able to find anything that
allows me to reproduce the problem on a local host. So I grasp any straw
that looks promising.
--
You received this bug notification because you are a membe
We're using https://github.com/ariya/phantomjs/tree/1.7 , the recent
traces are just from machines that are running phantomJS, we have been
seeing crashes on other servers without phantomjs but I only have the
kernel you compiled for us running on those servers since they crash the
most frequent
-
So the first one did not show some immediately obvious hint. And I think
the lockup of that was posted in comment #52 is a completely different
issue (also wondering about the kernel version in there, is that a
mainline kernel?). Anyway, that rather seems to be a bug which I thought
we had a patch
Just saw a crash on Kernel 3.2.46
Here's attached console output
** Attachment added: "kernel_3_2.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3730533/+files/kernel_3_2.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which
One crash from this weekend
** Attachment added: "linkworker01_bad_page_20130707.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3729233/+files/linkworker01_bad_page_20130707.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, whic
And another crash from this weekend.
There was a third but the memory page that it dumped out contains some
non-public information so I can't post it here
** Attachment added: "linkworker02_bad_page_20130706.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3729234
Yea that documentation is a part of paramiko which is imported in a
shared python library that some code on this particular server uses (but
does not make use of). PhantomJS is rolled on our own but it's also not
installed or running on other instances where we've seen this issue.
Hopefully (odd t
Hm, so that middle part looks a bit like Python documentation. Could it be part
or a part of phantomjs? Btw, for Lucid/10.04, how is phantomjs obtained? At
least it is not a separate package as of Precise/12.04 and later.
I wonder whether any part of that (or something else which is added to the
Ok got some more information now.
** Attachment added: "linkworker01_bad_page_20130702.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3722260/+files/linkworker01_bad_page_20130702.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs
Thanks and sorry, yeah the dump would be on the console if I had not
messed up the conversion between the reported struct page and the memory
I try to read from. So what you saw is basically the function trying to
dump crashing because it accesses the wrong place. I hope I got it right
this time an
Is it supposed to dump the contents to the console? Had 2 crashes this
weekend, attached are the stack traces but I don't really see anything
different.
** Attachment added: "linkworker03_crash.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3720432/+files/linkwo
debug kernel stack trace
** Attachment added: "linkworker01_crash.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3720433/+files/linkworker01_crash.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
That unfortunate news. Right now I can only think of a kind of desperate
approach. I added a 64bit dbg1 kernel to the same location as from
comment #41. That one hopefully (not really able to test it). If it
works as expected it will dump the memory contents of the page that
appears bad on the free
Unfortunately just saw a panic on those newer kernels as well
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1178707
Title:
Kernel Panics - ec2
To manage notifications about this bug go to:
https://
Not really anything substantial, but recently there was a new upstream
stable release for 2.6.32 which had some mm updates and also a few
places claiming to fix memory leaks. As it is still unclear what causes
the problems it would be good to install that updated kernel into at
least one affected
Also not sure if this is helpful, but here's an output of "sysctl
-a"?field.comment=Also not sure if this is helpful, but here's an output
of "sysctl -a"
** Attachment added: "sysctl.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3708175/+files/sysctl.txt
--
Yo
I tested this on a local Lucid PVM and tracing was available there.
Maybe debugfs is not mounted by defaul on EC2? For 2.6.32 it was
anon_vma_unlink. But probably does not matter that much which kernel.
More to get a feeling how much relative activity processes do.
I guess I need to do a bit more
It doesn't look like dynamic ftracing is available in the 2.6.32 kernels
we are running, only in the 3.x kernels. I assume you meant
unlink_anon_vmas function?
There's alot of output so it's really hard to discern much from it. We
have phantomjs running on one of the servers experiencing the cra
The irqbalance problem on Xen.org sounds like the daemon crashing (which
is not the case here). In the Redhat bug report it feels like people use
crash when they mean hang. I remember there were some requests about
backporting interrupt related patches. But due to the differences in the
EC2 kernels
Stefan,
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=430 , granted
that is very old and
https://bugzilla.redhat.com/show_bug.cgi?id=550724#c81
I also found this related bug that seems to be having similar crashes to
ours reported by an Amazon engineer
https://bugs.launchpad.net/ubuntu/
Maybe you got pointers to those reports about irgbalance? I not really
sure what could be monitored to find more information. I went back and
looked at all the bad page error messages and one thing that all of them
seem to have in common is that there is a page->mapping set which has
bit 0 set. And
Yea I had read some bug reports about instability with irqbalance on
Xen, but I'm just grasping at straws.
The software and configurations are identical on the m1.large and
m2.xlarge for this class of servers.
Are there any particular values I could graph and start monitoring to
see if network IO
Hard to imagine how dynamically pinning irqhandlers to certain cpu's
would make a difference. But who knows. If the description of instance
types is correct the main differences between the two instance types
would be that m2 has more memory (7.5GB / 17.1 GB) but has only one
420GB virtual drive, w
@Stefan,
One interesting thing is we are seeing the crashes on m1.larges of a
certain server type, but that same type running on m2.xlarge has not
seen any crashes. Seeing same network and IO patterns in both cases but
no crashes on the larger instance type.
I disabled irqbalancer on one group
That "kernel BUG" probably does not mean that much. Given there seems to
be at least one (but likely more) page on the free list which is not
really released, this will result in more and more fallout. Is it
possible to elaborate more on disk and network setup (at least anything
that differers to a
Here's another backtrace from today , this occurred on a c1.medium but
the backtrace actually contained a mention of "kernel BUG"
** Attachment added: "i-deb173a1_20130516.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3678351/+files/i-deb173a1_20130516.txt
--
One common trait these instances share are they are heavy on network IO.
Instances of larger sizes with the same network I/O seem to be stable.
Some have sustain bandwidth of 4MB/sec in/out with packet rates of up to
30k/sec
--
You received this bug notification because you are a member of Ubuntu
Oh, right, I forgot that the version string came later. But since the
symptom is distributed over such a variety of availability zones and
even different instance types, it seems rather unlikely to be related to
something on the host.
Unfortunately the kernel messages that are seen only tell us so
Adding an additional backtrace from this morning on Kernel 3
** Attachment added: "kernel_3_20131505.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3677408/+files/kernel_3_20131505.txt
--
You received this bug notification because you are a member of Ubuntu
Bug
No NFS is involved. All the mounts are ephemeral storage.
Instance types seemed to be isolated to m1.large and c1.xlarge so far.
We have the same configuration running on m2.xlarge that we have for
some m1.larges and have not seen crashes there (but I wouldn't rule out
since we didnt start diggin
And just for references (I had the feeling there was something similar
before) bug 1007082 has a comment #36 that claims this was there related
to fsc on NFS. Is that involved here as well?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubunt
Looking at the dmesg snippets of the various kernels there seem to be multiple
pages that have that bad page state. The locations seem random (maybe
visualizing may yield some pattern). It happens the same with the ec2 kernel
and the virtual flavour which actually are very different in the Xen c
@Joseph is there any additional information I can provide to help the
debugging?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1178707
Title:
Kernel Panics - ec2
To manage notifications about this
I've only tested this on 10.04 images. It would be a bit difficult to
try on a newer release given software dependencies we have currently
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1178707
Title
Does this only happen on the 10.04 images? Have you also tested other
releases?
** Changed in: linux (Ubuntu)
Importance: Undecided => High
** Tags added: kernel-da-key
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs
apport information
** Tags added: apport-collected
** Description changed:
Kernel Versions affected: 2.6.32-346-ec2 #51-Ubuntu
2.6.32-309-ec2 #18-Ubuntu SMP
3.0.0-32-virtual
#51~lucid1-Ubuntu
The instances work load range from
- an nginx proxy server, just proxies connections to different backends
running in-memory database
avg cpu: 30%
- a server running inhouse in-memory database , taking connections from the
nginx proxy servers
avg cpu: 20%
- queue worker servers
avg cpu:
** Attachment added: "console_output_kernel_3-b.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672123/+files/console_output_kernel_3-b.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs
** Attachment added: "console_output_kernel_2_6_32_309.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672119/+files/console_output_kernel_2_6_32_309.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu
** Attachment added: "console_output_kernel_3.txt"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672116/+files/console_output_kernel_3.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.lau
** Attachment added: "apport.linux-image-3.0.0-32-virtual.y8mP6P.apport"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672113/+files/apport.linux-image-3.0.0-32-virtual.y8mP6P.apport
--
You received this bug notification because you are a member of Ubuntu
Bugs, whi
** Attachment added: "apport.linux-image-2.6.32-351-ec2.xiDdHy.apport"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672110/+files/apport.linux-image-2.6.32-351-ec2.xiDdHy.apport
--
You received this bug notification because you are a member of Ubuntu
Bugs, which i
** Attachment added: "apport.linux-image-2.6.32-346-ec2.iIBV_c.apport"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1178707/+attachment/3672107/+files/apport.linux-image-2.6.32-346-ec2.iIBV_c.apport
--
You received this bug notification because you are a member of Ubuntu
Bugs, which i
47 matches
Mail list logo