My testing on the 3.9 kernel has been underway since the note above, its
surpassed 11 days of running the loads from the scripts attached, and
even higher.  The previous 3.2 and 3.5 kernel testing never exceeded 4.5
days before hanging solidly, and usually were less.

So, the 3.9 kernel appears to be considerably more robust at the very
least since I could not cause it to solidly hang as I could in my 2.6
and 3.2/3.5 kernel testbeds.   So it would be good to see 3.9 backported
to Precise for supported usage on our deployed 12.04 systems.  And I
will write another bug for the 2.6 systems that are suffering the most
so that perhaps something can be done there as well.  BUT.....

... I could not tag this bug either as "kernel-bug-exists-upstream" nor
"kernel-fixed-upstream" because while the "solid hang/failure" symptom
*is* fixed in the upstream kernel we *still* experienced the same hangs
but of only 5-10 minutes each event through at least the later half of
the 3.9 kernel testing.  I had no way to measure these hangs other than
my own observations at my testing consoles, I had the impression they
occurred a couple of times a day.  I first noticed them a few days into
the test, and can not say for sure whether they were there from the
beginning or not.  5-10 minutes of outage from our servers would look
the same to most network operations folks as a permanently solid hang,
one can't have customers twiddling their thumbs for that long when
engaged in transactions of some kind.  I believe these "transient" hangs
were also seen in my 3.2/3.5 testing, but I didn't time them since I was
most concerned about the solid hang/failure.  When any of the kernels,
including this 3.9 test,l hangs like this I can see that all CPUs are
100% busy and I presume its the same symptom I've reported earlier about
the constant rescheduling all processes for another page that I reported
as part of my kdb session attachment.  But I did not break in with kdb
to confirm that in this round of testing, I didn't want to risk
disrupting the longer-term survival testing I was going for primarily.
I can confirm that pings were still responded to during these hangs and
that the serial console remained unresponsive for the 5-10 minutes of
hang.

** Changed in: linux (Ubuntu)
       Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1154876

Title:
  3.2.0-38 and earlier systems hang with heavy memory usage

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1154876/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to