Re: Big problems with 7.1 locking up :-(

Tomas Randa Mon, 12 Jan 2009 13:05:19 -0800

Hello,

I have similar problems. The last "good" kernel I have from stablebrach, october the 8. Then in next upgrade, I saw big problems withperformance.

I tried ULE, 4BSD etc, but nothing helps, only downgrading system back.

Now I am trying 7.1-p1 and problems are here again. Mysql is waiting alot of time with status "waiting for opening table" or "waiting forclose tables"

I have 32bit FreeBSD with PAE, 1x xeon 5420, supermicro motherboard,areca SATA controller. Could not be problem in "da" device for example?


Thanks Tomas Randa

Garance A Drosihn wrote:

At 2:55 PM +0000 1/12/09, Robert Watson wrote:
On Fri, 9 Jan 2009, Garance A Drosihn wrote:
At 2:39 PM -0500 1/9/09, Robert Blayzor wrote:
On Jan 8, 2009, at 8:58 PM, Pete French wrote:
I have a number of HP 1U servers, all of which were running 7.0perfectly happily. I have been testing 7.1 in it's variousincarnations for the last couple of months on our test server andit has performed perfectly.
I noticed a problem with 7.0 on a couple of Dell servers. [...]We've since then compiled the kernel under the BSD scheduler torule that out, and so far so good.
Since ULE is now default in 7.1 and not in 7.0, perhaps you can trythat?
FWIW, the other guy I know who is having this problem had alreadyswitched to using ULE under 7.0-release, and did not have anyproblems with it. So *his* problem was probably not related toSCHED_ULE, unless something has recently changed there.
Turns out he hasn't reverted back to 7.0-release just yet, so he'sgoing to try SCHED_4BSD and see if that helps his situation.
Scheduler changes always come with some risk of exposing bugs thathave existed in the code for a long time but never really manifestedthemselves. ULE is well shaken-out, having been under development forat least five years, but it is possible that some problems willbecome visible as a result of the switch. I would encourage peopleto stick with ULE, but if you're having a stability problem thenexperimenting with scheduler as a variable that could be triggeringthe problem may well be useful to help track down the bug.
Just to followup on this:  My friend did switch back to a 7.1 kernel with
SCHED_4BSD, and he still ran into problems.  The error messages weren't
the same, but errors did happen in the same high disk-I/O situations as
the lockup happened with SCHED_ULE.  At this point he's fallen back to
the 7.0-kernel that he had been running (which also has SCHED_ULE), and
all the problems have gone away.  So at the moment he's running with a
7.0-ish kernel and the 7.1-release userland, without the hangingproblems.
So the problem is something in the kernel, but it is *NOT* the scheduler
(at least, not in his case).

He is not eager to do a whole lot of experiments to track down the
problem, since this is happening on busy production machines and he
can't afford to have a lot of downtime on them (especially now that the
semester at RPI has started up).  The systems have some large (2 TB)
filesystems on them, and the lockups occur in high disk-I/O situations.
He's seeing the problem on one system which is a dual CPU quad-core
xeon, and another which is a 64 bit P4 with hyperthreading.  The one
thing in common between the two setups is that the boot drives + a
3ware controller (with its array of RAID disks) is moved from one
machine to the other one:

  "its a 3ware 9500 12 port model, the boot drive is connected to
   an ICH6 in IDE mode, and yes, I've run it in single, single with
   hyper threading, and 8 way mode.  All 64 bit."

We still have no idea where the problem really is.  For all we know,
someone spilled a Pepsi on it when he wasn't looking...

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Big problems with 7.1 locking up :-(

Reply via email to