On 19/04/2008, at 3:14 AM, John Baldwin wrote:
On Thursday 10 April 2008 06:33:40 pm Aristedes Maniatis wrote:
http://www.ish.com.au/s/LOR/1.jpg
http://www.ish.com.au/s/LOR/2.jpg
http://www.ish.com.au/s/LOR/3.jpg (this overlaps with [2])
These are all garbage in kuickshow. :(
They work fine for me in Firefox. But don't know what sort of jpegs
the Sony camera saves. Anyhow I've also now resaved them as png
(about
twice the size). Please let me know if that worked.
http://www.ish.com.au/s/LOR/1.png , etc
kuickshow had issues still, but FF worked ok. The specific LOR at
the end is
real, but a minor one. Basically, the console driver locks
(e.g. "sio", "scrlock") are higher in the order than the various
thread
locks, so any printf while holding a thread lock will trigger a
LOR. The
real problem at the bottom of the screen though is a real issue.
It's a LOR
of two different sleepqueue chain locks. The problem is that when
setrunnable() encounters a swapped out thread it tries to wakeup
proc0, but
if proc0 is asleep (which is typical) then its thread lock is a
sleep queue
chain lock, so waking up a swapped out thread from wakeup() will
usually
trigger this LOR.
I think the best fix is to not have setrunnable() kick proc0 directly.
Perhaps setrunnable() should return an int and return true if proc0
needs to
be awakened and false otherwise. Then the the sleepq code (b/c only
sleeping
threads can be swapped out anyway) can return that value from
sleepq_resume_thread() and can call kick_proc0() directly once it
has dropped
all of its own locks.
--
John Baldwin
The way you describe it, it almost sounds like this LOR should be
happening for everyone, all the time. To try and eliminate the factors
which trigger it for us, we tried the following: removed PAE from
kernel, disabled PF. Neither of these things made any difference and
the error is fairly quickly reproducible (within a couple of hours
running various things to load the machine). The one thing we did not
test yet is removing ZFS from the picture. Note also that this box ran
for years and years on FreeBSD 4.x without a hiccup (non PAE, ipfw
instead of pf and no ZFS of course).
Since I've ordered a replacement machine to go into production now, I
am happy to make this one available for whatever testing would benefit
the FreeBSD community to track down the problem.
If useful, we could upgrade this machine to 7 STABLE branch and use
the new tools Robert Watson recently wrote to dump better crash logs.
Let me know, but I don't know a lot about them yet apart from what I
read on this list.
Regards
Ari Maniatis
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001 fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"