Last night we killed one more bug. This is what was causing the
lockups that Doug Gilbert was seeing. He is seeing a second problem that
is much less frequent that I will have to look into next. I know what the
problem is - I just need to code up the solution and test it.
The problem I was seeing where eh_wait was getting zeroed had a
simple explaination. I was running the tests in single-user mode, and the
error handling threads are incorrectly shutting down in single user mode.
In addition, the error handling threads are sitting around as zombies
(which is why I thought they were still running in the first place),
as they are not reaped by anybody.
I need to tweak the set of blocked signals so that it is possible to
unload a module, yet these things don't get killed by init when dropping
down to single-user mode.
I sent off a set of diffs against 2.3.32-pre4 this morning, which
contains this bugfix plus the patches I sent out yesterday.
I haven't had a chance to browse linux-kernel in the past couple
of days. If there is anything being said over there that I need to be
aware of, please let me know.
I will be on vacation starting this coming Sat for about 10 days.
I will have a laptop, and I will be responding to mail (but I won't be
checking more than once a day or so). I won't have the ability to test
anything on a live kernel (no SCSI in the laptop).
-Eric
"The world was a library, and its books were the stones, leaves,
brooks, grass, and the birds of the earth. We learned to do what only
a student of nature ever learns, and that was to feel beauty."
Chief Luther Standing Bear - Teton Sioux
*** scsi_lib.c.orig Mon Dec 13 23:54:24 1999
--- scsi_lib.c Mon Dec 13 23:54:48 1999
***************
*** 109,114 ****
--- 109,115 ----
for (req = q->current_request; req; req = req->next) {
if (req->next == NULL) {
req->next = &SCpnt->request;
+ break;
}
}
}