Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE - FURTHER UPDATE

2005-04-05 Thread Karl Denninger
On Tue, Apr 05, 2005 at 08:25:04PM -0500, Karl Denninger wrote:
> 
> This patch appears to be "safe".
> 
> I have about 2 hours on the production machine right now post-rebuild (which
> had to complete first) with the added "callout_drain" in, have taken two DMA
> WRITE retries, and have not yet seen any evidence of destabilization.
> 
> This is good evidence but not proof - before I took out the original line
> the FIRST write retry would immediately cause the system to become unstable.

I have forced over a dozen errors (high rate I/O) and still no stability
problems.

Patch looks good.

--
-- 
Karl Denninger ([EMAIL PROTECTED]) Internet Consultant & Kids Rights Activist
http://www.denninger.netMy home on the net - links to everything I do!
http://scubaforum.org   Your UNCENSORED place to talk about DIVING!
http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME!
http://genesis3.blogspot.comMusings Of A Sentient Mind


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: DANGER WILL ROBINSON! SERIOUS problem with current 5.4-PRERELEASE - FURTHER UPDATE

2005-04-05 Thread Karl Denninger
On Thu, Mar 31, 2005 at 11:06:08AM -0600, Karl Denninger wrote:
> On Thu, Mar 31, 2005 at 12:02:20PM -0500, Matthew N. Dodd wrote:
> > On Wed, 30 Mar 2005, Karl Denninger wrote:
> > > Removing the FIRST delta, which is:
> > >
> > > 218a219,221
> > >   if (!dumping)
> > >   callout_reset(&request->callout, request->timeout * hz,
> > > (timeout_t*)ata_timeout, request);
> > >
> > > appears to get rid of the crashes while not harming data integrity OR the
> > > reqeueing.
> > 
> > I'd be interested to know if the attached patch does anything.
> > 
> > -- 
> > 10 40 80 C0 00 FF FF FF FF C0 00 00 00 00 10 AA AA 03 00 00 00 08 00
> > Index: ata-queue.c
> > ===
> > RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v
> > retrieving revision 1.32.2.6
> > diff -u -u -r1.32.2.6 ata-queue.c
> > --- ata-queue.c 23 Mar 2005 04:50:26 -  1.32.2.6
> > +++ ata-queue.c 31 Mar 2005 17:00:46 -
> > @@ -217,8 +217,7 @@
> >  }
> >  else {
> > if (!dumping)
> > -   callout_reset(&request->callout, request->timeout * hz,
> > - (timeout_t*)ata_timeout, request);
> > +callout_drain(&request->callout);
> > if (request->bio && !(request->flags & ATA_R_TIMEOUT)) {
> > ATA_DEBUG_RQ(request, "finish bio_taskqueue");
> > bio_taskqueue(request->bio, (bio_task_t *)ata_completed, request);
> > 
> 
> It'll be a few hours before I will know on the production machine - the RAID
> array has to rebuild before I can trigger the problem, and we're scheduled
> for some power work here in an hour or so - which I suspect will get in the
> way.
> 
> What do you expect the patch to do, given that removing the delta appears to
> fix the instability problem?

This patch appears to be "safe".

I have about 2 hours on the production machine right now post-rebuild (which
had to complete first) with the added "callout_drain" in, have taken two DMA
WRITE retries, and have not yet seen any evidence of destabilization.

This is good evidence but not proof - before I took out the original line
the FIRST write retry would immediately cause the system to become unstable.

--
-- 
Karl Denninger ([EMAIL PROTECTED]) Internet Consultant & Kids Rights Activist
http://www.denninger.netMy home on the net - links to everything I do!
http://scubaforum.org   Your UNCENSORED place to talk about DIVING!
http://www.spamcuda.net SPAM FREE mailboxes - FREE FOR A LIMITED TIME!
http://genesis3.blogspot.comMusings Of A Sentient Mind


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"