On Tue, Mar 13, 2012 at 08:15:33PM +0100, Mark Kettenis wrote: > > Date: Mon, 12 Mar 2012 22:39:31 +0100 (CET) > > From: Mark Kettenis <mark.kette...@xs4all.nl> > > > > > Date: Sat, 25 Feb 2012 09:55:57 +0100 > > > From: Paul de Weerd <we...@weirdnet.nl> > > > > > > I recently got a v215 from a friend and have installed OpenBSD on it. > > > Occassionally, it will panic during boot. This happened during > > > install and I see it now during regular reboots. I can pretty much > > > reproduce this at will with a couple of reboots. > > > > > > Could this be faulty hardware ? To reset the ALOM password, I > > > installed Solaris 10 (took an eternity) and that never showed any > > > problems, but I guess that doesn't prove much. > > > > > > First the panic and then full dmesg (from a succesful boot) are > > > included below. > > > > I doubt this is faulty hardware. I've seen similar reports for a > > v445, which has the same crappy Acer Labs pciide(4) controller. I > > fear that the wdc.c changes made in April 2011 introduced this > > behaviour. > > So thanks to Paul giving me access to the machine in question I've > been able to figure out what's going wrong here. > > The data error always happens when running wdcintr() for channel 1. > Now on these machines we have the following line in dmesg > > ... > pciide0: channel 1 disabled (no drives) > ... > > indicating that there is no actual hardware connected to channel 1. > As a result of this we skip further initialization of the channel. > Therefore it shouldn't be a terrible surprise that the chip doesn't > like it when we try to read registers associated with this channel. > On crappy PC hardware this won't be noticed, but on sparc64 this > results in an unrecoverable fault. > > The solution is easy. We shouldn't be calling wdcintr() for a channel > that isn't properly initialized. > > ok? > > > Index: pciide.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/pciide.c,v > retrieving revision 1.337 > diff -u -p -r1.337 pciide.c > --- pciide.c 15 Jan 2012 15:16:23 -0000 1.337 > +++ pciide.c 13 Mar 2012 18:54:50 -0000 > @@ -1838,6 +1838,9 @@ pciide_pci_intr(void *arg) > if (cp->compat) > continue; > > + if (cp->hw_ok == 0) > + continue; > + > if (pciide_intr_flag(cp) == 0) > continue; >
Make sense to me. ok krw@