re: ncr53c9x fallout from asserting kernel lock in scsipi_base
> On Fri, Mar 09, 2012 at 02:49:43PM +, Julian Coleman wrote: > > Hi, > > > > After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to > > assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based > > sparc64 crashed with: > > > > panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file > > "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221 > > > > The backtrace is: > > > > kern_assert > > scsipi_lookup_periph > > scsipi_async_event > > ncr53c9x_update_xfer_mode > > ncr53c9x_init > > ncr53c9x_attach > > > > The attached allows it to boot and run, but is it correct? > > assuming ncr53c9x is MP-safe, it is correct. it isn't, but these happen during the non-interrupt driven autoconfig phase and kernel lock isn't held here. .mrg.
re: ncr53c9x fallout from asserting kernel lock in scsipi_base
> After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to > assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based > sparc64 crashed with: > > panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file > "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221 > > The backtrace is: > > kern_assert > scsipi_lookup_periph > scsipi_async_event > ncr53c9x_update_xfer_mode > ncr53c9x_init > ncr53c9x_attach > > The attached allows it to boot and run, but is it correct? i had a look at all of the scsi drivers, and this and ninjascsi are the only broken ones. i have a change identical to this in my tree. it seems the best solution for now, but please add /* XXXSMP scsipi */ comments. .mrg.
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Sat, Mar 10, 2012 at 03:22:43AM +0900, Izumi Tsutsui wrote: > > Since attach is usually called when the system is cold, there are no > > other CPUs running so the system is effectively in KERNEL_LOCK(). > > > > Rather than fix the driver, maybe init_main should take out KERNEL_LOCK() > > until it exits cold state. > > What about hot-plug? (we have esp at pcmcia) While it's not a big problem to not have KERNEL_LOCK() while cold (execpt for DIAGNOSTIC), I really hope autoconf runs under KERNEL_LOCK() for hot-plug devices. Otherwise bad things can happen ... -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
matt@ wrote: > On Mar 9, 2012, at 7:53 AM, Manuel Bouyer wrote: > > > On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote: > >> On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote: > >>> if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK > >>> itself, and so should not need to take it before calling back in > >>> scsipi. > >> > >> Where is that dealt with? I.e. where is the lock taken before the attach > >> function is called? > > > > This I don't know. Maybe it's not taken when running autoconf. > > If that's true I'm surprised there isn't more issues than this ... > > Since attach is usually called when the system is cold, there are no > other CPUs running so the system is effectively in KERNEL_LOCK(). > > Rather than fix the driver, maybe init_main should take out KERNEL_LOCK() > until it exits cold state. What about hot-plug? (we have esp at pcmcia) --- Izumi Tsutsui
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Mar 9, 2012, at 7:53 AM, Manuel Bouyer wrote: > On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote: >> On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote: >>> if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK >>> itself, and so should not need to take it before calling back in >>> scsipi. >> >> Where is that dealt with? I.e. where is the lock taken before the attach >> function is called? > > This I don't know. Maybe it's not taken when running autoconf. > If that's true I'm surprised there isn't more issues than this ... Since attach is usually called when the system is cold, there are no other CPUs running so the system is effectively in KERNEL_LOCK(). Rather than fix the driver, maybe init_main should take out KERNEL_LOCK() until it exits cold state.
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote: > On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote: > > if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK > > itself, and so should not need to take it before calling back in > > scsipi. > > Where is that dealt with? I.e. where is the lock taken before the attach > function is called? This I don't know. Maybe it's not taken when running autoconf. If that's true I'm surprised there isn't more issues than this ... -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote: > if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK > itself, and so should not need to take it before calling back in > scsipi. Where is that dealt with? I.e. where is the lock taken before the attach function is called? Martin
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Fri, Mar 09, 2012 at 04:30:56PM +0100, Martin Husemann wrote: > On Fri, Mar 09, 2012 at 04:17:42PM +0100, Manuel Bouyer wrote: > > assuming ncr53c9x is MP-safe, it is correct. > > Just out of curiosity: assuming it isn't, what would be different? if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK itself, and so should not need to take it before calling back in scsipi. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Fri, Mar 09, 2012 at 04:17:42PM +0100, Manuel Bouyer wrote: > assuming ncr53c9x is MP-safe, it is correct. Just out of curiosity: assuming it isn't, what would be different? Martin
Re: ncr53c9x fallout from asserting kernel lock in scsipi_base
On Fri, Mar 09, 2012 at 02:49:43PM +, Julian Coleman wrote: > Hi, > > After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to > assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based > sparc64 crashed with: > > panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file > "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221 > > The backtrace is: > > kern_assert > scsipi_lookup_periph > scsipi_async_event > ncr53c9x_update_xfer_mode > ncr53c9x_init > ncr53c9x_attach > > The attached allows it to boot and run, but is it correct? assuming ncr53c9x is MP-safe, it is correct. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
ncr53c9x fallout from asserting kernel lock in scsipi_base
Hi, After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based sparc64 crashed with: panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221 The backtrace is: kern_assert scsipi_lookup_periph scsipi_async_event ncr53c9x_update_xfer_mode ncr53c9x_init ncr53c9x_attach The attached allows it to boot and run, but is it correct? Thanks, J -- My other computer also runs NetBSD/Sailing at Newbiggin http://www.netbsd.org// http://www.newbigginsailingclub.org/ Index: src/sys/dev/ic/ncr53c9x.c === RCS file: /cvsroot/src/sys/dev/ic/ncr53c9x.c,v retrieving revision 1.143 diff -u -r1.143 ncr53c9x.c --- src/sys/dev/ic/ncr53c9x.c 31 Jul 2011 18:39:00 - 1.143 +++ src/sys/dev/ic/ncr53c9x.c 9 Mar 2012 14:41:02 - @@ -546,7 +546,10 @@ ti->offset = 0; ti->cfg3 = 0; + /* We need to have the kernel lock in scsipi_lookup_periph() */ + KERNEL_LOCK(1, curlwp); ncr53c9x_update_xfer_mode(sc, r); + KERNEL_UNLOCK_ONE(curlwp); } if (doreset) { @@ -558,7 +561,9 @@ } /* Notify upper layer */ + KERNEL_LOCK(1, curlwp); scsipi_async_event(&sc->sc_channel, ASYNC_EVENT_RESET, NULL); + KERNEL_UNLOCK_ONE(curlwp); } /*