re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread matthew green

> On Fri, Mar 09, 2012 at 02:49:43PM +, Julian Coleman wrote:
> > Hi,
> > 
> > After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to
> > assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based
> > sparc64 crashed with:
> > 
> >   panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file 
> > "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221
> > 
> > The backtrace is:
> > 
> >   kern_assert
> >   scsipi_lookup_periph
> >   scsipi_async_event
> >   ncr53c9x_update_xfer_mode
> >   ncr53c9x_init
> >   ncr53c9x_attach
> > 
> > The attached allows it to boot and run, but is it correct?
> 
> assuming ncr53c9x is MP-safe, it is correct.

it isn't, but these happen during the non-interrupt driven autoconfig
phase and kernel lock isn't held here.


.mrg.


re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread matthew green

> After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to
> assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based
> sparc64 crashed with:
> 
>   panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file 
> "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221
> 
> The backtrace is:
> 
>   kern_assert
>   scsipi_lookup_periph
>   scsipi_async_event
>   ncr53c9x_update_xfer_mode
>   ncr53c9x_init
>   ncr53c9x_attach
> 
> The attached allows it to boot and run, but is it correct?

i had a look at all of the scsi drivers, and this and ninjascsi are
the only broken ones.  i have a change identical to this in my tree.

it seems the best solution for now, but please add /* XXXSMP scsipi */
comments.


.mrg.


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Manuel Bouyer
On Sat, Mar 10, 2012 at 03:22:43AM +0900, Izumi Tsutsui wrote:
> > Since attach is usually called when the system is cold, there are no
> > other CPUs running so the system is effectively in KERNEL_LOCK().
> > 
> > Rather than fix the driver, maybe init_main should take out KERNEL_LOCK()
> > until it exits cold state.  
> 
> What about hot-plug?  (we have esp at pcmcia)

While it's not a big problem to not have KERNEL_LOCK() while cold
(execpt for DIAGNOSTIC), I really hope autoconf runs under KERNEL_LOCK()
for hot-plug devices. Otherwise bad things can happen ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Izumi Tsutsui
matt@ wrote:

> On Mar 9, 2012, at 7:53 AM, Manuel Bouyer wrote:
> 
> > On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote:
> >> On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote:
> >>> if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK
> >>> itself, and so should not need to take it before calling back in
> >>> scsipi.
> >> 
> >> Where is that dealt with? I.e. where is the lock taken before the attach
> >> function is called?
> > 
> > This I don't know. Maybe it's not taken when running autoconf.
> > If that's true I'm surprised there isn't more issues than this ...
> 
> Since attach is usually called when the system is cold, there are no
> other CPUs running so the system is effectively in KERNEL_LOCK().
> 
> Rather than fix the driver, maybe init_main should take out KERNEL_LOCK()
> until it exits cold state.  

What about hot-plug?  (we have esp at pcmcia)

---
Izumi Tsutsui


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Matt Thomas

On Mar 9, 2012, at 7:53 AM, Manuel Bouyer wrote:

> On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote:
>> On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote:
>>> if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK
>>> itself, and so should not need to take it before calling back in
>>> scsipi.
>> 
>> Where is that dealt with? I.e. where is the lock taken before the attach
>> function is called?
> 
> This I don't know. Maybe it's not taken when running autoconf.
> If that's true I'm surprised there isn't more issues than this ...

Since attach is usually called when the system is cold, there are no
other CPUs running so the system is effectively in KERNEL_LOCK().

Rather than fix the driver, maybe init_main should take out KERNEL_LOCK()
until it exits cold state.  


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Manuel Bouyer
On Fri, Mar 09, 2012 at 04:44:54PM +0100, Martin Husemann wrote:
> On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote:
> > if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK
> > itself, and so should not need to take it before calling back in
> > scsipi.
> 
> Where is that dealt with? I.e. where is the lock taken before the attach
> function is called?

This I don't know. Maybe it's not taken when running autoconf.
If that's true I'm surprised there isn't more issues than this ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Martin Husemann
On Fri, Mar 09, 2012 at 04:39:08PM +0100, Manuel Bouyer wrote:
> if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK
> itself, and so should not need to take it before calling back in
> scsipi.

Where is that dealt with? I.e. where is the lock taken before the attach
function is called?

Martin


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Manuel Bouyer
On Fri, Mar 09, 2012 at 04:30:56PM +0100, Martin Husemann wrote:
> On Fri, Mar 09, 2012 at 04:17:42PM +0100, Manuel Bouyer wrote:
> > assuming ncr53c9x is MP-safe, it is correct.
> 
> Just out of curiosity: assuming it isn't, what would be different?

if ncr53c9x is not MP-safe, it should be running under the KERNEL_LOCK
itself, and so should not need to take it before calling back in
scsipi.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Martin Husemann
On Fri, Mar 09, 2012 at 04:17:42PM +0100, Manuel Bouyer wrote:
> assuming ncr53c9x is MP-safe, it is correct.

Just out of curiosity: assuming it isn't, what would be different?

Martin


Re: ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Manuel Bouyer
On Fri, Mar 09, 2012 at 02:49:43PM +, Julian Coleman wrote:
> Hi,
> 
> After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to
> assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based
> sparc64 crashed with:
> 
>   panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file 
> "/usr/src/sys/dev/scsipi/scsipi_base.c", line 221
> 
> The backtrace is:
> 
>   kern_assert
>   scsipi_lookup_periph
>   scsipi_async_event
>   ncr53c9x_update_xfer_mode
>   ncr53c9x_init
>   ncr53c9x_attach
> 
> The attached allows it to boot and run, but is it correct?

assuming ncr53c9x is MP-safe, it is correct.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


ncr53c9x fallout from asserting kernel lock in scsipi_base

2012-03-09 Thread Julian Coleman
Hi,

After the change in revision 1.156 of src/sys/dev/scsipi/scsipi_base.c to
assert that the kernel lock is held in scsipi_lookup_periph(), my SBus-based
sparc64 crashed with:

  panic: kernel diagnostic assertion "KERNEL_LOCKED_P()" failed: file 
"/usr/src/sys/dev/scsipi/scsipi_base.c", line 221

The backtrace is:

  kern_assert
  scsipi_lookup_periph
  scsipi_async_event
  ncr53c9x_update_xfer_mode
  ncr53c9x_init
  ncr53c9x_attach

The attached allows it to boot and run, but is it correct?

Thanks,

J

-- 
  My other computer also runs NetBSD/Sailing at Newbiggin
http://www.netbsd.org//   http://www.newbigginsailingclub.org/
Index: src/sys/dev/ic/ncr53c9x.c
===
RCS file: /cvsroot/src/sys/dev/ic/ncr53c9x.c,v
retrieving revision 1.143
diff -u -r1.143 ncr53c9x.c
--- src/sys/dev/ic/ncr53c9x.c   31 Jul 2011 18:39:00 -  1.143
+++ src/sys/dev/ic/ncr53c9x.c   9 Mar 2012 14:41:02 -
@@ -546,7 +546,10 @@
ti->offset = 0;
ti->cfg3   = 0;
 
+   /* We need to have the kernel lock in scsipi_lookup_periph() */
+   KERNEL_LOCK(1, curlwp);
ncr53c9x_update_xfer_mode(sc, r);
+   KERNEL_UNLOCK_ONE(curlwp);
}
 
if (doreset) {
@@ -558,7 +561,9 @@
}
 
/* Notify upper layer */
+   KERNEL_LOCK(1, curlwp);
scsipi_async_event(&sc->sc_channel, ASYNC_EVENT_RESET, NULL);
+   KERNEL_UNLOCK_ONE(curlwp);
 }
 
 /*