re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread matthew green

 Module Name:  src
 Committed By: bouyer
 Date: Wed Apr 18 20:37:49 UTC 2012
 
 Modified Files:
   src/sys/dev/scsipi: scsipi_base.c
 
 Log Message:
 Fix KASSERT(): autoconf doesn't run under the KERNEL_LOCK

this is true, but can you please fix it differently?  ie autoconf
isn't cold.  most of scsi autoconf runs after cold is gone, so
i'd rather make whatever callers deal with it so that they work
cold or not.

thanks.


.mrg.


Re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread Manuel Bouyer
On Thu, Apr 19, 2012 at 04:41:04PM +1000, matthew green wrote:
 
  Module Name:src
  Committed By:   bouyer
  Date:   Wed Apr 18 20:37:49 UTC 2012
  
  Modified Files:
  src/sys/dev/scsipi: scsipi_base.c
  
  Log Message:
  Fix KASSERT(): autoconf doesn't run under the KERNEL_LOCK
 
 this is true, but can you please fix it differently?  ie autoconf
 isn't cold.  most of scsi autoconf runs after cold is gone, so
 i'd rather make whatever callers deal with it so that they work
 cold or not.

But if we're not cold, interrupts are enabled so I hope whatever calls
into scsi autoconf runs under the big lock if the underlying driver
isn't SMP-safe (and if it is, it's its job to take the KERNEL_LOCK).

The problem I've seen is with a driver I'm working on, similar to mfi(4)
(and I suspect mfi(4) would have had the same problem):
the driver's attach calls config_found_sm_loc() to attach the scsibus
while cold, so the big lock isn't helt at this point.

If the driver's attach is called after cold (e.g. after a detach/rescan
of the pci bus), the driver's attach should be called with KERNEL_LOCK
held, or bad things may happen when interrupts are enabled for this driver.

What kind of senario do you have in mind ?

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread matthew green

   Module Name:  src
   Committed By: bouyer
   Date: Wed Apr 18 20:37:49 UTC 2012
   
   Modified Files:
 src/sys/dev/scsipi: scsipi_base.c
   
   Log Message:
   Fix KASSERT(): autoconf doesn't run under the KERNEL_LOCK
  
  this is true, but can you please fix it differently?  ie autoconf
  isn't cold.  most of scsi autoconf runs after cold is gone, so
  i'd rather make whatever callers deal with it so that they work
  cold or not.
 
 But if we're not cold, interrupts are enabled so I hope whatever calls
 into scsi autoconf runs under the big lock if the underlying driver
 isn't SMP-safe (and if it is, it's its job to take the KERNEL_LOCK).

the point of the KASSERT()'s it to make sure that, until scsipi is
made smp safe, calls into scsi are all done with the kernel lock.

 The problem I've seen is with a driver I'm working on, similar to mfi(4)
 (and I suspect mfi(4) would have had the same problem):
 the driver's attach calls config_found_sm_loc() to attach the scsibus
 while cold, so the big lock isn't helt at this point.

yeah - i'm pretty sure i've seen this and fixed it in a few other
places rather than the above change.

 If the driver's attach is called after cold (e.g. after a detach/rescan
 of the pci bus), the driver's attach should be called with KERNEL_LOCK
 held, or bad things may happen when interrupts are enabled for this driver.

there should be no reliance on cold being set for normal driver
attach.  it might be a module loaded after boot.  how ever the
driver is loaded, it will need to work without cold being set.

in my mind, the scsi code should try to run regardless of the value
of cold and that's why i replied above.

 What kind of senario do you have in mind ?

modules, as above.  or simply drvctl -d / -r.  IMO, only platform
specific code really should depend upon cold.  scsipi, as a
relatively high level subsystem, should not.

thanks.


.mrg.


Re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread Manuel Bouyer
On Thu, Apr 19, 2012 at 06:25:54PM +1000, matthew green wrote:
 [...]
 
  If the driver's attach is called after cold (e.g. after a detach/rescan
  of the pci bus), the driver's attach should be called with KERNEL_LOCK
  held, or bad things may happen when interrupts are enabled for this driver.
 
 there should be no reliance on cold being set for normal driver
 attach.  it might be a module loaded after boot.  how ever the
 driver is loaded, it will need to work without cold being set.

If it's a module laoded after boot, and the driver is not SMP-safe,
its attach function has to be called with KERNEL_LOCK held. Or you may
have problems when this driver starts receiving interrupts before its
attach function has returned (which is typically the case, interrupt enable
occurs somewhere in _attach()).

 
 in my mind, the scsi code should try to run regardless of the value
 of cold and that's why i replied above.
 
  What kind of senario do you have in mind ?
 
 modules, as above.  or simply drvctl -d / -r.  IMO, only platform
 specific code really should depend upon cold.  scsipi, as a
 relatively high level subsystem, should not.

But I don't think it's the job of non-SMP-safe drivers to make sure scsipi
is called with KERNEL_LOCK held (and scsipi may be the only subsystem
to check for KERNEL_LOCK(), but others do rely on it as well - e.g.
sys/dev/ata). Either attach() functions should always be called with
KERNEL_LOCK(), or for checks as above we have to accept that when
cold, locking is not fully up yet and lock checks should be bypassed.

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread matthew green

   If the driver's attach is called after cold (e.g. after a detach/rescan
   of the pci bus), the driver's attach should be called with KERNEL_LOCK
   held, or bad things may happen when interrupts are enabled for this 
   driver.
  
  there should be no reliance on cold being set for normal driver
  attach.  it might be a module loaded after boot.  how ever the
  driver is loaded, it will need to work without cold being set.
 
 If it's a module laoded after boot, and the driver is not SMP-safe,
 its attach function has to be called with KERNEL_LOCK held. Or you may
 have problems when this driver starts receiving interrupts before its
 attach function has returned (which is typically the case, interrupt enable
 occurs somewhere in _attach()).

i don't see how this matters.  drivers should never enable
interrupts if they're not ready to service them.

if this is happening, then it is a driver bug.

  in my mind, the scsi code should try to run regardless of the value
  of cold and that's why i replied above.
  
   What kind of senario do you have in mind ?
  
  modules, as above.  or simply drvctl -d / -r.  IMO, only platform
  specific code really should depend upon cold.  scsipi, as a
  relatively high level subsystem, should not.
 
 But I don't think it's the job of non-SMP-safe drivers to make sure scsipi
 is called with KERNEL_LOCK held (and scsipi may be the only subsystem
 to check for KERNEL_LOCK(), but others do rely on it as well - e.g.
 sys/dev/ata). Either attach() functions should always be called with
 KERNEL_LOCK(), or for checks as above we have to accept that when
 cold, locking is not fully up yet and lock checks should be bypassed.

the lack of checks elsewhere is not the point here.  we should add
them to dev/ata (and more) as well, if they depend upon these.

currently, autoconf isn't run with the kernel lock held at any point
and i don't think that saying that we should hold it during autoconf
is a good change.  we should fix the cases that trigger problems when
they appear, and it sounds like you've hit one in your new driver.
holding the kernel lock here would be a major step backwards.

scsipi depends upon kernel lock.  thus, callers should arrange for it
to be held.  since these are drivers calling, it is upto each driver
to do this currently.  this is what we've done with other problems we
have hit as they've arrived.

it may be that we have this problem in a large number of drivers but
these KASSERT()s have been in place for a long time now and, until
your change, we've fixed it at the caller level not worked around it
at the scsipi level.

your change can't work with the drivers that are broken without the
major regression of holding kernel lock over autoconfig, or, never
ever having them be modules or detached/reattached.

please make drvctl -d and rescan work for your drivers.  this will
make the checks against cold go away.


.mrg.


Re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread Manuel Bouyer
On Thu, Apr 19, 2012 at 07:00:56PM +1000, matthew green wrote:
 
If the driver's attach is called after cold (e.g. after a detach/rescan
of the pci bus), the driver's attach should be called with KERNEL_LOCK
held, or bad things may happen when interrupts are enabled for this 
driver.
   
   there should be no reliance on cold being set for normal driver
   attach.  it might be a module loaded after boot.  how ever the
   driver is loaded, it will need to work without cold being set.
  
  If it's a module laoded after boot, and the driver is not SMP-safe,
  its attach function has to be called with KERNEL_LOCK held. Or you may
  have problems when this driver starts receiving interrupts before its
  attach function has returned (which is typically the case, interrupt enable
  occurs somewhere in _attach()).
 
 i don't see how this matters.  drivers should never enable
 interrupts if they're not ready to service them.
 
 if this is happening, then it is a driver bug.

It looks like we're talking about different things.
I'm talking about this senario, with a non-smp-safe driver.

foo_attach()
{
do software and hardware initialisation();

s = splbio();
enable_interrupt(foo_intr());
finish_init_and_attach_children();
splx(s);
return;
}

foo_intr()
{

}

In this context, the driver enables interrupt, but don't expect to have its
interrupt handler called before the splx(), because
finish_init_and_attach_children() is running at IPL_BIO. But if foo_attach()
is not called with KERNEL_LOCK held, another CPU may call foo_intr(),
breaking the driver's spl protection.

And I don't think it's the driver's business to make sure KERNEL_LOCK is
held in it's attach routine: it's a non-MPSAFE driver and it should not
even have to know about KERNEL_LOCK.

If attach routines can be called without KERNEL_LOCK held after cold,
then all non-MPSAFE drivers (i.e. almost all of them) needs to be fixed.


 
   in my mind, the scsi code should try to run regardless of the value
   of cold and that's why i replied above.
   
What kind of senario do you have in mind ?
   
   modules, as above.  or simply drvctl -d / -r.  IMO, only platform
   specific code really should depend upon cold.  scsipi, as a
   relatively high level subsystem, should not.
  
  But I don't think it's the job of non-SMP-safe drivers to make sure scsipi
  is called with KERNEL_LOCK held (and scsipi may be the only subsystem
  to check for KERNEL_LOCK(), but others do rely on it as well - e.g.
  sys/dev/ata). Either attach() functions should always be called with
  KERNEL_LOCK(), or for checks as above we have to accept that when
  cold, locking is not fully up yet and lock checks should be bypassed.
 
 the lack of checks elsewhere is not the point here.  we should add
 them to dev/ata (and more) as well, if they depend upon these.
 
 currently, autoconf isn't run with the kernel lock held at any point
 and i don't think that saying that we should hold it during autoconf
 is a good change.  we should fix the cases that trigger problems when
 they appear, and it sounds like you've hit one in your new driver.
 holding the kernel lock here would be a major step backwards.
 
 scsipi depends upon kernel lock.  thus, callers should arrange for it
 to be held.  since these are drivers calling, it is upto each driver
 to do this currently.  this is what we've done with other problems we
 have hit as they've arrived.

if the caller is MPSAFE, I agree. Non-MPSAFE driver should not have to care
about KERNEL_LOCK (precisely because they are not MPSAFE).
The basic assumption about kernel locking was that that driver and subsystems
using spl locking would continue to work as it, without changes. If we
change this assumption a lot of code needs to be changed (that is, almost
all our drivers).

 
 it may be that we have this problem in a large number of drivers but
 these KASSERT()s have been in place for a long time now and, until
 your change, we've fixed it at the caller level not worked around it
 at the scsipi level.

AFAIK the only drivers that has been changed are ncr53c9x and USB, and they
are MP-safe (they already use mutex instead of spl locking).

 
 your change can't work with the drivers that are broken without the
 major regression of holding kernel lock over autoconfig, or, never
 ever having them be modules or detached/reattached.

Maybe non-MPSAFE drivers can't be modules or detached/reattached, indeed.
But that's a problem with autoconf not dealing with non-MPSAFE drivers,
not the driver themselve (they were working before the MP changes, they
should continue to work as-is). If you want autoconf to not run
under the KERNEL_LOCK when not needed, then is has to be extended so that
MPSAFE drivers can register themselves as MPSAFE, so that non-MPSAFE
drivers are still called with KERNEL_LOCK.

 
 please make drvctl -d and rescan work for your drivers.  this will
 make the checks against cold go away.

I don't 

re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread matthew green

  scsipi depends upon kernel lock.  thus, callers should arrange for it
  to be held.  since these are drivers calling, it is upto each driver
  to do this currently.  this is what we've done with other problems we
  have hit as they've arrived.
 
 if the caller is MPSAFE, I agree. Non-MPSAFE driver should not have to care
 about KERNEL_LOCK (precisely because they are not MPSAFE).
 The basic assumption about kernel locking was that that driver and subsystems
 using spl locking would continue to work as it, without changes. If we
 change this assumption a lot of code needs to be changed (that is, almost
 all our drivers).

we've never had autoconfig run with the kernel lock AFAICT, so this
assumption has never been true.  the thing that does work is that
IPL_VM using drivers will automatically get kernel locked in their
interrupt handlers.

  these KASSERT()s have been in place for a long time now and, until
  your change, we've fixed it at the caller level not worked around it
  at the scsipi level.
 
 AFAIK the only drivers that has been changed are ncr53c9x and USB, and they
 are MP-safe (they already use mutex instead of spl locking).

usb is not MP safe in -current, only on the usbmp branch.  all those
drivers take care of taking the kernel lock when calling into the
scsipi code since they almost *always* run when not cold anyway.

infact, the asserts were added in -current due to the usbmp work
and the understanding that they would advance the rest of the system.

  your change can't work with the drivers that are broken without the
  major regression of holding kernel lock over autoconfig, or, never
  ever having them be modules or detached/reattached.
 
 Maybe non-MPSAFE drivers can't be modules or detached/reattached, indeed.

that's not true.  eg, radeondrm works just fine being MP safe or not.
(it actually is MP safe, though it doesn't announce this to any of the
rest of the system currently.)

 But that's a problem with autoconf not dealing with non-MPSAFE drivers,
 not the driver themselve (they were working before the MP changes, they
 should continue to work as-is). If you want autoconf to not run
 under the KERNEL_LOCK when not needed, then is has to be extended so that
 MPSAFE drivers can register themselves as MPSAFE, so that non-MPSAFE
 drivers are still called with KERNEL_LOCK.

it seems the wrong answer to me.  instead of fixing the one or
two drivers you've found problems in, you want to add more
infrastructure (IMO clutter) to the system and patch all the
drivers that *are* ok.  considering that this problem is
very specific to scsipi, i don't think that is a viable
solution.

  please make drvctl -d and rescan work for your drivers.  this will
  make the checks against cold go away.
 
 I don't want to touch each driver in our tree for drvctl -d/rescan to work.

as in aside, you should make drvctl -d work for any drivers
you maintain.  everyone (tm) should.  irrespective of modules
or otherwise, the end result of it working tends to be bug
fixes.

 If we want this to work for non-MPSAFE driver, then this has to be fixed
 in autoconf (if the problem really exists, which I've not looked at yet).

i don't believe there is a problem in autoconfig.  there is a
problem with some (probably most) scsi drivers here, but that
is about it.  it is true that the contracts have changed, but
this is progress after all.  the assert exists solely to find
places that are wrong and fix them.  your change hides this.

this isn't about SMPifing a whole driver, but adding a call to
taken/release the kernel lock around a few (often just one)
calls in a driver.  it has never been a difficult change to
make.  if you have drivers that needs it, i am happy to send
patches for testing.


.mrg.


Re: CVS commit: src/sys/dev/scsipi

2012-04-19 Thread Warner Losh

On Apr 19, 2012, at 6:03 PM, matthew green wrote:
 But that's a problem with autoconf not dealing with non-MPSAFE drivers,
 not the driver themselve (they were working before the MP changes, they
 should continue to work as-is). If you want autoconf to not run
 under the KERNEL_LOCK when not needed, then is has to be extended so that
 MPSAFE drivers can register themselves as MPSAFE, so that non-MPSAFE
 drivers are still called with KERNEL_LOCK.
 
 it seems the wrong answer to me.  instead of fixing the one or
 two drivers you've found problems in, you want to add more
 infrastructure (IMO clutter) to the system and patch all the
 drivers that *are* ok.  considering that this problem is
 very specific to scsipi, i don't think that is a viable
 solution.

FreeBSD started out with MPSAFE and then went to NEEDS_GIANT as flags for the 
drivers.  I'm with Matthew: patch the drivers that are broken (or not known to 
be safe) and then you have a convenient thing to grep for when you want to 
expand the drivers that are safe.  Much better that way, and it turned out to 
be a big win in FreeBSD when we went from opt-in MPSAFE to out-out...

Warner