Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-15 Thread Paul Goyette

Updating the status on these changes:

One comment questioned whether or not a version bump was required, and 
I've more-or-less convinced myself it is at least desired.  While 
properly-working modules from the pre-update will continue to work on a 
post-update kernel, the reverse is not necessarily true.  A module 
written for a post-update kernel and which takes advantage of the 
changed locking protocol will fail on a pre-update kernel.


Another comment suggested that the name of newly-created file 
kern/kern_cfg.c should be changed to more closely match its contents, 
while an earlier comment had suggested generalizing the filename!  I've 
taken the more-specific path and the file is now called kern_cfglock.c
If other stuff gets added to this file later, we can come up with a more 
generic name at that time.


Some additional changes have been included in this latest set of diffs. 
Mostly, there were some KASSERT()s sprinkled throughout that tried to 
ensure that the code had not been called recursively.  Since recursion 
is now explicitly supported, all of the "module_active" stuff has been 
reworked.


Additionally, there was a statically-allocated list of "pending" modules 
that needed to have their initialization completed.  This has been 
changed to keep multiple stack-allocated lists, one for each depth.



I'm planning to commit these changes next weekend, unless there are any 
significant objections.  If anyone else needs to coordinate changes in 
order to "ride-the-version-bump", let me know.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-08 Thread Paul Goyette

On Sun, 8 Aug 2010, Mindaugas Rasiukevicius wrote:


Paul Goyette  wrote:

Following up on all the various comments from jmcneil@, pooka@, and
rmind@, I've attached a revised set of diffs.  The most significant
change between this and the previous revision is the separation of the
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This
allows inclusion of the kernel code directly into rump, rather than
having to re-implement it in rump/klock.c


Please use kern_cflock.c or something like that, as kern_cfg.c is too
generic for such specific facility.


My original plan was to use kern_cfglock.c but pooka@ suggested using a 
more-generic name "in case we have more stuff in the future".  :)


I personally prefer the more-specific name myself, but I cannot make you 
both happy!



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-08 Thread Mindaugas Rasiukevicius
Paul Goyette  wrote:
> Following up on all the various comments from jmcneil@, pooka@, and 
> rmind@, I've attached a revised set of diffs.  The most significant 
> change between this and the previous revision is the separation of the 
> kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
> allows inclusion of the kernel code directly into rump, rather than 
> having to re-implement it in rump/klock.c

Please use kern_cflock.c or something like that, as kern_cfg.c is too
generic for such specific facility.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to work on 
this!  :)


Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


Following up on all the various comments from jmcneil@, pooka@, and 
rmind@, I've attached a revised set of diffs.  The most significant 
change between this and the previous revision is the separation of the 
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
allows inclusion of the kernel code directly into rump, rather than 
having to re-implement it in rump/klock.c




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Fri, 6 Aug 2010, Mindaugas Rasiukevicius wrote:


Hello,

Paul Goyette  wrote:


Well, it was a slow day at the office today, so I found some time to
work on this!  :)

Attached is the latest version of this change.  For simplicity, I have
broken the patches up into five separate groups:

<...>


Few comments on the patch:


-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */


Why bumping?


Seemed like the right thing to do, since the locking protocol is 
changing, as well as the global module_lock mutex disappearing.  But I 
really can't hink of any ill effects, unless someone has a module that 
suddenly fails link-and-load because it references module_lock.  It's 
not very likely.


I'll defer, or I'll wait to ride someone else's bump just to be safe.




+ * of of kernel functionality, such as device configuration and loading


One "of" too much.


Fixed.


+void
+kernconfig_lock(void)
+{
+   lwp_t   *l;
+
+   /*
+* OK to check this unlocked, since it could only be set to
+* curlwp by the current thread itself, and not by an interrupt
+* or any other LWP.
+*/


Please add KASSERT(!cpu_intr_p());


Done.  I was worried about this, and forgot to ask.  Thanks for picking 
it up.





+   l = curlwp;
+   if (kernconfig_lwp == l) {
+   kernconfig_recurse++;
+   KASSERT(kernconfig_recurse != 0);


Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse > 1);


The intent here was to ensure that we didn't overflow the counter back 
to zero.  It might have made sense when the counter was a uint8_t, but 
probably not with an int!  It can probably be removed completely.





+   } else {
+   mutex_enter(&kernconfig_mutex);
+   kernconfig_lwp = l;
+   kernconfig_recurse = 1;
+   }
+}


Somebody will confuse l with 1. :)


This came straight from ad@'s suggestions, and there seems to be a lot 
of precedent in the sys/kern/ code to use l.  But I will select a better 
variable name.





+   if (--kernconfig_recurse == 0) {
+   kernconfig_lwp = 0;
+   mutex_exit(&kernconfig_mutex);


kernconfig_lwp = NULL;


Ooops!  :)




+bool
+kernconfig_is_held(void)
+{
+
+   return (mutex_owned(&kernconfig_mutex));


No need for brackets around function.


Got it.




+   KASSERT(kernconfig_is_held);


But missing brackets in few cases.


Yeah, you're the second one to point that out.  I fixed them once, not 
sure how they crept back in.  But they've been fixed again.



Otherwise seems OK to me.  Thanks.



Thank you for the detailed look.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Mindaugas Rasiukevicius
Hello,

Paul Goyette  wrote:
> 
> Well, it was a slow day at the office today, so I found some time to 
> work on this!  :)
> 
> Attached is the latest version of this change.  For simplicity, I have 
> broken the patches up into five separate groups:
> 
> <...>

Few comments on the patch:

> -#define  __NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
> +#define  __NetBSD_Version__  599003900   /* NetBSD 5.99.39 */

Why bumping?

> + * of of kernel functionality, such as device configuration and loading

One "of" too much.

> +void
> +kernconfig_lock(void)
> +{
> + lwp_t   *l;
> +
> + /*
> +  * OK to check this unlocked, since it could only be set to
> +  * curlwp by the current thread itself, and not by an interrupt
> +  * or any other LWP.
> +  */

Please add KASSERT(!cpu_intr_p());

> + l = curlwp;
> + if (kernconfig_lwp == l) {
> + kernconfig_recurse++;
> + KASSERT(kernconfig_recurse != 0);

Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse > 1);

> + } else {
> + mutex_enter(&kernconfig_mutex);
> + kernconfig_lwp = l;
> + kernconfig_recurse = 1;
> + }
> +}

Somebody will confuse l with 1. :)

> + if (--kernconfig_recurse == 0) {
> + kernconfig_lwp = 0;
> + mutex_exit(&kernconfig_mutex);

kernconfig_lwp = NULL;

> +bool
> +kernconfig_is_held(void)
> +{
> +
> + return (mutex_owned(&kernconfig_mutex));

No need for brackets around function.

> + KASSERT(kernconfig_is_held);

But missing brackets in few cases.

Otherwise seems OK to me.  Thanks.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to 
work on this!  :)


Attached is the latest version of this change.  For simplicity, I have 
broken the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of 
the locking in sys/kern_module.c has been updated to use this instead of 
the internal module_lock mutex.  Additionally, the module_autoload() 
routine now does its own locking, rather than requiring its caller to do 
it.  And the description of locking in the module(9) man page is 
updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the 
various xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other 
bits and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I 
would appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify 
that recursion actually works!




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/param.h
===
RCS file: /cvsroot/src/sys/sys/param.h,v
retrieving revision 1.373
diff -u -p -r1.373 param.h
--- sys/sys/param.h 28 Jul 2010 11:03:47 -  1.373
+++ sys/sys/param.h 6 Aug 2010 00:46:05 -
@@ -63,7 +63,7 @@
  * 2.99.9  (299000900)
  */
 
-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */
 
 #define __NetBSD_Prereq__(M,m,p) (M) * 1) + \
 (m) * 100) + (p) * 100) <= __NetBSD_Version__)
Index: sys/sys/systm.h
===
RCS file: /cvsroot/src/sys/sys/systm.h,v
retrieving revision 1.239
diff -u -p -r1.239 systm.h
--- sys/sys/systm.h 31 Jan 2010 02:04:43 -  1.239
+++ sys/sys/systm.h 6 Aug 2010 00:46:06 -
@@ -492,6 +492,11 @@ void   kernel_lock_init(void);
 void   _kernel_lock(int);
 void   _kernel_unlock(int, int *);
 
+void   kernconfig_lock_init(void);
+void   kernconfig_lock(void);
+void   kernconfig_unlock(void);
+bool   kernconfig_is_held(void);
+
 #if defined(MULTIPROCESSOR) || defined(_MODULE)
 #defineKERNEL_LOCK(count, lwp) \
 do {   \
Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.24
diff -u -p -r1.24 module.h
--- sys/sys/module.h26 Jun 2010 07:23:57 -  1.24
+++ sys/sys/module.h6 Aug 2010 00:46:07 -
@@ -115,7 +115,6 @@ __link_set_add_rodata(modules, name##_mo
 TAILQ_HEAD(modlist, module);
 
 extern struct vm_map   *module_map;
-extern kmutex_tmodule_lock;
 extern u_int   module_count;
 extern u_int   module_builtinlist;
 extern struct modlist  module_list;
Index: sys/kern/kern_lock.c
===
RCS file: /cvsroot/src/sys/kern/kern_lock.c,v
retrieving revision 1.150
diff -u -p -r1.150 kern_lock.c
--- sys/kern/kern_lock.c20 Dec 2009 20:42:23 -  1.150
+++ sys/kern/kern_lock.c6 Aug 2010 00:46:07 -
@@ -39,6 +39,7 @@ __KERNEL_RCSID(0, "$NetBSD: kern_lock.c,
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -56,6 +57,10 @@ bool kernel_lock_dodebug;
 __cpu_simple_lock_t kernel_lock[CACHE_LINE_SIZE / sizeof(__cpu_simple_lock_t)]
 __aligned(CACHE_LINE_SIZE);
 
+static kmutex_t kernconfig_mutex;
+static lwp_t *kernconfig_lwp;
+static int kernconfig_recurse;
+
 void
 assert_sleepable(void)
 {
@@ -306,3 +311,59 @@ _kernel_unlock(int nlocks, int *countp)
if (countp != NULL)
*countp = olocks;
 }
+
+/*
+ * Functions for manipulating the kernel configuration lock.  This
+ * recursive lock should be used to protect all additions and removals
+ * of of kernel functionality, such as device configuration and loading
+ * of modular kernel components.
+ */
+
+void
+kernconfig_lock_init(void)
+{
+
+   mutex_init(&kernconfig_mutex, MUTEX_DEFAULT, IPL_NONE);
+   kernconfig_lwp = NULL;

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:

Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


I should add that I'm actually running with these changes on one of my 
local machines with no visible problems, and the above-added test case 
works.


The machine has a "AMD Phenom(tm) 9600 Quad-Core Processor" at 2.3MHz.

{103} modstat
NAME CLASS  SOURCE REFS  SIZE REQUIRES
bpf  driver builtin0 --
compat   misc   builtin0 --
coredump misc   filesys1 3402 -
exec_elf64   misc   filesys0 7960 coredump
exec_script  misc   filesys0 1256 -
ffs  vfsboot   0 153316   -
kernfs   vfsfilesys0 12914-
nfs  vfsfilesys0 159471   -
ptyfsvfsfilesys0 10118-
secmodel_bsd44   secmodel   builtin0 -suser,securelevel
securelevel  secmodel   builtin1 --
susersecmodel   builtin1 --
tmpfsvfsfilesys0 19513-
wapblvfsbuiltin0 --
{104}



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

OK, got it.

I'll have another set of patches in a few days.

On Thu, 5 Aug 2010, Andrew Doran wrote:


On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(&sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
   * as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(&sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is
certainly a much larger task than dealing with the immediate problem
- modules which want to load other modules.

So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described


For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for "a long time"
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.


   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?


condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.


2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this "ultimate" solution?  I think
   it would be a long time before we could find and resolve all of the
   "issues" that might be created in the various "threaded" situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those "situations", and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.


I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.


3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.


Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
* Ok to check this unlocked, as for the value to equal curlwp it must
* have been set by the current thread of execution (i.e. not interrupt
* context or another LWP).
*/
l = curlwp;
if (sysconfig_lwp == l) {
/* recurse */
} else {
mutex_enter(&sysconfig_lock);
sysconfig_lwp = l;
/* etc etc */
}




--

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Andrew Doran
On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:
> On Wed, 4 Aug 2010, Andrew Doran wrote:
> 
> >Sorry for not replying sooner.
> >
> >Please don't add a generic recursive lock facility, it would be abused.
> 
> Yeah!  My current patches have no new generic facilities at all.
> 
> >I'd like something roughly like the following.  I think this should
> >also cover major configuration tasks such as device attach and detach,
> >so it wouldn't really be module specific.  The naming is a poor suggestion.
> >
> >void
> >sysconfig_lock(void)
> >{
> >
> > if (mutex_owned(&sysconfig_mutex)) {
> > /*
> >  * At this point it's OK for the thread to hold other
> >  * locks, since the system configuration lock was the
> >  * first taken on the way in.
> >  */
> > sysconfig_recurse++;
> > return;
> > }
> > /*
> >  * I don't remember the correct argument sequence to the
> >  * following, but basically we need to assert that NO locks,
> >  * sleep or spin, other than kernel_lock are held,
> >* as the system configuration lock would be the outermost
> >  * lock in the system.
> >  */
> > LOCKDEBUG_BARRIER(...);
> > mutex_enter(&sysconfig_mutex);
> > KASSERT(sysconfig_recurse == 0);
> >}
> >
> >For the boot autoconfig path where you have things being driven by the
> >kernel, this would work OK.  Situations where there's a kthread or
> >something attaching and detaching devices on the fly are a bit different,
> >since they likely have local atomicity requirements (need to take device
> >private locks).  There you'd probably need the caller to take the lock.
> >USB comes to mind (along with some hardware RAID drivers that do hotplug
> >attach/detach).
> >
> >Thoughts?
> 
> Well, at first glance, this proposal makes sense.  But it is
> certainly a much larger task than dealing with the immediate problem
> - modules which want to load other modules.
> 
> So, I really have three questions:
> 
> 1. In keeping with earlier concerns about holding locks for long periods
>of time (eeh and mrg both commented on this), the approach described

For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for "a long time"
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.

>above would seem to have an even longer hold-time than the current
>module_lock.  Would not something similar to haad's condvar approach
>be appropriate for this proposal, as well as for the more-limited-in-
>scope recursive module lock?

condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.

> 2. Would it be reasonable to solve the immediate problem as previously
>proposed, rather than waiting for this "ultimate" solution?  I think
>it would be a long time before we could find and resolve all of the
>"issues" that might be created in the various "threaded" situations
>that may exist.  I know I'm certainly not sufficiently qualified to
>identify all those "situations", and I also know that I don't have
>sufficient available work-hours to do this in any reasonable time-
>frame.
> 
>I'd still like to move forward with the most recent solution to the
>module_lock problem.

I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.

> 3. Since we're still technically abusing the mutex(9) man-page caveat
>about using mutex_owned() for making locking decisions, it would
>be very much appreciated (at least by myself) to have an explanation
>of WHY it is OK in this case to do it here, but not somewhere else.

Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
 * Ok to check this unlocked, as for the value to equal curlwp it must
 * have been set by the current thread of execution (i.e. not interrupt
 * context or another LWP).
 */
l = curlwp;
if (sysconfig_lwp == l) { 
/* recurse */
} else {
mutex_enter(&sysconfig_lock);
sysconfig_lwp = l;
/* etc etc */
}



Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Paul Goyette

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(&sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
* as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(&sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is certainly a 
much larger task than dealing with the immediate problem - modules which 
want to load other modules.


So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described
   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?

2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this "ultimate" solution?  I think
   it would be a long time before we could find and resolve all of the
   "issues" that might be created in the various "threaded" situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those "situations", and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.

3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-