Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-15 Thread Paul Goyette

Updating the status on these changes:

One comment questioned whether or not a version bump was required, and 
I've more-or-less convinced myself it is at least desired.  While 
properly-working modules from the pre-update will continue to work on a 
post-update kernel, the reverse is not necessarily true.  A module 
written for a post-update kernel and which takes advantage of the 
changed locking protocol will fail on a pre-update kernel.


Another comment suggested that the name of newly-created file 
kern/kern_cfg.c should be changed to more closely match its contents, 
while an earlier comment had suggested generalizing the filename!  I've 
taken the more-specific path and the file is now called kern_cfglock.c
If other stuff gets added to this file later, we can come up with a more 
generic name at that time.


Some additional changes have been included in this latest set of diffs. 
Mostly, there were some KASSERT()s sprinkled throughout that tried to 
ensure that the code had not been called recursively.  Since recursion 
is now explicitly supported, all of the "module_active" stuff has been 
reworked.


Additionally, there was a statically-allocated list of "pending" modules 
that needed to have their initialization completed.  This has been 
changed to keep multiple stack-allocated lists, one for each depth.



I'm planning to commit these changes next weekend, unless there are any 
significant objections.  If anyone else needs to coordinate changes in 
order to "ride-the-version-bump", let me know.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-08 Thread Paul Goyette

On Sun, 8 Aug 2010, Mindaugas Rasiukevicius wrote:


Paul Goyette  wrote:

Following up on all the various comments from jmcneil@, pooka@, and
rmind@, I've attached a revised set of diffs.  The most significant
change between this and the previous revision is the separation of the
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This
allows inclusion of the kernel code directly into rump, rather than
having to re-implement it in rump/klock.c


Please use kern_cflock.c or something like that, as kern_cfg.c is too
generic for such specific facility.


My original plan was to use kern_cfglock.c but pooka@ suggested using a 
more-generic name "in case we have more stuff in the future".  :)


I personally prefer the more-specific name myself, but I cannot make you 
both happy!



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-08 Thread Mindaugas Rasiukevicius
Paul Goyette  wrote:
> Following up on all the various comments from jmcneil@, pooka@, and 
> rmind@, I've attached a revised set of diffs.  The most significant 
> change between this and the previous revision is the separation of the 
> kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
> allows inclusion of the kernel code directly into rump, rather than 
> having to re-implement it in rump/klock.c

Please use kern_cflock.c or something like that, as kern_cfg.c is too
generic for such specific facility.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to work on 
this!  :)


Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


Following up on all the various comments from jmcneil@, pooka@, and 
rmind@, I've attached a revised set of diffs.  The most significant 
change between this and the previous revision is the separation of the 
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
allows inclusion of the kernel code directly into rump, rather than 
having to re-implement it in rump/klock.c




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Fri, 6 Aug 2010, Mindaugas Rasiukevicius wrote:


Hello,

Paul Goyette  wrote:


Well, it was a slow day at the office today, so I found some time to
work on this!  :)

Attached is the latest version of this change.  For simplicity, I have
broken the patches up into five separate groups:

<...>


Few comments on the patch:


-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */


Why bumping?


Seemed like the right thing to do, since the locking protocol is 
changing, as well as the global module_lock mutex disappearing.  But I 
really can't hink of any ill effects, unless someone has a module that 
suddenly fails link-and-load because it references module_lock.  It's 
not very likely.


I'll defer, or I'll wait to ride someone else's bump just to be safe.




+ * of of kernel functionality, such as device configuration and loading


One "of" too much.


Fixed.


+void
+kernconfig_lock(void)
+{
+   lwp_t   *l;
+
+   /*
+* OK to check this unlocked, since it could only be set to
+* curlwp by the current thread itself, and not by an interrupt
+* or any other LWP.
+*/


Please add KASSERT(!cpu_intr_p());


Done.  I was worried about this, and forgot to ask.  Thanks for picking 
it up.





+   l = curlwp;
+   if (kernconfig_lwp == l) {
+   kernconfig_recurse++;
+   KASSERT(kernconfig_recurse != 0);


Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse > 1);


The intent here was to ensure that we didn't overflow the counter back 
to zero.  It might have made sense when the counter was a uint8_t, but 
probably not with an int!  It can probably be removed completely.





+   } else {
+   mutex_enter(&kernconfig_mutex);
+   kernconfig_lwp = l;
+   kernconfig_recurse = 1;
+   }
+}


Somebody will confuse l with 1. :)


This came straight from ad@'s suggestions, and there seems to be a lot 
of precedent in the sys/kern/ code to use l.  But I will select a better 
variable name.





+   if (--kernconfig_recurse == 0) {
+   kernconfig_lwp = 0;
+   mutex_exit(&kernconfig_mutex);


kernconfig_lwp = NULL;


Ooops!  :)




+bool
+kernconfig_is_held(void)
+{
+
+   return (mutex_owned(&kernconfig_mutex));


No need for brackets around function.


Got it.




+   KASSERT(kernconfig_is_held);


But missing brackets in few cases.


Yeah, you're the second one to point that out.  I fixed them once, not 
sure how they crept back in.  But they've been fixed again.



Otherwise seems OK to me.  Thanks.



Thank you for the detailed look.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Mindaugas Rasiukevicius
Hello,

Paul Goyette  wrote:
> 
> Well, it was a slow day at the office today, so I found some time to 
> work on this!  :)
> 
> Attached is the latest version of this change.  For simplicity, I have 
> broken the patches up into five separate groups:
> 
> <...>

Few comments on the patch:

> -#define  __NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
> +#define  __NetBSD_Version__  599003900   /* NetBSD 5.99.39 */

Why bumping?

> + * of of kernel functionality, such as device configuration and loading

One "of" too much.

> +void
> +kernconfig_lock(void)
> +{
> + lwp_t   *l;
> +
> + /*
> +  * OK to check this unlocked, since it could only be set to
> +  * curlwp by the current thread itself, and not by an interrupt
> +  * or any other LWP.
> +  */

Please add KASSERT(!cpu_intr_p());

> + l = curlwp;
> + if (kernconfig_lwp == l) {
> + kernconfig_recurse++;
> + KASSERT(kernconfig_recurse != 0);

Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse > 1);

> + } else {
> + mutex_enter(&kernconfig_mutex);
> + kernconfig_lwp = l;
> + kernconfig_recurse = 1;
> + }
> +}

Somebody will confuse l with 1. :)

> + if (--kernconfig_recurse == 0) {
> + kernconfig_lwp = 0;
> + mutex_exit(&kernconfig_mutex);

kernconfig_lwp = NULL;

> +bool
> +kernconfig_is_held(void)
> +{
> +
> + return (mutex_owned(&kernconfig_mutex));

No need for brackets around function.

> + KASSERT(kernconfig_is_held);

But missing brackets in few cases.

Otherwise seems OK to me.  Thanks.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to 
work on this!  :)


Attached is the latest version of this change.  For simplicity, I have 
broken the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of 
the locking in sys/kern_module.c has been updated to use this instead of 
the internal module_lock mutex.  Additionally, the module_autoload() 
routine now does its own locking, rather than requiring its caller to do 
it.  And the description of locking in the module(9) man page is 
updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the 
various xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other 
bits and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I 
would appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify 
that recursion actually works!




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/param.h
===
RCS file: /cvsroot/src/sys/sys/param.h,v
retrieving revision 1.373
diff -u -p -r1.373 param.h
--- sys/sys/param.h 28 Jul 2010 11:03:47 -  1.373
+++ sys/sys/param.h 6 Aug 2010 00:46:05 -
@@ -63,7 +63,7 @@
  * 2.99.9  (299000900)
  */
 
-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */
 
 #define __NetBSD_Prereq__(M,m,p) (M) * 1) + \
 (m) * 100) + (p) * 100) <= __NetBSD_Version__)
Index: sys/sys/systm.h
===
RCS file: /cvsroot/src/sys/sys/systm.h,v
retrieving revision 1.239
diff -u -p -r1.239 systm.h
--- sys/sys/systm.h 31 Jan 2010 02:04:43 -  1.239
+++ sys/sys/systm.h 6 Aug 2010 00:46:06 -
@@ -492,6 +492,11 @@ void   kernel_lock_init(void);
 void   _kernel_lock(int);
 void   _kernel_unlock(int, int *);
 
+void   kernconfig_lock_init(void);
+void   kernconfig_lock(void);
+void   kernconfig_unlock(void);
+bool   kernconfig_is_held(void);
+
 #if defined(MULTIPROCESSOR) || defined(_MODULE)
 #defineKERNEL_LOCK(count, lwp) \
 do {   \
Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.24
diff -u -p -r1.24 module.h
--- sys/sys/module.h26 Jun 2010 07:23:57 -  1.24
+++ sys/sys/module.h6 Aug 2010 00:46:07 -
@@ -115,7 +115,6 @@ __link_set_add_rodata(modules, name##_mo
 TAILQ_HEAD(modlist, module);
 
 extern struct vm_map   *module_map;
-extern kmutex_tmodule_lock;
 extern u_int   module_count;
 extern u_int   module_builtinlist;
 extern struct modlist  module_list;
Index: sys/kern/kern_lock.c
===
RCS file: /cvsroot/src/sys/kern/kern_lock.c,v
retrieving revision 1.150
diff -u -p -r1.150 kern_lock.c
--- sys/kern/kern_lock.c20 Dec 2009 20:42:23 -  1.150
+++ sys/kern/kern_lock.c6 Aug 2010 00:46:07 -
@@ -39,6 +39,7 @@ __KERNEL_RCSID(0, "$NetBSD: kern_lock.c,
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -56,6 +57,10 @@ bool kernel_lock_dodebug;
 __cpu_simple_lock_t kernel_lock[CACHE_LINE_SIZE / sizeof(__cpu_simple_lock_t)]
 __aligned(CACHE_LINE_SIZE);
 
+static kmutex_t kernconfig_mutex;
+static lwp_t *kernconfig_lwp;
+static int kernconfig_recurse;
+
 void
 assert_sleepable(void)
 {
@@ -306,3 +311,59 @@ _kernel_unlock(int nlocks, int *countp)
if (countp != NULL)
*countp = olocks;
 }
+
+/*
+ * Functions for manipulating the kernel configuration lock.  This
+ * recursive lock should be used to protect all additions and removals
+ * of of kernel functionality, such as device configuration and loading
+ * of modular kernel components.
+ */
+
+void
+kernconfig_lock_init(void)
+{
+
+   mutex_init(&kernconfig_mutex, MUTEX_DEFAULT, IPL_NONE);
+   kernconfig_lwp = NULL;

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:

Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


I should add that I'm actually running with these changes on one of my 
local machines with no visible problems, and the above-added test case 
works.


The machine has a "AMD Phenom(tm) 9600 Quad-Core Processor" at 2.3MHz.

{103} modstat
NAME CLASS  SOURCE REFS  SIZE REQUIRES
bpf  driver builtin0 --
compat   misc   builtin0 --
coredump misc   filesys1 3402 -
exec_elf64   misc   filesys0 7960 coredump
exec_script  misc   filesys0 1256 -
ffs  vfsboot   0 153316   -
kernfs   vfsfilesys0 12914-
nfs  vfsfilesys0 159471   -
ptyfsvfsfilesys0 10118-
secmodel_bsd44   secmodel   builtin0 -suser,securelevel
securelevel  secmodel   builtin1 --
susersecmodel   builtin1 --
tmpfsvfsfilesys0 19513-
wapblvfsbuiltin0 --
{104}



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

OK, got it.

I'll have another set of patches in a few days.

On Thu, 5 Aug 2010, Andrew Doran wrote:


On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(&sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
   * as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(&sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is
certainly a much larger task than dealing with the immediate problem
- modules which want to load other modules.

So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described


For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for "a long time"
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.


   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?


condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.


2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this "ultimate" solution?  I think
   it would be a long time before we could find and resolve all of the
   "issues" that might be created in the various "threaded" situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those "situations", and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.


I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.


3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.


Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
* Ok to check this unlocked, as for the value to equal curlwp it must
* have been set by the current thread of execution (i.e. not interrupt
* context or another LWP).
*/
l = curlwp;
if (sysconfig_lwp == l) {
/* recurse */
} else {
mutex_enter(&sysconfig_lock);
sysconfig_lwp = l;
/* etc etc */
}




--

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Andrew Doran
On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:
> On Wed, 4 Aug 2010, Andrew Doran wrote:
> 
> >Sorry for not replying sooner.
> >
> >Please don't add a generic recursive lock facility, it would be abused.
> 
> Yeah!  My current patches have no new generic facilities at all.
> 
> >I'd like something roughly like the following.  I think this should
> >also cover major configuration tasks such as device attach and detach,
> >so it wouldn't really be module specific.  The naming is a poor suggestion.
> >
> >void
> >sysconfig_lock(void)
> >{
> >
> > if (mutex_owned(&sysconfig_mutex)) {
> > /*
> >  * At this point it's OK for the thread to hold other
> >  * locks, since the system configuration lock was the
> >  * first taken on the way in.
> >  */
> > sysconfig_recurse++;
> > return;
> > }
> > /*
> >  * I don't remember the correct argument sequence to the
> >  * following, but basically we need to assert that NO locks,
> >  * sleep or spin, other than kernel_lock are held,
> >* as the system configuration lock would be the outermost
> >  * lock in the system.
> >  */
> > LOCKDEBUG_BARRIER(...);
> > mutex_enter(&sysconfig_mutex);
> > KASSERT(sysconfig_recurse == 0);
> >}
> >
> >For the boot autoconfig path where you have things being driven by the
> >kernel, this would work OK.  Situations where there's a kthread or
> >something attaching and detaching devices on the fly are a bit different,
> >since they likely have local atomicity requirements (need to take device
> >private locks).  There you'd probably need the caller to take the lock.
> >USB comes to mind (along with some hardware RAID drivers that do hotplug
> >attach/detach).
> >
> >Thoughts?
> 
> Well, at first glance, this proposal makes sense.  But it is
> certainly a much larger task than dealing with the immediate problem
> - modules which want to load other modules.
> 
> So, I really have three questions:
> 
> 1. In keeping with earlier concerns about holding locks for long periods
>of time (eeh and mrg both commented on this), the approach described

For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for "a long time"
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.

>above would seem to have an even longer hold-time than the current
>module_lock.  Would not something similar to haad's condvar approach
>be appropriate for this proposal, as well as for the more-limited-in-
>scope recursive module lock?

condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.

> 2. Would it be reasonable to solve the immediate problem as previously
>proposed, rather than waiting for this "ultimate" solution?  I think
>it would be a long time before we could find and resolve all of the
>"issues" that might be created in the various "threaded" situations
>that may exist.  I know I'm certainly not sufficiently qualified to
>identify all those "situations", and I also know that I don't have
>sufficient available work-hours to do this in any reasonable time-
>frame.
> 
>I'd still like to move forward with the most recent solution to the
>module_lock problem.

I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.

> 3. Since we're still technically abusing the mutex(9) man-page caveat
>about using mutex_owned() for making locking decisions, it would
>be very much appreciated (at least by myself) to have an explanation
>of WHY it is OK in this case to do it here, but not somewhere else.

Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
 * Ok to check this unlocked, as for the value to equal curlwp it must
 * have been set by the current thread of execution (i.e. not interrupt
 * context or another LWP).
 */
l = curlwp;
if (sysconfig_lwp == l) { 
/* recurse */
} else {
mutex_enter(&sysconfig_lock);
sysconfig_lwp = l;
/* etc etc */
}



Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Paul Goyette

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(&sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
* as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(&sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is certainly a 
much larger task than dealing with the immediate problem - modules which 
want to load other modules.


So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described
   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?

2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this "ultimate" solution?  I think
   it would be a long time before we could find and resolve all of the
   "issues" that might be created in the various "threaded" situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those "situations", and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.

3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Andrew Doran
Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.

I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(&sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
 * as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(&sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?



Re: Modules loading modules?

2010-08-03 Thread Paul Goyette

On Tue, 3 Aug 2010, haad wrote:

Well, still no hurry, but it would be nice if some additional eyes were 
pointed this way.  I've got the recursive-module-load test case added

to the existing tests/modules/ stuff (and it works)!

BTW, since this is changing the kernel ABI, I guess it will need a 
version bump.  Do we have any other bumps coming soon that we can 
"share"?


I talked with Andrew today and he said that basicaly we can use
mutex_owned for this particular case. That should make your code much
simpler :) sorry for any additional work :)


Well, we've come full circle!  :)

It seems to me that, even if it is "OK to use mutex_owned() for this 
particular case", perhaps we should not.  Most everyone seemed to agree 
that your condvar solution was "technically correct", and if we did use 
the mutex_owned() solution it would only serve as a temptation for 
someone else in the future!


We now have two solutions, both of which actually work (tested on my 
home machine).  One solution (mutex_owned) might be slightly simpler and 
slightly more efficient than the other (condvar), but this section of 
code is certainly not performance-critical.  Therefore, is there any 
reason for preferring the mutex_owned solution?


I'm happy to go either way based on concensus here, but I'm leaning 
towards the condvar solution.  It doesn't require any "exemptions" or 
long explanations of why it is "OK to use mutex_owned" in ways that 
would appear to violate our own documentation.  And it also addresses 
eeh's (and mrg's) concerns of keeping the lock across potentially 
long-duration  operations.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-03 Thread haad
On Tue, Aug 3, 2010 at 3:03 PM, Paul Goyette  wrote:
> On Mon, 2 Aug 2010, Paul Goyette wrote:
>
>> On Tue, 3 Aug 2010, matthew green wrote:
>>
>>> i think this looks good enough.  wait for some others eyes, though.
>>
>> No hurry.  I'm going to try to add a new atf test case for the recursive
>> load case, and it will take some time for me to come up to speed on the atf
>> stuff. :)
>
> Well, still no hurry, but it would be nice if some additional eyes were
> pointed this way.  I've got the recursive-module-load test case added to the
> existing tests/modules/ stuff (and it works)!
>
> BTW, since this is changing the kernel ABI, I guess it will need a version
> bump.  Do we have any other bumps coming soon that we can "share"?

I talked with Andrew today and he said that basicaly we can use
mutex_owned for this particular case. That should make your code much
simpler :) sorry for any additional work :)



-- 


Regards.

Adam


re: Modules loading modules?

2010-08-03 Thread Paul Goyette

On Mon, 2 Aug 2010, Paul Goyette wrote:


On Tue, 3 Aug 2010, matthew green wrote:


i think this looks good enough.  wait for some others eyes, though.


No hurry.  I'm going to try to add a new atf test case for the recursive load 
case, and it will take some time for me to come up to speed on the atf stuff. 
:)


Well, still no hurry, but it would be nice if some additional eyes were 
pointed this way.  I've got the recursive-module-load test case added to 
the existing tests/modules/ stuff (and it works)!


BTW, since this is changing the kernel ABI, I guess it will need a 
version bump.  Do we have any other bumps coming soon that we can 
"share"?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Tue, 3 Aug 2010, matthew green wrote:


i think this looks good enough.  wait for some others eyes, though.


No hurry.  I'm going to try to add a new atf test case for the recursive 
load case, and it will take some time for me to come up to speed on the 
atf stuff.  :)




please commit the module_autoload() changes separately.


Will do.  I will commit the changes to sys/kern/kern_module.c, 
sys/sys/module.h, and share/man/man9/module.9 separately from the 
other changes.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-08-02 Thread matthew green

> On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote:
> > > Now, on to sensible stuff.  I'm quite certain that warning was written
> > > to make people avoid writing bad code without understanding locking --
> > > if you need to used mutex_owned() to decide if you should lock a normal
> > > mutex, your code is broken.  The other less likely possibility is that
> > > someone plans to change mutex_owned in the future.
> > > 
> > > Further data point: the same warning is in rw_held, yet it was used to
> > > implement recursive locking in vlockmgr until its very recent demise.
> > > 
> > > Ignoring man page mantras and focusing on how the code works, I do not
> > > see anything wrong with Paul's use of mutex_owned().
> > 
> > this just does not match my actual experience in the kernel.  i had
> > weird pmap-style problems and asserts firing wrongly while i did not
> > obey the rules in the manual directly.
> 
> Not knowing more details it's difficult to comment.  But since you are
> talking about the pmap, maybe your experiences are with spin mutexes
> instead of adaptive ones?

that could be it.  i certainly had different issues trying to use a
sleep-able lock in this code.

i find it confusing that mutex_owned() returns true if the mutex is
locked (regardless of who has it) for spin, where as adaptive only
returns true if this LWP has it locked.


.mrg.


re: Modules loading modules?

2010-08-02 Thread matthew green

i think this looks good enough.  wait for some others eyes, though.

please commit the module_autoload() changes separately.


.mrg.


Re: Modules loading modules?

2010-08-02 Thread Antti Kantee
On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote:
> > Now, on to sensible stuff.  I'm quite certain that warning was written
> > to make people avoid writing bad code without understanding locking --
> > if you need to used mutex_owned() to decide if you should lock a normal
> > mutex, your code is broken.  The other less likely possibility is that
> > someone plans to change mutex_owned in the future.
> > 
> > Further data point: the same warning is in rw_held, yet it was used to
> > implement recursive locking in vlockmgr until its very recent demise.
> > 
> > Ignoring man page mantras and focusing on how the code works, I do not
> > see anything wrong with Paul's use of mutex_owned().
> 
> this just does not match my actual experience in the kernel.  i had
> weird pmap-style problems and asserts firing wrongly while i did not
> obey the rules in the manual directly.

Not knowing more details it's difficult to comment.  But since you are
talking about the pmap, maybe your experiences are with spin mutexes
instead of adaptive ones?


re: Modules loading modules?

2010-08-02 Thread Paul Goyette
The most recent verion of this patch set was recently posted, and it 
uses the cv_wait() mechanism suggested by Adam Hamsik.


This discussion is still very intresting, but perhaps we could start a 
new thread (or at least change the subject line)?  :)



On Tue, 3 Aug 2010, matthew green wrote:




On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.


what is not possible is to safely to what the manual says.  you will
not have the right thing happen.  i've seen this when i've been
working on the sparc64 pmap.


Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


this just does not match my actual experience in the kernel.  i had
weird pmap-style problems and asserts firing wrongly while i did not
obey the rules in the manual directly.


.mrg.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-08-02 Thread matthew green

> On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:
> > this is an incomplete reading of the manual page, and you can not
> > use mutex_owned() the way you are trying to (regardless of what
> > pooka posted.) you can't even using it in various forms of assertions
> > safely.  from the man page:
> > 
> > It should not be used to make locking decisions at run time, or to
> > verify that a lock is not held.
> 
> That's the mantra, yes.
> 
> > ie, you can not even KASSERT(!mutex_owned()).
> 
> Strictly speaking you can in a case where you have two locks always
> are taken as l1,l2 and released as l2,l1 provided you're not dealing
> with a spin mutex.  Does it make any sense?  no (l2 is useless).
> Is it possible?  yes.

what is not possible is to safely to what the manual says.  you will
not have the right thing happen.  i've seen this when i've been
working on the sparc64 pmap.

> Now, on to sensible stuff.  I'm quite certain that warning was written
> to make people avoid writing bad code without understanding locking --
> if you need to used mutex_owned() to decide if you should lock a normal
> mutex, your code is broken.  The other less likely possibility is that
> someone plans to change mutex_owned in the future.
> 
> Further data point: the same warning is in rw_held, yet it was used to
> implement recursive locking in vlockmgr until its very recent demise.
> 
> Ignoring man page mantras and focusing on how the code works, I do not
> see anything wrong with Paul's use of mutex_owned().

this just does not match my actual experience in the kernel.  i had
weird pmap-style problems and asserts firing wrongly while i did not
obey the rules in the manual directly.


.mrg.


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Paul Goyette wrote:

Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes good 
sense here.  I'm working on it and I should be able to post new diffs 
tomorrow.



OK, it's time for Round #3!

Attached are diffs that implement Adam's suggested cv-based solution,
along with moving the locking for module_autolock() into the routine 
itself (for consistency with other routines).  I've also updated the 
module(4) man page this time around.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

MODULE_LOCKING.diff.gz
Description: Binary data


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-02 Thread Adam Hamsik

On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote:

> On Sun, 1 Aug 2010, John Nemeth wrote:
> 
>>I'm thinking the acquisition of module_lock should be pushed into
>> module_autoload() instead of having the caller acquire it for this very
>> reason.  It makes it hard to change the way locking works in the
>> MODULAR code if you expect the caller to acquire the lock.  I don't
>> know why it was done this way originally, or what the consequences (if
>> any) would be for making the change.  Andrew, any thoughts on this?
> 
> Attached are diffs for moving the locking out of each caller and into 
> module_autoload() itself.  Most of these changes are fairly trivial, but a 
> couple of them should be looked at more closely in case I've missed any 
> subtleties in the original code:
> 
>   kern/kern_syscall.c
>   net/if_ppp.c

I think that this patch has same problem as original code you can call 
module_autoload twice in same lwp. (you are holding module_lock while calling 
module_do_load).

Regards

Adam.



Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Antti Kantee wrote:


On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.

Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


but i'm still not sure why we're going to such lengths to hold a
lock across such a heavy weight operation like loading a module.
that may involve disk speeds, and so you're looking at waiting
millions of cycles for the lock.  aka, what eeh posted.


It's held for millions of cycles already now and nobody has pointed out
measurable problems.  But if it is deemed necessary, you can certainly
hide a cv underneath.  The efficiency requirements for the module lock
are probably anyway in the "who cares wins ... not" spectrum.  At least
I'm not aware of any fastpath wanting to use it.

Anyway, no real opinion there.  A cv most likely is the safe, no-brainer
choice.


Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes 
good sense here.  I'm working on it and I should be able to post new 
diffs tomorrow.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Antti Kantee
On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:
> this is an incomplete reading of the manual page, and you can not
> use mutex_owned() the way you are trying to (regardless of what
> pooka posted.) you can't even using it in various forms of assertions
> safely.  from the man page:
> 
>   It should not be used to make locking decisions at run time, or to
>   verify that a lock is not held.

That's the mantra, yes.

> ie, you can not even KASSERT(!mutex_owned()).

Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.

Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().

> but i'm still not sure why we're going to such lengths to hold a
> lock across such a heavy weight operation like loading a module.
> that may involve disk speeds, and so you're looking at waiting
> millions of cycles for the lock.  aka, what eeh posted.

It's held for millions of cycles already now and nobody has pointed out
measurable problems.  But if it is deemed necessary, you can certainly
hide a cv underneath.  The efficiency requirements for the module lock
are probably anyway in the "who cares wins ... not" spectrum.  At least
I'm not aware of any fastpath wanting to use it.

Anyway, no real opinion there.  A cv most likely is the safe, no-brainer
choice.


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette
At last, a response that doesn't just say "you can't do that" but 
instead provides a crystal-clear alternative.   :)


I like this approach since it avoids the need to implement the new 
"recursive mutex" type (which leads to the tempting-but-forbidden 
mis-use of mutex_owned()).  It also would appear to alleviate eeh's 
and mrg's concerns for holding the mutex across long, blocking 
operations.


I still think it is preferable to move all locking associated with 
module_autoload() into the routine itself as described in the other 
thread (with subject line "Locking for module_autoload()").  Then it 
will be nearly trivial to replace the current locking with your 
suggested replacement.



On Mon, 2 Aug 2010, Adam Hamsik wrote:



On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote:


On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote:

One question:  Since an adaptive kmutex_t already includes an owner
field, would we really need to have another copy of it in the rmutex_t
structure?


Good point.  I think it's ok to do:

 if (mutex_owned(mtx))
   cnt++
 else
   lock


When I did locking for dm driver, it was said by rmind@ that mutex_owned should 
not be used for any locking decisions and should be used only for diagnostic 
purposes. What I would in this case is add two new routines to module_framework 
->

module_lock()
module_unlock()

These two will then do something like this.

kmutex_t module_lock;
uint32_t refcnt;
lwp_t mod_lwp;
condvar_t cv;


module_lock() {

mutex_enter(module_lock);
start
if (refcnt == 0) {
refcnt++;
mod_lwp = curlwp;
else {
if (mod_lwp == curlwp)
refcnt++;
else {
cv_wait(cv);
goto start
}
}

mutex_exit(module_lock);
}


mudule_unlock() {
 mutex_enter(module_lock);
KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from 
lwp which called module_lock */

refcnt--;

if (refcnt == 0) {
mod_lwp = 0;
cv_broadcast(cv);
}

 mutex_exit(module_lock);
}

This way module_lock can be called multiple times by one lwp and will sleep for 
everyone else.

Regards

Adam.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Adam Hamsik wrote:



On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote:


On Sun, 1 Aug 2010, John Nemeth wrote:


   I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


Attached are diffs for moving the locking out of each caller and into 
module_autoload() itself.  Most of these changes are fairly trivial, but a 
couple of them should be looked at more closely in case I've missed any 
subtleties in the original code:

kern/kern_syscall.c
net/if_ppp.c


I think that this patch has same problem as original code you can call 
module_autoload twice in same lwp. (you are holding module_lock while 
calling module_do_load).


This patch wasn;t intended to solve the nested-call problem.  This patch 
only purpose was to modify the existing semantics of the module_autoload 
routine to match that of the other module routines.  In particular, 
module_autoload is the only routine that requires its caller to acquire 
the mutex first;  all the other routines do all the required locking 
internally to each routine.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Eduardo Horvath wrote:


Why is there a need to hold a mutex across module loading instead of, say,
having a reference count tht's protected by a mutex?  I thought holding a
mutex across a potentially blocking operation such as a module load is
considered bad practice.


Please note that I am not introducing the holding of the lock over the 
potentially blocking operation.  This is already done.


Several others have already indicated that the atomicity of the module 
load (including the loading of any required modules) is a desirable 
property.  I'm simply looking for a mechanism that allows us to keep 
this property while allowing modules to load other modules.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Adam Hamsik

On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote:

> On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote:
>> One question:  Since an adaptive kmutex_t already includes an owner 
>> field, would we really need to have another copy of it in the rmutex_t 
>> structure?
> 
> Good point.  I think it's ok to do:
> 
>  if (mutex_owned(mtx))
>cnt++
>  else
>lock

When I did locking for dm driver, it was said by rmind@ that mutex_owned should 
not be used for any locking decisions and should be used only for diagnostic 
purposes. What I would in this case is add two new routines to module_framework 
-> 

module_lock()
module_unlock()

These two will then do something like this.

kmutex_t module_lock;
uint32_t refcnt;
lwp_t mod_lwp;
condvar_t cv;


module_lock() {

mutex_enter(module_lock);
start
if (refcnt == 0) {
refcnt++;
mod_lwp = curlwp;
else {
if (mod_lwp == curlwp)
refcnt++;
else {
cv_wait(cv);
goto start
}
}

mutex_exit(module_lock);
}


mudule_unlock() {
  mutex_enter(module_lock);
KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from 
lwp which called module_lock */

refcnt--;

if (refcnt == 0) {
mod_lwp = 0;
cv_broadcast(cv);
}

  mutex_exit(module_lock);
}

This way module_lock can be called multiple times by one lwp and will sleep for 
everyone else. 

Regards

Adam.



re: Modules loading modules?

2010-08-01 Thread matthew green

> According to the mutex(9) man page:
> 
>  mutex_owned(mtx)
> 
> For adaptive mutexes, return non-zero if the current LWP holds
> the mutex. ...

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.

ie, you can not even KASSERT(!mutex_owned()).

the documentation matches my experience with using mutex(9) and
KASSERT().

but i'm still not sure why we're going to such lengths to hold a
lock across such a heavy weight operation like loading a module.
that may involve disk speeds, and so you're looking at waiting
millions of cycles for the lock.  aka, what eeh posted.


.mrg


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Masao Uebayashi wrote:


mutex_owned(mtx)

  ...

  mutex_owned() is provided for making diagnostic checks to verify
  that a lock is held.  For example:

  KASSERT(mutex_owned(&driver_lock));

  It should not be used to make locking decisions at run time, or to
  verify that a lock is not held.

Thus you can't use it for your decision.


Yes, I did read the mutex(9) man page.

However, your assertion is at odds with pooka's suggestion, from
http://mail-index.netbsd.org/tech-kern/2010/08/01/msg008632.html


Good point.  I think it's ok to do:

  if (mutex_owned(mtx))
cnt++
  else
lock




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, John Nemeth wrote:


I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


Attached are diffs for moving the locking out of each caller and into 
module_autoload() itself.  Most of these changes are fairly trivial, but 
a couple of them should be looked at more closely in case I've missed 
any subtleties in the original code:


kern/kern_syscall.c
net/if_ppp.c


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: dev/acpi/acpi.c
===
RCS file: /cvsroot/src/sys/dev/acpi/acpi.c,v
retrieving revision 1.208
diff -u -p -r1.208 acpi.c
--- dev/acpi/acpi.c 25 Jul 2010 12:54:46 -  1.208
+++ dev/acpi/acpi.c 2 Aug 2010 02:25:26 -
@@ -226,11 +226,8 @@ acpi_null(void)
 void
 acpi_load_verbose(void)
 {
-   if (acpi_verbose_loaded == 0) {
-   mutex_enter(&module_lock);
+   if (acpi_verbose_loaded == 0)
module_autoload("acpiverbose", MODULE_CLASS_MISC);
-   mutex_exit(&module_lock);
-   }
 }
 
 void
Index: dev/dm/dm_target.c
===
RCS file: /cvsroot/src/sys/dev/dm/dm_target.c,v
retrieving revision 1.13
diff -u -p -r1.13 dm_target.c
--- dev/dm/dm_target.c  18 May 2010 15:10:41 -  1.13
+++ dev/dm/dm_target.c  2 Aug 2010 02:25:26 -
@@ -83,9 +83,7 @@ dm_target_autoload(const char *dm_target
gen = module_gen;
 
/* Try to autoload target module */
-   mutex_enter(&module_lock);
(void) module_autoload(name, MODULE_CLASS_MISC);
-   mutex_exit(&module_lock);
} while (gen != module_gen);
 
mutex_enter(&dm_target_mutex);
Index: dev/mii/mii_physubr.c
===
RCS file: /cvsroot/src/sys/dev/mii/mii_physubr.c,v
retrieving revision 1.71
diff -u -p -r1.71 mii_physubr.c
--- dev/mii/mii_physubr.c   25 Jul 2010 14:44:34 -  1.71
+++ dev/mii/mii_physubr.c   2 Aug 2010 02:25:27 -
@@ -71,11 +71,8 @@ const char *mii_get_descr_stub(int oui, 
  */
 void mii_load_verbose(void)
 {
-   if (mii_verbose_loaded == 0) {
-   mutex_enter(&module_lock);
+   if (mii_verbose_loaded == 0)
module_autoload("miiverbose", MODULE_CLASS_MISC);
-   mutex_exit(&module_lock);
-   }
 }  
 
 static void mii_phy_statusmsg(struct mii_softc *);
Index: dev/pci/pci_subr.c
===
RCS file: /cvsroot/src/sys/dev/pci/pci_subr.c,v
retrieving revision 1.84
diff -u -p -r1.84 pci_subr.c
--- dev/pci/pci_subr.c  25 Jul 2010 14:14:25 -  1.84
+++ dev/pci/pci_subr.c  2 Aug 2010 02:25:27 -
@@ -323,11 +323,8 @@ int pciverbose_loaded = 0;
  */
 void pci_load_verbose(void)
 {
-   if (pciverbose_loaded == 0) {
-   mutex_enter(&module_lock);
+   if (pciverbose_loaded == 0)
module_autoload("pciverbose", MODULE_CLASS_MISC);
-   mutex_exit(&module_lock);
-   }
 }
 
 const char *pci_findvendor_stub(pcireg_t id_reg)
Index: dev/scsipi/scsipiconf.c
===
RCS file: /cvsroot/src/sys/dev/scsipi/scsipiconf.c,v
retrieving revision 1.39
diff -u -p -r1.39 scsipiconf.c
--- dev/scsipi/scsipiconf.c 25 Jul 2010 13:49:58 -  1.39
+++ dev/scsipi/scsipiconf.c 2 Aug 2010 02:25:27 -
@@ -107,11 +107,8 @@ scsipi_command(struct scsipi_periph *per
 void
 scsipi_load_verbose(void)
 {
-   if (scsi_verbose_loaded == 0) {
-   mutex_enter(&module_lock);
+   if (scsi_verbose_loaded == 0)
module_autoload("scsiverbose", MODULE_CLASS_MISC);
-   mutex_exit(&module_lock);
-   }
 }
 
 /*
Index: dev/usb/usb_subr.c
===
RCS file: /cvsroot/src/sys/dev/usb/usb_subr.c,v
retrieving revision 1.174
diff -u -p -r1.174 usb_subr.c
--- dev/usb/usb_subr.c  27 Jul 2010 16:15:30 -  1.174
+++ dev/usb/usb_subr.c  2 Aug 2010 02:25:27 -
@@ -122,11 +122,8 @@ int usb_verbose_loaded = 0;
  */
 void usb_load_verbose(void)
 {
- 

Re: Modules loading modules?

2010-08-01 Thread John Nemeth
On Nov 17,  5:24am, Paul Goyette wrote:
} On Sun, 1 Aug 2010, Antti Kantee wrote:
} 
} > I'm not sure if it's a good idea to change the size of kmutex_t.  I
} > guess plenty of data structures have carefully been adjusted by hand
} > to its size and I don't know of any automatic way to recalculate that
} > stuff.
} >
} > Even if not, since this is the only user and we probably won't have
} > that many of them even in the future, why not just define a new type
} > ``rmutex'' which contains a kmutex, an owner and the counter?  It 
} > feels wrong to punish all the normal kmutex users for just one use.
} > It'll also make the implementation a lot simpler to test, since it's
} > purely MI.
} >
} > "separate normal case and worst case"
} 
} Round two!  Taking pooka's suggestion, this version is built on top of 
} (rather than beside) the existing non-recursive mutex.  As such, it does 
} not affect any MD code.
} 
} Attached is a set of diffs that
} 
} 1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement
} recursive adaptive mutexes.  (Conspicuously missing is an rmutex(9)
} man page...  It will happen before this gets committed.)
} 
} 2. Converts the existing module_lock from a normal kmutex_t to an
} rmutex_t
} 
} 3. Updates all of the (surprisingly many) places where module_lock
} is acquired.

 I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?

} Compile-tested on port-amd64 (including rumptest).  Since there are no 
} MD-changes in this version, there "shouldn't be" any issues with 
} building on other ports.
} 
} As previously noted, there is only one known use case for this so far: 
} modules loading other modules from within their xxx_modcmd() routine. 
} The specific use case we have involves loading the acpicpu driver/module 
} which eventually results in an attempt to load acpiverbose.
} 
} It would be really nice if the community could
} 
} A. Compile-test on additional architectures
} B. Test to see that existing mutex operations still work correctly
} C. Exercise the known use case if possible
} D. Identify additional use cases
} 
}-- End of excerpt from Paul Goyette


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, John Nemeth wrote:


I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


I was thinking the same thing.

There is one case in kern/kern_exec.c where the lock was held while the 
code loops through a list of possible modules.  Looking closer, that 
code actually releases the lock and yield()s before re-acquiring it 
within the loop, so it probably could be modified without any adverse 
effects.


This would actually simplify the interface, since ALL of the routines 
would be doing their own locking, and noone in the rest of the kernel 
would need to know about the lock.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Quentin Garnier
On Sun, Aug 01, 2010 at 06:05:20PM -0700, Paul Goyette wrote:
> On Mon, 2 Aug 2010, Quentin Garnier wrote:
> 
> >>+int
> >>+rmutex_tryenter(rmutex_t *rmtx)
> >>+{
> >>+   int rv = 1;
> >>+
> >>+   if (mutex_owned(&rmtx->rmtx_mtx)) {
> >>+   rmtx->rmtx_recurse++;
> >>+   KASSERT(rmtx->rmtx_recurse != 0);
> >>+   } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) {
> >>+   rmtx->rmtx_recurse++;
> >>+   KASSERT(rmtx->rmtx_recurse != 0);
> >>+   }
> >>+   return rv;
> >>+}
> >
> >I am probably not getting the idea, but I fail to see how this qualifies
> >as a mutex.  My reading of this piece of code is that rmutex_enter() is
> >always going to succeed immediately.
> 
> According to the mutex(9) man page:
> 
> mutex_owned(mtx)
> 
>For adaptive mutexes, return non-zero if the current LWP holds
>the mutex. ...

Ah, that was what I was missing.  Thanks.

-- 
Quentin Garnier - c...@cubidou.net - c...@netbsd.org
"See the look on my face from staying too long in one place
[...] every time the morning breaks I know I'm closer to falling"
KT Tunstall, Saving My Face, Drastic Fantastic, 2007.


pgp0WclWeuAf7.pgp
Description: PGP signature


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Quentin Garnier wrote:


+int
+rmutex_tryenter(rmutex_t *rmtx)
+{
+   int rv = 1;
+
+   if (mutex_owned(&rmtx->rmtx_mtx)) {
+   rmtx->rmtx_recurse++;
+   KASSERT(rmtx->rmtx_recurse != 0);
+   } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) {
+   rmtx->rmtx_recurse++;
+   KASSERT(rmtx->rmtx_recurse != 0);
+   }
+   return rv;
+}


I am probably not getting the idea, but I fail to see how this qualifies
as a mutex.  My reading of this piece of code is that rmutex_enter() is
always going to succeed immediately.


According to the mutex(9) man page:

mutex_owned(mtx)

   For adaptive mutexes, return non-zero if the current LWP holds
   the mutex. ...

The rmutex_init() code ensures that we have an adaptive mutex.  So what 
this says is


If _we_ already own the mutex,
increment counter
(rv is still initialized to 1)
Else
if we can acquire the mutex (updating rv)
increment the counter

Return



Please correct me if I'm wrong--and I am very probably wrong--, but
isn't the point of a recursive mutex to allow the thread initially
taking the mutex to take it again, but not other threads?  The code
would have to be a lot more complicated than that.

All I see here is a fancy no-op, but maybe it's just me.


It would be a no-op for a recursive spin mutex.  That's why I didn't
allow them!  :)


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Quentin Garnier
On Sun, Aug 01, 2010 at 05:17:01PM -0700, Paul Goyette wrote:
[...]
> +void
> +rmutex_enter(rmutex_t *rmtx)
> +{
> + if (mutex_owned(&rmtx->rmtx_mtx)) {
> + rmtx->rmtx_recurse++;
> + KASSERT(rmtx->rmtx_recurse != 0);
> + } else {
> + mutex_enter(&rmtx->rmtx_mtx);
> + rmtx->rmtx_recurse = 1;
> + }
> +}
> +
> +void
> +rmutex_exit(rmutex_t *rmtx)
> +{
> + KASSERT(mutex_owned(&rmtx->rmtx_mtx));
> + KASSERT(rmtx->rmtx_recurse != 0);
> + if (--rmtx->rmtx_recurse == 0)
> + mutex_exit(&rmtx->rmtx_mtx);
> +}
> +
> +int
> +rmutex_tryenter(rmutex_t *rmtx)
> +{
> + int rv = 1;
> +
> + if (mutex_owned(&rmtx->rmtx_mtx)) {
> + rmtx->rmtx_recurse++;
> + KASSERT(rmtx->rmtx_recurse != 0);
> + } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) {
> + rmtx->rmtx_recurse++;
> + KASSERT(rmtx->rmtx_recurse != 0);
> + }
> + return rv;
> +}

I am probably not getting the idea, but I fail to see how this qualifies
as a mutex.  My reading of this piece of code is that rmutex_enter() is
always going to succeed immediately.

Please correct me if I'm wrong--and I am very probably wrong--, but
isn't the point of a recursive mutex to allow the thread initially
taking the mutex to take it again, but not other threads?  The code
would have to be a lot more complicated than that.

All I see here is a fancy no-op, but maybe it's just me.

-- 
Quentin Garnier - c...@cubidou.net - c...@netbsd.org
"See the look on my face from staying too long in one place
[...] every time the morning breaks I know I'm closer to falling"
KT Tunstall, Saving My Face, Drastic Fantastic, 2007.


pgpgTt12zdsHJ.pgp
Description: PGP signature


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


I'm not sure if it's a good idea to change the size of kmutex_t.  I
guess plenty of data structures have carefully been adjusted by hand
to its size and I don't know of any automatic way to recalculate that
stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It 
feels wrong to punish all the normal kmutex users for just one use.

It'll also make the implementation a lot simpler to test, since it's
purely MI.

"separate normal case and worst case"


Round two!  Taking pooka's suggestion, this version is built on top of 
(rather than beside) the existing non-recursive mutex.  As such, it does 
not affect any MD code.


Attached is a set of diffs that

1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement
   recursive adaptive mutexes.  (Conspicuously missing is an rmutex(9)
   man page...  It will happen before this gets committed.)

2. Converts the existing module_lock from a normal kmutex_t to an
   rmutex_t

3. Updates all of the (surprisingly many) places where module_lock
   is acquired.

Compile-tested on port-amd64 (including rumptest).  Since there are no 
MD-changes in this version, there "shouldn't be" any issues with 
building on other ports.



As previously noted, there is only one known use case for this so far: 
modules loading other modules from within their xxx_modcmd() routine. 
The specific use case we have involves loading the acpicpu driver/module 
which eventually results in an attempt to load acpiverbose.


It would be really nice if the community could

A. Compile-test on additional architectures
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/conf/files
===
RCS file: /cvsroot/src/sys/conf/files,v
retrieving revision 1.992
diff -u -p -r1.992 files
--- sys/conf/files  7 Jul 2010 01:09:39 -   1.992
+++ sys/conf/files  2 Aug 2010 00:08:26 -
@@ -1468,6 +1468,7 @@ file  kern/kern_prot.c
 file   kern/kern_ras.c
 file   kern/kern_rate.c
 file   kern/kern_resource.c
+file   kern/kern_rmutex.c
 file   kern/kern_runq.c
 file   kern/kern_rwlock.c
 file   kern/kern_rwlock_obj.c
Index: sys/sys/rmutex.h
===
RCS file: sys/sys/rmutex.h
diff -N sys/sys/rmutex.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/sys/rmutex.h2 Aug 2010 00:08:27 -
@@ -0,0 +1,53 @@
+/* $NetBSD$*/
+
+/*-
+ * Copyright (c) 2010 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef_SYS_RMUTEX_H_
+#define_SYS_RMUTEX_H_
+
+#include 
+
+/*
+ * A recursive mutex is simply an adaptive mutex which can be acquired
+ * multiple times by a single owner.
+ */
+ 
+struct rmutex {
+   kmutex_trmtx_mtx;
+   uint32_trmtx_recurse;
+};
+
+typedef struct rmutex rmutex_t;
+
+void   rmutex_init(rmutex_t *);
+void   rmutex_destroy(rmutex_t *);
+void   rmutex_enter(rmutex_t 

Re: Modules loading modules?

2010-08-01 Thread Eduardo Horvath
On Sun, 1 Aug 2010, Paul Goyette wrote:

> Good point, and it will be a lot less work, too!  :)  And it solves the
> problem of not permitting a rmutex being used with condvars.
> 
> One question:  Since an adaptive kmutex_t already includes an owner field,
> would we really need to have another copy of it in the rmutex_t structure?

Why is there a need to hold a mutex across module loading instead of, say, 
having a reference count tht's protected by a mutex?  I thought holding a 
mutex across a potentially blocking operation such as a module load is 
considered bad practice.

Eduardo


Re: Modules loading modules?

2010-08-01 Thread Antti Kantee
On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote:
> One question:  Since an adaptive kmutex_t already includes an owner 
> field, would we really need to have another copy of it in the rmutex_t 
> structure?

Good point.  I think it's ok to do:

  if (mutex_owned(mtx))
cnt++
  else
lock


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


Well, folks, here is a first pass recursive locks!  The attached diffs
are against -current as of a few minutes ago.


Oh, heh, I thought we have recursive lock support.  But with that gone
from the vfs locks, I guess not (apart from the kernel lock ;).


:)



I'm not sure if it's a good idea to change the size of kmutex_t.  I guess
plenty of data structures have carefully been adjusted by hand to its
size and I don't know of any automatic way to recalculate that stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It feels
wrong to punish all the normal kmutex users for just one use.  It'll also
make the implementation a lot simpler to test, since it's purely MI.

"separate normal case and worst case"


Good point, and it will be a lot less work, too!  :)  And it solves the 
problem of not permitting a rmutex being used with condvars.


One question:  Since an adaptive kmutex_t already includes an owner 
field, would we really need to have another copy of it in the rmutex_t 
structure?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Antti Kantee
On Sat Jul 31 2010 at 15:48:26 -0700, Paul Goyette wrote:
> >>If modload-from-modcmd is found necessary, sounds more like a case for
> >>the infamous recursive lock.
> >
> >Recursive lock is the way to go.  I think the same lock should also cover
> >all device configuration activites (i.e. autoconf) and any other
> >heavy lifting where we have chunks of the system coming and going.
> 
> Well, folks, here is a first pass recursive locks!  The attached diffs 
> are against -current as of a few minutes ago.

Oh, heh, I thought we have recursive lock support.  But with that gone
from the vfs locks, I guess not (apart from the kernel lock ;).

I'm not sure if it's a good idea to change the size of kmutex_t.  I guess
plenty of data structures have carefully been adjusted by hand to its
size and I don't know of any automatic way to recalculate that stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It feels
wrong to punish all the normal kmutex users for just one use.  It'll also
make the implementation a lot simpler to test, since it's purely MI.

"separate normal case and worst case"


Re: Modules loading modules?

2010-07-31 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.

thus, the right solution is to remove this requirement.


Yes, that's what is needed.


I'm far from convinced that's a good idea.  First, it will probably
make the module code a nightmare -- what happens when you have multiple
interleaved loads, some of which fail at some point in their dependency
stack, and let's just throw in a manual modunload to mix up things
further.  Second, and pretty much related to number one, it goes against
one of the most fundamental principles of robust code: atomic actions.


This: atomicity.  The atomic behaviour is relied upon to give "all or
nothing" semantics upon load and unload, like transactions in a database.

Admittely some modules (not of my making) are sloppy about this and so
break that bit of the interface contract.


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


Well, folks, here is a first pass recursive locks!  The attached diffs 
are against -current as of a few minutes ago.


Some caveats:

1. This has only been compile-tested for x86 (amd64 & i386).

2. Other architectures have not even been compiled yet.  "It should
   work" but you never know.

3. HPPA is an exceptional exception here, since it is the only in-tree
   architecture I found that cannot use SIMPLE_LOCKS.

4. There is only one known use case for this so far:  modules loading
   other modules from within their xxx_modcmd() routine.  The specific
   use case we have involves loading the acpicpu driver/module which
   eventually results in an attempt to load acpiverbose.

It would be really nice if the community could

A. Compile-test on additional architectures, especially HPPA
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


There is one place in the code (marked with great big XXX) where I am 
simply unsure if I need to use a CAS-type operation for updating the 
reference count on the recursive mutex.  I _think_ it is save to simply 
auto-increment the field, but I am totally willing for any of you 
experts to tell my why this might not be safe.   :)


I will be updating one of my home machines (amd64) to use this code in 
the next few days.  I don't plan on making any commits until I've had 
some positive testing feedback, and hopefully some expert commentary on 
the XXX.  (Some additional "concensus" that this is the "right thing to 
do" would also be appreciated!)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/arch/alpha/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/alpha/include/mutex.h,v
retrieving revision 1.4
diff -u -p -r1.4 mutex.h
--- sys/arch/alpha/include/mutex.h  28 Apr 2008 20:23:11 -  1.4
+++ sys/arch/alpha/include/mutex.h  31 Jul 2010 22:34:30 -
@@ -43,7 +43,11 @@ struct kmutex {
 
 struct kmutex {
union {
-   volatile uintptr_t  mtxa_owner;
+   struct {
+   volatile uintptr_t  mtxa_owner;
+   volatile uint8_tmtxa_recurse;
+   volatile uint8_tmtxa_unused[3];
+   } a;
struct {
volatile uint8_tmtxs_flags;
ipl_cookie_tmtxs_ipl;
@@ -53,7 +57,8 @@ struct kmutex {
__cpu_simple_lock_t mtx_lock;
 };
 
-#definemtx_owner   u.mtxa_owner
+#definemtx_owner   u.a.mtxa_owner
+#definemtx_recurse u.a.mtxa_recurse
 #definemtx_flags   u.s.mtxs_flags
 #definemtx_ipl u.s.mtxs_ipl
 
Index: sys/arch/arm/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/arm/include/mutex.h,v
retrieving revision 1.10
diff -u -p -r1.10 mutex.h
--- sys/arch/arm/include/mutex.h28 Apr 2008 20:23:14 -  1.10
+++ sys/arch/arm/include/mutex.h31 Jul 2010 22:34:31 -
@@ -57,7 +57,11 @@ struct kmutex {
 struct kmutex {
union {
/* Adaptive mutex */
- 

Re: Modules loading modules?

2010-07-29 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


OK, I'm willing to make an attempt to implement the recursive_lock if 
noone else wants to give it a try!  It will probably take some time...




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-28 Thread Andrew Doran
On Mon, Jul 26, 2010 at 02:49:56PM +0300, Antti Kantee wrote:
> On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote:
> > On Mon, 26 Jul 2010, matthew green wrote:
> > 
> > >
> > >it seems to me the root problem is that module_mutex is held while
> > >calling into the module startup routines.
> > >
> > >thus, the right solution is to remove this requirement.
> > 
> > Yes, that's what is needed.
> 
> I'm far from convinced that's a good idea.  First, it will probably
> make the module code a nightmare -- what happens when you have multiple
> interleaved loads, some of which fail at some point in their dependency
> stack, and let's just throw in a manual modunload to mix up things
> further.  Second, and pretty much related to number one, it goes against
> one of the most fundamental principles of robust code: atomic actions.

This: atomicity.  The atomic behaviour is relied upon to give "all or
nothing" semantics upon load and unload, like transactions in a database.

Admittely some modules (not of my making) are sloppy about this and so
break that bit of the interface contract.

> If modload-from-modcmd is found necessary, sounds more like a case for
> the infamous recursive lock.

Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other 
heavy lifting where we have chunks of the system coming and going.



Re: Modules loading modules?

2010-07-26 Thread Paul Goyette

On Mon, 26 Jul 2010, der Mouse wrote:


We have a modular device driver, let's call it xxxmod.  [...]  It []
might attempt to use an optional module (e.g., zzzverbose) to print
some device attachment messages.



First, a required module cannot be optional.  If the desired module
is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated
back to the original "outer" call to module_load() which will also
fail.



The second reason why this is not suitable is that the "outer" load
will add a reference to the module, preventing it from being
auto-unloaded.


Surely the right answer here is to provide a way to say "refer to this
module, but it's ok for its load to fail, and it's ok for it to get
auto-unloaded", including passing up whatever information is necessary
for the calling module to do something useful in failure cases?


I actually considered adding a set of "desired" modules as well as the 
current "required" modules.  That would certainly take care of the 
immediate problem with the xxx_verbose modules.


But it doesn't address the need to have one module/driver loading 
another one for a child device, and it would be a maintenance nightmare 
to require updating the dependency-list every time a new child driver 
gets created.



It really seems to me that the module system is there to help us, not
to shackle us, and that if it has properties which are leading to
problems, one of the options we should at the very least be considering
is changing those properties.


Yes, the current restriction appears to be burdensome.  I'm just trying 
to get some consensus on that - so far I've seen only one negative 
opinion (from pooka@ IIRC).  If we can agree that it's desirable for one 
module's xxx_modcmd() to load another module, then we can move to step 2 
and figure out how to make it happen without breaking anything.


If we don't get consensus, then I'll most likely need to revert the 
xxx_verbose modules.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-26 Thread der Mouse
> We have a modular device driver, let's call it xxxmod.  [...]  It []
> might attempt to use an optional module (e.g., zzzverbose) to print
> some device attachment messages.

> First, a required module cannot be optional.  If the desired module
> is not present, or if it is present but its own
> xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated
> back to the original "outer" call to module_load() which will also
> fail.

> The second reason why this is not suitable is that the "outer" load
> will add a reference to the module, preventing it from being
> auto-unloaded.

Surely the right answer here is to provide a way to say "refer to this
module, but it's ok for its load to fail, and it's ok for it to get
auto-unloaded", including passing up whatever information is necessary
for the calling module to do something useful in failure cases?

It really seems to me that the module system is there to help us, not
to shackle us, and that if it has properties which are leading to
problems, one of the options we should at the very least be considering
is changing those properties.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Modules loading modules?

2010-07-26 Thread Antti Kantee
On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote:
> On Mon, 26 Jul 2010, matthew green wrote:
> 
> >
> >it seems to me the root problem is that module_mutex is held while
> >calling into the module startup routines.
> >
> >thus, the right solution is to remove this requirement.
> 
> Yes, that's what is needed.

I'm far from convinced that's a good idea.  First, it will probably
make the module code a nightmare -- what happens when you have multiple
interleaved loads, some of which fail at some point in their dependency
stack, and let's just throw in a manual modunload to mix up things
further.  Second, and pretty much related to number one, it goes against
one of the most fundamental principles of robust code: atomic actions.

If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.

(no comment on the actual problem)


Re: Modules loading modules?

2010-07-26 Thread Paul Goyette

On Mon, 26 Jul 2010, Jukka Ruohonen wrote:


On Mon, Jul 26, 2010 at 06:41:11AM +1000, matthew green wrote:

it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.


Here is one related question: is it ensured that the module lock is dropped
immediately after a modular device driver returns from its attachment
routine?  I am thinking of a case where a modular driver defers its
configuration by using config_interrupts(9) or config_finalize_register(9).


Not necessarily.

If a module is being loaded as a dependency of (required by) another 
module, then the lock is not dropped.


The lock is dropped only after the requested module and all of its 
dependencies have been loaded and their xxx_modcmd() have been called.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-25 Thread Jukka Ruohonen
On Mon, Jul 26, 2010 at 06:41:11AM +1000, matthew green wrote:
> it seems to me the root problem is that module_mutex is held while
> calling into the module startup routines.

Here is one related question: is it ensured that the module lock is dropped
immediately after a modular device driver returns from its attachment
routine?  I am thinking of a case where a modular driver defers its
configuration by using config_interrupts(9) or config_finalize_register(9).

- Jukka.


re: Modules loading modules?

2010-07-25 Thread matthew green

> On Mon, 26 Jul 2010, matthew green wrote:
> 
> >
> > it seems to me the root problem is that module_mutex is held while
> > calling into the module startup routines.
> >
> > thus, the right solution is to remove this requirement.
> 
> Yes, that's what is needed.
> 
> But it sounds a lot simpler than it is.

oh, i'm sure it's not trivial.  :-(


.mrg.


re: Modules loading modules?

2010-07-25 Thread Paul Goyette

On Mon, 26 Jul 2010, matthew green wrote:



it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.

thus, the right solution is to remove this requirement.


Yes, that's what is needed.

But it sounds a lot simpler than it is.  Among other issues, we'd need 
to ensure that we continue to detect circular dependencies between both 
sets of required modules.  And, since we call the module startup routine 
for a required module _before_ we call it for the required module, we 
would need to release and then reacquire the mutex, all the while being 
sure that the data structures don't get changed out from under us.


We would end up with something like this sequence,where module A 
requires A1 and A2, and module B requires B1 and B2, and hopefully 
(B1 != A && B2 != A) (which would be a hidden circular dependency).


mutex_enter()
load module A
load required module A1
mutex_exit()
call A1's init routine
mutex_enter()
load required module A2
mutex_exit()
call A2's init routine
mutex_enter()
load module B
load required module B1
mutex_exit()
call B1's init routine
mutex_enter()
load required module B2
mutex_exit()
call B2's init routine
mutex_enter()
mutex_exit()
call B's init routine
mutex_enter()
mutex_exit()
mutex_enter()
mutex_exit()
call A's init routine
mutex_enter()
mutex_exit()



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-07-25 Thread matthew green

it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.

thus, the right solution is to remove this requirement.


.mrg.


Modules loading modules?

2010-07-25 Thread Paul Goyette


Consider the following scenario:

We have a modular device driver, let's call it xxxmod.  The driver's
xxx_modcmd(MODULE_CMD_INIT, ...) routine handles all of its initialization,
including interaction with autoconf(9).  It therefore might attempt to use
an optional module (e.g., zzzverbose) to print some device attachment
messages.

(Another possible scenario would be to have one modular driver xxxmod
that calls config_found_xxx() which in turn requires another modular
driver yyymod.)

We have no problem if the zzzverbose module is already auto-loaded. But
if the module is not present, or if it has already been auto- unloaded,
the current code will call module_autoload() to attempt to (re)load it.
Since this call happens in xxx_modcmd(), we end calling mutex_enter()
while the mutex is still owned.

There is currently a "requires" mechanism that allows recursion within
the module loading process.  But it doesn't quite solve the problem, for
at least three reasons.  First, a required module cannot be optional. If
the desired module is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated back
to the original "outer" call to module_load() which will also fail.

The second reason why this is not suitable is that the "outer" load will
add a reference to the module, preventing it from being auto-unloaded.
(A key benefit of having the xxx_verbose modules is that their large text
tables can be unloaded.)

A third reason that tends to make this option unuseable is the need to
identify ahead of time all of the possible optional modules and maintain
the dependency list.  Adding a new driver would thus require updating the
dependency list in the parent device's driver.

Any suggestions on how to resolve this conundrum?  Without a solution,
the zzz_verbose modules and modularized drivers that might use the 
zzz_verbose modules cannot co-exist, and we won't be able to have modular

drivers for devices whose children are served by other modular drivers.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-