Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-15 Thread Paul Goyette

Updating the status on these changes:

One comment questioned whether or not a version bump was required, and 
I've more-or-less convinced myself it is at least desired.  While 
properly-working modules from the pre-update will continue to work on a 
post-update kernel, the reverse is not necessarily true.  A module 
written for a post-update kernel and which takes advantage of the 
changed locking protocol will fail on a pre-update kernel.


Another comment suggested that the name of newly-created file 
kern/kern_cfg.c should be changed to more closely match its contents, 
while an earlier comment had suggested generalizing the filename!  I've 
taken the more-specific path and the file is now called kern_cfglock.c
If other stuff gets added to this file later, we can come up with a more 
generic name at that time.


Some additional changes have been included in this latest set of diffs. 
Mostly, there were some KASSERT()s sprinkled throughout that tried to 
ensure that the code had not been called recursively.  Since recursion 
is now explicitly supported, all of the module_active stuff has been 
reworked.


Additionally, there was a statically-allocated list of pending modules 
that needed to have their initialization completed.  This has been 
changed to keep multiple stack-allocated lists, one for each depth.



I'm planning to commit these changes next weekend, unless there are any 
significant objections.  If anyone else needs to coordinate changes in 
order to ride-the-version-bump, let me know.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-08 Thread Mindaugas Rasiukevicius
Paul Goyette p...@whooppee.com wrote:
 Following up on all the various comments from jmcneil@, pooka@, and 
 rmind@, I've attached a revised set of diffs.  The most significant 
 change between this and the previous revision is the separation of the 
 kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
 allows inclusion of the kernel code directly into rump, rather than 
 having to re-implement it in rump/klock.c

Please use kern_cflock.c or something like that, as kern_cfg.c is too
generic for such specific facility.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to 
work on this!  :)


Attached is the latest version of this change.  For simplicity, I have 
broken the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of 
the locking in sys/kern_module.c has been updated to use this instead of 
the internal module_lock mutex.  Additionally, the module_autoload() 
routine now does its own locking, rather than requiring its caller to do 
it.  And the description of locking in the module(9) man page is 
updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the 
various xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other 
bits and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I 
would appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify 
that recursion actually works!




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/sys/param.h
===
RCS file: /cvsroot/src/sys/sys/param.h,v
retrieving revision 1.373
diff -u -p -r1.373 param.h
--- sys/sys/param.h 28 Jul 2010 11:03:47 -  1.373
+++ sys/sys/param.h 6 Aug 2010 00:46:05 -
@@ -63,7 +63,7 @@
  * 2.99.9  (299000900)
  */
 
-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */
 
 #define __NetBSD_Prereq__(M,m,p) (M) * 1) + \
 (m) * 100) + (p) * 100) = __NetBSD_Version__)
Index: sys/sys/systm.h
===
RCS file: /cvsroot/src/sys/sys/systm.h,v
retrieving revision 1.239
diff -u -p -r1.239 systm.h
--- sys/sys/systm.h 31 Jan 2010 02:04:43 -  1.239
+++ sys/sys/systm.h 6 Aug 2010 00:46:06 -
@@ -492,6 +492,11 @@ void   kernel_lock_init(void);
 void   _kernel_lock(int);
 void   _kernel_unlock(int, int *);
 
+void   kernconfig_lock_init(void);
+void   kernconfig_lock(void);
+void   kernconfig_unlock(void);
+bool   kernconfig_is_held(void);
+
 #if defined(MULTIPROCESSOR) || defined(_MODULE)
 #defineKERNEL_LOCK(count, lwp) \
 do {   \
Index: sys/sys/module.h
===
RCS file: /cvsroot/src/sys/sys/module.h,v
retrieving revision 1.24
diff -u -p -r1.24 module.h
--- sys/sys/module.h26 Jun 2010 07:23:57 -  1.24
+++ sys/sys/module.h6 Aug 2010 00:46:07 -
@@ -115,7 +115,6 @@ __link_set_add_rodata(modules, name##_mo
 TAILQ_HEAD(modlist, module);
 
 extern struct vm_map   *module_map;
-extern kmutex_tmodule_lock;
 extern u_int   module_count;
 extern u_int   module_builtinlist;
 extern struct modlist  module_list;
Index: sys/kern/kern_lock.c
===
RCS file: /cvsroot/src/sys/kern/kern_lock.c,v
retrieving revision 1.150
diff -u -p -r1.150 kern_lock.c
--- sys/kern/kern_lock.c20 Dec 2009 20:42:23 -  1.150
+++ sys/kern/kern_lock.c6 Aug 2010 00:46:07 -
@@ -39,6 +39,7 @@ __KERNEL_RCSID(0, $NetBSD: kern_lock.c,
 #include sys/systm.h
 #include sys/kernel.h
 #include sys/lockdebug.h
+#include sys/mutex.h
 #include sys/cpu.h
 #include sys/syslog.h
 #include sys/atomic.h
@@ -56,6 +57,10 @@ bool kernel_lock_dodebug;
 __cpu_simple_lock_t kernel_lock[CACHE_LINE_SIZE / sizeof(__cpu_simple_lock_t)]
 __aligned(CACHE_LINE_SIZE);
 
+static kmutex_t kernconfig_mutex;
+static lwp_t *kernconfig_lwp;
+static int kernconfig_recurse;
+
 void
 assert_sleepable(void)
 {
@@ -306,3 +311,59 @@ _kernel_unlock(int nlocks, int *countp)
if (countp != NULL)
*countp = olocks;
 }
+
+/*
+ * Functions for manipulating the kernel configuration lock.  This
+ * recursive lock should be used to protect all additions and removals
+ * of of kernel functionality, such as device configuration and loading
+ * of modular kernel components.
+ */
+
+void
+kernconfig_lock_init(void)
+{
+
+   

Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Mindaugas Rasiukevicius
Hello,

Paul Goyette p...@whooppee.com wrote:
 
 Well, it was a slow day at the office today, so I found some time to 
 work on this!  :)
 
 Attached is the latest version of this change.  For simplicity, I have 
 broken the patches up into five separate groups:
 
 ...

Few comments on the patch:

 -#define  __NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
 +#define  __NetBSD_Version__  599003900   /* NetBSD 5.99.39 */

Why bumping?

 + * of of kernel functionality, such as device configuration and loading

One of too much.

 +void
 +kernconfig_lock(void)
 +{
 + lwp_t   *l;
 +
 + /*
 +  * OK to check this unlocked, since it could only be set to
 +  * curlwp by the current thread itself, and not by an interrupt
 +  * or any other LWP.
 +  */

Please add KASSERT(!cpu_intr_p());

 + l = curlwp;
 + if (kernconfig_lwp == l) {
 + kernconfig_recurse++;
 + KASSERT(kernconfig_recurse != 0);

Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse  1);

 + } else {
 + mutex_enter(kernconfig_mutex);
 + kernconfig_lwp = l;
 + kernconfig_recurse = 1;
 + }
 +}

Somebody will confuse l with 1. :)

 + if (--kernconfig_recurse == 0) {
 + kernconfig_lwp = 0;
 + mutex_exit(kernconfig_mutex);

kernconfig_lwp = NULL;

 +bool
 +kernconfig_is_held(void)
 +{
 +
 + return (mutex_owned(kernconfig_mutex));

No need for brackets around function.

 + KASSERT(kernconfig_is_held);

But missing brackets in few cases.

Otherwise seems OK to me.  Thanks.

-- 
Mindaugas


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Fri, 6 Aug 2010, Mindaugas Rasiukevicius wrote:


Hello,

Paul Goyette p...@whooppee.com wrote:


Well, it was a slow day at the office today, so I found some time to
work on this!  :)

Attached is the latest version of this change.  For simplicity, I have
broken the patches up into five separate groups:

...


Few comments on the patch:


-#define__NetBSD_Version__  599003800   /* NetBSD 5.99.38 */
+#define__NetBSD_Version__  599003900   /* NetBSD 5.99.39 */


Why bumping?


Seemed like the right thing to do, since the locking protocol is 
changing, as well as the global module_lock mutex disappearing.  But I 
really can't hink of any ill effects, unless someone has a module that 
suddenly fails link-and-load because it references module_lock.  It's 
not very likely.


I'll defer, or I'll wait to ride someone else's bump just to be safe.




+ * of of kernel functionality, such as device configuration and loading


One of too much.


Fixed.


+void
+kernconfig_lock(void)
+{
+   lwp_t   *l;
+
+   /*
+* OK to check this unlocked, since it could only be set to
+* curlwp by the current thread itself, and not by an interrupt
+* or any other LWP.
+*/


Please add KASSERT(!cpu_intr_p());


Done.  I was worried about this, and forgot to ask.  Thanks for picking 
it up.





+   l = curlwp;
+   if (kernconfig_lwp == l) {
+   kernconfig_recurse++;
+   KASSERT(kernconfig_recurse != 0);


Such (++ then != 0) assert is a bit useless.  How about:

KASSERT(kernconfig_recurse  1);


The intent here was to ensure that we didn't overflow the counter back 
to zero.  It might have made sense when the counter was a uint8_t, but 
probably not with an int!  It can probably be removed completely.





+   } else {
+   mutex_enter(kernconfig_mutex);
+   kernconfig_lwp = l;
+   kernconfig_recurse = 1;
+   }
+}


Somebody will confuse l with 1. :)


This came straight from ad@'s suggestions, and there seems to be a lot 
of precedent in the sys/kern/ code to use l.  But I will select a better 
variable name.





+   if (--kernconfig_recurse == 0) {
+   kernconfig_lwp = 0;
+   mutex_exit(kernconfig_mutex);


kernconfig_lwp = NULL;


Ooops!  :)




+bool
+kernconfig_is_held(void)
+{
+
+   return (mutex_owned(kernconfig_mutex));


No need for brackets around function.


Got it.




+   KASSERT(kernconfig_is_held);


But missing brackets in few cases.


Yeah, you're the second one to point that out.  I fixed them once, not 
sure how they crept back in.  But they've been fixed again.



Otherwise seems OK to me.  Thanks.



Thank you for the detailed look.


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-06 Thread Paul Goyette

On Thu, 5 Aug 2010, Paul Goyette wrote:


On Thu, 5 Aug 2010, Paul Goyette wrote:


OK, got it.

I'll have another set of patches in a few days.


Well, it was a slow day at the office today, so I found some time to work on 
this!  :)


Attached is the latest version of this change.  For simplicity, I have broken 
the patches up into five separate groups:


Part 1 defines a set of new kernel locking primitives in file
kern/kern_lock.c to implement the recursive kernconfig_mutex.  (I'm not 
really stuck on the name, so open to alternative suggestions!)  All of the 
locking in sys/kern_module.c has been updated to use this instead of the 
internal module_lock mutex.  Additionally, the module_autoload() routine now 
does its own locking, rather than requiring its caller to do it.  And the 
description of locking in the module(9) man page is updated.


Part 2 removes all of the explicit module_{,un}lock() calls from the various 
xxx_verbose modules that I'd previous worked on.


Part 3 removes all of the explicit module_{,un}lock() calls from other bits 
and pieces of the kernel.  In a couple of places, new calls to

kernconfig_{,un}lock() are inserted.

Part 4 updates rump to provide equivalent locking routines.  (pooka, I would 
appreciate you looking at this.)


Part 5 adds a new atf test-case to the existing module tests to verify that 
recursion actually works!


Following up on all the various comments from jmcneil@, pooka@, and 
rmind@, I've attached a revised set of diffs.  The most significant 
change between this and the previous revision is the separation of the 
kernconfig_lock_*() stuff into a separate kern_cfg.c source file.  This 
allows inclusion of the kernel code directly into rump, rather than 
having to re-implement it in rump/klock.c




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

ML.diff.tgz
Description: Binary data


Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-05 Thread Paul Goyette

OK, got it.

I'll have another set of patches in a few days.

On Thu, 5 Aug 2010, Andrew Doran wrote:


On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote:

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
   * as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is
certainly a much larger task than dealing with the immediate problem
- modules which want to load other modules.

So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described


For module lock the wait time isn't really a big concern.  Module load
and unload is a heavyweight operation so the wait time is expected.
There is nothing intrinsically wrong with holding a mutex for a long time
provided that you've got a good handle on the potential side effects.
It's a different story for other locks in the system like say proc_lock,
which can't be held for a long time without disrupting the user experience
and/or deadlocking the system.


   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?


condvar gives you another lock in a roundabout way, without priority
inheritance to help move things along if something has clogged the
pipework. :-).  So I don't see benefit over a mutex.


2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this ultimate solution?  I think
   it would be a long time before we could find and resolve all of the
   issues that might be created in the various threaded situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those situations, and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.


I'm not suggesting that you need to tackle autoconfiguration locking at
the same time.. What I'm suggesting is to put the primitives in place
(say in kern_lock.c or something) and then use these for the benefit of
the modules code. It would provide a statment of direction and avoid
re-hashing things later when somebody decides to tackle autoconf.
Sorry for being unclear.


3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.


Ok well I think the simplest course of action would be to replace the
mutex_owned() with a global variable.  It would do much the same thing
but for the price of 4 or 8 bytes there is no question of contradicting
the mantra. So like:

/*
* Ok to check this unlocked, as for the value to equal curlwp it must
* have been set by the current thread of execution (i.e. not interrupt
* context or another LWP).
*/
l = curlwp;
if (sysconfig_lwp == l) {
/* recurse */
} else {
mutex_enter(sysconfig_lock);
sysconfig_lwp = l;
/* etc etc */
}





Re: Module and device configuration locking [was Re: Modules loading modules?]

2010-08-04 Thread Paul Goyette

On Wed, 4 Aug 2010, Andrew Doran wrote:


Sorry for not replying sooner.

Please don't add a generic recursive lock facility, it would be abused.


Yeah!  My current patches have no new generic facilities at all.


I'd like something roughly like the following.  I think this should
also cover major configuration tasks such as device attach and detach,
so it wouldn't really be module specific.  The naming is a poor suggestion.

void
sysconfig_lock(void)
{

if (mutex_owned(sysconfig_mutex)) {
/*
 * At this point it's OK for the thread to hold other
 * locks, since the system configuration lock was the
 * first taken on the way in.
 */
sysconfig_recurse++;
return;
}
/*
 * I don't remember the correct argument sequence to the
 * following, but basically we need to assert that NO locks,
 * sleep or spin, other than kernel_lock are held,
* as the system configuration lock would be the outermost
 * lock in the system.
 */
LOCKDEBUG_BARRIER(...);
mutex_enter(sysconfig_mutex);
KASSERT(sysconfig_recurse == 0);
}

For the boot autoconfig path where you have things being driven by the
kernel, this would work OK.  Situations where there's a kthread or
something attaching and detaching devices on the fly are a bit different,
since they likely have local atomicity requirements (need to take device
private locks).  There you'd probably need the caller to take the lock.
USB comes to mind (along with some hardware RAID drivers that do hotplug
attach/detach).

Thoughts?


Well, at first glance, this proposal makes sense.  But it is certainly a 
much larger task than dealing with the immediate problem - modules which 
want to load other modules.


So, I really have three questions:

1. In keeping with earlier concerns about holding locks for long periods
   of time (eeh and mrg both commented on this), the approach described
   above would seem to have an even longer hold-time than the current
   module_lock.  Would not something similar to haad's condvar approach
   be appropriate for this proposal, as well as for the more-limited-in-
   scope recursive module lock?

2. Would it be reasonable to solve the immediate problem as previously
   proposed, rather than waiting for this ultimate solution?  I think
   it would be a long time before we could find and resolve all of the
   issues that might be created in the various threaded situations
   that may exist.  I know I'm certainly not sufficiently qualified to
   identify all those situations, and I also know that I don't have
   sufficient available work-hours to do this in any reasonable time-
   frame.

   I'd still like to move forward with the most recent solution to the
   module_lock problem.

3. Since we're still technically abusing the mutex(9) man-page caveat
   about using mutex_owned() for making locking decisions, it would
   be very much appreciated (at least by myself) to have an explanation
   of WHY it is OK in this case to do it here, but not somewhere else.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-03 Thread haad
On Tue, Aug 3, 2010 at 3:03 PM, Paul Goyette p...@whooppee.com wrote:
 On Mon, 2 Aug 2010, Paul Goyette wrote:

 On Tue, 3 Aug 2010, matthew green wrote:

 i think this looks good enough.  wait for some others eyes, though.

 No hurry.  I'm going to try to add a new atf test case for the recursive
 load case, and it will take some time for me to come up to speed on the atf
 stuff. :)

 Well, still no hurry, but it would be nice if some additional eyes were
 pointed this way.  I've got the recursive-module-load test case added to the
 existing tests/modules/ stuff (and it works)!

 BTW, since this is changing the kernel ABI, I guess it will need a version
 bump.  Do we have any other bumps coming soon that we can share?

I talked with Andrew today and he said that basicaly we can use
mutex_owned for this particular case. That should make your code much
simpler :) sorry for any additional work :)



-- 


Regards.

Adam


Re: Modules loading modules?

2010-08-03 Thread Paul Goyette

On Tue, 3 Aug 2010, haad wrote:

Well, still no hurry, but it would be nice if some additional eyes were 
pointed this way.  I've got the recursive-module-load test case added

to the existing tests/modules/ stuff (and it works)!

BTW, since this is changing the kernel ABI, I guess it will need a 
version bump.  Do we have any other bumps coming soon that we can 
share?


I talked with Andrew today and he said that basicaly we can use
mutex_owned for this particular case. That should make your code much
simpler :) sorry for any additional work :)


Well, we've come full circle!  :)

It seems to me that, even if it is OK to use mutex_owned() for this 
particular case, perhaps we should not.  Most everyone seemed to agree 
that your condvar solution was technically correct, and if we did use 
the mutex_owned() solution it would only serve as a temptation for 
someone else in the future!


We now have two solutions, both of which actually work (tested on my 
home machine).  One solution (mutex_owned) might be slightly simpler and 
slightly more efficient than the other (condvar), but this section of 
code is certainly not performance-critical.  Therefore, is there any 
reason for preferring the mutex_owned solution?


I'm happy to go either way based on concensus here, but I'm leaning 
towards the condvar solution.  It doesn't require any exemptions or 
long explanations of why it is OK to use mutex_owned in ways that 
would appear to violate our own documentation.  And it also addresses 
eeh's (and mrg's) concerns of keeping the lock across potentially 
long-duration  operations.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-08-02 Thread matthew green

 According to the mutex(9) man page:
 
  mutex_owned(mtx)
 
 For adaptive mutexes, return non-zero if the current LWP holds
 the mutex. ...

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.

ie, you can not even KASSERT(!mutex_owned()).

the documentation matches my experience with using mutex(9) and
KASSERT().

but i'm still not sure why we're going to such lengths to hold a
lock across such a heavy weight operation like loading a module.
that may involve disk speeds, and so you're looking at waiting
millions of cycles for the lock.  aka, what eeh posted.


.mrg


Re: Modules loading modules?

2010-08-02 Thread Adam Hamsik

On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote:

 On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote:
 One question:  Since an adaptive kmutex_t already includes an owner 
 field, would we really need to have another copy of it in the rmutex_t 
 structure?
 
 Good point.  I think it's ok to do:
 
  if (mutex_owned(mtx))
cnt++
  else
lock

When I did locking for dm driver, it was said by rmind@ that mutex_owned should 
not be used for any locking decisions and should be used only for diagnostic 
purposes. What I would in this case is add two new routines to module_framework 
- 

module_lock()
module_unlock()

These two will then do something like this.

kmutex_t module_lock;
uint32_t refcnt;
lwp_t mod_lwp;
condvar_t cv;


module_lock() {

mutex_enter(module_lock);
start
if (refcnt == 0) {
refcnt++;
mod_lwp = curlwp;
else {
if (mod_lwp == curlwp)
refcnt++;
else {
cv_wait(cv);
goto start
}
}

mutex_exit(module_lock);
}


mudule_unlock() {
  mutex_enter(module_lock);
KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from 
lwp which called module_lock */

refcnt--;

if (refcnt == 0) {
mod_lwp = 0;
cv_broadcast(cv);
}

  mutex_exit(module_lock);
}

This way module_lock can be called multiple times by one lwp and will sleep for 
everyone else. 

Regards

Adam.



Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Sun, 1 Aug 2010, Eduardo Horvath wrote:


Why is there a need to hold a mutex across module loading instead of, say,
having a reference count tht's protected by a mutex?  I thought holding a
mutex across a potentially blocking operation such as a module load is
considered bad practice.


Please note that I am not introducing the holding of the lock over the 
potentially blocking operation.  This is already done.


Several others have already indicated that the atomicity of the module 
load (including the loading of any required modules) is a desirable 
property.  I'm simply looking for a mechanism that allows us to keep 
this property while allowing modules to load other modules.




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Adam Hamsik wrote:



On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote:


On Sun, 1 Aug 2010, John Nemeth wrote:


   I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


Attached are diffs for moving the locking out of each caller and into 
module_autoload() itself.  Most of these changes are fairly trivial, but a 
couple of them should be looked at more closely in case I've missed any 
subtleties in the original code:

kern/kern_syscall.c
net/if_ppp.c


I think that this patch has same problem as original code you can call 
module_autoload twice in same lwp. (you are holding module_lock while 
calling module_do_load).


This patch wasn;t intended to solve the nested-call problem.  This patch 
only purpose was to modify the existing semantics of the module_autoload 
routine to match that of the other module routines.  In particular, 
module_autoload is the only routine that requires its caller to acquire 
the mutex first;  all the other routines do all the required locking 
internally to each routine.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Antti Kantee
On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:
 this is an incomplete reading of the manual page, and you can not
 use mutex_owned() the way you are trying to (regardless of what
 pooka posted.) you can't even using it in various forms of assertions
 safely.  from the man page:
 
   It should not be used to make locking decisions at run time, or to
   verify that a lock is not held.

That's the mantra, yes.

 ie, you can not even KASSERT(!mutex_owned()).

Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.

Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().

 but i'm still not sure why we're going to such lengths to hold a
 lock across such a heavy weight operation like loading a module.
 that may involve disk speeds, and so you're looking at waiting
 millions of cycles for the lock.  aka, what eeh posted.

It's held for millions of cycles already now and nobody has pointed out
measurable problems.  But if it is deemed necessary, you can certainly
hide a cv underneath.  The efficiency requirements for the module lock
are probably anyway in the who cares wins ... not spectrum.  At least
I'm not aware of any fastpath wanting to use it.

Anyway, no real opinion there.  A cv most likely is the safe, no-brainer
choice.


Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Antti Kantee wrote:


On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.

Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


but i'm still not sure why we're going to such lengths to hold a
lock across such a heavy weight operation like loading a module.
that may involve disk speeds, and so you're looking at waiting
millions of cycles for the lock.  aka, what eeh posted.


It's held for millions of cycles already now and nobody has pointed out
measurable problems.  But if it is deemed necessary, you can certainly
hide a cv underneath.  The efficiency requirements for the module lock
are probably anyway in the who cares wins ... not spectrum.  At least
I'm not aware of any fastpath wanting to use it.

Anyway, no real opinion there.  A cv most likely is the safe, no-brainer
choice.


Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes 
good sense here.  I'm working on it and I should be able to post new 
diffs tomorrow.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Locking for module_autoload() (was: Modules loading modules?)

2010-08-02 Thread Adam Hamsik

On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote:

 On Sun, 1 Aug 2010, John Nemeth wrote:
 
I'm thinking the acquisition of module_lock should be pushed into
 module_autoload() instead of having the caller acquire it for this very
 reason.  It makes it hard to change the way locking works in the
 MODULAR code if you expect the caller to acquire the lock.  I don't
 know why it was done this way originally, or what the consequences (if
 any) would be for making the change.  Andrew, any thoughts on this?
 
 Attached are diffs for moving the locking out of each caller and into 
 module_autoload() itself.  Most of these changes are fairly trivial, but a 
 couple of them should be looked at more closely in case I've missed any 
 subtleties in the original code:
 
   kern/kern_syscall.c
   net/if_ppp.c

I think that this patch has same problem as original code you can call 
module_autoload twice in same lwp. (you are holding module_lock while calling 
module_do_load).

Regards

Adam.



Re: Modules loading modules?

2010-08-02 Thread Paul Goyette

On Mon, 2 Aug 2010, Paul Goyette wrote:

Yes, the cv mechanism, coupled with changing the semantics of 
module_autoload() (to not require the caller to obtain the lock) makes good 
sense here.  I'm working on it and I should be able to post new diffs 
tomorrow.



OK, it's time for Round #3!

Attached are diffs that implement Adam's suggested cv-based solution,
along with moving the locking for module_autolock() into the routine 
itself (for consistency with other routines).  I've also updated the 
module(4) man page this time around.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-

MODULE_LOCKING.diff.gz
Description: Binary data


re: Modules loading modules?

2010-08-02 Thread matthew green

 On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:
  this is an incomplete reading of the manual page, and you can not
  use mutex_owned() the way you are trying to (regardless of what
  pooka posted.) you can't even using it in various forms of assertions
  safely.  from the man page:
  
  It should not be used to make locking decisions at run time, or to
  verify that a lock is not held.
 
 That's the mantra, yes.
 
  ie, you can not even KASSERT(!mutex_owned()).
 
 Strictly speaking you can in a case where you have two locks always
 are taken as l1,l2 and released as l2,l1 provided you're not dealing
 with a spin mutex.  Does it make any sense?  no (l2 is useless).
 Is it possible?  yes.

what is not possible is to safely to what the manual says.  you will
not have the right thing happen.  i've seen this when i've been
working on the sparc64 pmap.

 Now, on to sensible stuff.  I'm quite certain that warning was written
 to make people avoid writing bad code without understanding locking --
 if you need to used mutex_owned() to decide if you should lock a normal
 mutex, your code is broken.  The other less likely possibility is that
 someone plans to change mutex_owned in the future.
 
 Further data point: the same warning is in rw_held, yet it was used to
 implement recursive locking in vlockmgr until its very recent demise.
 
 Ignoring man page mantras and focusing on how the code works, I do not
 see anything wrong with Paul's use of mutex_owned().

this just does not match my actual experience in the kernel.  i had
weird pmap-style problems and asserts firing wrongly while i did not
obey the rules in the manual directly.


.mrg.


re: Modules loading modules?

2010-08-02 Thread Paul Goyette
The most recent verion of this patch set was recently posted, and it 
uses the cv_wait() mechanism suggested by Adam Hamsik.


This discussion is still very intresting, but perhaps we could start a 
new thread (or at least change the subject line)?  :)



On Tue, 3 Aug 2010, matthew green wrote:




On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote:

this is an incomplete reading of the manual page, and you can not
use mutex_owned() the way you are trying to (regardless of what
pooka posted.) you can't even using it in various forms of assertions
safely.  from the man page:

It should not be used to make locking decisions at run time, or to
verify that a lock is not held.


That's the mantra, yes.


ie, you can not even KASSERT(!mutex_owned()).


Strictly speaking you can in a case where you have two locks always
are taken as l1,l2 and released as l2,l1 provided you're not dealing
with a spin mutex.  Does it make any sense?  no (l2 is useless).
Is it possible?  yes.


what is not possible is to safely to what the manual says.  you will
not have the right thing happen.  i've seen this when i've been
working on the sparc64 pmap.


Now, on to sensible stuff.  I'm quite certain that warning was written
to make people avoid writing bad code without understanding locking --
if you need to used mutex_owned() to decide if you should lock a normal
mutex, your code is broken.  The other less likely possibility is that
someone plans to change mutex_owned in the future.

Further data point: the same warning is in rw_held, yet it was used to
implement recursive locking in vlockmgr until its very recent demise.

Ignoring man page mantras and focusing on how the code works, I do not
see anything wrong with Paul's use of mutex_owned().


this just does not match my actual experience in the kernel.  i had
weird pmap-style problems and asserts firing wrongly while i did not
obey the rules in the manual directly.


.mrg.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-02 Thread Antti Kantee
On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote:
  Now, on to sensible stuff.  I'm quite certain that warning was written
  to make people avoid writing bad code without understanding locking --
  if you need to used mutex_owned() to decide if you should lock a normal
  mutex, your code is broken.  The other less likely possibility is that
  someone plans to change mutex_owned in the future.
  
  Further data point: the same warning is in rw_held, yet it was used to
  implement recursive locking in vlockmgr until its very recent demise.
  
  Ignoring man page mantras and focusing on how the code works, I do not
  see anything wrong with Paul's use of mutex_owned().
 
 this just does not match my actual experience in the kernel.  i had
 weird pmap-style problems and asserts firing wrongly while i did not
 obey the rules in the manual directly.

Not knowing more details it's difficult to comment.  But since you are
talking about the pmap, maybe your experiences are with spin mutexes
instead of adaptive ones?


Re: Modules loading modules?

2010-08-01 Thread Antti Kantee
On Sat Jul 31 2010 at 15:48:26 -0700, Paul Goyette wrote:
 If modload-from-modcmd is found necessary, sounds more like a case for
 the infamous recursive lock.
 
 Recursive lock is the way to go.  I think the same lock should also cover
 all device configuration activites (i.e. autoconf) and any other
 heavy lifting where we have chunks of the system coming and going.
 
 Well, folks, here is a first pass recursive locks!  The attached diffs 
 are against -current as of a few minutes ago.

Oh, heh, I thought we have recursive lock support.  But with that gone
from the vfs locks, I guess not (apart from the kernel lock ;).

I'm not sure if it's a good idea to change the size of kmutex_t.  I guess
plenty of data structures have carefully been adjusted by hand to its
size and I don't know of any automatic way to recalculate that stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It feels
wrong to punish all the normal kmutex users for just one use.  It'll also
make the implementation a lot simpler to test, since it's purely MI.

separate normal case and worst case


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


Well, folks, here is a first pass recursive locks!  The attached diffs
are against -current as of a few minutes ago.


Oh, heh, I thought we have recursive lock support.  But with that gone
from the vfs locks, I guess not (apart from the kernel lock ;).


:)



I'm not sure if it's a good idea to change the size of kmutex_t.  I guess
plenty of data structures have carefully been adjusted by hand to its
size and I don't know of any automatic way to recalculate that stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It feels
wrong to punish all the normal kmutex users for just one use.  It'll also
make the implementation a lot simpler to test, since it's purely MI.

separate normal case and worst case


Good point, and it will be a lot less work, too!  :)  And it solves the 
problem of not permitting a rmutex being used with condvars.


One question:  Since an adaptive kmutex_t already includes an owner 
field, would we really need to have another copy of it in the rmutex_t 
structure?



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Antti Kantee
On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote:
 One question:  Since an adaptive kmutex_t already includes an owner 
 field, would we really need to have another copy of it in the rmutex_t 
 structure?

Good point.  I think it's ok to do:

  if (mutex_owned(mtx))
cnt++
  else
lock


Re: Modules loading modules?

2010-08-01 Thread Eduardo Horvath
On Sun, 1 Aug 2010, Paul Goyette wrote:

 Good point, and it will be a lot less work, too!  :)  And it solves the
 problem of not permitting a rmutex being used with condvars.
 
 One question:  Since an adaptive kmutex_t already includes an owner field,
 would we really need to have another copy of it in the rmutex_t structure?

Why is there a need to hold a mutex across module loading instead of, say, 
having a reference count tht's protected by a mutex?  I thought holding a 
mutex across a potentially blocking operation such as a module load is 
considered bad practice.

Eduardo


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, Antti Kantee wrote:


I'm not sure if it's a good idea to change the size of kmutex_t.  I
guess plenty of data structures have carefully been adjusted by hand
to its size and I don't know of any automatic way to recalculate that
stuff.

Even if not, since this is the only user and we probably won't have
that many of them even in the future, why not just define a new type
``rmutex'' which contains a kmutex, an owner and the counter?  It 
feels wrong to punish all the normal kmutex users for just one use.

It'll also make the implementation a lot simpler to test, since it's
purely MI.

separate normal case and worst case


Round two!  Taking pooka's suggestion, this version is built on top of 
(rather than beside) the existing non-recursive mutex.  As such, it does 
not affect any MD code.


Attached is a set of diffs that

1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement
   recursive adaptive mutexes.  (Conspicuously missing is an rmutex(9)
   man page...  It will happen before this gets committed.)

2. Converts the existing module_lock from a normal kmutex_t to an
   rmutex_t

3. Updates all of the (surprisingly many) places where module_lock
   is acquired.

Compile-tested on port-amd64 (including rumptest).  Since there are no 
MD-changes in this version, there shouldn't be any issues with 
building on other ports.



As previously noted, there is only one known use case for this so far: 
modules loading other modules from within their xxx_modcmd() routine. 
The specific use case we have involves loading the acpicpu driver/module 
which eventually results in an attempt to load acpiverbose.


It would be really nice if the community could

A. Compile-test on additional architectures
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/conf/files
===
RCS file: /cvsroot/src/sys/conf/files,v
retrieving revision 1.992
diff -u -p -r1.992 files
--- sys/conf/files  7 Jul 2010 01:09:39 -   1.992
+++ sys/conf/files  2 Aug 2010 00:08:26 -
@@ -1468,6 +1468,7 @@ file  kern/kern_prot.c
 file   kern/kern_ras.c
 file   kern/kern_rate.c
 file   kern/kern_resource.c
+file   kern/kern_rmutex.c
 file   kern/kern_runq.c
 file   kern/kern_rwlock.c
 file   kern/kern_rwlock_obj.c
Index: sys/sys/rmutex.h
===
RCS file: sys/sys/rmutex.h
diff -N sys/sys/rmutex.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/sys/rmutex.h2 Aug 2010 00:08:27 -
@@ -0,0 +1,53 @@
+/* $NetBSD$*/
+
+/*-
+ * Copyright (c) 2010 The NetBSD Foundation, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
+ * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+ * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef_SYS_RMUTEX_H_
+#define_SYS_RMUTEX_H_
+
+#include sys/mutex.h
+
+/*
+ * A recursive mutex is simply an adaptive mutex which can be acquired
+ * multiple times by a single owner.
+ */
+ 
+struct rmutex {
+   kmutex_trmtx_mtx;
+   uint32_trmtx_recurse;
+};
+
+typedef struct rmutex rmutex_t;
+
+void   rmutex_init(rmutex_t *);
+void   rmutex_destroy(rmutex_t *);
+void   

Re: Modules loading modules?

2010-08-01 Thread Quentin Garnier
On Sun, Aug 01, 2010 at 05:17:01PM -0700, Paul Goyette wrote:
[...]
 +void
 +rmutex_enter(rmutex_t *rmtx)
 +{
 + if (mutex_owned(rmtx-rmtx_mtx)) {
 + rmtx-rmtx_recurse++;
 + KASSERT(rmtx-rmtx_recurse != 0);
 + } else {
 + mutex_enter(rmtx-rmtx_mtx);
 + rmtx-rmtx_recurse = 1;
 + }
 +}
 +
 +void
 +rmutex_exit(rmutex_t *rmtx)
 +{
 + KASSERT(mutex_owned(rmtx-rmtx_mtx));
 + KASSERT(rmtx-rmtx_recurse != 0);
 + if (--rmtx-rmtx_recurse == 0)
 + mutex_exit(rmtx-rmtx_mtx);
 +}
 +
 +int
 +rmutex_tryenter(rmutex_t *rmtx)
 +{
 + int rv = 1;
 +
 + if (mutex_owned(rmtx-rmtx_mtx)) {
 + rmtx-rmtx_recurse++;
 + KASSERT(rmtx-rmtx_recurse != 0);
 + } else if ((rv = mutex_tryenter(rmtx-rmtx_mtx)) != 0) {
 + rmtx-rmtx_recurse++;
 + KASSERT(rmtx-rmtx_recurse != 0);
 + }
 + return rv;
 +}

I am probably not getting the idea, but I fail to see how this qualifies
as a mutex.  My reading of this piece of code is that rmutex_enter() is
always going to succeed immediately.

Please correct me if I'm wrong--and I am very probably wrong--, but
isn't the point of a recursive mutex to allow the thread initially
taking the mutex to take it again, but not other threads?  The code
would have to be a lot more complicated than that.

All I see here is a fancy no-op, but maybe it's just me.

-- 
Quentin Garnier - c...@cubidou.net - c...@netbsd.org
See the look on my face from staying too long in one place
[...] every time the morning breaks I know I'm closer to falling
KT Tunstall, Saving My Face, Drastic Fantastic, 2007.


pgpgTt12zdsHJ.pgp
Description: PGP signature


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Quentin Garnier wrote:


+int
+rmutex_tryenter(rmutex_t *rmtx)
+{
+   int rv = 1;
+
+   if (mutex_owned(rmtx-rmtx_mtx)) {
+   rmtx-rmtx_recurse++;
+   KASSERT(rmtx-rmtx_recurse != 0);
+   } else if ((rv = mutex_tryenter(rmtx-rmtx_mtx)) != 0) {
+   rmtx-rmtx_recurse++;
+   KASSERT(rmtx-rmtx_recurse != 0);
+   }
+   return rv;
+}


I am probably not getting the idea, but I fail to see how this qualifies
as a mutex.  My reading of this piece of code is that rmutex_enter() is
always going to succeed immediately.


According to the mutex(9) man page:

mutex_owned(mtx)

   For adaptive mutexes, return non-zero if the current LWP holds
   the mutex. ...

The rmutex_init() code ensures that we have an adaptive mutex.  So what 
this says is


If _we_ already own the mutex,
increment counter
(rv is still initialized to 1)
Else
if we can acquire the mutex (updating rv)
increment the counter

Return



Please correct me if I'm wrong--and I am very probably wrong--, but
isn't the point of a recursive mutex to allow the thread initially
taking the mutex to take it again, but not other threads?  The code
would have to be a lot more complicated than that.

All I see here is a fancy no-op, but maybe it's just me.


It would be a no-op for a recursive spin mutex.  That's why I didn't
allow them!  :)


-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Quentin Garnier
On Sun, Aug 01, 2010 at 06:05:20PM -0700, Paul Goyette wrote:
 On Mon, 2 Aug 2010, Quentin Garnier wrote:
 
 +int
 +rmutex_tryenter(rmutex_t *rmtx)
 +{
 +   int rv = 1;
 +
 +   if (mutex_owned(rmtx-rmtx_mtx)) {
 +   rmtx-rmtx_recurse++;
 +   KASSERT(rmtx-rmtx_recurse != 0);
 +   } else if ((rv = mutex_tryenter(rmtx-rmtx_mtx)) != 0) {
 +   rmtx-rmtx_recurse++;
 +   KASSERT(rmtx-rmtx_recurse != 0);
 +   }
 +   return rv;
 +}
 
 I am probably not getting the idea, but I fail to see how this qualifies
 as a mutex.  My reading of this piece of code is that rmutex_enter() is
 always going to succeed immediately.
 
 According to the mutex(9) man page:
 
 mutex_owned(mtx)
 
For adaptive mutexes, return non-zero if the current LWP holds
the mutex. ...

Ah, that was what I was missing.  Thanks.

-- 
Quentin Garnier - c...@cubidou.net - c...@netbsd.org
See the look on my face from staying too long in one place
[...] every time the morning breaks I know I'm closer to falling
KT Tunstall, Saving My Face, Drastic Fantastic, 2007.


pgp0WclWeuAf7.pgp
Description: PGP signature


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Sun, 1 Aug 2010, John Nemeth wrote:


I'm thinking the acquisition of module_lock should be pushed into
module_autoload() instead of having the caller acquire it for this very
reason.  It makes it hard to change the way locking works in the
MODULAR code if you expect the caller to acquire the lock.  I don't
know why it was done this way originally, or what the consequences (if
any) would be for making the change.  Andrew, any thoughts on this?


I was thinking the same thing.

There is one case in kern/kern_exec.c where the lock was held while the 
code loops through a list of possible modules.  Looking closer, that 
code actually releases the lock and yield()s before re-acquiring it 
within the loop, so it probably could be modified without any adverse 
effects.


This would actually simplify the interface, since ALL of the routines 
would be doing their own locking, and noone in the rest of the kernel 
would need to know about the lock.





-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-08-01 Thread Paul Goyette

On Mon, 2 Aug 2010, Masao Uebayashi wrote:


mutex_owned(mtx)

  ...

  mutex_owned() is provided for making diagnostic checks to verify
  that a lock is held.  For example:

  KASSERT(mutex_owned(driver_lock));

  It should not be used to make locking decisions at run time, or to
  verify that a lock is not held.

Thus you can't use it for your decision.


Yes, I did read the mutex(9) man page.

However, your assertion is at odds with pooka's suggestion, from
http://mail-index.netbsd.org/tech-kern/2010/08/01/msg008632.html


Good point.  I think it's ok to do:

  if (mutex_owned(mtx))
cnt++
  else
lock




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-31 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


it seems to me the root problem is that module_mutex is held while
calling into the module startup routines.

thus, the right solution is to remove this requirement.


Yes, that's what is needed.


I'm far from convinced that's a good idea.  First, it will probably
make the module code a nightmare -- what happens when you have multiple
interleaved loads, some of which fail at some point in their dependency
stack, and let's just throw in a manual modunload to mix up things
further.  Second, and pretty much related to number one, it goes against
one of the most fundamental principles of robust code: atomic actions.


This: atomicity.  The atomic behaviour is relied upon to give all or
nothing semantics upon load and unload, like transactions in a database.

Admittely some modules (not of my making) are sloppy about this and so
break that bit of the interface contract.


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


Well, folks, here is a first pass recursive locks!  The attached diffs 
are against -current as of a few minutes ago.


Some caveats:

1. This has only been compile-tested for x86 (amd64  i386).

2. Other architectures have not even been compiled yet.  It should
   work but you never know.

3. HPPA is an exceptional exception here, since it is the only in-tree
   architecture I found that cannot use SIMPLE_LOCKS.

4. There is only one known use case for this so far:  modules loading
   other modules from within their xxx_modcmd() routine.  The specific
   use case we have involves loading the acpicpu driver/module which
   eventually results in an attempt to load acpiverbose.

It would be really nice if the community could

A. Compile-test on additional architectures, especially HPPA
B. Test to see that existing mutex operations still work correctly
C. Exercise the known use case if possible
D. Identify additional use cases


There is one place in the code (marked with great big XXX) where I am 
simply unsure if I need to use a CAS-type operation for updating the 
reference count on the recursive mutex.  I _think_ it is save to simply 
auto-increment the field, but I am totally willing for any of you 
experts to tell my why this might not be safe.   :)


I will be updating one of my home machines (amd64) to use this code in 
the next few days.  I don't plan on making any commits until I've had 
some positive testing feedback, and hopefully some expert commentary on 
the XXX.  (Some additional concensus that this is the right thing to 
do would also be appreciated!)




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-Index: sys/arch/alpha/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/alpha/include/mutex.h,v
retrieving revision 1.4
diff -u -p -r1.4 mutex.h
--- sys/arch/alpha/include/mutex.h  28 Apr 2008 20:23:11 -  1.4
+++ sys/arch/alpha/include/mutex.h  31 Jul 2010 22:34:30 -
@@ -43,7 +43,11 @@ struct kmutex {
 
 struct kmutex {
union {
-   volatile uintptr_t  mtxa_owner;
+   struct {
+   volatile uintptr_t  mtxa_owner;
+   volatile uint8_tmtxa_recurse;
+   volatile uint8_tmtxa_unused[3];
+   } a;
struct {
volatile uint8_tmtxs_flags;
ipl_cookie_tmtxs_ipl;
@@ -53,7 +57,8 @@ struct kmutex {
__cpu_simple_lock_t mtx_lock;
 };
 
-#definemtx_owner   u.mtxa_owner
+#definemtx_owner   u.a.mtxa_owner
+#definemtx_recurse u.a.mtxa_recurse
 #definemtx_flags   u.s.mtxs_flags
 #definemtx_ipl u.s.mtxs_ipl
 
Index: sys/arch/arm/include/mutex.h
===
RCS file: /cvsroot/src/sys/arch/arm/include/mutex.h,v
retrieving revision 1.10
diff -u -p -r1.10 mutex.h
--- sys/arch/arm/include/mutex.h28 Apr 2008 20:23:14 -  1.10
+++ sys/arch/arm/include/mutex.h31 Jul 2010 22:34:31 -
@@ -57,7 +57,11 @@ struct kmutex {
 struct kmutex {
union {
/* Adaptive mutex */
-   

Re: Modules loading modules?

2010-07-29 Thread Paul Goyette

On Wed, 28 Jul 2010, Andrew Doran wrote:


If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.


Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other
heavy lifting where we have chunks of the system coming and going.


OK, I'm willing to make an attempt to implement the recursive_lock if 
noone else wants to give it a try!  It will probably take some time...




-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: Modules loading modules?

2010-07-28 Thread Andrew Doran
On Mon, Jul 26, 2010 at 02:49:56PM +0300, Antti Kantee wrote:
 On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote:
  On Mon, 26 Jul 2010, matthew green wrote:
  
  
  it seems to me the root problem is that module_mutex is held while
  calling into the module startup routines.
  
  thus, the right solution is to remove this requirement.
  
  Yes, that's what is needed.
 
 I'm far from convinced that's a good idea.  First, it will probably
 make the module code a nightmare -- what happens when you have multiple
 interleaved loads, some of which fail at some point in their dependency
 stack, and let's just throw in a manual modunload to mix up things
 further.  Second, and pretty much related to number one, it goes against
 one of the most fundamental principles of robust code: atomic actions.

This: atomicity.  The atomic behaviour is relied upon to give all or
nothing semantics upon load and unload, like transactions in a database.

Admittely some modules (not of my making) are sloppy about this and so
break that bit of the interface contract.

 If modload-from-modcmd is found necessary, sounds more like a case for
 the infamous recursive lock.

Recursive lock is the way to go.  I think the same lock should also cover
all device configuration activites (i.e. autoconf) and any other 
heavy lifting where we have chunks of the system coming and going.



Re: Modules loading modules?

2010-07-26 Thread Antti Kantee
On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote:
 On Mon, 26 Jul 2010, matthew green wrote:
 
 
 it seems to me the root problem is that module_mutex is held while
 calling into the module startup routines.
 
 thus, the right solution is to remove this requirement.
 
 Yes, that's what is needed.

I'm far from convinced that's a good idea.  First, it will probably
make the module code a nightmare -- what happens when you have multiple
interleaved loads, some of which fail at some point in their dependency
stack, and let's just throw in a manual modunload to mix up things
further.  Second, and pretty much related to number one, it goes against
one of the most fundamental principles of robust code: atomic actions.

If modload-from-modcmd is found necessary, sounds more like a case for
the infamous recursive lock.

(no comment on the actual problem)


Re: Modules loading modules?

2010-07-26 Thread Paul Goyette

On Mon, 26 Jul 2010, der Mouse wrote:


We have a modular device driver, let's call it xxxmod.  [...]  It []
might attempt to use an optional module (e.g., zzzverbose) to print
some device attachment messages.



First, a required module cannot be optional.  If the desired module
is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated
back to the original outer call to module_load() which will also
fail.



The second reason why this is not suitable is that the outer load
will add a reference to the module, preventing it from being
auto-unloaded.


Surely the right answer here is to provide a way to say refer to this
module, but it's ok for its load to fail, and it's ok for it to get
auto-unloaded, including passing up whatever information is necessary
for the calling module to do something useful in failure cases?


I actually considered adding a set of desired modules as well as the 
current required modules.  That would certainly take care of the 
immediate problem with the xxx_verbose modules.


But it doesn't address the need to have one module/driver loading 
another one for a child device, and it would be a maintenance nightmare 
to require updating the dependency-list every time a new child driver 
gets created.



It really seems to me that the module system is there to help us, not
to shackle us, and that if it has properties which are leading to
problems, one of the options we should at the very least be considering
is changing those properties.


Yes, the current restriction appears to be burdensome.  I'm just trying 
to get some consensus on that - so far I've seen only one negative 
opinion (from pooka@ IIRC).  If we can agree that it's desirable for one 
module's xxx_modcmd() to load another module, then we can move to step 2 
and figure out how to make it happen without breaking anything.


If we don't get consensus, then I'll most likely need to revert the 
xxx_verbose modules.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Modules loading modules?

2010-07-25 Thread Paul Goyette


Consider the following scenario:

We have a modular device driver, let's call it xxxmod.  The driver's
xxx_modcmd(MODULE_CMD_INIT, ...) routine handles all of its initialization,
including interaction with autoconf(9).  It therefore might attempt to use
an optional module (e.g., zzzverbose) to print some device attachment
messages.

(Another possible scenario would be to have one modular driver xxxmod
that calls config_found_xxx() which in turn requires another modular
driver yyymod.)

We have no problem if the zzzverbose module is already auto-loaded. But
if the module is not present, or if it has already been auto- unloaded,
the current code will call module_autoload() to attempt to (re)load it.
Since this call happens in xxx_modcmd(), we end calling mutex_enter()
while the mutex is still owned.

There is currently a requires mechanism that allows recursion within
the module loading process.  But it doesn't quite solve the problem, for
at least three reasons.  First, a required module cannot be optional. If
the desired module is not present, or if it is present but its own
xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated back
to the original outer call to module_load() which will also fail.

The second reason why this is not suitable is that the outer load will
add a reference to the module, preventing it from being auto-unloaded.
(A key benefit of having the xxx_verbose modules is that their large text
tables can be unloaded.)

A third reason that tends to make this option unuseable is the need to
identify ahead of time all of the possible optional modules and maintain
the dependency list.  Adding a new driver would thus require updating the
dependency list in the parent device's driver.

Any suggestions on how to resolve this conundrum?  Without a solution,
the zzz_verbose modules and modularized drivers that might use the 
zzz_verbose modules cannot co-exist, and we won't be able to have modular

drivers for devices whose children are served by other modular drivers.



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


re: Modules loading modules?

2010-07-25 Thread matthew green

 On Mon, 26 Jul 2010, matthew green wrote:
 
 
  it seems to me the root problem is that module_mutex is held while
  calling into the module startup routines.
 
  thus, the right solution is to remove this requirement.
 
 Yes, that's what is needed.
 
 But it sounds a lot simpler than it is.

oh, i'm sure it's not trivial.  :-(


.mrg.


Re: Modules loading modules?

2010-07-25 Thread Jukka Ruohonen
On Mon, Jul 26, 2010 at 06:41:11AM +1000, matthew green wrote:
 it seems to me the root problem is that module_mutex is held while
 calling into the module startup routines.

Here is one related question: is it ensured that the module lock is dropped
immediately after a modular device driver returns from its attachment
routine?  I am thinking of a case where a modular driver defers its
configuration by using config_interrupts(9) or config_finalize_register(9).

- Jukka.