Re: Module and device configuration locking [was Re: Modules loading modules?]
Updating the status on these changes: One comment questioned whether or not a version bump was required, and I've more-or-less convinced myself it is at least desired. While properly-working modules from the pre-update will continue to work on a post-update kernel, the reverse is not necessarily true. A module written for a post-update kernel and which takes advantage of the changed locking protocol will fail on a pre-update kernel. Another comment suggested that the name of newly-created file kern/kern_cfg.c should be changed to more closely match its contents, while an earlier comment had suggested generalizing the filename! I've taken the more-specific path and the file is now called kern_cfglock.c If other stuff gets added to this file later, we can come up with a more generic name at that time. Some additional changes have been included in this latest set of diffs. Mostly, there were some KASSERT()s sprinkled throughout that tried to ensure that the code had not been called recursively. Since recursion is now explicitly supported, all of the "module_active" stuff has been reworked. Additionally, there was a statically-allocated list of "pending" modules that needed to have their initialization completed. This has been changed to keep multiple stack-allocated lists, one for each depth. I'm planning to commit these changes next weekend, unless there are any significant objections. If anyone else needs to coordinate changes in order to "ride-the-version-bump", let me know. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | - ML.diff.tgz Description: Binary data
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Sun, 8 Aug 2010, Mindaugas Rasiukevicius wrote: Paul Goyette wrote: Following up on all the various comments from jmcneil@, pooka@, and rmind@, I've attached a revised set of diffs. The most significant change between this and the previous revision is the separation of the kernconfig_lock_*() stuff into a separate kern_cfg.c source file. This allows inclusion of the kernel code directly into rump, rather than having to re-implement it in rump/klock.c Please use kern_cflock.c or something like that, as kern_cfg.c is too generic for such specific facility. My original plan was to use kern_cfglock.c but pooka@ suggested using a more-generic name "in case we have more stuff in the future". :) I personally prefer the more-specific name myself, but I cannot make you both happy! - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Module and device configuration locking [was Re: Modules loading modules?]
Paul Goyette wrote: > Following up on all the various comments from jmcneil@, pooka@, and > rmind@, I've attached a revised set of diffs. The most significant > change between this and the previous revision is the separation of the > kernconfig_lock_*() stuff into a separate kern_cfg.c source file. This > allows inclusion of the kernel code directly into rump, rather than > having to re-implement it in rump/klock.c Please use kern_cflock.c or something like that, as kern_cfg.c is too generic for such specific facility. -- Mindaugas
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Thu, 5 Aug 2010, Paul Goyette wrote: On Thu, 5 Aug 2010, Paul Goyette wrote: OK, got it. I'll have another set of patches in a few days. Well, it was a slow day at the office today, so I found some time to work on this! :) Attached is the latest version of this change. For simplicity, I have broken the patches up into five separate groups: Part 1 defines a set of new kernel locking primitives in file kern/kern_lock.c to implement the recursive kernconfig_mutex. (I'm not really stuck on the name, so open to alternative suggestions!) All of the locking in sys/kern_module.c has been updated to use this instead of the internal module_lock mutex. Additionally, the module_autoload() routine now does its own locking, rather than requiring its caller to do it. And the description of locking in the module(9) man page is updated. Part 2 removes all of the explicit module_{,un}lock() calls from the various xxx_verbose modules that I'd previous worked on. Part 3 removes all of the explicit module_{,un}lock() calls from other bits and pieces of the kernel. In a couple of places, new calls to kernconfig_{,un}lock() are inserted. Part 4 updates rump to provide equivalent locking routines. (pooka, I would appreciate you looking at this.) Part 5 adds a new atf test-case to the existing module tests to verify that recursion actually works! Following up on all the various comments from jmcneil@, pooka@, and rmind@, I've attached a revised set of diffs. The most significant change between this and the previous revision is the separation of the kernconfig_lock_*() stuff into a separate kern_cfg.c source file. This allows inclusion of the kernel code directly into rump, rather than having to re-implement it in rump/klock.c - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | - ML.diff.tgz Description: Binary data
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Fri, 6 Aug 2010, Mindaugas Rasiukevicius wrote: Hello, Paul Goyette wrote: Well, it was a slow day at the office today, so I found some time to work on this! :) Attached is the latest version of this change. For simplicity, I have broken the patches up into five separate groups: <...> Few comments on the patch: -#define__NetBSD_Version__ 599003800 /* NetBSD 5.99.38 */ +#define__NetBSD_Version__ 599003900 /* NetBSD 5.99.39 */ Why bumping? Seemed like the right thing to do, since the locking protocol is changing, as well as the global module_lock mutex disappearing. But I really can't hink of any ill effects, unless someone has a module that suddenly fails link-and-load because it references module_lock. It's not very likely. I'll defer, or I'll wait to ride someone else's bump just to be safe. + * of of kernel functionality, such as device configuration and loading One "of" too much. Fixed. +void +kernconfig_lock(void) +{ + lwp_t *l; + + /* +* OK to check this unlocked, since it could only be set to +* curlwp by the current thread itself, and not by an interrupt +* or any other LWP. +*/ Please add KASSERT(!cpu_intr_p()); Done. I was worried about this, and forgot to ask. Thanks for picking it up. + l = curlwp; + if (kernconfig_lwp == l) { + kernconfig_recurse++; + KASSERT(kernconfig_recurse != 0); Such (++ then != 0) assert is a bit useless. How about: KASSERT(kernconfig_recurse > 1); The intent here was to ensure that we didn't overflow the counter back to zero. It might have made sense when the counter was a uint8_t, but probably not with an int! It can probably be removed completely. + } else { + mutex_enter(&kernconfig_mutex); + kernconfig_lwp = l; + kernconfig_recurse = 1; + } +} Somebody will confuse l with 1. :) This came straight from ad@'s suggestions, and there seems to be a lot of precedent in the sys/kern/ code to use l. But I will select a better variable name. + if (--kernconfig_recurse == 0) { + kernconfig_lwp = 0; + mutex_exit(&kernconfig_mutex); kernconfig_lwp = NULL; Ooops! :) +bool +kernconfig_is_held(void) +{ + + return (mutex_owned(&kernconfig_mutex)); No need for brackets around function. Got it. + KASSERT(kernconfig_is_held); But missing brackets in few cases. Yeah, you're the second one to point that out. I fixed them once, not sure how they crept back in. But they've been fixed again. Otherwise seems OK to me. Thanks. Thank you for the detailed look. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Module and device configuration locking [was Re: Modules loading modules?]
Hello, Paul Goyette wrote: > > Well, it was a slow day at the office today, so I found some time to > work on this! :) > > Attached is the latest version of this change. For simplicity, I have > broken the patches up into five separate groups: > > <...> Few comments on the patch: > -#define __NetBSD_Version__ 599003800 /* NetBSD 5.99.38 */ > +#define __NetBSD_Version__ 599003900 /* NetBSD 5.99.39 */ Why bumping? > + * of of kernel functionality, such as device configuration and loading One "of" too much. > +void > +kernconfig_lock(void) > +{ > + lwp_t *l; > + > + /* > + * OK to check this unlocked, since it could only be set to > + * curlwp by the current thread itself, and not by an interrupt > + * or any other LWP. > + */ Please add KASSERT(!cpu_intr_p()); > + l = curlwp; > + if (kernconfig_lwp == l) { > + kernconfig_recurse++; > + KASSERT(kernconfig_recurse != 0); Such (++ then != 0) assert is a bit useless. How about: KASSERT(kernconfig_recurse > 1); > + } else { > + mutex_enter(&kernconfig_mutex); > + kernconfig_lwp = l; > + kernconfig_recurse = 1; > + } > +} Somebody will confuse l with 1. :) > + if (--kernconfig_recurse == 0) { > + kernconfig_lwp = 0; > + mutex_exit(&kernconfig_mutex); kernconfig_lwp = NULL; > +bool > +kernconfig_is_held(void) > +{ > + > + return (mutex_owned(&kernconfig_mutex)); No need for brackets around function. > + KASSERT(kernconfig_is_held); But missing brackets in few cases. Otherwise seems OK to me. Thanks. -- Mindaugas
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Thu, 5 Aug 2010, Paul Goyette wrote: OK, got it. I'll have another set of patches in a few days. Well, it was a slow day at the office today, so I found some time to work on this! :) Attached is the latest version of this change. For simplicity, I have broken the patches up into five separate groups: Part 1 defines a set of new kernel locking primitives in file kern/kern_lock.c to implement the recursive kernconfig_mutex. (I'm not really stuck on the name, so open to alternative suggestions!) All of the locking in sys/kern_module.c has been updated to use this instead of the internal module_lock mutex. Additionally, the module_autoload() routine now does its own locking, rather than requiring its caller to do it. And the description of locking in the module(9) man page is updated. Part 2 removes all of the explicit module_{,un}lock() calls from the various xxx_verbose modules that I'd previous worked on. Part 3 removes all of the explicit module_{,un}lock() calls from other bits and pieces of the kernel. In a couple of places, new calls to kernconfig_{,un}lock() are inserted. Part 4 updates rump to provide equivalent locking routines. (pooka, I would appreciate you looking at this.) Part 5 adds a new atf test-case to the existing module tests to verify that recursion actually works! - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -Index: sys/sys/param.h === RCS file: /cvsroot/src/sys/sys/param.h,v retrieving revision 1.373 diff -u -p -r1.373 param.h --- sys/sys/param.h 28 Jul 2010 11:03:47 - 1.373 +++ sys/sys/param.h 6 Aug 2010 00:46:05 - @@ -63,7 +63,7 @@ * 2.99.9 (299000900) */ -#define__NetBSD_Version__ 599003800 /* NetBSD 5.99.38 */ +#define__NetBSD_Version__ 599003900 /* NetBSD 5.99.39 */ #define __NetBSD_Prereq__(M,m,p) (M) * 1) + \ (m) * 100) + (p) * 100) <= __NetBSD_Version__) Index: sys/sys/systm.h === RCS file: /cvsroot/src/sys/sys/systm.h,v retrieving revision 1.239 diff -u -p -r1.239 systm.h --- sys/sys/systm.h 31 Jan 2010 02:04:43 - 1.239 +++ sys/sys/systm.h 6 Aug 2010 00:46:06 - @@ -492,6 +492,11 @@ void kernel_lock_init(void); void _kernel_lock(int); void _kernel_unlock(int, int *); +void kernconfig_lock_init(void); +void kernconfig_lock(void); +void kernconfig_unlock(void); +bool kernconfig_is_held(void); + #if defined(MULTIPROCESSOR) || defined(_MODULE) #defineKERNEL_LOCK(count, lwp) \ do { \ Index: sys/sys/module.h === RCS file: /cvsroot/src/sys/sys/module.h,v retrieving revision 1.24 diff -u -p -r1.24 module.h --- sys/sys/module.h26 Jun 2010 07:23:57 - 1.24 +++ sys/sys/module.h6 Aug 2010 00:46:07 - @@ -115,7 +115,6 @@ __link_set_add_rodata(modules, name##_mo TAILQ_HEAD(modlist, module); extern struct vm_map *module_map; -extern kmutex_tmodule_lock; extern u_int module_count; extern u_int module_builtinlist; extern struct modlist module_list; Index: sys/kern/kern_lock.c === RCS file: /cvsroot/src/sys/kern/kern_lock.c,v retrieving revision 1.150 diff -u -p -r1.150 kern_lock.c --- sys/kern/kern_lock.c20 Dec 2009 20:42:23 - 1.150 +++ sys/kern/kern_lock.c6 Aug 2010 00:46:07 - @@ -39,6 +39,7 @@ __KERNEL_RCSID(0, "$NetBSD: kern_lock.c, #include #include #include +#include #include #include #include @@ -56,6 +57,10 @@ bool kernel_lock_dodebug; __cpu_simple_lock_t kernel_lock[CACHE_LINE_SIZE / sizeof(__cpu_simple_lock_t)] __aligned(CACHE_LINE_SIZE); +static kmutex_t kernconfig_mutex; +static lwp_t *kernconfig_lwp; +static int kernconfig_recurse; + void assert_sleepable(void) { @@ -306,3 +311,59 @@ _kernel_unlock(int nlocks, int *countp) if (countp != NULL) *countp = olocks; } + +/* + * Functions for manipulating the kernel configuration lock. This + * recursive lock should be used to protect all additions and removals + * of of kernel functionality, such as device configuration and loading + * of modular kernel components. + */ + +void +kernconfig_lock_init(void) +{ + + mutex_init(&kernconfig_mutex, MUTEX_DEFAULT, IPL_NONE); + kernconfig_lwp = NULL;
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Thu, 5 Aug 2010, Paul Goyette wrote: Attached is the latest version of this change. For simplicity, I have broken the patches up into five separate groups: Part 1 defines a set of new kernel locking primitives in file kern/kern_lock.c to implement the recursive kernconfig_mutex. (I'm not really stuck on the name, so open to alternative suggestions!) All of the locking in sys/kern_module.c has been updated to use this instead of the internal module_lock mutex. Additionally, the module_autoload() routine now does its own locking, rather than requiring its caller to do it. And the description of locking in the module(9) man page is updated. Part 2 removes all of the explicit module_{,un}lock() calls from the various xxx_verbose modules that I'd previous worked on. Part 3 removes all of the explicit module_{,un}lock() calls from other bits and pieces of the kernel. In a couple of places, new calls to kernconfig_{,un}lock() are inserted. Part 4 updates rump to provide equivalent locking routines. (pooka, I would appreciate you looking at this.) Part 5 adds a new atf test-case to the existing module tests to verify that recursion actually works! I should add that I'm actually running with these changes on one of my local machines with no visible problems, and the above-added test case works. The machine has a "AMD Phenom(tm) 9600 Quad-Core Processor" at 2.3MHz. {103} modstat NAME CLASS SOURCE REFS SIZE REQUIRES bpf driver builtin0 -- compat misc builtin0 -- coredump misc filesys1 3402 - exec_elf64 misc filesys0 7960 coredump exec_script misc filesys0 1256 - ffs vfsboot 0 153316 - kernfs vfsfilesys0 12914- nfs vfsfilesys0 159471 - ptyfsvfsfilesys0 10118- secmodel_bsd44 secmodel builtin0 -suser,securelevel securelevel secmodel builtin1 -- susersecmodel builtin1 -- tmpfsvfsfilesys0 19513- wapblvfsbuiltin0 -- {104} - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Module and device configuration locking [was Re: Modules loading modules?]
OK, got it. I'll have another set of patches in a few days. On Thu, 5 Aug 2010, Andrew Doran wrote: On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote: On Wed, 4 Aug 2010, Andrew Doran wrote: Sorry for not replying sooner. Please don't add a generic recursive lock facility, it would be abused. Yeah! My current patches have no new generic facilities at all. I'd like something roughly like the following. I think this should also cover major configuration tasks such as device attach and detach, so it wouldn't really be module specific. The naming is a poor suggestion. void sysconfig_lock(void) { if (mutex_owned(&sysconfig_mutex)) { /* * At this point it's OK for the thread to hold other * locks, since the system configuration lock was the * first taken on the way in. */ sysconfig_recurse++; return; } /* * I don't remember the correct argument sequence to the * following, but basically we need to assert that NO locks, * sleep or spin, other than kernel_lock are held, * as the system configuration lock would be the outermost * lock in the system. */ LOCKDEBUG_BARRIER(...); mutex_enter(&sysconfig_mutex); KASSERT(sysconfig_recurse == 0); } For the boot autoconfig path where you have things being driven by the kernel, this would work OK. Situations where there's a kthread or something attaching and detaching devices on the fly are a bit different, since they likely have local atomicity requirements (need to take device private locks). There you'd probably need the caller to take the lock. USB comes to mind (along with some hardware RAID drivers that do hotplug attach/detach). Thoughts? Well, at first glance, this proposal makes sense. But it is certainly a much larger task than dealing with the immediate problem - modules which want to load other modules. So, I really have three questions: 1. In keeping with earlier concerns about holding locks for long periods of time (eeh and mrg both commented on this), the approach described For module lock the wait time isn't really a big concern. Module load and unload is a heavyweight operation so the wait time is expected. There is nothing intrinsically wrong with holding a mutex for "a long time" provided that you've got a good handle on the potential side effects. It's a different story for other locks in the system like say proc_lock, which can't be held for a long time without disrupting the user experience and/or deadlocking the system. above would seem to have an even longer hold-time than the current module_lock. Would not something similar to haad's condvar approach be appropriate for this proposal, as well as for the more-limited-in- scope recursive module lock? condvar gives you another lock in a roundabout way, without priority inheritance to help move things along if something has clogged the pipework. :-). So I don't see benefit over a mutex. 2. Would it be reasonable to solve the immediate problem as previously proposed, rather than waiting for this "ultimate" solution? I think it would be a long time before we could find and resolve all of the "issues" that might be created in the various "threaded" situations that may exist. I know I'm certainly not sufficiently qualified to identify all those "situations", and I also know that I don't have sufficient available work-hours to do this in any reasonable time- frame. I'd still like to move forward with the most recent solution to the module_lock problem. I'm not suggesting that you need to tackle autoconfiguration locking at the same time.. What I'm suggesting is to put the primitives in place (say in kern_lock.c or something) and then use these for the benefit of the modules code. It would provide a statment of direction and avoid re-hashing things later when somebody decides to tackle autoconf. Sorry for being unclear. 3. Since we're still technically abusing the mutex(9) man-page caveat about using mutex_owned() for making locking decisions, it would be very much appreciated (at least by myself) to have an explanation of WHY it is OK in this case to do it here, but not somewhere else. Ok well I think the simplest course of action would be to replace the mutex_owned() with a global variable. It would do much the same thing but for the price of 4 or 8 bytes there is no question of contradicting the mantra. So like: /* * Ok to check this unlocked, as for the value to equal curlwp it must * have been set by the current thread of execution (i.e. not interrupt * context or another LWP). */ l = curlwp; if (sysconfig_lwp == l) { /* recurse */ } else { mutex_enter(&sysconfig_lock); sysconfig_lwp = l; /* etc etc */ } --
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Wed, Aug 04, 2010 at 10:28:56AM -0700, Paul Goyette wrote: > On Wed, 4 Aug 2010, Andrew Doran wrote: > > >Sorry for not replying sooner. > > > >Please don't add a generic recursive lock facility, it would be abused. > > Yeah! My current patches have no new generic facilities at all. > > >I'd like something roughly like the following. I think this should > >also cover major configuration tasks such as device attach and detach, > >so it wouldn't really be module specific. The naming is a poor suggestion. > > > >void > >sysconfig_lock(void) > >{ > > > > if (mutex_owned(&sysconfig_mutex)) { > > /* > > * At this point it's OK for the thread to hold other > > * locks, since the system configuration lock was the > > * first taken on the way in. > > */ > > sysconfig_recurse++; > > return; > > } > > /* > > * I don't remember the correct argument sequence to the > > * following, but basically we need to assert that NO locks, > > * sleep or spin, other than kernel_lock are held, > >* as the system configuration lock would be the outermost > > * lock in the system. > > */ > > LOCKDEBUG_BARRIER(...); > > mutex_enter(&sysconfig_mutex); > > KASSERT(sysconfig_recurse == 0); > >} > > > >For the boot autoconfig path where you have things being driven by the > >kernel, this would work OK. Situations where there's a kthread or > >something attaching and detaching devices on the fly are a bit different, > >since they likely have local atomicity requirements (need to take device > >private locks). There you'd probably need the caller to take the lock. > >USB comes to mind (along with some hardware RAID drivers that do hotplug > >attach/detach). > > > >Thoughts? > > Well, at first glance, this proposal makes sense. But it is > certainly a much larger task than dealing with the immediate problem > - modules which want to load other modules. > > So, I really have three questions: > > 1. In keeping with earlier concerns about holding locks for long periods >of time (eeh and mrg both commented on this), the approach described For module lock the wait time isn't really a big concern. Module load and unload is a heavyweight operation so the wait time is expected. There is nothing intrinsically wrong with holding a mutex for "a long time" provided that you've got a good handle on the potential side effects. It's a different story for other locks in the system like say proc_lock, which can't be held for a long time without disrupting the user experience and/or deadlocking the system. >above would seem to have an even longer hold-time than the current >module_lock. Would not something similar to haad's condvar approach >be appropriate for this proposal, as well as for the more-limited-in- >scope recursive module lock? condvar gives you another lock in a roundabout way, without priority inheritance to help move things along if something has clogged the pipework. :-). So I don't see benefit over a mutex. > 2. Would it be reasonable to solve the immediate problem as previously >proposed, rather than waiting for this "ultimate" solution? I think >it would be a long time before we could find and resolve all of the >"issues" that might be created in the various "threaded" situations >that may exist. I know I'm certainly not sufficiently qualified to >identify all those "situations", and I also know that I don't have >sufficient available work-hours to do this in any reasonable time- >frame. > >I'd still like to move forward with the most recent solution to the >module_lock problem. I'm not suggesting that you need to tackle autoconfiguration locking at the same time.. What I'm suggesting is to put the primitives in place (say in kern_lock.c or something) and then use these for the benefit of the modules code. It would provide a statment of direction and avoid re-hashing things later when somebody decides to tackle autoconf. Sorry for being unclear. > 3. Since we're still technically abusing the mutex(9) man-page caveat >about using mutex_owned() for making locking decisions, it would >be very much appreciated (at least by myself) to have an explanation >of WHY it is OK in this case to do it here, but not somewhere else. Ok well I think the simplest course of action would be to replace the mutex_owned() with a global variable. It would do much the same thing but for the price of 4 or 8 bytes there is no question of contradicting the mantra. So like: /* * Ok to check this unlocked, as for the value to equal curlwp it must * have been set by the current thread of execution (i.e. not interrupt * context or another LWP). */ l = curlwp; if (sysconfig_lwp == l) { /* recurse */ } else { mutex_enter(&sysconfig_lock); sysconfig_lwp = l; /* etc etc */ }
Re: Module and device configuration locking [was Re: Modules loading modules?]
On Wed, 4 Aug 2010, Andrew Doran wrote: Sorry for not replying sooner. Please don't add a generic recursive lock facility, it would be abused. Yeah! My current patches have no new generic facilities at all. I'd like something roughly like the following. I think this should also cover major configuration tasks such as device attach and detach, so it wouldn't really be module specific. The naming is a poor suggestion. void sysconfig_lock(void) { if (mutex_owned(&sysconfig_mutex)) { /* * At this point it's OK for the thread to hold other * locks, since the system configuration lock was the * first taken on the way in. */ sysconfig_recurse++; return; } /* * I don't remember the correct argument sequence to the * following, but basically we need to assert that NO locks, * sleep or spin, other than kernel_lock are held, * as the system configuration lock would be the outermost * lock in the system. */ LOCKDEBUG_BARRIER(...); mutex_enter(&sysconfig_mutex); KASSERT(sysconfig_recurse == 0); } For the boot autoconfig path where you have things being driven by the kernel, this would work OK. Situations where there's a kthread or something attaching and detaching devices on the fly are a bit different, since they likely have local atomicity requirements (need to take device private locks). There you'd probably need the caller to take the lock. USB comes to mind (along with some hardware RAID drivers that do hotplug attach/detach). Thoughts? Well, at first glance, this proposal makes sense. But it is certainly a much larger task than dealing with the immediate problem - modules which want to load other modules. So, I really have three questions: 1. In keeping with earlier concerns about holding locks for long periods of time (eeh and mrg both commented on this), the approach described above would seem to have an even longer hold-time than the current module_lock. Would not something similar to haad's condvar approach be appropriate for this proposal, as well as for the more-limited-in- scope recursive module lock? 2. Would it be reasonable to solve the immediate problem as previously proposed, rather than waiting for this "ultimate" solution? I think it would be a long time before we could find and resolve all of the "issues" that might be created in the various "threaded" situations that may exist. I know I'm certainly not sufficiently qualified to identify all those "situations", and I also know that I don't have sufficient available work-hours to do this in any reasonable time- frame. I'd still like to move forward with the most recent solution to the module_lock problem. 3. Since we're still technically abusing the mutex(9) man-page caveat about using mutex_owned() for making locking decisions, it would be very much appreciated (at least by myself) to have an explanation of WHY it is OK in this case to do it here, but not somewhere else. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Module and device configuration locking [was Re: Modules loading modules?]
Sorry for not replying sooner. Please don't add a generic recursive lock facility, it would be abused. I'd like something roughly like the following. I think this should also cover major configuration tasks such as device attach and detach, so it wouldn't really be module specific. The naming is a poor suggestion. void sysconfig_lock(void) { if (mutex_owned(&sysconfig_mutex)) { /* * At this point it's OK for the thread to hold other * locks, since the system configuration lock was the * first taken on the way in. */ sysconfig_recurse++; return; } /* * I don't remember the correct argument sequence to the * following, but basically we need to assert that NO locks, * sleep or spin, other than kernel_lock are held, * as the system configuration lock would be the outermost * lock in the system. */ LOCKDEBUG_BARRIER(...); mutex_enter(&sysconfig_mutex); KASSERT(sysconfig_recurse == 0); } For the boot autoconfig path where you have things being driven by the kernel, this would work OK. Situations where there's a kthread or something attaching and detaching devices on the fly are a bit different, since they likely have local atomicity requirements (need to take device private locks). There you'd probably need the caller to take the lock. USB comes to mind (along with some hardware RAID drivers that do hotplug attach/detach). Thoughts?
Re: Modules loading modules?
On Tue, 3 Aug 2010, haad wrote: Well, still no hurry, but it would be nice if some additional eyes were pointed this way. I've got the recursive-module-load test case added to the existing tests/modules/ stuff (and it works)! BTW, since this is changing the kernel ABI, I guess it will need a version bump. Do we have any other bumps coming soon that we can "share"? I talked with Andrew today and he said that basicaly we can use mutex_owned for this particular case. That should make your code much simpler :) sorry for any additional work :) Well, we've come full circle! :) It seems to me that, even if it is "OK to use mutex_owned() for this particular case", perhaps we should not. Most everyone seemed to agree that your condvar solution was "technically correct", and if we did use the mutex_owned() solution it would only serve as a temptation for someone else in the future! We now have two solutions, both of which actually work (tested on my home machine). One solution (mutex_owned) might be slightly simpler and slightly more efficient than the other (condvar), but this section of code is certainly not performance-critical. Therefore, is there any reason for preferring the mutex_owned solution? I'm happy to go either way based on concensus here, but I'm leaning towards the condvar solution. It doesn't require any "exemptions" or long explanations of why it is "OK to use mutex_owned" in ways that would appear to violate our own documentation. And it also addresses eeh's (and mrg's) concerns of keeping the lock across potentially long-duration operations. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Tue, Aug 3, 2010 at 3:03 PM, Paul Goyette wrote: > On Mon, 2 Aug 2010, Paul Goyette wrote: > >> On Tue, 3 Aug 2010, matthew green wrote: >> >>> i think this looks good enough. wait for some others eyes, though. >> >> No hurry. I'm going to try to add a new atf test case for the recursive >> load case, and it will take some time for me to come up to speed on the atf >> stuff. :) > > Well, still no hurry, but it would be nice if some additional eyes were > pointed this way. I've got the recursive-module-load test case added to the > existing tests/modules/ stuff (and it works)! > > BTW, since this is changing the kernel ABI, I guess it will need a version > bump. Do we have any other bumps coming soon that we can "share"? I talked with Andrew today and he said that basicaly we can use mutex_owned for this particular case. That should make your code much simpler :) sorry for any additional work :) -- Regards. Adam
re: Modules loading modules?
On Mon, 2 Aug 2010, Paul Goyette wrote: On Tue, 3 Aug 2010, matthew green wrote: i think this looks good enough. wait for some others eyes, though. No hurry. I'm going to try to add a new atf test case for the recursive load case, and it will take some time for me to come up to speed on the atf stuff. :) Well, still no hurry, but it would be nice if some additional eyes were pointed this way. I've got the recursive-module-load test case added to the existing tests/modules/ stuff (and it works)! BTW, since this is changing the kernel ABI, I guess it will need a version bump. Do we have any other bumps coming soon that we can "share"? - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
re: Modules loading modules?
On Tue, 3 Aug 2010, matthew green wrote: i think this looks good enough. wait for some others eyes, though. No hurry. I'm going to try to add a new atf test case for the recursive load case, and it will take some time for me to come up to speed on the atf stuff. :) please commit the module_autoload() changes separately. Will do. I will commit the changes to sys/kern/kern_module.c, sys/sys/module.h, and share/man/man9/module.9 separately from the other changes. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
re: Modules loading modules?
> On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote: > > > Now, on to sensible stuff. I'm quite certain that warning was written > > > to make people avoid writing bad code without understanding locking -- > > > if you need to used mutex_owned() to decide if you should lock a normal > > > mutex, your code is broken. The other less likely possibility is that > > > someone plans to change mutex_owned in the future. > > > > > > Further data point: the same warning is in rw_held, yet it was used to > > > implement recursive locking in vlockmgr until its very recent demise. > > > > > > Ignoring man page mantras and focusing on how the code works, I do not > > > see anything wrong with Paul's use of mutex_owned(). > > > > this just does not match my actual experience in the kernel. i had > > weird pmap-style problems and asserts firing wrongly while i did not > > obey the rules in the manual directly. > > Not knowing more details it's difficult to comment. But since you are > talking about the pmap, maybe your experiences are with spin mutexes > instead of adaptive ones? that could be it. i certainly had different issues trying to use a sleep-able lock in this code. i find it confusing that mutex_owned() returns true if the mutex is locked (regardless of who has it) for spin, where as adaptive only returns true if this LWP has it locked. .mrg.
re: Modules loading modules?
i think this looks good enough. wait for some others eyes, though. please commit the module_autoload() changes separately. .mrg.
Re: Modules loading modules?
On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote: > > Now, on to sensible stuff. I'm quite certain that warning was written > > to make people avoid writing bad code without understanding locking -- > > if you need to used mutex_owned() to decide if you should lock a normal > > mutex, your code is broken. The other less likely possibility is that > > someone plans to change mutex_owned in the future. > > > > Further data point: the same warning is in rw_held, yet it was used to > > implement recursive locking in vlockmgr until its very recent demise. > > > > Ignoring man page mantras and focusing on how the code works, I do not > > see anything wrong with Paul's use of mutex_owned(). > > this just does not match my actual experience in the kernel. i had > weird pmap-style problems and asserts firing wrongly while i did not > obey the rules in the manual directly. Not knowing more details it's difficult to comment. But since you are talking about the pmap, maybe your experiences are with spin mutexes instead of adaptive ones?
re: Modules loading modules?
The most recent verion of this patch set was recently posted, and it uses the cv_wait() mechanism suggested by Adam Hamsik. This discussion is still very intresting, but perhaps we could start a new thread (or at least change the subject line)? :) On Tue, 3 Aug 2010, matthew green wrote: On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. what is not possible is to safely to what the manual says. you will not have the right thing happen. i've seen this when i've been working on the sparc64 pmap. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). this just does not match my actual experience in the kernel. i had weird pmap-style problems and asserts firing wrongly while i did not obey the rules in the manual directly. .mrg. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
re: Modules loading modules?
> On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: > > this is an incomplete reading of the manual page, and you can not > > use mutex_owned() the way you are trying to (regardless of what > > pooka posted.) you can't even using it in various forms of assertions > > safely. from the man page: > > > > It should not be used to make locking decisions at run time, or to > > verify that a lock is not held. > > That's the mantra, yes. > > > ie, you can not even KASSERT(!mutex_owned()). > > Strictly speaking you can in a case where you have two locks always > are taken as l1,l2 and released as l2,l1 provided you're not dealing > with a spin mutex. Does it make any sense? no (l2 is useless). > Is it possible? yes. what is not possible is to safely to what the manual says. you will not have the right thing happen. i've seen this when i've been working on the sparc64 pmap. > Now, on to sensible stuff. I'm quite certain that warning was written > to make people avoid writing bad code without understanding locking -- > if you need to used mutex_owned() to decide if you should lock a normal > mutex, your code is broken. The other less likely possibility is that > someone plans to change mutex_owned in the future. > > Further data point: the same warning is in rw_held, yet it was used to > implement recursive locking in vlockmgr until its very recent demise. > > Ignoring man page mantras and focusing on how the code works, I do not > see anything wrong with Paul's use of mutex_owned(). this just does not match my actual experience in the kernel. i had weird pmap-style problems and asserts firing wrongly while i did not obey the rules in the manual directly. .mrg.
Re: Modules loading modules?
On Mon, 2 Aug 2010, Paul Goyette wrote: Yes, the cv mechanism, coupled with changing the semantics of module_autoload() (to not require the caller to obtain the lock) makes good sense here. I'm working on it and I should be able to post new diffs tomorrow. OK, it's time for Round #3! Attached are diffs that implement Adam's suggested cv-based solution, along with moving the locking for module_autolock() into the routine itself (for consistency with other routines). I've also updated the module(4) man page this time around. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | - MODULE_LOCKING.diff.gz Description: Binary data
Re: Locking for module_autoload() (was: Modules loading modules?)
On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote: > On Sun, 1 Aug 2010, John Nemeth wrote: > >>I'm thinking the acquisition of module_lock should be pushed into >> module_autoload() instead of having the caller acquire it for this very >> reason. It makes it hard to change the way locking works in the >> MODULAR code if you expect the caller to acquire the lock. I don't >> know why it was done this way originally, or what the consequences (if >> any) would be for making the change. Andrew, any thoughts on this? > > Attached are diffs for moving the locking out of each caller and into > module_autoload() itself. Most of these changes are fairly trivial, but a > couple of them should be looked at more closely in case I've missed any > subtleties in the original code: > > kern/kern_syscall.c > net/if_ppp.c I think that this patch has same problem as original code you can call module_autoload twice in same lwp. (you are holding module_lock while calling module_do_load). Regards Adam.
Re: Modules loading modules?
On Mon, 2 Aug 2010, Antti Kantee wrote: On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). but i'm still not sure why we're going to such lengths to hold a lock across such a heavy weight operation like loading a module. that may involve disk speeds, and so you're looking at waiting millions of cycles for the lock. aka, what eeh posted. It's held for millions of cycles already now and nobody has pointed out measurable problems. But if it is deemed necessary, you can certainly hide a cv underneath. The efficiency requirements for the module lock are probably anyway in the "who cares wins ... not" spectrum. At least I'm not aware of any fastpath wanting to use it. Anyway, no real opinion there. A cv most likely is the safe, no-brainer choice. Yes, the cv mechanism, coupled with changing the semantics of module_autoload() (to not require the caller to obtain the lock) makes good sense here. I'm working on it and I should be able to post new diffs tomorrow. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: > this is an incomplete reading of the manual page, and you can not > use mutex_owned() the way you are trying to (regardless of what > pooka posted.) you can't even using it in various forms of assertions > safely. from the man page: > > It should not be used to make locking decisions at run time, or to > verify that a lock is not held. That's the mantra, yes. > ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). > but i'm still not sure why we're going to such lengths to hold a > lock across such a heavy weight operation like loading a module. > that may involve disk speeds, and so you're looking at waiting > millions of cycles for the lock. aka, what eeh posted. It's held for millions of cycles already now and nobody has pointed out measurable problems. But if it is deemed necessary, you can certainly hide a cv underneath. The efficiency requirements for the module lock are probably anyway in the "who cares wins ... not" spectrum. At least I'm not aware of any fastpath wanting to use it. Anyway, no real opinion there. A cv most likely is the safe, no-brainer choice.
Re: Modules loading modules?
At last, a response that doesn't just say "you can't do that" but instead provides a crystal-clear alternative. :) I like this approach since it avoids the need to implement the new "recursive mutex" type (which leads to the tempting-but-forbidden mis-use of mutex_owned()). It also would appear to alleviate eeh's and mrg's concerns for holding the mutex across long, blocking operations. I still think it is preferable to move all locking associated with module_autoload() into the routine itself as described in the other thread (with subject line "Locking for module_autoload()"). Then it will be nearly trivial to replace the current locking with your suggested replacement. On Mon, 2 Aug 2010, Adam Hamsik wrote: On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote: On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote: One question: Since an adaptive kmutex_t already includes an owner field, would we really need to have another copy of it in the rmutex_t structure? Good point. I think it's ok to do: if (mutex_owned(mtx)) cnt++ else lock When I did locking for dm driver, it was said by rmind@ that mutex_owned should not be used for any locking decisions and should be used only for diagnostic purposes. What I would in this case is add two new routines to module_framework -> module_lock() module_unlock() These two will then do something like this. kmutex_t module_lock; uint32_t refcnt; lwp_t mod_lwp; condvar_t cv; module_lock() { mutex_enter(module_lock); start if (refcnt == 0) { refcnt++; mod_lwp = curlwp; else { if (mod_lwp == curlwp) refcnt++; else { cv_wait(cv); goto start } } mutex_exit(module_lock); } mudule_unlock() { mutex_enter(module_lock); KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from lwp which called module_lock */ refcnt--; if (refcnt == 0) { mod_lwp = 0; cv_broadcast(cv); } mutex_exit(module_lock); } This way module_lock can be called multiple times by one lwp and will sleep for everyone else. Regards Adam. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Locking for module_autoload() (was: Modules loading modules?)
On Mon, 2 Aug 2010, Adam Hamsik wrote: On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote: On Sun, 1 Aug 2010, John Nemeth wrote: I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? Attached are diffs for moving the locking out of each caller and into module_autoload() itself. Most of these changes are fairly trivial, but a couple of them should be looked at more closely in case I've missed any subtleties in the original code: kern/kern_syscall.c net/if_ppp.c I think that this patch has same problem as original code you can call module_autoload twice in same lwp. (you are holding module_lock while calling module_do_load). This patch wasn;t intended to solve the nested-call problem. This patch only purpose was to modify the existing semantics of the module_autoload routine to match that of the other module routines. In particular, module_autoload is the only routine that requires its caller to acquire the mutex first; all the other routines do all the required locking internally to each routine. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Sun, 1 Aug 2010, Eduardo Horvath wrote: Why is there a need to hold a mutex across module loading instead of, say, having a reference count tht's protected by a mutex? I thought holding a mutex across a potentially blocking operation such as a module load is considered bad practice. Please note that I am not introducing the holding of the lock over the potentially blocking operation. This is already done. Several others have already indicated that the atomicity of the module load (including the loading of any required modules) is a desirable property. I'm simply looking for a mechanism that allows us to keep this property while allowing modules to load other modules. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote: > On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote: >> One question: Since an adaptive kmutex_t already includes an owner >> field, would we really need to have another copy of it in the rmutex_t >> structure? > > Good point. I think it's ok to do: > > if (mutex_owned(mtx)) >cnt++ > else >lock When I did locking for dm driver, it was said by rmind@ that mutex_owned should not be used for any locking decisions and should be used only for diagnostic purposes. What I would in this case is add two new routines to module_framework -> module_lock() module_unlock() These two will then do something like this. kmutex_t module_lock; uint32_t refcnt; lwp_t mod_lwp; condvar_t cv; module_lock() { mutex_enter(module_lock); start if (refcnt == 0) { refcnt++; mod_lwp = curlwp; else { if (mod_lwp == curlwp) refcnt++; else { cv_wait(cv); goto start } } mutex_exit(module_lock); } mudule_unlock() { mutex_enter(module_lock); KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from lwp which called module_lock */ refcnt--; if (refcnt == 0) { mod_lwp = 0; cv_broadcast(cv); } mutex_exit(module_lock); } This way module_lock can be called multiple times by one lwp and will sleep for everyone else. Regards Adam.
re: Modules loading modules?
> According to the mutex(9) man page: > > mutex_owned(mtx) > > For adaptive mutexes, return non-zero if the current LWP holds > the mutex. ... this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. ie, you can not even KASSERT(!mutex_owned()). the documentation matches my experience with using mutex(9) and KASSERT(). but i'm still not sure why we're going to such lengths to hold a lock across such a heavy weight operation like loading a module. that may involve disk speeds, and so you're looking at waiting millions of cycles for the lock. aka, what eeh posted. .mrg
Re: Modules loading modules?
On Mon, 2 Aug 2010, Masao Uebayashi wrote: mutex_owned(mtx) ... mutex_owned() is provided for making diagnostic checks to verify that a lock is held. For example: KASSERT(mutex_owned(&driver_lock)); It should not be used to make locking decisions at run time, or to verify that a lock is not held. Thus you can't use it for your decision. Yes, I did read the mutex(9) man page. However, your assertion is at odds with pooka's suggestion, from http://mail-index.netbsd.org/tech-kern/2010/08/01/msg008632.html Good point. I think it's ok to do: if (mutex_owned(mtx)) cnt++ else lock - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Locking for module_autoload() (was: Modules loading modules?)
On Sun, 1 Aug 2010, John Nemeth wrote: I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? Attached are diffs for moving the locking out of each caller and into module_autoload() itself. Most of these changes are fairly trivial, but a couple of them should be looked at more closely in case I've missed any subtleties in the original code: kern/kern_syscall.c net/if_ppp.c - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -Index: dev/acpi/acpi.c === RCS file: /cvsroot/src/sys/dev/acpi/acpi.c,v retrieving revision 1.208 diff -u -p -r1.208 acpi.c --- dev/acpi/acpi.c 25 Jul 2010 12:54:46 - 1.208 +++ dev/acpi/acpi.c 2 Aug 2010 02:25:26 - @@ -226,11 +226,8 @@ acpi_null(void) void acpi_load_verbose(void) { - if (acpi_verbose_loaded == 0) { - mutex_enter(&module_lock); + if (acpi_verbose_loaded == 0) module_autoload("acpiverbose", MODULE_CLASS_MISC); - mutex_exit(&module_lock); - } } void Index: dev/dm/dm_target.c === RCS file: /cvsroot/src/sys/dev/dm/dm_target.c,v retrieving revision 1.13 diff -u -p -r1.13 dm_target.c --- dev/dm/dm_target.c 18 May 2010 15:10:41 - 1.13 +++ dev/dm/dm_target.c 2 Aug 2010 02:25:26 - @@ -83,9 +83,7 @@ dm_target_autoload(const char *dm_target gen = module_gen; /* Try to autoload target module */ - mutex_enter(&module_lock); (void) module_autoload(name, MODULE_CLASS_MISC); - mutex_exit(&module_lock); } while (gen != module_gen); mutex_enter(&dm_target_mutex); Index: dev/mii/mii_physubr.c === RCS file: /cvsroot/src/sys/dev/mii/mii_physubr.c,v retrieving revision 1.71 diff -u -p -r1.71 mii_physubr.c --- dev/mii/mii_physubr.c 25 Jul 2010 14:44:34 - 1.71 +++ dev/mii/mii_physubr.c 2 Aug 2010 02:25:27 - @@ -71,11 +71,8 @@ const char *mii_get_descr_stub(int oui, */ void mii_load_verbose(void) { - if (mii_verbose_loaded == 0) { - mutex_enter(&module_lock); + if (mii_verbose_loaded == 0) module_autoload("miiverbose", MODULE_CLASS_MISC); - mutex_exit(&module_lock); - } } static void mii_phy_statusmsg(struct mii_softc *); Index: dev/pci/pci_subr.c === RCS file: /cvsroot/src/sys/dev/pci/pci_subr.c,v retrieving revision 1.84 diff -u -p -r1.84 pci_subr.c --- dev/pci/pci_subr.c 25 Jul 2010 14:14:25 - 1.84 +++ dev/pci/pci_subr.c 2 Aug 2010 02:25:27 - @@ -323,11 +323,8 @@ int pciverbose_loaded = 0; */ void pci_load_verbose(void) { - if (pciverbose_loaded == 0) { - mutex_enter(&module_lock); + if (pciverbose_loaded == 0) module_autoload("pciverbose", MODULE_CLASS_MISC); - mutex_exit(&module_lock); - } } const char *pci_findvendor_stub(pcireg_t id_reg) Index: dev/scsipi/scsipiconf.c === RCS file: /cvsroot/src/sys/dev/scsipi/scsipiconf.c,v retrieving revision 1.39 diff -u -p -r1.39 scsipiconf.c --- dev/scsipi/scsipiconf.c 25 Jul 2010 13:49:58 - 1.39 +++ dev/scsipi/scsipiconf.c 2 Aug 2010 02:25:27 - @@ -107,11 +107,8 @@ scsipi_command(struct scsipi_periph *per void scsipi_load_verbose(void) { - if (scsi_verbose_loaded == 0) { - mutex_enter(&module_lock); + if (scsi_verbose_loaded == 0) module_autoload("scsiverbose", MODULE_CLASS_MISC); - mutex_exit(&module_lock); - } } /* Index: dev/usb/usb_subr.c === RCS file: /cvsroot/src/sys/dev/usb/usb_subr.c,v retrieving revision 1.174 diff -u -p -r1.174 usb_subr.c --- dev/usb/usb_subr.c 27 Jul 2010 16:15:30 - 1.174 +++ dev/usb/usb_subr.c 2 Aug 2010 02:25:27 - @@ -122,11 +122,8 @@ int usb_verbose_loaded = 0; */ void usb_load_verbose(void) { -
Re: Modules loading modules?
On Nov 17, 5:24am, Paul Goyette wrote: } On Sun, 1 Aug 2010, Antti Kantee wrote: } } > I'm not sure if it's a good idea to change the size of kmutex_t. I } > guess plenty of data structures have carefully been adjusted by hand } > to its size and I don't know of any automatic way to recalculate that } > stuff. } > } > Even if not, since this is the only user and we probably won't have } > that many of them even in the future, why not just define a new type } > ``rmutex'' which contains a kmutex, an owner and the counter? It } > feels wrong to punish all the normal kmutex users for just one use. } > It'll also make the implementation a lot simpler to test, since it's } > purely MI. } > } > "separate normal case and worst case" } } Round two! Taking pooka's suggestion, this version is built on top of } (rather than beside) the existing non-recursive mutex. As such, it does } not affect any MD code. } } Attached is a set of diffs that } } 1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement } recursive adaptive mutexes. (Conspicuously missing is an rmutex(9) } man page... It will happen before this gets committed.) } } 2. Converts the existing module_lock from a normal kmutex_t to an } rmutex_t } } 3. Updates all of the (surprisingly many) places where module_lock } is acquired. I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? } Compile-tested on port-amd64 (including rumptest). Since there are no } MD-changes in this version, there "shouldn't be" any issues with } building on other ports. } } As previously noted, there is only one known use case for this so far: } modules loading other modules from within their xxx_modcmd() routine. } The specific use case we have involves loading the acpicpu driver/module } which eventually results in an attempt to load acpiverbose. } } It would be really nice if the community could } } A. Compile-test on additional architectures } B. Test to see that existing mutex operations still work correctly } C. Exercise the known use case if possible } D. Identify additional use cases } }-- End of excerpt from Paul Goyette
Re: Modules loading modules?
On Sun, 1 Aug 2010, John Nemeth wrote: I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? I was thinking the same thing. There is one case in kern/kern_exec.c where the lock was held while the code loops through a list of possible modules. Looking closer, that code actually releases the lock and yield()s before re-acquiring it within the loop, so it probably could be modified without any adverse effects. This would actually simplify the interface, since ALL of the routines would be doing their own locking, and noone in the rest of the kernel would need to know about the lock. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Sun, Aug 01, 2010 at 06:05:20PM -0700, Paul Goyette wrote: > On Mon, 2 Aug 2010, Quentin Garnier wrote: > > >>+int > >>+rmutex_tryenter(rmutex_t *rmtx) > >>+{ > >>+ int rv = 1; > >>+ > >>+ if (mutex_owned(&rmtx->rmtx_mtx)) { > >>+ rmtx->rmtx_recurse++; > >>+ KASSERT(rmtx->rmtx_recurse != 0); > >>+ } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) { > >>+ rmtx->rmtx_recurse++; > >>+ KASSERT(rmtx->rmtx_recurse != 0); > >>+ } > >>+ return rv; > >>+} > > > >I am probably not getting the idea, but I fail to see how this qualifies > >as a mutex. My reading of this piece of code is that rmutex_enter() is > >always going to succeed immediately. > > According to the mutex(9) man page: > > mutex_owned(mtx) > >For adaptive mutexes, return non-zero if the current LWP holds >the mutex. ... Ah, that was what I was missing. Thanks. -- Quentin Garnier - c...@cubidou.net - c...@netbsd.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007. pgp0WclWeuAf7.pgp Description: PGP signature
Re: Modules loading modules?
On Mon, 2 Aug 2010, Quentin Garnier wrote: +int +rmutex_tryenter(rmutex_t *rmtx) +{ + int rv = 1; + + if (mutex_owned(&rmtx->rmtx_mtx)) { + rmtx->rmtx_recurse++; + KASSERT(rmtx->rmtx_recurse != 0); + } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) { + rmtx->rmtx_recurse++; + KASSERT(rmtx->rmtx_recurse != 0); + } + return rv; +} I am probably not getting the idea, but I fail to see how this qualifies as a mutex. My reading of this piece of code is that rmutex_enter() is always going to succeed immediately. According to the mutex(9) man page: mutex_owned(mtx) For adaptive mutexes, return non-zero if the current LWP holds the mutex. ... The rmutex_init() code ensures that we have an adaptive mutex. So what this says is If _we_ already own the mutex, increment counter (rv is still initialized to 1) Else if we can acquire the mutex (updating rv) increment the counter Return Please correct me if I'm wrong--and I am very probably wrong--, but isn't the point of a recursive mutex to allow the thread initially taking the mutex to take it again, but not other threads? The code would have to be a lot more complicated than that. All I see here is a fancy no-op, but maybe it's just me. It would be a no-op for a recursive spin mutex. That's why I didn't allow them! :) - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Sun, Aug 01, 2010 at 05:17:01PM -0700, Paul Goyette wrote: [...] > +void > +rmutex_enter(rmutex_t *rmtx) > +{ > + if (mutex_owned(&rmtx->rmtx_mtx)) { > + rmtx->rmtx_recurse++; > + KASSERT(rmtx->rmtx_recurse != 0); > + } else { > + mutex_enter(&rmtx->rmtx_mtx); > + rmtx->rmtx_recurse = 1; > + } > +} > + > +void > +rmutex_exit(rmutex_t *rmtx) > +{ > + KASSERT(mutex_owned(&rmtx->rmtx_mtx)); > + KASSERT(rmtx->rmtx_recurse != 0); > + if (--rmtx->rmtx_recurse == 0) > + mutex_exit(&rmtx->rmtx_mtx); > +} > + > +int > +rmutex_tryenter(rmutex_t *rmtx) > +{ > + int rv = 1; > + > + if (mutex_owned(&rmtx->rmtx_mtx)) { > + rmtx->rmtx_recurse++; > + KASSERT(rmtx->rmtx_recurse != 0); > + } else if ((rv = mutex_tryenter(&rmtx->rmtx_mtx)) != 0) { > + rmtx->rmtx_recurse++; > + KASSERT(rmtx->rmtx_recurse != 0); > + } > + return rv; > +} I am probably not getting the idea, but I fail to see how this qualifies as a mutex. My reading of this piece of code is that rmutex_enter() is always going to succeed immediately. Please correct me if I'm wrong--and I am very probably wrong--, but isn't the point of a recursive mutex to allow the thread initially taking the mutex to take it again, but not other threads? The code would have to be a lot more complicated than that. All I see here is a fancy no-op, but maybe it's just me. -- Quentin Garnier - c...@cubidou.net - c...@netbsd.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007. pgpgTt12zdsHJ.pgp Description: PGP signature
Re: Modules loading modules?
On Sun, 1 Aug 2010, Antti Kantee wrote: I'm not sure if it's a good idea to change the size of kmutex_t. I guess plenty of data structures have carefully been adjusted by hand to its size and I don't know of any automatic way to recalculate that stuff. Even if not, since this is the only user and we probably won't have that many of them even in the future, why not just define a new type ``rmutex'' which contains a kmutex, an owner and the counter? It feels wrong to punish all the normal kmutex users for just one use. It'll also make the implementation a lot simpler to test, since it's purely MI. "separate normal case and worst case" Round two! Taking pooka's suggestion, this version is built on top of (rather than beside) the existing non-recursive mutex. As such, it does not affect any MD code. Attached is a set of diffs that 1. Adds sys/sys/rmutex.h and sys/kern/kern_rmutex.c to implement recursive adaptive mutexes. (Conspicuously missing is an rmutex(9) man page... It will happen before this gets committed.) 2. Converts the existing module_lock from a normal kmutex_t to an rmutex_t 3. Updates all of the (surprisingly many) places where module_lock is acquired. Compile-tested on port-amd64 (including rumptest). Since there are no MD-changes in this version, there "shouldn't be" any issues with building on other ports. As previously noted, there is only one known use case for this so far: modules loading other modules from within their xxx_modcmd() routine. The specific use case we have involves loading the acpicpu driver/module which eventually results in an attempt to load acpiverbose. It would be really nice if the community could A. Compile-test on additional architectures B. Test to see that existing mutex operations still work correctly C. Exercise the known use case if possible D. Identify additional use cases - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -Index: sys/conf/files === RCS file: /cvsroot/src/sys/conf/files,v retrieving revision 1.992 diff -u -p -r1.992 files --- sys/conf/files 7 Jul 2010 01:09:39 - 1.992 +++ sys/conf/files 2 Aug 2010 00:08:26 - @@ -1468,6 +1468,7 @@ file kern/kern_prot.c file kern/kern_ras.c file kern/kern_rate.c file kern/kern_resource.c +file kern/kern_rmutex.c file kern/kern_runq.c file kern/kern_rwlock.c file kern/kern_rwlock_obj.c Index: sys/sys/rmutex.h === RCS file: sys/sys/rmutex.h diff -N sys/sys/rmutex.h --- /dev/null 1 Jan 1970 00:00:00 - +++ sys/sys/rmutex.h2 Aug 2010 00:08:27 - @@ -0,0 +1,53 @@ +/* $NetBSD$*/ + +/*- + * Copyright (c) 2010 The NetBSD Foundation, Inc. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS + * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS + * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + * POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef_SYS_RMUTEX_H_ +#define_SYS_RMUTEX_H_ + +#include + +/* + * A recursive mutex is simply an adaptive mutex which can be acquired + * multiple times by a single owner. + */ + +struct rmutex { + kmutex_trmtx_mtx; + uint32_trmtx_recurse; +}; + +typedef struct rmutex rmutex_t; + +void rmutex_init(rmutex_t *); +void rmutex_destroy(rmutex_t *); +void rmutex_enter(rmutex_t
Re: Modules loading modules?
On Sun, 1 Aug 2010, Paul Goyette wrote: > Good point, and it will be a lot less work, too! :) And it solves the > problem of not permitting a rmutex being used with condvars. > > One question: Since an adaptive kmutex_t already includes an owner field, > would we really need to have another copy of it in the rmutex_t structure? Why is there a need to hold a mutex across module loading instead of, say, having a reference count tht's protected by a mutex? I thought holding a mutex across a potentially blocking operation such as a module load is considered bad practice. Eduardo
Re: Modules loading modules?
On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote: > One question: Since an adaptive kmutex_t already includes an owner > field, would we really need to have another copy of it in the rmutex_t > structure? Good point. I think it's ok to do: if (mutex_owned(mtx)) cnt++ else lock
Re: Modules loading modules?
On Sun, 1 Aug 2010, Antti Kantee wrote: Well, folks, here is a first pass recursive locks! The attached diffs are against -current as of a few minutes ago. Oh, heh, I thought we have recursive lock support. But with that gone from the vfs locks, I guess not (apart from the kernel lock ;). :) I'm not sure if it's a good idea to change the size of kmutex_t. I guess plenty of data structures have carefully been adjusted by hand to its size and I don't know of any automatic way to recalculate that stuff. Even if not, since this is the only user and we probably won't have that many of them even in the future, why not just define a new type ``rmutex'' which contains a kmutex, an owner and the counter? It feels wrong to punish all the normal kmutex users for just one use. It'll also make the implementation a lot simpler to test, since it's purely MI. "separate normal case and worst case" Good point, and it will be a lot less work, too! :) And it solves the problem of not permitting a rmutex being used with condvars. One question: Since an adaptive kmutex_t already includes an owner field, would we really need to have another copy of it in the rmutex_t structure? - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Sat Jul 31 2010 at 15:48:26 -0700, Paul Goyette wrote: > >>If modload-from-modcmd is found necessary, sounds more like a case for > >>the infamous recursive lock. > > > >Recursive lock is the way to go. I think the same lock should also cover > >all device configuration activites (i.e. autoconf) and any other > >heavy lifting where we have chunks of the system coming and going. > > Well, folks, here is a first pass recursive locks! The attached diffs > are against -current as of a few minutes ago. Oh, heh, I thought we have recursive lock support. But with that gone from the vfs locks, I guess not (apart from the kernel lock ;). I'm not sure if it's a good idea to change the size of kmutex_t. I guess plenty of data structures have carefully been adjusted by hand to its size and I don't know of any automatic way to recalculate that stuff. Even if not, since this is the only user and we probably won't have that many of them even in the future, why not just define a new type ``rmutex'' which contains a kmutex, an owner and the counter? It feels wrong to punish all the normal kmutex users for just one use. It'll also make the implementation a lot simpler to test, since it's purely MI. "separate normal case and worst case"
Re: Modules loading modules?
On Wed, 28 Jul 2010, Andrew Doran wrote: it seems to me the root problem is that module_mutex is held while calling into the module startup routines. thus, the right solution is to remove this requirement. Yes, that's what is needed. I'm far from convinced that's a good idea. First, it will probably make the module code a nightmare -- what happens when you have multiple interleaved loads, some of which fail at some point in their dependency stack, and let's just throw in a manual modunload to mix up things further. Second, and pretty much related to number one, it goes against one of the most fundamental principles of robust code: atomic actions. This: atomicity. The atomic behaviour is relied upon to give "all or nothing" semantics upon load and unload, like transactions in a database. Admittely some modules (not of my making) are sloppy about this and so break that bit of the interface contract. If modload-from-modcmd is found necessary, sounds more like a case for the infamous recursive lock. Recursive lock is the way to go. I think the same lock should also cover all device configuration activites (i.e. autoconf) and any other heavy lifting where we have chunks of the system coming and going. Well, folks, here is a first pass recursive locks! The attached diffs are against -current as of a few minutes ago. Some caveats: 1. This has only been compile-tested for x86 (amd64 & i386). 2. Other architectures have not even been compiled yet. "It should work" but you never know. 3. HPPA is an exceptional exception here, since it is the only in-tree architecture I found that cannot use SIMPLE_LOCKS. 4. There is only one known use case for this so far: modules loading other modules from within their xxx_modcmd() routine. The specific use case we have involves loading the acpicpu driver/module which eventually results in an attempt to load acpiverbose. It would be really nice if the community could A. Compile-test on additional architectures, especially HPPA B. Test to see that existing mutex operations still work correctly C. Exercise the known use case if possible D. Identify additional use cases There is one place in the code (marked with great big XXX) where I am simply unsure if I need to use a CAS-type operation for updating the reference count on the recursive mutex. I _think_ it is save to simply auto-increment the field, but I am totally willing for any of you experts to tell my why this might not be safe. :) I will be updating one of my home machines (amd64) to use this code in the next few days. I don't plan on making any commits until I've had some positive testing feedback, and hopefully some expert commentary on the XXX. (Some additional "concensus" that this is the "right thing to do" would also be appreciated!) - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -Index: sys/arch/alpha/include/mutex.h === RCS file: /cvsroot/src/sys/arch/alpha/include/mutex.h,v retrieving revision 1.4 diff -u -p -r1.4 mutex.h --- sys/arch/alpha/include/mutex.h 28 Apr 2008 20:23:11 - 1.4 +++ sys/arch/alpha/include/mutex.h 31 Jul 2010 22:34:30 - @@ -43,7 +43,11 @@ struct kmutex { struct kmutex { union { - volatile uintptr_t mtxa_owner; + struct { + volatile uintptr_t mtxa_owner; + volatile uint8_tmtxa_recurse; + volatile uint8_tmtxa_unused[3]; + } a; struct { volatile uint8_tmtxs_flags; ipl_cookie_tmtxs_ipl; @@ -53,7 +57,8 @@ struct kmutex { __cpu_simple_lock_t mtx_lock; }; -#definemtx_owner u.mtxa_owner +#definemtx_owner u.a.mtxa_owner +#definemtx_recurse u.a.mtxa_recurse #definemtx_flags u.s.mtxs_flags #definemtx_ipl u.s.mtxs_ipl Index: sys/arch/arm/include/mutex.h === RCS file: /cvsroot/src/sys/arch/arm/include/mutex.h,v retrieving revision 1.10 diff -u -p -r1.10 mutex.h --- sys/arch/arm/include/mutex.h28 Apr 2008 20:23:14 - 1.10 +++ sys/arch/arm/include/mutex.h31 Jul 2010 22:34:31 - @@ -57,7 +57,11 @@ struct kmutex { struct kmutex { union { /* Adaptive mutex */ -
Re: Modules loading modules?
On Wed, 28 Jul 2010, Andrew Doran wrote: If modload-from-modcmd is found necessary, sounds more like a case for the infamous recursive lock. Recursive lock is the way to go. I think the same lock should also cover all device configuration activites (i.e. autoconf) and any other heavy lifting where we have chunks of the system coming and going. OK, I'm willing to make an attempt to implement the recursive_lock if noone else wants to give it a try! It will probably take some time... - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Mon, Jul 26, 2010 at 02:49:56PM +0300, Antti Kantee wrote: > On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote: > > On Mon, 26 Jul 2010, matthew green wrote: > > > > > > > >it seems to me the root problem is that module_mutex is held while > > >calling into the module startup routines. > > > > > >thus, the right solution is to remove this requirement. > > > > Yes, that's what is needed. > > I'm far from convinced that's a good idea. First, it will probably > make the module code a nightmare -- what happens when you have multiple > interleaved loads, some of which fail at some point in their dependency > stack, and let's just throw in a manual modunload to mix up things > further. Second, and pretty much related to number one, it goes against > one of the most fundamental principles of robust code: atomic actions. This: atomicity. The atomic behaviour is relied upon to give "all or nothing" semantics upon load and unload, like transactions in a database. Admittely some modules (not of my making) are sloppy about this and so break that bit of the interface contract. > If modload-from-modcmd is found necessary, sounds more like a case for > the infamous recursive lock. Recursive lock is the way to go. I think the same lock should also cover all device configuration activites (i.e. autoconf) and any other heavy lifting where we have chunks of the system coming and going.
Re: Modules loading modules?
On Mon, 26 Jul 2010, der Mouse wrote: We have a modular device driver, let's call it xxxmod. [...] It [] might attempt to use an optional module (e.g., zzzverbose) to print some device attachment messages. First, a required module cannot be optional. If the desired module is not present, or if it is present but its own xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated back to the original "outer" call to module_load() which will also fail. The second reason why this is not suitable is that the "outer" load will add a reference to the module, preventing it from being auto-unloaded. Surely the right answer here is to provide a way to say "refer to this module, but it's ok for its load to fail, and it's ok for it to get auto-unloaded", including passing up whatever information is necessary for the calling module to do something useful in failure cases? I actually considered adding a set of "desired" modules as well as the current "required" modules. That would certainly take care of the immediate problem with the xxx_verbose modules. But it doesn't address the need to have one module/driver loading another one for a child device, and it would be a maintenance nightmare to require updating the dependency-list every time a new child driver gets created. It really seems to me that the module system is there to help us, not to shackle us, and that if it has properties which are leading to problems, one of the options we should at the very least be considering is changing those properties. Yes, the current restriction appears to be burdensome. I'm just trying to get some consensus on that - so far I've seen only one negative opinion (from pooka@ IIRC). If we can agree that it's desirable for one module's xxx_modcmd() to load another module, then we can move to step 2 and figure out how to make it happen without breaking anything. If we don't get consensus, then I'll most likely need to revert the xxx_verbose modules. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
> We have a modular device driver, let's call it xxxmod. [...] It [] > might attempt to use an optional module (e.g., zzzverbose) to print > some device attachment messages. > First, a required module cannot be optional. If the desired module > is not present, or if it is present but its own > xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated > back to the original "outer" call to module_load() which will also > fail. > The second reason why this is not suitable is that the "outer" load > will add a reference to the module, preventing it from being > auto-unloaded. Surely the right answer here is to provide a way to say "refer to this module, but it's ok for its load to fail, and it's ok for it to get auto-unloaded", including passing up whatever information is necessary for the calling module to do something useful in failure cases? It really seems to me that the module system is there to help us, not to shackle us, and that if it has properties which are leading to problems, one of the options we should at the very least be considering is changing those properties. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Modules loading modules?
On Sun Jul 25 2010 at 15:17:29 -0700, Paul Goyette wrote: > On Mon, 26 Jul 2010, matthew green wrote: > > > > >it seems to me the root problem is that module_mutex is held while > >calling into the module startup routines. > > > >thus, the right solution is to remove this requirement. > > Yes, that's what is needed. I'm far from convinced that's a good idea. First, it will probably make the module code a nightmare -- what happens when you have multiple interleaved loads, some of which fail at some point in their dependency stack, and let's just throw in a manual modunload to mix up things further. Second, and pretty much related to number one, it goes against one of the most fundamental principles of robust code: atomic actions. If modload-from-modcmd is found necessary, sounds more like a case for the infamous recursive lock. (no comment on the actual problem)
Re: Modules loading modules?
On Mon, 26 Jul 2010, Jukka Ruohonen wrote: On Mon, Jul 26, 2010 at 06:41:11AM +1000, matthew green wrote: it seems to me the root problem is that module_mutex is held while calling into the module startup routines. Here is one related question: is it ensured that the module lock is dropped immediately after a modular device driver returns from its attachment routine? I am thinking of a case where a modular driver defers its configuration by using config_interrupts(9) or config_finalize_register(9). Not necessarily. If a module is being loaded as a dependency of (required by) another module, then the lock is not dropped. The lock is dropped only after the requested module and all of its dependencies have been loaded and their xxx_modcmd() have been called. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Mon, Jul 26, 2010 at 06:41:11AM +1000, matthew green wrote: > it seems to me the root problem is that module_mutex is held while > calling into the module startup routines. Here is one related question: is it ensured that the module lock is dropped immediately after a modular device driver returns from its attachment routine? I am thinking of a case where a modular driver defers its configuration by using config_interrupts(9) or config_finalize_register(9). - Jukka.
re: Modules loading modules?
> On Mon, 26 Jul 2010, matthew green wrote: > > > > > it seems to me the root problem is that module_mutex is held while > > calling into the module startup routines. > > > > thus, the right solution is to remove this requirement. > > Yes, that's what is needed. > > But it sounds a lot simpler than it is. oh, i'm sure it's not trivial. :-( .mrg.
re: Modules loading modules?
On Mon, 26 Jul 2010, matthew green wrote: it seems to me the root problem is that module_mutex is held while calling into the module startup routines. thus, the right solution is to remove this requirement. Yes, that's what is needed. But it sounds a lot simpler than it is. Among other issues, we'd need to ensure that we continue to detect circular dependencies between both sets of required modules. And, since we call the module startup routine for a required module _before_ we call it for the required module, we would need to release and then reacquire the mutex, all the while being sure that the data structures don't get changed out from under us. We would end up with something like this sequence,where module A requires A1 and A2, and module B requires B1 and B2, and hopefully (B1 != A && B2 != A) (which would be a hidden circular dependency). mutex_enter() load module A load required module A1 mutex_exit() call A1's init routine mutex_enter() load required module A2 mutex_exit() call A2's init routine mutex_enter() load module B load required module B1 mutex_exit() call B1's init routine mutex_enter() load required module B2 mutex_exit() call B2's init routine mutex_enter() mutex_exit() call B's init routine mutex_enter() mutex_exit() mutex_enter() mutex_exit() call A's init routine mutex_enter() mutex_exit() - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
re: Modules loading modules?
it seems to me the root problem is that module_mutex is held while calling into the module startup routines. thus, the right solution is to remove this requirement. .mrg.
Modules loading modules?
Consider the following scenario: We have a modular device driver, let's call it xxxmod. The driver's xxx_modcmd(MODULE_CMD_INIT, ...) routine handles all of its initialization, including interaction with autoconf(9). It therefore might attempt to use an optional module (e.g., zzzverbose) to print some device attachment messages. (Another possible scenario would be to have one modular driver xxxmod that calls config_found_xxx() which in turn requires another modular driver yyymod.) We have no problem if the zzzverbose module is already auto-loaded. But if the module is not present, or if it has already been auto- unloaded, the current code will call module_autoload() to attempt to (re)load it. Since this call happens in xxx_modcmd(), we end calling mutex_enter() while the mutex is still owned. There is currently a "requires" mechanism that allows recursion within the module loading process. But it doesn't quite solve the problem, for at least three reasons. First, a required module cannot be optional. If the desired module is not present, or if it is present but its own xxx_modcmd(MODULE_CMD_INIT, ...) fails, the failure is propagated back to the original "outer" call to module_load() which will also fail. The second reason why this is not suitable is that the "outer" load will add a reference to the module, preventing it from being auto-unloaded. (A key benefit of having the xxx_verbose modules is that their large text tables can be unloaded.) A third reason that tends to make this option unuseable is the need to identify ahead of time all of the possible optional modules and maintain the dependency list. Adding a new driver would thus require updating the dependency list in the parent device's driver. Any suggestions on how to resolve this conundrum? Without a solution, the zzz_verbose modules and modularized drivers that might use the zzz_verbose modules cannot co-exist, and we won't be able to have modular drivers for devices whose children are served by other modular drivers. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -