re: Modules loading modules?
According to the mutex(9) man page: mutex_owned(mtx) For adaptive mutexes, return non-zero if the current LWP holds the mutex. ... this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. ie, you can not even KASSERT(!mutex_owned()). the documentation matches my experience with using mutex(9) and KASSERT(). but i'm still not sure why we're going to such lengths to hold a lock across such a heavy weight operation like loading a module. that may involve disk speeds, and so you're looking at waiting millions of cycles for the lock. aka, what eeh posted. .mrg
Re: Modules loading modules?
On Aug,Sunday 1 2010, at 3:53 PM, Antti Kantee wrote: On Sun Aug 01 2010 at 06:10:07 -0700, Paul Goyette wrote: One question: Since an adaptive kmutex_t already includes an owner field, would we really need to have another copy of it in the rmutex_t structure? Good point. I think it's ok to do: if (mutex_owned(mtx)) cnt++ else lock When I did locking for dm driver, it was said by rmind@ that mutex_owned should not be used for any locking decisions and should be used only for diagnostic purposes. What I would in this case is add two new routines to module_framework - module_lock() module_unlock() These two will then do something like this. kmutex_t module_lock; uint32_t refcnt; lwp_t mod_lwp; condvar_t cv; module_lock() { mutex_enter(module_lock); start if (refcnt == 0) { refcnt++; mod_lwp = curlwp; else { if (mod_lwp == curlwp) refcnt++; else { cv_wait(cv); goto start } } mutex_exit(module_lock); } mudule_unlock() { mutex_enter(module_lock); KASSERT(mod_lwp != curlwp); /* module_unlock should be only called from lwp which called module_lock */ refcnt--; if (refcnt == 0) { mod_lwp = 0; cv_broadcast(cv); } mutex_exit(module_lock); } This way module_lock can be called multiple times by one lwp and will sleep for everyone else. Regards Adam.
Re: Modules loading modules?
On Sun, 1 Aug 2010, Eduardo Horvath wrote: Why is there a need to hold a mutex across module loading instead of, say, having a reference count tht's protected by a mutex? I thought holding a mutex across a potentially blocking operation such as a module load is considered bad practice. Please note that I am not introducing the holding of the lock over the potentially blocking operation. This is already done. Several others have already indicated that the atomicity of the module load (including the loading of any required modules) is a desirable property. I'm simply looking for a mechanism that allows us to keep this property while allowing modules to load other modules. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Locking for module_autoload() (was: Modules loading modules?)
On Mon, 2 Aug 2010, Adam Hamsik wrote: On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote: On Sun, 1 Aug 2010, John Nemeth wrote: I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? Attached are diffs for moving the locking out of each caller and into module_autoload() itself. Most of these changes are fairly trivial, but a couple of them should be looked at more closely in case I've missed any subtleties in the original code: kern/kern_syscall.c net/if_ppp.c I think that this patch has same problem as original code you can call module_autoload twice in same lwp. (you are holding module_lock while calling module_do_load). This patch wasn;t intended to solve the nested-call problem. This patch only purpose was to modify the existing semantics of the module_autoload routine to match that of the other module routines. In particular, module_autoload is the only routine that requires its caller to acquire the mutex first; all the other routines do all the required locking internally to each routine. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). but i'm still not sure why we're going to such lengths to hold a lock across such a heavy weight operation like loading a module. that may involve disk speeds, and so you're looking at waiting millions of cycles for the lock. aka, what eeh posted. It's held for millions of cycles already now and nobody has pointed out measurable problems. But if it is deemed necessary, you can certainly hide a cv underneath. The efficiency requirements for the module lock are probably anyway in the who cares wins ... not spectrum. At least I'm not aware of any fastpath wanting to use it. Anyway, no real opinion there. A cv most likely is the safe, no-brainer choice.
Re: Modules loading modules?
On Mon, 2 Aug 2010, Antti Kantee wrote: On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). but i'm still not sure why we're going to such lengths to hold a lock across such a heavy weight operation like loading a module. that may involve disk speeds, and so you're looking at waiting millions of cycles for the lock. aka, what eeh posted. It's held for millions of cycles already now and nobody has pointed out measurable problems. But if it is deemed necessary, you can certainly hide a cv underneath. The efficiency requirements for the module lock are probably anyway in the who cares wins ... not spectrum. At least I'm not aware of any fastpath wanting to use it. Anyway, no real opinion there. A cv most likely is the safe, no-brainer choice. Yes, the cv mechanism, coupled with changing the semantics of module_autoload() (to not require the caller to obtain the lock) makes good sense here. I'm working on it and I should be able to post new diffs tomorrow. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Locking for module_autoload() (was: Modules loading modules?)
On Aug,Monday 2 2010, at 4:27 AM, Paul Goyette wrote: On Sun, 1 Aug 2010, John Nemeth wrote: I'm thinking the acquisition of module_lock should be pushed into module_autoload() instead of having the caller acquire it for this very reason. It makes it hard to change the way locking works in the MODULAR code if you expect the caller to acquire the lock. I don't know why it was done this way originally, or what the consequences (if any) would be for making the change. Andrew, any thoughts on this? Attached are diffs for moving the locking out of each caller and into module_autoload() itself. Most of these changes are fairly trivial, but a couple of them should be looked at more closely in case I've missed any subtleties in the original code: kern/kern_syscall.c net/if_ppp.c I think that this patch has same problem as original code you can call module_autoload twice in same lwp. (you are holding module_lock while calling module_do_load). Regards Adam.
Re: pchb@acpi
Hi! Quentin, From: Quentin Garnier c...@cubidou.net Date: Sun, 1 Aug 2010 15:21:18 + On Sun, Aug 01, 2010 at 11:17:54PM +0900, KIYOHARA Takashi wrote: All recent PC has information on PCI in ACPI. We can attach pchb in acpi like a lot of other acpi devices. p...@acpi has been fairly supported in FreeBSD since before. ftp://ftp.netbsd.org/pub/NetBSD/misc/kiyohara/ia64/p...@acpi-support.diff Do you have anything further in mind? This patch is only dmesg cosmetics. No, I don't. My ia64 machine not need it. However it pass segment information to pci(4) in the future possibly. Thanks, -- kiyohara
Re: Modules loading modules?
On Mon, 2 Aug 2010, Paul Goyette wrote: Yes, the cv mechanism, coupled with changing the semantics of module_autoload() (to not require the caller to obtain the lock) makes good sense here. I'm working on it and I should be able to post new diffs tomorrow. OK, it's time for Round #3! Attached are diffs that implement Adam's suggested cv-based solution, along with moving the locking for module_autolock() into the routine itself (for consistency with other routines). I've also updated the module(4) man page this time around. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | - MODULE_LOCKING.diff.gz Description: Binary data
Re: fd code multithreaded race?
On Sat Jul 31 2010 at 20:31:19 +0300, Antti Kantee wrote: Hi, I'm looking at a KASSERT which is triggering quite rarely for me (in terms of iterations): panic: kernel diagnostic assertion dt-dt_ff[i]-ff_refcnt == 0 failed: file /usr/allsrc/src/sys/rump/librump/rumpkern/../../../kern/kern_descrip.c, line 856 Upon closer examination, it seems that this can trigger while another thread is in fd_getfile() between upping the refcount, testing for ff_file, and fd_putfile(). Removing the KASSERT seems to restore correct operation, but I didn't read the code far enough to see where the race is actually handled and what stops the code from using the wrong file. How-to-repeat: Run tests/fs/puffs/t_fuzz mountfuzz7 in a loop. A multiprocessor kernel might produce a more reliable result, so set RUMP_NCPU unless you have a multiprocessor host. Depending on timings and how the get/put thread runs, you might even see the refcount as 0 in the core. Does anyone see something wrong with the analysis? If not, I'll create a dedidated test and file a PR. kern/43694, tests/kernel/t_filedesc
re: Modules loading modules?
On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. what is not possible is to safely to what the manual says. you will not have the right thing happen. i've seen this when i've been working on the sparc64 pmap. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). this just does not match my actual experience in the kernel. i had weird pmap-style problems and asserts firing wrongly while i did not obey the rules in the manual directly. .mrg.
re: Modules loading modules?
The most recent verion of this patch set was recently posted, and it uses the cv_wait() mechanism suggested by Adam Hamsik. This discussion is still very intresting, but perhaps we could start a new thread (or at least change the subject line)? :) On Tue, 3 Aug 2010, matthew green wrote: On Mon Aug 02 2010 at 16:30:03 +1000, matthew green wrote: this is an incomplete reading of the manual page, and you can not use mutex_owned() the way you are trying to (regardless of what pooka posted.) you can't even using it in various forms of assertions safely. from the man page: It should not be used to make locking decisions at run time, or to verify that a lock is not held. That's the mantra, yes. ie, you can not even KASSERT(!mutex_owned()). Strictly speaking you can in a case where you have two locks always are taken as l1,l2 and released as l2,l1 provided you're not dealing with a spin mutex. Does it make any sense? no (l2 is useless). Is it possible? yes. what is not possible is to safely to what the manual says. you will not have the right thing happen. i've seen this when i've been working on the sparc64 pmap. Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). this just does not match my actual experience in the kernel. i had weird pmap-style problems and asserts firing wrongly while i did not obey the rules in the manual directly. .mrg. - | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com| | Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net | | Kernel Developer | | pgoyette at netbsd.org | -
Re: Modules loading modules?
On Tue Aug 03 2010 at 02:17:43 +1000, matthew green wrote: Now, on to sensible stuff. I'm quite certain that warning was written to make people avoid writing bad code without understanding locking -- if you need to used mutex_owned() to decide if you should lock a normal mutex, your code is broken. The other less likely possibility is that someone plans to change mutex_owned in the future. Further data point: the same warning is in rw_held, yet it was used to implement recursive locking in vlockmgr until its very recent demise. Ignoring man page mantras and focusing on how the code works, I do not see anything wrong with Paul's use of mutex_owned(). this just does not match my actual experience in the kernel. i had weird pmap-style problems and asserts firing wrongly while i did not obey the rules in the manual directly. Not knowing more details it's difficult to comment. But since you are talking about the pmap, maybe your experiences are with spin mutexes instead of adaptive ones?