Re: mutexes, locks and so on...
On 11/14/10 20:16, David Holland wrote: On Sat, Nov 13, 2010 at 01:45:40AM +0900, Izumi Tsutsui wrote: > > Wow. I guess you can add me to the list of people leaving. > > There is no perfect world and we don't have enough resources. > > Any help to keep support for ancient machines are appreciate, but > complaints like "we should support it" which prevents improvements > of mainstream will just make NetBSD rotten. What "prevents improvements of mainstream" are we talking about here? We have someone who wants to provide tuned vax-specific locking primitives. The absolute worst possible cost to the "mainstream" that this incurs is a bit of extra cpp and config hackery. (Can we all please get a grip?) Yes, please. And... Wow, just for my seaching for a spin mutex lock against myself at boot time, I started counting the number of times the uvm_fpageqlock spin mutex is taken. By the time the "boot device: xxx" is printed out, that spin-mutex have been taken 423137 times. That looked like a big number to me atleast. (And yes, it is that mutex that is taken twice, which traps me right now. Still looking at why it happens, but I know that the "normal" code does not in fact check for this condition, and happily just accept that you take a spin-lock that is already taken, when you're not multiprocessor.) Johnny
Re: mutexes, locks and so on...
On 11/15/10 14:55, Johnny Billquist wrote: On 11/14/10 20:16, David Holland wrote: On Sat, Nov 13, 2010 at 01:45:40AM +0900, Izumi Tsutsui wrote: > > Wow. I guess you can add me to the list of people leaving. > > There is no perfect world and we don't have enough resources. > > Any help to keep support for ancient machines are appreciate, but > complaints like "we should support it" which prevents improvements > of mainstream will just make NetBSD rotten. What "prevents improvements of mainstream" are we talking about here? We have someone who wants to provide tuned vax-specific locking primitives. The absolute worst possible cost to the "mainstream" that this incurs is a bit of extra cpp and config hackery. (Can we all please get a grip?) Yes, please. And... Wow, just for my seaching for a spin mutex lock against myself at boot time, I started counting the number of times the uvm_fpageqlock spin mutex is taken. By the time the "boot device: xxx" is printed out, that spin-mutex have been taken 423137 times. That looked like a big number to me atleast. (And yes, it is that mutex that is taken twice, which traps me right now. Still looking at why it happens, but I know that the "normal" code does not in fact check for this condition, and happily just accept that you take a spin-lock that is already taken, when you're not multiprocessor.) Aw. Crap. Forget that. Error in my mathematics, since I had two numbers, and one happened to be decimal, and the other hex... The actual number of spin mutexes taken up to that point is 14524. Still a hefty number, but nowhere near what I first though I saw. Johnny
Re: Vax interlock insn's (Re: Please do not yell at people for trying to help you.)
On 11/13/10 11:16, Anders Magnusson wrote: (a little side-note, but may be interesting) On 11/13/2010 04:17 AM, Matt Thomas wrote: Eventually, most operations come down to compare and swap. It's just too damn useful to not have as a primitive. Even if some of the platforms have to emulate it somehow. Just because an architecture is 30+ years old doesn't mean we are forced to ignore algorithms that came after its birth. The VAX now has a fast non-MP emulation of atomic_cas so that should be less of an issue. When I tested NetBSD/vax MP on an 8353 some years ago I found that it ended up unexpectedly slow in some situations. Some timing showed that the interlock insn's were _really_ slow compared to their non-interlocked version; insqti almost took 100 times of insque, bbssi were not that slow compared to bbss but there were an unreasonable amount of time spent on the buses. Not that shocking... The cache is there for a reason. Having an instruction forced to go to physical memory will never be "fast". queue instructions must be even more horrible, since they will probably need to lock up several memory addresses while the operation is performed. That said, our current spin-locking, using 1: BBSSI $0,mutex,1b is really not good, and not needed. The problem being that you keep hitting the same memory cell over and over again, over the bus, thus creating a lot of bus contention as well, and waiting on slow memory. Changing it to 1: BBSSI $0,mutex,2f BRB3f 2: BBS$0,mutex,2b BRB1b 3: will seriously improve performance, since the CPU will then spin hitting it's own CPU cache instead, and only go back to hitting memory once the mutex is free again. Using an MP VAX were just not worth it (at least not with BI bus). Better to optimize for UP VAX and make it work reasonable good. I think it's definitely also worth trying to get MP code in better shape that it is today. I'll see if I can get my fixes to the mutexes atleast to work (and also locks), but rwlocks can not be fixed without more work. :-( Johnny
Re: CVS commit: src/sys/arch/powerpc/oea
(moving this to tech-kern because it's the right place and per request) On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote: > > Every header file should include the things it requires to compile. > > Therefore, there should in principle be no cases where a header file > > (or source file) needs to #include something it doesn't itself use. > > This clarifies my long-unanswered question, thanks! *bow* > I've (re)built about 300 kernels in the last days. I've found: > > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h > (I've done this locally). Otherwise all sysctl node providers > includes sys/proc.h and uvm/uvm_extern.h. > (This is where I started...) I'm not sure this is a good plan in the long run. Shouldn't it at some point be unhooked fully from the "real" proc structure? > - sys/syscallargs.h should be split into pieces, otherwise all its > users have to know unrelated types (sys/mount.h, sys/cpu.h). Since system calls don't in general pass structures by value, it shouldn't need most of those header files, just forward declarations of the structs. (this is, btw, one of the reasons to avoid silly typedefs) > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into > some common header, because it's widely used API. sys/proc.h will > be used only for "struct proc" related things. Given that this is a deprecated API in the long term I'm not sure it's worthwhile. -- David A. Holland dholl...@netbsd.org
Re: CVS commit: src/sys/arch/powerpc/oea
>>> Every header file should include the things it requires to compile. I've long felt this way: that, except for a very few examples like that are defined to depend on context, the order of #includes should not matter. In particular, if multiple files must be included, any of them may come first - so any file that generates errors if it's included first needs fixing. (Well, unless it's an internal file, one that shouldn't be included directly.) I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks it's worth applying this stance to 4.x. > [...] just forward declarations of the structs. > (this is, btw, one of the reasons to avoid silly typedefs) I'm not sure what typedefs have to do with it. typedeffing a name to an incomplete ("forward") struct type works just fine: struct foo; typedef struct foo FOO; (You can't do anything with a FOO without completing the struct type, but you can work with pointers to them) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: CVS commit: src/sys/arch/powerpc/oea
On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote: > >>> Every header file should include the things it requires to compile. > > I've long felt this way: that, except for a very few examples like > that are defined to depend on context, the order of > #includes should not matter. In particular, if multiple files must be > included, any of them may come first - so any file that generates > errors if it's included first needs fixing. (Well, unless it's an > internal file, one that shouldn't be included directly.) assert.h is the *only* header that is not supposed to be idempotent. Pretty much anything else should be classified as bug. Another item is that too many of our headers depend on non-standard compliant types polluting the namespace. Nothing installed in /usr/include should depend on u_char for example. > I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks > it's worth applying this stance to 4.x. Doing it properly has a chance of creating some fallout and that's not really acceptable for a release branch, especially one that doesn't get many non-critical changes. > > [...] just forward declarations of the structs. > > > (this is, btw, one of the reasons to avoid silly typedefs) > > I'm not sure what typedefs have to do with it. typedeffing a name to > an incomplete ("forward") struct type works just fine: > > struct foo; > typedef struct foo FOO; > > (You can't do anything with a FOO without completing the struct type, > but you can work with pointers to them) Problem is that this requires a guard for the typedef if FOO is supposed to be defined by multiple files. Joerg
Re: CVS commit: src/sys/arch/powerpc/oea
>> I've long felt this way: that, except for a very few examples like >> that are defined to depend on context, the order of >> #includes should not matter. In particular, if multiple files must >> be included, any of them may come first - so any file that generates >> errors if it's included first needs fixing. > assert.h is the *only* header that is not supposed to be idempotent. > Pretty much anything else should be classified as bug. I'm not talking about idempotency. Idempotency is the question of whether #include #include is operationally equivalent to #include But what I'm talking about is whether #include #include and #include #include are operationally equivalent. > Another item is that too many of our headers depend on non-standard > compliant types polluting the namespace. This is another issue I have with our include files, but it's significantly more work to fix, so I've been working around it rather than fixing it right. (It's also not entirely clear what the right fix is when more than two levels of software supplier are involved.) > Nothing installed in /usr/include should depend on u_char for > example. Well...ideally. But there's a big difference between (say) depending on u_char and depending on u_char. (I'm not happy about either, but substantially less happy about the former.) >> struct foo; >> typedef struct foo FOO; > Problem is that this requires a guard for the typedef if FOO is > supposed to be defined by multiple files. True. Most of the cases I've seen for this push the incomplete struct declaration and the typedef into a separate file, wrap it with an idempotency guard, and then include that from files that need the struct and/or type. I've seen a very few cases where this hasn't been done and I think none at all where it couldn't be done if the relevant software authors cared to bother. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: CVS commit: src/sys/arch/powerpc/oea
On Mon, Nov 15, 2010 at 08:34:40PM +, David Holland wrote: > (moving this to tech-kern because it's the right place and per request) > > On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote: > > > Every header file should include the things it requires to compile. > > > Therefore, there should in principle be no cases where a header file > > > (or source file) needs to #include something it doesn't itself use. > > > > This clarifies my long-unanswered question, thanks! > > *bow* > > > I've (re)built about 300 kernels in the last days. I've found: > > > > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h > > (I've done this locally). Otherwise all sysctl node providers > > includes sys/proc.h and uvm/uvm_extern.h. > > (This is where I started...) > > I'm not sure this is a good plan in the long run. Shouldn't it at some > point be unhooked fully from the "real" proc structure? > > > - sys/syscallargs.h should be split into pieces, otherwise all its > > users have to know unrelated types (sys/mount.h, sys/cpu.h). > > Since system calls don't in general pass structures by value, it > shouldn't need most of those header files, just forward declarations > of the structs. > > (this is, btw, one of the reasons to avoid silly typedefs) OK, you're absolutely right. And I'm fully agree that forward decls are better in modularity POV. > > > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into > > some common header, because it's widely used API. sys/proc.h will > > be used only for "struct proc" related things. > > Given that this is a deprecated API in the long term I'm not sure it's > worthwhile. > > -- > David A. Holland > dholl...@netbsd.org -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Re: CVS commit: src/sys/arch/powerpc/oea
On Mon, Nov 15, 2010 at 10:41:55PM +, David Laight wrote: > > Indeed. Properly speaking though, headers that are exported to > > userland should define only the precise symbols that userland needs; > > kernel-only material should be kept elsewhere. > > One start would be to add a sys/proc_internal.h so that sys/proc > can be reduced to only stuff that userspace and some kernel parts > are really expected to use. The right way (TM) is to create src/sys/include and put kernel-only headers in there, to be included as e.g. . In the long term the user-visible parts would go in src/sys/include/kern/proc.h, which would be included as . (It has to be kern/ and not sys/ because a couple decades of standards creep and poor API maintenance has led to half of sys/*.h properly belonging to libc. In order to avoid repeating this problem in the future, all APIs should be defined without direct reference to any kern/*.h files; those should only be included from other libc or kernel headers. So libc would grow its own because that's part of the libkvm API.) When done completely the entire kern/ subtree is the same for both userland and the kernel, including MD headers, no other random kernel headers need to be installed, and there's no longer any need for #ifdef _KERNEL. As much as this probably sounds obvious, the first couple of times I set out to do it myself I got it wrong. (And it's wrong in Linux too.) -- David A. Holland dholl...@netbsd.org
Re: CVS commit: src/sys/arch/powerpc/oea
On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote: > > [...] just forward declarations of the structs. > > > (this is, btw, one of the reasons to avoid silly typedefs) > > I'm not sure what typedefs have to do with it. typedeffing a name to > an incomplete ("forward") struct type works just fine: > > struct foo; > typedef struct foo FOO; > > (You can't do anything with a FOO without completing the struct type, > but you can work with pointers to them) But now there's no protection against divergence; that is, if I have typedef struct foo FOO; in one header and a typo'd typedef struct tfoo FOO; in another, assuming suitable ifdef guards as already mentioned, now FOO can be two different things, and the inconsistency in the cut-and-pasted-material might not be detected for some time. However, if I just have struct foo; in multiple headers there aren't very many ways this can be wrong that will compile at all. The only common way for this to go bad is if you've removed struct foo from your program completely; then you have to hunt down all the forward declarations by hand and kill them off. But that's more or less unavoidable. The difference between these two cases is inherent in the fact that the typedef form is declaring two things and the plain struct declaration is declaring only one... there's no particular reason C couldn't provide a way to create a forward declaration (without definition) of a typedef name, but it doesn't. -- David A. Holland dholl...@netbsd.org
Re: Vax interlock insn's (Re: Please do not yell at people for trying to help you.)
On Mon, Nov 15, 2010 at 05:40:05PM +0100, Johnny Billquist wrote: > [...] but rwlocks can not be fixed without more work. :-( I wouldn't be particularly surprised if replacing the current overbred rwlock code with a simple implementation using a single mutex and condvar made it go faster... or at least no slower. -- David A. Holland dholl...@netbsd.org