Re: mutexes, locks and so on...

2010-11-15 Thread Johnny Billquist

On 11/14/10 20:16, David Holland wrote:

On Sat, Nov 13, 2010 at 01:45:40AM +0900, Izumi Tsutsui wrote:
  >  >  Wow. I guess you can add me to the list of people leaving.
  >
  >  There is no perfect world and we don't have enough resources.
  >
  >  Any help to keep support for ancient machines are appreciate, but
  >  complaints like "we should support it" which prevents improvements
  >  of mainstream will just make NetBSD rotten.

What "prevents improvements of mainstream" are we talking about here?
We have someone who wants to provide tuned vax-specific locking
primitives. The absolute worst possible cost to the "mainstream" that
this incurs is a bit of extra cpp and config hackery.

(Can we all please get a grip?)


Yes, please.

And... Wow, just for my seaching for a spin mutex lock against myself at 
boot time, I started counting the number of times the uvm_fpageqlock 
spin mutex is taken. By the time the
"boot device: xxx" is printed out, that spin-mutex have been taken 
423137 times.


That looked like a big number to me atleast. (And yes, it is that mutex 
that is taken twice, which traps me right now. Still looking at why it 
happens, but I know that the "normal" code does not in fact check for 
this condition, and happily just accept that you take a spin-lock that 
is already taken, when you're not multiprocessor.)


Johnny


Re: mutexes, locks and so on...

2010-11-15 Thread Johnny Billquist

On 11/15/10 14:55, Johnny Billquist wrote:

On 11/14/10 20:16, David Holland wrote:

On Sat, Nov 13, 2010 at 01:45:40AM +0900, Izumi Tsutsui wrote:
> > Wow. I guess you can add me to the list of people leaving.
>
> There is no perfect world and we don't have enough resources.
>
> Any help to keep support for ancient machines are appreciate, but
> complaints like "we should support it" which prevents improvements
> of mainstream will just make NetBSD rotten.

What "prevents improvements of mainstream" are we talking about here?
We have someone who wants to provide tuned vax-specific locking
primitives. The absolute worst possible cost to the "mainstream" that
this incurs is a bit of extra cpp and config hackery.

(Can we all please get a grip?)


Yes, please.

And... Wow, just for my seaching for a spin mutex lock against myself at
boot time, I started counting the number of times the uvm_fpageqlock
spin mutex is taken. By the time the
"boot device: xxx" is printed out, that spin-mutex have been taken
423137 times.

That looked like a big number to me atleast. (And yes, it is that mutex
that is taken twice, which traps me right now. Still looking at why it
happens, but I know that the "normal" code does not in fact check for
this condition, and happily just accept that you take a spin-lock that
is already taken, when you're not multiprocessor.)


Aw. Crap. Forget that. Error in my mathematics, since I had two numbers, 
and one happened to be decimal, and the other hex...


The actual number of spin mutexes taken up to that point is 14524. Still 
a hefty number, but nowhere near what I first though I saw.


Johnny



Re: Vax interlock insn's (Re: Please do not yell at people for trying to help you.)

2010-11-15 Thread Johnny Billquist

On 11/13/10 11:16, Anders Magnusson wrote:

(a little side-note, but may be interesting)

On 11/13/2010 04:17 AM, Matt Thomas wrote:

Eventually, most operations come down to compare and swap. It's just too
damn useful to not have as a primitive. Even if some of the platforms
have to emulate it somehow. Just because an architecture is 30+ years old
doesn't mean we are forced to ignore algorithms that came after its
birth.

The VAX now has a fast non-MP emulation of atomic_cas so that should be
less of an issue.

When I tested NetBSD/vax MP on an 8353 some years ago I found that
it ended up unexpectedly slow in some situations. Some timing showed
that the interlock insn's were _really_ slow compared to their
non-interlocked
version; insqti almost took 100 times of insque, bbssi were not that slow
compared to bbss but there were an unreasonable amount of time spent
on the buses.


Not that shocking... The cache is there for a reason. Having an 
instruction forced to go to physical memory will never be "fast".
queue instructions must be even more horrible, since they will probably 
need to lock up several memory addresses while the operation is performed.


That said, our current spin-locking, using

1:   BBSSI  $0,mutex,1b

is really not good, and not needed. The problem being that you keep 
hitting the same memory cell over and over again, over the bus, thus 
creating a lot of bus contention as well, and waiting on slow memory.


Changing it to

1:   BBSSI  $0,mutex,2f
 BRB3f
2:   BBS$0,mutex,2b
 BRB1b
3:

will seriously improve performance, since the CPU will then spin hitting 
it's own CPU cache instead, and only go back to hitting memory once the 
mutex is free again.



Using an MP VAX were just not worth it (at least not with BI bus). Better
to optimize for UP VAX and make it work reasonable good.


I think it's definitely also worth trying to get MP code in better shape 
that it is today. I'll see if I can get my fixes to the mutexes atleast 
to work (and also locks), but rwlocks can not be fixed without more 
work. :-(


Johnny


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland
(moving this to tech-kern because it's the right place and per request)

On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
 > > Every header file should include the things it requires to compile.
 > > Therefore, there should in principle be no cases where a header file
 > > (or source file) needs to #include something it doesn't itself use.
 > 
 > This clarifies my long-unanswered question, thanks!

*bow*

 > I've (re)built about 300 kernels in the last days.  I've found:
 > 
 > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
 >   (I've done this locally).  Otherwise all sysctl node providers
 >   includes sys/proc.h and uvm/uvm_extern.h.
 >   (This is where I started...)

I'm not sure this is a good plan in the long run. Shouldn't it at some
point be unhooked fully from the "real" proc structure?

 > - sys/syscallargs.h should be split into pieces, otherwise all its
 >   users have to know unrelated types (sys/mount.h, sys/cpu.h).

Since system calls don't in general pass structures by value, it
shouldn't need most of those header files, just forward declarations
of the structs.

(this is, btw, one of the reasons to avoid silly typedefs)

 > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
 >   some common header, because it's widely used API.  sys/proc.h will
 >   be used only for "struct proc" related things.

Given that this is a deprecated API in the long term I'm not sure it's
worthwhile.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread der Mouse
>>> Every header file should include the things it requires to compile.

I've long felt this way: that, except for a very few examples like
 that are defined to depend on context, the order of
#includes should not matter.  In particular, if multiple files must be
included, any of them may come first - so any file that generates
errors if it's included first needs fixing.  (Well, unless it's an
internal file, one that shouldn't be included directly.)

I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks
it's worth applying this stance to 4.x.

> [...] just forward declarations of the structs.

> (this is, btw, one of the reasons to avoid silly typedefs)

I'm not sure what typedefs have to do with it.  typedeffing a name to
an incomplete ("forward") struct type works just fine:

struct foo;
typedef struct foo FOO;

(You can't do anything with a FOO without completing the struct type,
but you can work with pointers to them)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread Joerg Sonnenberger
On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote:
> >>> Every header file should include the things it requires to compile.
> 
> I've long felt this way: that, except for a very few examples like
>  that are defined to depend on context, the order of
> #includes should not matter.  In particular, if multiple files must be
> included, any of them may come first - so any file that generates
> errors if it's included first needs fixing.  (Well, unless it's an
> internal file, one that shouldn't be included directly.)

assert.h is the *only* header that is not supposed to be idempotent.
Pretty much anything else should be classified as bug.

Another item is that too many of our headers depend on non-standard
compliant types polluting the namespace. Nothing installed in
/usr/include should depend on u_char for example.

> I've got numerous fixes to 4.0.1 for such issues, in case anyone thinks
> it's worth applying this stance to 4.x.

Doing it properly has a chance of creating some fallout and that's not
really acceptable for a release branch, especially one that doesn't get
many non-critical changes.

> > [...] just forward declarations of the structs.
> 
> > (this is, btw, one of the reasons to avoid silly typedefs)
> 
> I'm not sure what typedefs have to do with it.  typedeffing a name to
> an incomplete ("forward") struct type works just fine:
> 
> struct foo;
> typedef struct foo FOO;
> 
> (You can't do anything with a FOO without completing the struct type,
> but you can work with pointers to them)

Problem is that this requires a guard for the typedef if FOO is supposed
to be defined by multiple files.

Joerg


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread der Mouse
>> I've long felt this way: that, except for a very few examples like
>>  that are defined to depend on context, the order of
>> #includes should not matter.  In particular, if multiple files must
>> be included, any of them may come first - so any file that generates
>> errors if it's included first needs fixing.
> assert.h is the *only* header that is not supposed to be idempotent.
> Pretty much anything else should be classified as bug.

I'm not talking about idempotency.  Idempotency is the question of
whether

#include 
#include 

is operationally equivalent to

#include 

But what I'm talking about is whether

#include 
#include 

and

#include 
#include 

are operationally equivalent.

> Another item is that too many of our headers depend on non-standard
> compliant types polluting the namespace.

This is another issue I have with our include files, but it's
significantly more work to fix, so I've been working around it rather
than fixing it right.  (It's also not entirely clear what the right fix
is when more than two levels of software supplier are involved.)

> Nothing installed in /usr/include should depend on u_char for
> example.

Well...ideally.  But there's a big difference between (say) 
depending on u_char and  depending on u_char.
(I'm not happy about either, but substantially less happy about the
former.)

>> struct foo;
>> typedef struct foo FOO;

> Problem is that this requires a guard for the typedef if FOO is
> supposed to be defined by multiple files.

True.

Most of the cases I've seen for this push the incomplete struct
declaration and the typedef into a separate file, wrap it with an
idempotency guard, and then include that from files that need the
struct and/or type.  I've seen a very few cases where this hasn't been
done and I think none at all where it couldn't be done if the relevant
software authors cared to bother.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread Masao Uebayashi
On Mon, Nov 15, 2010 at 08:34:40PM +, David Holland wrote:
> (moving this to tech-kern because it's the right place and per request)
> 
> On Mon, Nov 15, 2010 at 11:24:21AM +0900, Masao Uebayashi wrote:
>  > > Every header file should include the things it requires to compile.
>  > > Therefore, there should in principle be no cases where a header file
>  > > (or source file) needs to #include something it doesn't itself use.
>  > 
>  > This clarifies my long-unanswered question, thanks!
> 
> *bow*
> 
>  > I've (re)built about 300 kernels in the last days.  I've found:
>  > 
>  > - sys/sysctl.h's struct kinfo_proc should be moved into sys/proc.h
>  >   (I've done this locally).  Otherwise all sysctl node providers
>  >   includes sys/proc.h and uvm/uvm_extern.h.
>  >   (This is where I started...)
> 
> I'm not sure this is a good plan in the long run. Shouldn't it at some
> point be unhooked fully from the "real" proc structure?
> 
>  > - sys/syscallargs.h should be split into pieces, otherwise all its
>  >   users have to know unrelated types (sys/mount.h, sys/cpu.h).
> 
> Since system calls don't in general pass structures by value, it
> shouldn't need most of those header files, just forward declarations
> of the structs.
> 
> (this is, btw, one of the reasons to avoid silly typedefs)

OK, you're absolutely right.

And I'm fully agree that forward decls are better in modularity POV.

> 
>  > - sys/proc.h's tsleep(), wakeup(), and friends should be moved into
>  >   some common header, because it's widely used API.  sys/proc.h will
>  >   be used only for "struct proc" related things.
> 
> Given that this is a deprecated API in the long term I'm not sure it's
> worthwhile.
> 
> -- 
> David A. Holland
> dholl...@netbsd.org

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland
On Mon, Nov 15, 2010 at 10:41:55PM +, David Laight wrote:
 > > Indeed. Properly speaking though, headers that are exported to
 > > userland should define only the precise symbols that userland needs;
 > > kernel-only material should be kept elsewhere.
 > 
 > One start would be to add a sys/proc_internal.h so that sys/proc
 > can be reduced to only stuff that userspace and some kernel parts
 > are really expected to use.

The right way (TM) is to create src/sys/include and put kernel-only
headers in there, to be included as e.g. .

In the long term the user-visible parts would go in
src/sys/include/kern/proc.h, which would be included as .

(It has to be kern/ and not sys/ because a couple decades of standards
creep and poor API maintenance has led to half of sys/*.h properly
belonging to libc. In order to avoid repeating this problem in the
future, all APIs should be defined without direct reference to any
kern/*.h files; those should only be included from other libc or
kernel headers. So libc would grow its own  because that's
part of the libkvm API.)

When done completely the entire kern/ subtree is the same for both
userland and the kernel, including MD headers, no other random kernel
headers need to be installed, and there's no longer any need for
#ifdef _KERNEL.

As much as this probably sounds obvious, the first couple of times I
set out to do it myself I got it wrong. (And it's wrong in Linux too.)

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/arch/powerpc/oea

2010-11-15 Thread David Holland
On Mon, Nov 15, 2010 at 03:47:32PM -0500, der Mouse wrote:
 > > [...] just forward declarations of the structs.
 > 
 > > (this is, btw, one of the reasons to avoid silly typedefs)
 > 
 > I'm not sure what typedefs have to do with it.  typedeffing a name to
 > an incomplete ("forward") struct type works just fine:
 > 
 > struct foo;
 > typedef struct foo FOO;
 > 
 > (You can't do anything with a FOO without completing the struct type,
 > but you can work with pointers to them)

But now there's no protection against divergence; that is, if I have

   typedef struct foo FOO;

in one header and a typo'd

   typedef struct tfoo FOO;

in another, assuming suitable ifdef guards as already mentioned, now
FOO can be two different things, and the inconsistency in the
cut-and-pasted-material might not be detected for some time.

However, if I just have

   struct foo;

in multiple headers there aren't very many ways this can be wrong that
will compile at all. The only common way for this to go bad is if
you've removed struct foo from your program completely; then you have
to hunt down all the forward declarations by hand and kill them off.
But that's more or less unavoidable.

The difference between these two cases is inherent in the fact that
the typedef form is declaring two things and the plain struct
declaration is declaring only one... there's no particular reason C
couldn't provide a way to create a forward declaration (without
definition) of a typedef name, but it doesn't.

-- 
David A. Holland
dholl...@netbsd.org


Re: Vax interlock insn's (Re: Please do not yell at people for trying to help you.)

2010-11-15 Thread David Holland
On Mon, Nov 15, 2010 at 05:40:05PM +0100, Johnny Billquist wrote:
 > [...] but rwlocks can not be fixed without more work. :-(

I wouldn't be particularly surprised if replacing the current overbred
rwlock code with a simple implementation using a single mutex and
condvar made it go faster... or at least no slower.

-- 
David A. Holland
dholl...@netbsd.org