Define static kmutex_t

2012-01-15 Thread Emmanuel Dreyfus
Another DAHDI porting caveat. They use stuff like this to define a
static kmutex_t outside any function:

static DEFINE_SPINLOCK(pseudo_free_list_lock);

NetBSD needs to do this in order to initialize a mutex:

static kmutex_t pseudo_free_list_lock;
mutex_init(pseudo_free_list_lock, MUTEX_SPIN, IPL_NET)

There is no way to make this fit together, right? As I understand I need
to run a driver initilization hook to call mutex_init for all mutexes
that are defined this way in the drivers (I cound 12 occurences).

Am I correct? There Is No Alternatrive?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: patch: MFSv3 support (libsa) for boot2 (i386)

2012-01-15 Thread David Laight
On Fri, Jan 13, 2012 at 06:50:19PM +0400, Evgeniy Ivanov wrote:
  The code is linked to an address other than 0x7c00, the first thing
  it does is copy itself to that address.
 
  Are you sure you are disassembling it correctly ?
  It looks like you haven't told objdump? it is 16bit code.
 
 Yes, I was handling that output by hands. Thanks for the proper command.
 
  That jmp instruction needs to goto address 7c00, the opcode contains
  the pc-next relative value, the 7bfe value is just a parameter to
  the relocation.
  In the final image you have f30c+3+f1-7c00 is 0x7800 which is ok
  if the code is expected to relocate itself to 0x7800.

 Why do you refer 0x7800, how it's related to the LOADADDR (0x8800)?

That seemed to be the required LOADADDR to get those instruction bytes.
But possibly I got the sums wrong!
(The local file I looked at is old and uses a different LOADADDR.)

 With --adjust-vma=0x8800 I get the thing I understand:
 88f1:   e9 0c f3jmp0x7c00
 0x88f1 + 3 - 3316 (0xf30c) = 0x7c00
 
 And I still miss the meaning of relocation value 7bfe.
 In object file it is
 131:   e9 fe 7bjmp0x7d32
 7d32 = 7c00 + 0x132 (i.e. number of bytes before this command).

If you do an 'objdump -r mbr.o' you'll see there is a pc-relative
relocation applied to address 132, 'objdump -d' doesn't look at
the relocations (it would be useful if it did) - so it blindly
prints the wrong target address.

The pc-relative relocation will be defined relative to the location
of the fixup (ie 132 not 134) - so 0x7bfe is needed as a parameter
not 0x7c00. To save space this value is put into the onject code
rather than the relocation record.

 And then when link how do we get e9 0c f3? That's the thing I dream to
 know :-) Because in MINIX I get in final image
 88f1:   e9 fe 7bjmp0x4f2
 
 While preprocessed sources are the same on both systems and compiled
 with same options.

Something is going wrong with the assemble or link phase - might be
a buggy version of either the assembler or linker.

Possibly using: jmp .start - $LOADADDR + $BOOTADDR will work instead.

David

-- 
David Laight: da...@l8s.co.uk


Re: Define static kmutex_t

2012-01-15 Thread Mindaugas Rasiukevicius
m...@netbsd.org (Emmanuel Dreyfus) wrote:
 Another DAHDI porting caveat. They use stuff like this to define a
 static kmutex_t outside any function:
 
 static DEFINE_SPINLOCK(pseudo_free_list_lock);
 
 NetBSD needs to do this in order to initialize a mutex:
 
 static kmutex_t pseudo_free_list_lock;
 mutex_init(pseudo_free_list_lock, MUTEX_SPIN, IPL_NET)
 
 There is no way to make this fit together, right? As I understand I need
 to run a driver initilization hook to call mutex_init for all mutexes
 that are defined this way in the drivers (I cound 12 occurences).

Right.  Use MUTEX_DEFAULT (instead of MUTEX_SPIN or other), though.
One also needs to mutex_destroy(9) the lock on driver detach/unload.

-- 
Mindaugas


Re: patch: MFSv3 support (libsa) for boot2 (i386)

2012-01-15 Thread Evgeniy Ivanov
On Sun, Jan 15, 2012 at 4:40 PM, David Laight da...@l8s.co.uk wrote:
 On Fri, Jan 13, 2012 at 06:50:19PM +0400, Evgeniy Ivanov wrote:
  The code is linked to an address other than 0x7c00, the first thing
  it does is copy itself to that address.
 
  Are you sure you are disassembling it correctly ?
  It looks like you haven't told objdump? it is 16bit code.

 Yes, I was handling that output by hands. Thanks for the proper command.

  That jmp instruction needs to goto address 7c00, the opcode contains
  the pc-next relative value, the 7bfe value is just a parameter to
  the relocation.
  In the final image you have f30c+3+f1-7c00 is 0x7800 which is ok
  if the code is expected to relocate itself to 0x7800.

 Why do you refer 0x7800, how it's related to the LOADADDR (0x8800)?

 That seemed to be the required LOADADDR to get those instruction bytes.
 But possibly I got the sums wrong!
 (The local file I looked at is old and uses a different LOADADDR.)

 With --adjust-vma=0x8800 I get the thing I understand:
 88f1:       e9 0c f3                jmp    0x7c00
 0x88f1 + 3 - 3316 (0xf30c) = 0x7c00

 And I still miss the meaning of relocation value 7bfe.
 In object file it is
 131:   e9 fe 7b                jmp    0x7d32
 7d32 = 7c00 + 0x132 (i.e. number of bytes before this command).

 If you do an 'objdump -r mbr.o' you'll see there is a pc-relative
 relocation applied to address 132, 'objdump -d' doesn't look at
 the relocations (it would be useful if it did) - so it blindly
 prints the wrong target address.

Useful, thanks!

 The pc-relative relocation will be defined relative to the location
 of the fixup (ie 132 not 134) - so 0x7bfe is needed as a parameter
 not 0x7c00. To save space this value is put into the onject code
 rather than the relocation record.

Aha, I see.

 And then when link how do we get e9 0c f3? That's the thing I dream to
 know :-) Because in MINIX I get in final image
 88f1:       e9 fe 7b                jmp    0x4f2

 While preprocessed sources are the same on both systems and compiled
 with same options.

 Something is going wrong with the assemble or link phase - might be
 a buggy version of either the assembler or linker.

 Possibly using: jmp .start - $LOADADDR + $BOOTADDR will work instead.

It works, thank you very much! And thanks for all your detailed explanations!



-- 
Evgeniy


Re: heads-up: IPSEC is now FAST_IPSEC

2012-01-15 Thread Greg Troxel

Mindaugas Rasiukevicius rm...@netbsd.org writes:

 Matthias Drochner m.droch...@fz-juelich.de wrote:
 
 I've just made FAST_IPSEC the default implementation which gets
 used if the IPSEC kernel option is present.
 ...
 
 The old KAME implementation is still available through
 the KAME_IPSEC kernel option. The old IPSEC_ESP option
 is meaningless with (FAST_)IPSEC (ESP is always enabled)
 but still in effect with KAME_IPSEC.

 Thanks a lot for working on this.  Are you planning to remove old IPSEC
 code?  It would bring simplifications, clean-up and would make further work
 on network stack less painful.  I think post-netbsd-6 branch (or even now?)
 would be a very good time.

Removing the code so it isn't in NetBSD 6 seems premature.  There
shouldn't be much simplification/cleanup etc. on the branch.  And I
don't know what fraction of people who use IPsec at all use FAST_IPSEC
vs IPSEC - I would suspect that the new code has been exposed to only a
small fraction of the use cases.



pgpEdbBic6618.pgp
Description: PGP signature


Re: kernel crash at ibm x3850

2012-01-15 Thread 6bone

Hello,

sorry for the long delay, but the USB keyboard doesn't work in the ddb 
and I needed some time to configure SOL.


Now the output:

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 5.1_STABLE (INSTALL) #4: Sun Jan 15 13:56:06 CET 2012

r...@6bone.informatik.uni-leipzig.de:/usr/obj/sys/arch/amd64/compile/INSTALL
total memory = 511 GB
avail memory = 496 GB
RTC BIOS diagnostic error 80clock_battery
SMBIOS rev. 2.7 @ 0x7f0be000 (137 entries)
IBM System x3850 X5 -[71453RG]- (06)
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel 686-class, 1995MHz, id 0x206e6
fatal protection fault in supervisor mode
trap type 4 code 0 rip 80529b06 cs 8 rflags 10246 cr2  0 cpl 8 rsp 
811e0b98

kernel: protection fault trap, code=0
Stopped in pid 0.1 (system) at  0x80529b06: rdmsr
db{0} bt 
?() at 0x80529b06 
?() at 0x80524948

?() at 0x8047aed2
?() at 0x80517071
?() at 0x80513cc1
?() at 0x8046c2ea
?() at 0x80522cde
?() at 0x80726fe5
?() at 0x80522b90
?() at 0x805594dc
?() at 0x8046c2ea
?() at 0x805294a6
?() at 0x80432bef

Regards
Uwe

 On Fri, 13 Jan 2012, Patrick 
Welche wrote:



Date: Fri, 13 Jan 2012 12:34:09 +
From: Patrick Welche pr...@cam.ac.uk
To: 6b...@6bone.informatik.uni-leipzig.de
Cc: tech-kern@netbsd.org
Subject: Re: kernel crash at ibm x3850

On Fri, Jan 13, 2012 at 11:54:58AM +0100, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

if I boot the netbsd-5-1-1 install media at an ibm x3850 the kernel
crashs. You can find the screenshot at
https://suse.uni-leipzig.de/ibm-x3850.jpg

Any ideas what could be the problem?


Maybe you could get a backtrace (type bt) at the prompt showing, and
take a picture of that?
(http://www.netbsd.org/docs/kernel/#ddb)

Cheers,

Patrick



Re: buffer cache ufs changes (preliminary ffsv2 extattr support)

2012-01-15 Thread Mouse
 I'm working on porting the FreeBSD FFSv2 extended attributes support.
 [...]

 1) Add a new bflag, B_ALTDATA.  [...]
 2) instead of using a new flag, add a new 'int type' member [...]

 Althrough I've done 1 as a POC, I prefer solution 2 ([...]).  What do
 other think ?

As a choice of approach to implementing what you want, I think 2 is
better.  It's far more generalizable.  As a piece of SF I read once
said, the number two is ridiculous and can't possibly exist.  It was
talking about universes, but the basic concept applies here too:
there's very little excuse for any number between one and many.

However, I think that constitutes a good implementation of a bad idea.
This makes a file no longer a long list of octets; it becomes multiple
long lists of octets.  The Mac did this, with resource forks and data
forks, and you may note OS X doesn't do it any longer.  I suspect these
will seem like a good idea for a while, until people start discovering
all the things they break, or that break them, and realize that they
didn't learn from history and thus had to repeat it.

That said, it's no skin off my nose.  I've said my piece, and it won't
be affecting me, pragmatically, either way.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: O-A loan

2012-01-15 Thread jaimef

Hello Yamamoto-san

I have rerun build.sh -j16 -m amd64 on ufs+wapbl on the latest 
yamt-pagecache, with and without loan, and yamt-pagecache-base3


Below are the times for the first runs.
DIAGNOSTIC was disabled on all configs.

yamt-pagecache vm.loanread=1 (default)
make release started at:  Sat Jan 14 22:35:43 PST 2012
make release finished at: Sat Jan 14 23:09:35 PST 2012
34:52


yamt-pagecache vm.loanread=0
make release started at:  Sat Jan 14 23:23:59 PST 2012
make release finished at: Sat Jan 14 23:58:31 PST 2012
35:30

yamt-pagecache-base3:
make release started at:  Sun Jan 15 03:47:12 PST 2012
make release finished at: Sun Jan 15 04:20:44 PST 2012

33:32

Thanks.

On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:


Date: Thu, 12 Jan 2012 03:31:59 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: jai...@mauthesis.com
Cc: c...@chuq.com, tech-kern@netbsd.org
Subject: Re: O-A loan

hi,


I did not remove DIAGNOSTIC.
Would you like me to rerun without DIAGNOSTIC?


yes, please.

YAMAMOTO Takashi


On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:


Date: Thu, 12 Jan 2012 03:14:33 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: jai...@mauthesis.com
Cc: c...@chuq.com, tech-kern@netbsd.org
Subject: Re: O-A loan

hi,

thanks for benchmark!

was it without DIAGNOSTIC?

YAMAMOTO Takashi


Hello Yamamoto-san,

I ran dbench on the same system with yamt-pagecache, yamt-pagecache
without a-o loan, and yamt-pagecache-base3.
http://linbsd.org/yamt.png
The tests were run three times on each kernel and the results were
consistent between reboots/runs.

Thanks.

On Tue, 27 Dec 2011, YAMAMOTO Takashi wrote:


Date: Tue, 27 Dec 2011 02:53:29 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: c...@chuq.com
Cc: tech-kern@netbsd.org
Subject: Re: O-A loan

hi,

i made read with O-A loaning work for easy cases (ie. no locking difficulty)
on yamt-pagecache branch so that someone interested can benchmark.

YAMAMOTO Takashi


hi,


On Tue, Nov 29, 2011 at 06:38:27AM +, YAMAMOTO Takashi wrote:

O-A loaned pages installed on the user address space would have a different
owner than the usual map-entry.uvm_obj.
although it was not a problem when you wrote this patch, at least some
non-mechanical changes would be required after the recent locking
changes in this area.  namely, uvm_map_lock_entry etc now assumes that
any pages mapped in a map entry belong to either the entry's amap or
underlying object.


ok, I didn't think it would be entirely mechanical.  :-)

what if the O-A loan code also changed the entry's uvm_obj to be the vnode
that the pages really belong to?  if the loan range in the amap is fully
populated (which it is in this context) then that shouldn't affect the
logical contents of the entry, it would just cause anyone locking the entry
to also lock the vnode.  if the range of the loan is smaller than the
range of the entry, we could split the entry.  do you think that would work?


it might work, but i have some concerns:
- entry fragmentation
- the extra uobj reference keeps the file even after unlink

YAMAMOTO Takashi



-Chuck




Re: buffer cache ufs changes (preliminary ffsv2 extattr support)

2012-01-15 Thread Martin Husemann
On Sun, Jan 15, 2012 at 08:37:37PM +0100, Manuel Bouyer wrote:
 Althrough I've done 1 as a POC, I prefer solution 2 (the patch is mostly the
 same, with bflag remplaced by b_type). What do other think ?

(2) is conceptually what NTFS does IIUC. Consider a file to be a database
table to which arbitrary columns may be added, with column 0 containing all
the traditional octets, and some columns having special meaning.

I am not sure I like the idea, but it clearly is superior to option (1).

Martin


Re: O-A loan

2012-01-15 Thread YAMAMOTO Takashi
hi,

thanks.
is this on a different hardware from the previous one?

YAMAMOTO Takashi

 
 Hello Yamamoto-san
 
 I have run dbench on ufs/wapbl with diagnostics disabled.
 http://linbsd.org/yamt3.png
 
 
 On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:
 
 Date: Thu, 12 Jan 2012 03:31:59 + (UTC)
 From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
 To: jai...@mauthesis.com
 Cc: c...@chuq.com, tech-kern@netbsd.org
 Subject: Re: O-A loan
 
 hi,

 I did not remove DIAGNOSTIC.
 Would you like me to rerun without DIAGNOSTIC?

 yes, please.

 YAMAMOTO Takashi

 On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:

 Date: Thu, 12 Jan 2012 03:14:33 + (UTC)
 From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
 To: jai...@mauthesis.com
 Cc: c...@chuq.com, tech-kern@netbsd.org
 Subject: Re: O-A loan

 hi,

 thanks for benchmark!

 was it without DIAGNOSTIC?

 YAMAMOTO Takashi

 Hello Yamamoto-san,

 I ran dbench on the same system with yamt-pagecache, yamt-pagecache
 without a-o loan, and yamt-pagecache-base3.
 http://linbsd.org/yamt.png
 The tests were run three times on each kernel and the results were
 consistent between reboots/runs.

 Thanks.

 On Tue, 27 Dec 2011, YAMAMOTO Takashi wrote:

 Date: Tue, 27 Dec 2011 02:53:29 + (UTC)
 From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
 To: c...@chuq.com
 Cc: tech-kern@netbsd.org
 Subject: Re: O-A loan

 hi,

 i made read with O-A loaning work for easy cases (ie. no locking 
 difficulty)
 on yamt-pagecache branch so that someone interested can benchmark.

 YAMAMOTO Takashi

 hi,

 On Tue, Nov 29, 2011 at 06:38:27AM +, YAMAMOTO Takashi wrote:
 O-A loaned pages installed on the user address space would have a 
 different
 owner than the usual map-entry.uvm_obj.
 although it was not a problem when you wrote this patch, at least some
 non-mechanical changes would be required after the recent locking
 changes in this area.  namely, uvm_map_lock_entry etc now assumes that
 any pages mapped in a map entry belong to either the entry's amap or
 underlying object.

 ok, I didn't think it would be entirely mechanical.  :-)

 what if the O-A loan code also changed the entry's uvm_obj to be the 
 vnode
 that the pages really belong to?  if the loan range in the amap is 
 fully
 populated (which it is in this context) then that shouldn't affect the
 logical contents of the entry, it would just cause anyone locking the 
 entry
 to also lock the vnode.  if the range of the loan is smaller than the
 range of the entry, we could split the entry.  do you think that would 
 work?

 it might work, but i have some concerns:
 - entry fragmentation
 - the extra uobj reference keeps the file even after unlink

 YAMAMOTO Takashi


 -Chuck


Re: O-A loan

2012-01-15 Thread jaimef

This is the same hardware.
rmind had noticed in the lockstat output that fileassoc was being called
on all unlink() operation. So I am rerunning the tests without fileassoc.

On Mon, 16 Jan 2012, YAMAMOTO Takashi wrote:


Date: Mon, 16 Jan 2012 03:47:48 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: jai...@mauthesis.com
Cc: c...@chuq.com, tech-kern@netbsd.org
Subject: Re: O-A loan

hi,

thanks.
is this on a different hardware from the previous one?

YAMAMOTO Takashi



Hello Yamamoto-san

I have run dbench on ufs/wapbl with diagnostics disabled.
http://linbsd.org/yamt3.png


On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:


Date: Thu, 12 Jan 2012 03:31:59 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: jai...@mauthesis.com
Cc: c...@chuq.com, tech-kern@netbsd.org
Subject: Re: O-A loan

hi,


I did not remove DIAGNOSTIC.
Would you like me to rerun without DIAGNOSTIC?


yes, please.

YAMAMOTO Takashi


On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:


Date: Thu, 12 Jan 2012 03:14:33 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: jai...@mauthesis.com
Cc: c...@chuq.com, tech-kern@netbsd.org
Subject: Re: O-A loan

hi,

thanks for benchmark!

was it without DIAGNOSTIC?

YAMAMOTO Takashi


Hello Yamamoto-san,

I ran dbench on the same system with yamt-pagecache, yamt-pagecache
without a-o loan, and yamt-pagecache-base3.
http://linbsd.org/yamt.png
The tests were run three times on each kernel and the results were
consistent between reboots/runs.

Thanks.

On Tue, 27 Dec 2011, YAMAMOTO Takashi wrote:


Date: Tue, 27 Dec 2011 02:53:29 + (UTC)
From: YAMAMOTO Takashi y...@mwd.biglobe.ne.jp
To: c...@chuq.com
Cc: tech-kern@netbsd.org
Subject: Re: O-A loan

hi,

i made read with O-A loaning work for easy cases (ie. no locking difficulty)
on yamt-pagecache branch so that someone interested can benchmark.

YAMAMOTO Takashi


hi,


On Tue, Nov 29, 2011 at 06:38:27AM +, YAMAMOTO Takashi wrote:

O-A loaned pages installed on the user address space would have a different
owner than the usual map-entry.uvm_obj.
although it was not a problem when you wrote this patch, at least some
non-mechanical changes would be required after the recent locking
changes in this area.  namely, uvm_map_lock_entry etc now assumes that
any pages mapped in a map entry belong to either the entry's amap or
underlying object.


ok, I didn't think it would be entirely mechanical.  :-)

what if the O-A loan code also changed the entry's uvm_obj to be the vnode
that the pages really belong to?  if the loan range in the amap is fully
populated (which it is in this context) then that shouldn't affect the
logical contents of the entry, it would just cause anyone locking the entry
to also lock the vnode.  if the range of the loan is smaller than the
range of the entry, we could split the entry.  do you think that would work?


it might work, but i have some concerns:
- entry fragmentation
- the extra uobj reference keeps the file even after unlink

YAMAMOTO Takashi



-Chuck




Re: buffer cache ufs changes (preliminary ffsv2 extattr support)

2012-01-15 Thread Emmanuel Dreyfus
Mouse mo...@rodents-montreal.org wrote:

 The Mac did this, with resource forks and data
 forks, and you may note OS X doesn't do it any longer.  I suspect these
 will seem like a good idea for a while, until people start discovering
 all the things they break, or that break them, and realize that they
 didn't learn from history and thus had to repeat it.

The problem with multiple-forked files is interaction with the outter
world. When you upload a file on the internet, you loose the non
data-forks, except if you serialize in some way (the .hqx format did
that just for the mac). How things will work on a NFS filesystem is also
an issue: you may not want mv(1) or cp(1) to strip non-data fork

Anyway we already have the problem with ffsv1 extended attributes. I
added extattr support to cp(1) and mv(1), but there is a lot of work
left to do: dump, restore, tar, cpio, zip, unzip, scp, rcp, rsync...  Of
course there will be situation where protocols or format will not allow
preservation of extended attributes. In that case, the program may need
to warn the user about lost data.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-15 Thread YAMAMOTO Takashi
hi,

 Emmanuel Dreyfus m...@netbsd.org wrote:
 
 Hence I come to the conclusion that it may come from
 sys/kern/vfs_lookup.c, but it is very unlikely that there is a bug there
 that went unnoticed for other filesystems.
 
 Further investigation shows that this ENOENT is returned by vget() call
 in puffs_cookie2vnode(). That suggests some kind of race condition, but
 that is not obvious. It means a vnode has been created on a lookup, then
 it gets recycled while looking up one of its child.

it should retry from puffs_cookie2pnode in that case.

YAMAMOTO Takashi

 
 -- 
 Emmanuel Dreyfus
 http://hcpnet.free.fr/pubz
 m...@netbsd.org