Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Emmanuel Dreyfus
YAMAMOTO Takashi  wrote:

> > Further investigation shows that this ENOENT is returned by vget() call
> > in puffs_cookie2vnode(). That suggests some kind of race condition, but
> > that is not obvious. It means a vnode has been created on a lookup, then
> > it gets recycled while looking up one of its child.
> it should retry from puffs_cookie2pnode in that case.

I first tried to loop on vget but got a panic because I did not hold
v_interlock anymore. I then came to this patch and got a uvm_fault (backtrace
below)

Index: sys/fs/puffs/puffs_node.c
===
RCS file: /cvsroot/src/sys/fs/puffs/puffs_node.c,v
retrieving revision 1.13.10.3
diff -U 4 -r1.13.10.3 puffs_node.c
--- sys/fs/puffs/puffs_node.c   2 Nov 2011 20:11:12 -   1.13.10.3
+++ sys/fs/puffs/puffs_node.c   17 Jan 2012 02:36:02 -
@@ -56,8 +56,9 @@
.gop_alloc, should ask userspace
 #endif
 };
 
+static __inline int puffs_vget(struct puffs_mount *, struct vnode *, int);
 static __inline struct puffs_node_hashlist
*puffs_cookie2hashlist(struct puffs_mount *, puffs_cookie_t);
 static struct puffs_node *puffs_cookie2pnode(struct puffs_mount *,
 puffs_cookie_t);
@@ -271,8 +272,23 @@
 
return;
 }
 
+static __inline int puffs_vget(struct puffs_mount *pmp,
+   struct vnode *vp, int flags)
+{
+   int rv;
+
+   while ((rv = vget(vp, flags)) == ENOENT) {
+   printf("*** retry vget %p\n", vp); 
+   mutex_enter(&pmp->pmp_lock);
+   mutex_enter(&vp->v_interlock);
+   mutex_exit(&pmp->pmp_lock);
+   }
+
+   return rv;
+}
+
 static __inline struct puffs_node_hashlist *
 puffs_cookie2hashlist(struct puffs_mount *pmp, puffs_cookie_t ck)
 {
uint32_t hash;
@@ -320,9 +336,9 @@
vp = pmp->pmp_root;
if (vp) {
mutex_enter(&vp->v_interlock);
mutex_exit(&pmp->pmp_lock);
-   if (vget(vp, LK_INTERLOCK) == 0)
+   if (puffs_vget(pmp, vp, LK_INTERLOCK) == 0)
return 0;
} else
mutex_exit(&pmp->pmp_lock);
 
@@ -405,9 +421,9 @@
 
vgetflags = LK_INTERLOCK;
if (lock)
vgetflags |= LK_EXCLUSIVE | LK_RETRY;
-   if ((rv = vget(vp, vgetflags)))
+   if ((rv = puffs_vget(pmp, vp, vgetflags)))
return rv;
 
*vpp = vp;
return 0;



It produced a uvm_fault in the domU, followed by a crash of the dom0 (no
console access on that one, I do not have the dom0 backtrace yet). Here is
what I have been able to copy/paste from the domU (only the panic string is
missing):

trap type 6 code 0 eip c03012da cs 9 eflags 10283 cr2 0 ilevel 7
kernel: supervisor trap page fault, code=0
Stopped in pid 18692.1 (sh) at  netbsd:turnstile_block+0x1aa:   movl
0x10(%eax),%eax
db> bt
turnstile_block(0,1,cb5b42ec,c046d89c,cc3baa9c,cb91ba60,0,cb5b42ec,1,cb4c3000)
at netbsd:turnstile_block+0x1aa
mutex_vector_enter(cb5b42ec,cb5b42ec,0,0,cb3fc39c,cb497000,cc3baacc,c0365ff8,c
c3baac0,6) at netbsd:mutex_vector_enter+0xfa
puffs_cookie2vnode(cb4c3000,bb9090c0,1,1,cc3bab38,0,cc3bab4c,c0350467,cb497000
,cc3bab38) at netbsd:puffs_cookie2vnode+0x187
puffs_vfsop_root(cb497000,cc3bab38,cc3bac28,20002,ca21dc38,ca215bdc,cc3bab2c,c
0365fc5,20,0) at netbsd:puffs_vfsop_root+0x38
lookup(cc3bac00,20002,400,cc3bac1c,cb31a0b8,cb31a0e0,cc3babac,c0355e6c,cc3bac1
c,cc3bab9f) at netbsd:lookup+0x287
namei(cc3bac00,cc3bac70,cc3bac0c,c03bc307,1964000,0,cc3bac3c,bb9067cc,0,0) at
netbsd:namei+0x144
do_sys_stat(bb9067cc,0,cc3bac70,c02e40b0,c0470dc8,0,3bac8c,cb01,41ed,369ddb94)
at netbsd:do_sys_stat+0x37
sys___lstat30(cb91ba60,cc3bad00,cc3bad28,bb916010,c03bc307,61cb000,0,bb9067cc,
bfbfde68,bfbfded8) at netbsd:sys___lstat30+0x29
syscall(cc3bad48,1f,1f,1f,1f,805fced,bb906797,bfbfded8,bb906796,bb9067dc) at
netbsd:syscall+0xc7


-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 08:38:28PM -0500, Mouse wrote:
 > >> And I think the master tree for a (supposedly-)production OS is not
 > >> the place to be carrying out research experiments, not even if
 > >> another such OS is already doing it.
 > 
 > > The trouble, of course, is that there isn't really any such thing
 > > these days as a research platform OS useful for such experiments.
 > 
 > I think NetBSD actually would be a fine platform for it.

Well, sort of. For architectural experiments (as opposed to localized
tweaks to filesystems or network protocols or whatever) you really
want a small and flexible platform. Whether you get it by cutting down
a production system or building a new system (or some mixture) the
trick is figuring out what legacy stuff you need for a convincing
proof of concept, and what's deadweight. NetBSD includes a lot of
stuff that's deadweight for such purposes (beginning with but not
limited to all the historical compat code) so as it stands it's quite
a bit less than ideal.

 > Just not in the main tree.

Yes.

 > I'd suggest just having two long-running branches, an experimental
 > system and a production system, with things pulled back and forth as
 > appropriate, but I think that probably needs a switch away from CVS as
 > a prerequisite, and we all know where _that_ can of worms leads.  I've
 > got a pile of things that could go into the experimental tree myself.
 > (They're mostly in production use on my personal machines, but that
 > hardly qualifies as anything more than successful alpha test.)
 > 
 > Also, maintaining NetBSD-experimental and NetBSD-production doubles
 > certain overhead loads and increases (but not to 2x) others, thus
 > further (putatively) draining already-scarce resources

I was actually thinking about setting something like this up a while
back, but there's a fairly large problem: if the experimental branch
doesn't get much use it'll always be behind and always needing
merging, and thus not very useful, which is a self-perpetuating state
that defeats the purpose; however, if it does, there's a danger that
it effectively becomes HEAD, with the production branch functioning at
best like a stable branch, and that leaves us more or less where we
already are except with a much worse prognosis for whether -current
works on any given day.

The conclusion I came to is that it isn't really a good idea.

I think it might work to maintain a cut-down branch that can be used
as a *base* for experiments but that does not itself actually contain
any experiments (which would have to be proven on their own and then
merged directly into the trunk) but it's not clear if this is worth
the trouble vs. just experimenting on a private copy of HEAD.

Anyway as you note it's not likely workable if it involves CVS
branches.

-- 
David A. Holland
dholl...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Mouse
>> And I think the master tree for a (supposedly-)production OS is not
>> the place to be carrying out research experiments, not even if
>> another such OS is already doing it.

> The trouble, of course, is that there isn't really any such thing
> these days as a research platform OS useful for such experiments.

I think NetBSD actually would be a fine platform for it.

Just not in the main tree.

I'd suggest just having two long-running branches, an experimental
system and a production system, with things pulled back and forth as
appropriate, but I think that probably needs a switch away from CVS as
a prerequisite, and we all know where _that_ can of worms leads.  I've
got a pile of things that could go into the experimental tree myself.
(They're mostly in production use on my personal machines, but that
hardly qualifies as anything more than successful alpha test.)

Also, maintaining NetBSD-experimental and NetBSD-production doubles
certain overhead loads and increases (but not to 2x) others, thus
further (putatively) draining already-scarce resources

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: updated patch Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 10:28:57PM +0100, Manuel Bouyer wrote:
 > I consider lfs second-class citizen at this time and if forward
 > compat if broken for the lfs module on the branch it's probably not
 > a big deal).

I don't consider that acceptable...

-- 
David A. Holland
dholl...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 04:20:00PM -0500, Mouse wrote:
 > And I think the master tree for a (supposedly-)production OS is not
 > the place to be carrying out research experiments, not even if
 > another such OS is already doing it.
 > 
 > But my opinions seem to correlate negatively with NetBSD's these days.

Not on that point.

The trouble, of course, is that there isn't really any such thing
these days as a research platform OS useful for such experiments.

(I could produce one in a year or so, if anyone should happen to have
deep pockets or a tame funding agency on hand. But in general the
world doesn't seem to think there's any value in such an undertaking.)

-- 
David A. Holland
dholl...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 10:00:05PM +, YAMAMOTO Takashi wrote:
 > have you considered to separate the entity being cached from vnode?
 > iirc, irix called it "buffer cache target" or such.

That sounds like probably a good idea, but I need to think about it
more.

One of the things we need to be able to do is have both physical
(block numbers are device offsets) and virtual (block numbers are file
offsets) buffers. Currently we do this by hanging physical buffers on
the device vnode, but this has always created problems. Also for some
filesystems it may be necessary or desirable to be able to take a
virtual buffer and change it to a physical buffer, or to keep track of
both a virtual and physical identity for the same buffer at once. I
need to figure out if the latter is really necessary or not, and I
don't think it'll become entirely clear until I've gotten well into
hacking up LFS.

 > your vtruncbuf2 function seems to imply needs to have separate
 > v_dirtyblkhd/v_cleanblkhd for each types.

I don't see why and I don't think makes sense, at least in the long
term. The global buffer data structures hold buffers; they shouldn't
care what's in them or who they belong to. Any type field becomes part
of the identity of the buffer, though, and therefore part of the
lookup key; that's all vtruncbuf2 is doing, although a quick look at
the code suggests that it is not doing it efficiently and it may not
be doing it correctly.

-- 
David A. Holland
dholl...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Matthew Mondor
On Mon, 16 Jan 2012 22:26:30 + (UTC)
y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote:

> hi,
> 
> > On Mon, 16 Jan 2012 10:56:33 + (UTC)
> > y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote:
> > 
> >> when the kernel wants to cache other files.
> >> ie. whenever the kernel decides to reclaim it. :-)
> >> you can increase the chance by running
> >>while :;do sysctl -w kern.maxvnodes=0; done
> >> or something like that.
> > 
> > Wouldn't the performance also drop significantly with a permanently low
> > maxvnodes, though?
> 
> it does never succeed.
> anyway the performance is not a priority when trying to reproduce a bug.

Oh, I had missed the context, thanks for the explanation.
-- 
Matt


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Emmanuel Dreyfus
Manuel Bouyer  wrote:

> Because manu@ has put lots of efforts in getting glusterfs running,
> and I think it's something we can market. But it's unusable with ffsv1
> extattrs, we really need something better.

Well, it works, but it is so slow it suggests NetBSD is a second-class
operating system. I think we need decent extended attribute support
soon, this is a major feature everyone else has right now. The current
situation is getting as embarassing as when we did not had PAM in 2005.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 10:39:45PM +0100, Manuel Bouyer wrote:
 > > Indeed. But that isn't really the question. The question is really
 > > whether we're past the date for brand-new feature proposals for
 > > netbsd-6... or at least ones that involve invasive changes.
 > 
 > No, the question is whenever we commit the needed bits now
 > for the feature to be pulled up without kernel API major change later,
 > or if we accept a kernel API major change in the branch after netbsd-6-0
 > is branched. "this won't ever be in netbsd-6" is not an option, I don't
 > think we can wait for netbsd-7 for this.

No, that's really not the question. It's been months that we've been
planning out what will and won't be done in time for -6. That process
finally converged and we got a firm schedule. Now you come along with
something major you've never mentioned during this whole time and it
just *needs* to get in at the last minute? Really, I don't think it
can or should.

I mean, I could cite a dozen or two other things that "ought" to be in
netbsd-6, for various definitions of "ought"... many of them much less
invasive and/or dangerous. They're not going to be. It's too bad, in
some sense, but it's already been too long since -5 was released and
some point one needs to stop and release the system one has, not the
system one would want if one just had another three (or six or nine or
eighteen) months.

I'm sorry if I sound testy and I'm really not trying to start a fight;
but we really do need to get this thing branched and shipped, and what
you're proposing could easily turn into a three-month delay (if it
more or less works) or six or more (if it cracks wide open).

 > > Changing the way the buffer cache is indexed is semantically intrusive
 > > even if it's not physically intrusive. While I think adding a type
 > > field to modify the block number is a good idea, for various reasons,
 > > it needs to be thought through, and it hasn't been yet and there isn't
 > > time to thrash it out fully before the branch deadline. Furthermore,
 > > there's the existing question of indexing by physical vs. virtual
 > > block numbers; that is not in any way a resolved issue and what you're
 > > proposing interacts directly with it, and this too needs to be thought
 > > through and there isn't time before the branch deadline.
 > 
 > Yes, and I don't think we need to wait for theses questions to be
 > sorted out to have ffs2 extended attributes.

Then figure out how to do ffs2 extended attributes without changing
the buffer cache. I'm sure it can be done, and without resorting to
gross hacks, either.

 > Even if the buffer cache
 > code is rototilled later, I don't think extended attributes or the new
 > type field will make it harder, and the code in HEAD and the netbsd-6
 > branch is allowed to diverge.

With all due respect, you also didn't think the quota proplib stuff
was going to make things harder; I've already put as much or more time
into straightening it out than it took to fix ufs_rename, and it's
nothing like done yet. I would like to avoid a repeat of that whole
experience, partly because it's an ineffective use of scarce resources
(namely, developer time, both yours and mine) and partly because such
situations breed resentment and infighting.

The buffer cache code desperately needs a major rework. Ideally this
should be done in its entirety before anyone tries to make any new
semantic extensions to it or add any new functionality, to avoid
accidentally making the code worse than it already is and to avoid
adding new requirements to the logic that might turn out to be
impossible or vastly expensive to implement sanely.

If we really need to add features or extend the behavior before the
rework can get done, it's absolutely essential that all of the
ramifications be sorted out in advance and chewed over by as many
people as possible.

I'm sure someone's going to respond to say I'm being alarmist (or at
least think it) -- maybe I am, but the buffer cache, along with other
similar/related code (think genfs_putpages, and also the syncer) is
among the most subtle and delicate in the (or any) kernel. The
implementation in NetBSD is in no way well structured or robust; the
only reason it works at all is that it's been in production all this
time and any regressions that appear get reverted. I also had the
interesting experience of doing a new implementation of roughly the
same design a couple years back; even when done carefully and tidily
and stuffed full of assertions there are many subtle ways for it to go
wrong... many of which I'm sure exist as unfixed problems in the
NetBSD code. We *know* there's a certain background level of weird
unexplainable and unrepeatable filesystem bugs. A while back I tracked
one of those down, because it produced a panic with clear symptoms,
and traced it to race conditions involving buffer flags.

Fixing this up has been basically second on my urgent priority list
after ufs_rename, which had much w

Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread YAMAMOTO Takashi
hi,

> On Mon, Jan 16, 2012 at 10:00:05PM +, YAMAMOTO Takashi wrote:
>> have you considered to separate the entity being cached from vnode?
> 
> What would this buy us ? the data are intimely tied to the inode, cleaning
> the cache when a file is deleted or would be more difficult, isn't it ?

it allows us to have more than one such entities for a vnode.
it's what you want here, isn't it?

YAMAMOTO Takashi

> 
>> iirc, irix called it "buffer cache target" or such.
>> your vtruncbuf2 function seems to imply needs to have separate
>> v_dirtyblkhd/v_cleanblkhd for each types.
> 
> It's possible, I didn't look at vnode cleaning yet.
> 
> -- 
> Manuel Bouyer 
>  NetBSD: 26 ans d'experience feront toujours la difference
> --


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Manuel Bouyer
On Mon, Jan 16, 2012 at 10:00:05PM +, YAMAMOTO Takashi wrote:
> have you considered to separate the entity being cached from vnode?

What would this buy us ? the data are intimely tied to the inode, cleaning
the cache when a file is deleted or would be more difficult, isn't it ?

> iirc, irix called it "buffer cache target" or such.
> your vtruncbuf2 function seems to imply needs to have separate
> v_dirtyblkhd/v_cleanblkhd for each types.

It's possible, I didn't look at vnode cleaning yet.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread YAMAMOTO Takashi
hi,

> On Mon, 16 Jan 2012 10:56:33 + (UTC)
> y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote:
> 
>> when the kernel wants to cache other files.
>> ie. whenever the kernel decides to reclaim it. :-)
>> you can increase the chance by running
>>  while :;do sysctl -w kern.maxvnodes=0; done
>> or something like that.
> 
> Wouldn't the performance also drop significantly with a permanently low
> maxvnodes, though?

it does never succeed.
anyway the performance is not a priority when trying to reproduce a bug.

YAMAMOTO Takashi

> 
> Thanks,
> -- 
> Matt


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Matthew Mondor
On Mon, 16 Jan 2012 10:56:33 + (UTC)
y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote:

> when the kernel wants to cache other files.
> ie. whenever the kernel decides to reclaim it. :-)
> you can increase the chance by running
>   while :;do sysctl -w kern.maxvnodes=0; done
> or something like that.

Wouldn't the performance also drop significantly with a permanently low
maxvnodes, though?

Thanks,
-- 
Matt


Re: RFC: SEEK_DATA/SEEK_HOLE implementation version 2

2012-01-16 Thread YAMAMOTO Takashi
> Hi!
> 
> On Mon, Oct 03, 2011 at 04:54:29AM +, YAMAMOTO Takashi wrote:
>> > The new implementation presents the default one-blob for file systems that
>> > don't implement it. For NetBSD its currently implemented for UFS and is
>> > tested for FFS with/without WAPBL, ext2fs and lfs. It is present in our
>> > ZFS import but aparently disabled still and i dont have a ZFS partition to
>> > play with. I might be tempted to try it later on my scratch machine :) UDF
>> > is next but shouldn't be that difficult.
>> 
>> why is the VOP_FSYNC call necessary?
> 
> The sparse region search code depends on the indirect blocks being correctly
> written out as it traverses them. If the file is still `dirty' all the
> indirect blocks are present as negative indices so the normal FFS code works
> but their indirect blocks, when addressed with their disc addresses, are not
> up-to-date.
> 
> The FFS sparse region search code depends on the indirect blocks to see where
> actual data is recorded and needs the indirect blocks to be up-to-date. A
> range sync with only the negative range might also suffice but since most if
> not all of the applications of this code is dealing with backup/processing the
> VOP_FSYNC() is normally a NOP.
> 
> I hope this explanation helps :)

what's wrong with just reporting dirty regions as non-hole?

YAMAMOTO Takashi

> 
> With regards,
> Reinoud


Re: O->A loan

2012-01-16 Thread YAMAMOTO Takashi
hi,

the regression shown by yamt3.png seems bigger than i expected.
i guess there are some bugs...

anyway, thanks!

YAMAMOTO Takashi

> The first one is tmpfs (2GB md)
> The other is UFS
> On Mon, 16 Jan 2012, YAMAMOTO Takashi wrote:
> 
>> Date: Mon, 16 Jan 2012 04:04:31 + (UTC)
>> From: YAMAMOTO Takashi 
>> To: jai...@mauthesis.com
>> Cc: c...@chuq.com, tech-kern@netbsd.org
>> Subject: Re: O->A loan
>> 
>> hi,
>>
>> i'm wondering why the following two are this drastically different.
>> was there configuration changes more than flipping DIAGNOSTIC?
>>
>>  http://linbsd.org/yamt.png
>>  http://linbsd.org/yamt3.png
>>
>> YAMAMOTO Takashi
>>
>>> This is the same hardware.
>>> rmind had noticed in the lockstat output that fileassoc was being called
>>> on all unlink() operation. So I am rerunning the tests without fileassoc.
>>>
>>> On Mon, 16 Jan 2012, YAMAMOTO Takashi wrote:
>>>
 Date: Mon, 16 Jan 2012 03:47:48 + (UTC)
 From: YAMAMOTO Takashi 
 To: jai...@mauthesis.com
 Cc: c...@chuq.com, tech-kern@netbsd.org
 Subject: Re: O->A loan

 hi,

 thanks.
 is this on a different hardware from the previous one?

 YAMAMOTO Takashi

>
> Hello Yamamoto-san
>
> I have run dbench on ufs/wapbl with diagnostics disabled.
> http://linbsd.org/yamt3.png
>
>
> On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:
>
>> Date: Thu, 12 Jan 2012 03:31:59 + (UTC)
>> From: YAMAMOTO Takashi 
>> To: jai...@mauthesis.com
>> Cc: c...@chuq.com, tech-kern@netbsd.org
>> Subject: Re: O->A loan
>>
>> hi,
>>
>>> I did not remove DIAGNOSTIC.
>>> Would you like me to rerun without DIAGNOSTIC?
>>
>> yes, please.
>>
>> YAMAMOTO Takashi
>>
>>> On Thu, 12 Jan 2012, YAMAMOTO Takashi wrote:
>>>
 Date: Thu, 12 Jan 2012 03:14:33 + (UTC)
 From: YAMAMOTO Takashi 
 To: jai...@mauthesis.com
 Cc: c...@chuq.com, tech-kern@netbsd.org
 Subject: Re: O->A loan

 hi,

 thanks for benchmark!

 was it without DIAGNOSTIC?

 YAMAMOTO Takashi

> Hello Yamamoto-san,
>
> I ran dbench on the same system with yamt-pagecache, yamt-pagecache
> without a-o loan, and yamt-pagecache-base3.
> http://linbsd.org/yamt.png
> The tests were run three times on each kernel and the results were
> consistent between reboots/runs.
>
> Thanks.
>
> On Tue, 27 Dec 2011, YAMAMOTO Takashi wrote:
>
>> Date: Tue, 27 Dec 2011 02:53:29 + (UTC)
>> From: YAMAMOTO Takashi 
>> To: c...@chuq.com
>> Cc: tech-kern@netbsd.org
>> Subject: Re: O->A loan
>>
>> hi,
>>
>> i made read with O->A loaning work for easy cases (ie. no locking 
>> difficulty)
>> on yamt-pagecache branch so that someone interested can benchmark.
>>
>> YAMAMOTO Takashi
>>
>>> hi,
>>>
 On Tue, Nov 29, 2011 at 06:38:27AM +, YAMAMOTO Takashi wrote:
> O->A loaned pages installed on the user address space would have 
> a different
> owner than the usual map->entry.uvm_obj.
> although it was not a problem when you wrote this patch, at least 
> some
> non-mechanical changes would be required after the recent locking
> changes in this area.  namely, uvm_map_lock_entry etc now assumes 
> that
> any pages mapped in a map entry belong to either the entry's amap 
> or
> underlying object.

 ok, I didn't think it would be entirely mechanical.  :-)

 what if the O->A loan code also changed the entry's uvm_obj to be 
 the vnode
 that the pages really belong to?  if the loan range in the amap is 
 fully
 populated (which it is in this context) then that shouldn't affect 
 the
 logical contents of the entry, it would just cause anyone locking 
 the entry
 to also lock the vnode.  if the range of the loan is smaller than 
 the
 range of the entry, we could split the entry.  do you think that 
 would work?
>>>
>>> it might work, but i have some concerns:
>>> - entry fragmentation
>>> - the extra uobj reference keeps the file even after unlink
>>>
>>> YAMAMOTO Takashi
>>>

 -Chuck


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread YAMAMOTO Takashi
hi,

> Hello,
> I'm working on porting the FreeBSD FFSv2 extended attributes support.
> What we have right now only works for ffsv1 (it's a restriction in our
> sources but it could be extended to ffsv2), and uses a file hierarchy
> to store attributes. This has several issues, one being that it doesn't
> integrate with WAPBL and is very slow (glusterfs shows this very well).
> 
> FFSv2 has native extended attributes support, in the form of 2 direct
> blocks reserved for this purpose in the on-disk inode. This was commented out
> in our kernel when FFSv2 support was imported. It should be possible to
> integrate this with WAPBL and handle it as other metadata, so it should
> be fast. fsck will also be able to check it.
> 
> I don't think I'll be able to have this ready for netbsd-6, but I now know
> this requires 2 changes that will require a kernel version bump, so theses
> changes needs to go in before netbsd-6 is branched so that full
> extended attributes support can be pulled up later.
> 
> The fisrt change is to the buffer cache. Right now the buffer cache is
> indexed by the couple , block number being
> a block offset in the file being pointed to by vnode pointer. 
> But we'll have 2 kinds of blocks: data blocks (what we have now) and
> extended attributes blocks, so block number is not enough to identify
> blocks from a vnode. FreeBSD use negative block numbers for extattrs,
> but I find it unclean, I'm not sure it won't cause problems with our
> buffer cache (at last block -1 is used already for vtruncbuf()), and negative
> blocks numbers are already used in ufs for indirect blocks.
> I see 2 ways to fix this:
> 1) Add a new bflag, B_ALTDATA. When the buffer refers to a extended attr
>   block (and not a data block), this flag is set. This flag can also be
>   passed to bread() and breadn() (not the same namespace, but the same
>   B_ prefix and so the same name, this part of the buffer cache API could also
>   be improved). When looking up a buffer in the cache we also check for
>   this flag. For consumers to be able to specify we're looking up a
>   B_ALTDATA buffer, incore(), getblk() and vtruncbuf() gains a new flag
>   argument. To avoid touching a all buffer cache users, I choose to
>   introduce incore2(), getblk2() and vtruncbuf2() with the extra argument,
>   and the origical functions just call the *2 version with flag set to 0.
>   This is implemented in buffer.diff, and has been tested to not
>   introduce new problems with existing code.
> 
> 2) instead of using a new flag, add a new 'int type' member to struct buf,
>which is opaque to the buffer cache itself (the meaning of type > 0 is
>fs-dependant) but is checked when looking up a buffer.
>Type 0 would still mean regular vnode data, so that existing users
>won't have to be changed, other values could be used by filesystems
>for their internal data usage (for example, ufs could use 1 for
>first indirect blocks, 2, for second indirect blocks, 3 for thrird
>indirect blocks, and some other values for extended attributes. Another
>filesystem with e.g. blocks to store ACLs could also use its own
>type to have its ACL blocks entered in the bufcache).
>In addition to new incore2(), getblk2() and vtruncbuf2() functions
>with a type argument, we'd also need a bread2() and breadn2() with
>a type argument.
>I've not implemented this yet.

have you considered to separate the entity being cached from vnode?
iirc, irix called it "buffer cache target" or such.
your vtruncbuf2 function seems to imply needs to have separate
v_dirtyblkhd/v_cleanblkhd for each types.
besides that, it's more natural representation for NTFS/NFSv4 style
attribute forks.

YAMAMOTO Takashi

> 
> Althrough I've done 1 as a POC, I prefer solution 2 (the patch is mostly the
> same, with bflag remplaced by b_type). What do other think ?
> 
> The second change needed outside of sys/ufs/ffs/ is:
> - new members in struct inode. This is strait from FreeBSD, and this
>   affects modules so require a kernel version bump
> - ufs_inode.c:ufs_inactive() truncate extended data to 0 as well when
>   freeing the inode. This require changes to ffs and lfs to ignore
>   IO_EXT (for now).
> - ufs_vnops.c:ufs_strategy(): when requesting extended attribute data
>   (B_ALTDATA actually), get extdata bn instead of regular bn via VOP_BMAP().
> This is in ufs.diff attached.
> 
> The complete diff of actual code (where extended attributes are not working
> yet, there's locking issues, as well as more code needed for reverse endian
> handling, WAPBL and fsck) is also included, so that you can see how
> all of this goes in together.
> 
> I'd really like to be able to get this in netbsd-6 later, which would
> means (given my own schedule) I have to commit the kern and ufs/ufs parts
> before next friday if we want to avoid kernel API changes in the branch.
> 
> -- 
> Manuel Bouyer 
>  NetBSD: 26 ans d'experience feront toujours la difference
> 

Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Manuel Bouyer
On Mon, Jan 16, 2012 at 10:45:03PM +0100, Martin Husemann wrote:
> On Mon, Jan 16, 2012 at 10:39:45PM +0100, Manuel Bouyer wrote:
> > is branched. "this won't ever be in netbsd-6" is not an option, I don't
> > think we can wait for netbsd-7 for this.
> 
> Why not?

Because manu@ has put lots of efforts in getting glusterfs running,
and I think it's something we can market. But it's unusable with ffsv1
extattrs, we really need something better.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Martin Husemann
On Mon, Jan 16, 2012 at 10:39:45PM +0100, Manuel Bouyer wrote:
> is branched. "this won't ever be in netbsd-6" is not an option, I don't
> think we can wait for netbsd-7 for this.

Why not?

Martin


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Manuel Bouyer
On Mon, Jan 16, 2012 at 08:46:44PM +, David Holland wrote:
> On Mon, Jan 16, 2012 at 07:37:19PM +0100, Manuel Bouyer wrote:
>  > > > The fisrt change is to the buffer cache.
>  > > 
>  > > My first reaction is that I don't think it's a good idea to make major
>  > > changes to the buffer cache at this stage in a release cycle. It's not
>  > > exactly the most robust or stable code we have in the system.
>  > 
>  > On the other hand, changing the kernel ABI in the release branch
>  > is not a good idea either, given the state of the module subsystem ...
> 
> Indeed. But that isn't really the question. The question is really
> whether we're past the date for brand-new feature proposals for
> netbsd-6... or at least ones that involve invasive changes.

No, the question is whenever we commit the needed bits now
for the feature to be pulled up without kernel API major change later,
or if we accept a kernel API major change in the branch after netbsd-6-0
is branched. "this won't ever be in netbsd-6" is not an option, I don't
think we can wait for netbsd-7 for this.

> 
> releng and/or core will need to rule on that but I don't think it's a
> good idea.
> 
> The buffer cache code is ratty, poorly structured, and full of races
> that may or may not have visible consequences. Any change to it could
> turn out to be unexpectedly disruptive, and we don't at this point
> have time to clean up if the changes turn out to make a mess.
> (Especially since the person who'd likely have to do the cleanup,
> namely me, is already oversubscribed right up to the branch deadline
> on other stuff. Which, not to put too fine a point on it, includes
> sorting out a previous mess.)
> 
>  > I wouldn't call this a major change. The alternative way (adding a b_type
>  > member to struct buf) is even less intrusive; if the alternative (*2())
>  > functions are not used it'll always be 0.
> 
> Changing the way the buffer cache is indexed is semantically intrusive
> even if it's not physically intrusive. While I think adding a type
> field to modify the block number is a good idea, for various reasons,
> it needs to be thought through, and it hasn't been yet and there isn't
> time to thrash it out fully before the branch deadline. Furthermore,
> there's the existing question of indexing by physical vs. virtual
> block numbers; that is not in any way a resolved issue and what you're
> proposing interacts directly with it, and this too needs to be thought
> through and there isn't time before the branch deadline.

Yes, and I don't think we need to wait for theses questions to be
sorted out to have ffs2 extended attributes. Even if the buffer cache
code is rototilled later, I don't think extended attributes or the new
type field will make it harder, and the code in HEAD and the netbsd-6
branch is allowed to diverge.

> 
> Meanwhile, adding a whole set of extra functions instead of doing
> things properly with a massedit is highly undesirable; not only is it
> ugly but it leaves us unable to tidy the extra functions away in HEAD
> after branching, which is bad... unless we want to vastly complicate
> any pullup that goes near the buffer cache, which would be worse.

A massive change to the buffer cache code won't be pulled up to netbsd-6
anyway (one of them being the mess it would cause for modules) so
the code will not be the same in netbsd-6 and HEAD anyway.

>  > > (Also, you're aware that it isn't used for file data blocks, right?)
>  > 
>  > It depends on what you call "file data". Directory data blocks ends
>  > there AFAIK, as do symlink data blocks when the link's target doesn't
>  > fit in the inode.
> 
> Right. So, should extended attribute blocks go in the old buffer cache
> or should they be managed by uvm? This is another question we likely
> don't have time to address properly by the branch deadline.

Definitively the buffer cache, so it's covered by the journal. This is
metadata.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread YAMAMOTO Takashi
hi,

> On Mon, Jan 16, 2012 at 10:56:33AM +, YAMAMOTO Takashi wrote:
>> you can increase the chance by running
>>  while :;do sysctl -w kern.maxvnodes=0; done
> 
> It will always fail:
> bacasable#  sysctl -w kern.maxvnodes=0 
> sysctl: kern.maxvnodes: sysctl() failed with Device busy

it tries to reclaim vnodes before failing.

YAMAMOTO Takashi

> 
> -- 
> Emmanuel Dreyfus
> m...@netbsd.org


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Mouse
>> This makes a file no longer a long list of octets; it becomes
>> multiple long lists of octets.  [...]
> [...] I have always found the idea flaky myself (and sorry for the
> "rant"):  [...]

Yeah.  I think it's a very interesting direction to take filesystems.

But this, interesting as it is, is research experimentation; we do not
even nearly understand how to fit multi-fork (to adopt the MacOS term)
files into a Unix paradigm (witness all the programs that we don't
understand how to change for this), and investigating non-understood
things is what research _is_.  And I think the master tree for a
(supposedly-)production OS is not the place to be carrying out research
experiments, not even if another such OS is already doing it.

But my opinions seem to correlate negatively with NetBSD's these days.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Matthew Mondor
On Sun, 15 Jan 2012 15:21:40 -0500 (EST)
Mouse  wrote:

> However, I think that constitutes a good implementation of a bad idea.
> This makes a file no longer a long list of octets; it becomes multiple
> long lists of octets.  The Mac did this, with resource forks and data
> forks, and you may note OS X doesn't do it any longer.  I suspect these
> will seem like a good idea for a while, until people start discovering
> all the things they break, or that break them, and realize that they
> didn't learn from history and thus had to repeat it.

I didn't know that Apple dropped the idea, but I have always found the
idea flaky myself (and sorry for the "rant"):

- Applications may still implement and maintain metadata as they wish
  without the feature
- Requires changes to support in OS, FS, and many file manipulation
  tools
- No standard API for these, few, incompatible, restricted
  solutions/formats for archival
- Security implications (scanning tools which aren't aware might skip
  "hidden/extended" data; if ACLs are eventually implemented and are
  using these, the implementation should not only support a system
  domain, but also use IDs rather than strings (or at least severely
  sanity-check a restricted string format))
- Inevitable eventual loss of the extended data, possibly because of
  backup procedures not aware of it, moving/copying/editing files with
  non-aware/third-party tools, etc (also consider editors that save to
  another file to then rename)
- An administrative nightmare when tools such as find/locate/grep/diff
  won't disclose data that the admin might be looking for but is now in
  an extended attribute

But this is only the opinion of a user, and I could keep the feature
disabled on my systems, of course, so I don't necessarily object to
optional support for it.
-- 
Matt


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Mon, Jan 16, 2012 at 07:37:19PM +0100, Manuel Bouyer wrote:
 > > > The fisrt change is to the buffer cache.
 > > 
 > > My first reaction is that I don't think it's a good idea to make major
 > > changes to the buffer cache at this stage in a release cycle. It's not
 > > exactly the most robust or stable code we have in the system.
 > 
 > On the other hand, changing the kernel ABI in the release branch
 > is not a good idea either, given the state of the module subsystem ...

Indeed. But that isn't really the question. The question is really
whether we're past the date for brand-new feature proposals for
netbsd-6... or at least ones that involve invasive changes.

releng and/or core will need to rule on that but I don't think it's a
good idea.

The buffer cache code is ratty, poorly structured, and full of races
that may or may not have visible consequences. Any change to it could
turn out to be unexpectedly disruptive, and we don't at this point
have time to clean up if the changes turn out to make a mess.
(Especially since the person who'd likely have to do the cleanup,
namely me, is already oversubscribed right up to the branch deadline
on other stuff. Which, not to put too fine a point on it, includes
sorting out a previous mess.)

 > I wouldn't call this a major change. The alternative way (adding a b_type
 > member to struct buf) is even less intrusive; if the alternative (*2())
 > functions are not used it'll always be 0.

Changing the way the buffer cache is indexed is semantically intrusive
even if it's not physically intrusive. While I think adding a type
field to modify the block number is a good idea, for various reasons,
it needs to be thought through, and it hasn't been yet and there isn't
time to thrash it out fully before the branch deadline. Furthermore,
there's the existing question of indexing by physical vs. virtual
block numbers; that is not in any way a resolved issue and what you're
proposing interacts directly with it, and this too needs to be thought
through and there isn't time before the branch deadline.

Meanwhile, adding a whole set of extra functions instead of doing
things properly with a massedit is highly undesirable; not only is it
ugly but it leaves us unable to tidy the extra functions away in HEAD
after branching, which is bad... unless we want to vastly complicate
any pullup that goes near the buffer cache, which would be worse.

(Note that adding a flag instead of a type field is even worse; it has
all the same drawbacks, plus we'd be likely to want to rip it out and
replace it with the type field; plus on top of that race conditions
involving the flags are one of the primary things I'm worried about
disrupting.)

 > > (Also, you're aware that it isn't used for file data blocks, right?)
 > 
 > It depends on what you call "file data". Directory data blocks ends
 > there AFAIK, as do symlink data blocks when the link's target doesn't
 > fit in the inode.

Right. So, should extended attribute blocks go in the old buffer cache
or should they be managed by uvm? This is another question we likely
don't have time to address properly by the branch deadline.

-- 
David A. Holland
dholl...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Brian Buhrow
hello.  pstat -v should give you what you want to know.

-thanks
-Brian

On Jan 16,  1:17pm, Emmanuel Dreyfus wrote:
} Subject: Re: PUFFS and existing file that get ENOENT
} On Mon, Jan 16, 2012 at 02:02:41PM +0100, Adam Hamsik wrote:
} > Just try to lower that number to some smaller one ?
} 
} sysctl(7) says:
}  kern.maxvnodes (KERN_MAXVNODES)
}  The maximum number of vnodes available on the system.  This can
}  only be raised.
} 
} But it seems I can lower it from 26214 to 200 without a hitch. I have
} no idea how mch room it has, however. We cannot get the number of used
} vnode from userland, can we?
} 
} 
} -- 
} Emmanuel Dreyfus
} m...@netbsd.org
>-- End of excerpt from Emmanuel Dreyfus




Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread Manuel Bouyer
On Mon, Jan 16, 2012 at 04:37:57PM +, David Holland wrote:
> On Sun, Jan 15, 2012 at 08:37:37PM +0100, Manuel Bouyer wrote:
>  > I don't think I'll be able to have this ready for netbsd-6, but I now know
>  > this requires 2 changes that will require a kernel version bump, so theses
>  > changes needs to go in before netbsd-6 is branched so that full
>  > extended attributes support can be pulled up later.
>  > 
>  > The fisrt change is to the buffer cache.
> 
> My first reaction is that I don't think it's a good idea to make major
> changes to the buffer cache at this stage in a release cycle. It's not
> exactly the most robust or stable code we have in the system.

On the other hand, changing the kernel ABI in the release branch
is not a good idea either, given the state of the module subsystem ...

I wouldn't call this a major change. The alternative way (adding a b_type
member to struct buf) is even less intrusive; if the alternative (*2())
functions are not used it'll always be 0.

> 
> (Also, you're aware that it isn't used for file data blocks, right?)

It depends on what you call "file data". Directory data blocks ends
there AFAIK, as do symlink data blocks when the link's target doesn't
fit in the inode.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Return status ENXIO / ESRCH in kern_drvctl.c

2012-01-16 Thread Paul Goyette
While browsing the code for other stuff, I ran into what appears to be 
an inconsistency in drvctl(4).


In most cases, when the device specified cannot be found, we return 
ENXIO - Device not configured.


But in routine drvctl_command_get_properties(), if the device is not 
found, we return ESRCH - No such process.  (This is in file 
sys/kern/kern_drvctl.c rev 1.32 at line 462)


It seems to me that this is incorrect, and we should return ENXIO here.


As a side note, it would be nice if we had a drvctl(4) man page.   :)



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
| Kernel Developer |  | pgoyette at netbsd.org  |
-


Re: buffer cache & ufs changes (preliminary ffsv2 extattr support)

2012-01-16 Thread David Holland
On Sun, Jan 15, 2012 at 08:37:37PM +0100, Manuel Bouyer wrote:
 > I don't think I'll be able to have this ready for netbsd-6, but I now know
 > this requires 2 changes that will require a kernel version bump, so theses
 > changes needs to go in before netbsd-6 is branched so that full
 > extended attributes support can be pulled up later.
 > 
 > The fisrt change is to the buffer cache.

My first reaction is that I don't think it's a good idea to make major
changes to the buffer cache at this stage in a release cycle. It's not
exactly the most robust or stable code we have in the system.

(Also, you're aware that it isn't used for file data blocks, right?)

-- 
David A. Holland
dholl...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Rhialto
On Mon 16 Jan 2012 at 13:17:17 +, Emmanuel Dreyfus wrote:
> But it seems I can lower it from 26214 to 200 without a hitch. I have
> no idea how mch room it has, however. We cannot get the number of used
> vnode from userland, can we?

pstat -v gives the number of "active" vnodes; that may be useful.

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert  -- There's no point being grown-up if you 
\X/ rhialto/at/xs4all.nl-- can't be childish sometimes. -The 4th Doctor


Re: kernel crash at ibm x3850

2012-01-16 Thread 6bone

Hello,

I compiled a kernel with some more debug code.


kernel text is mapped with 6 large pages and 34 normal pages
Loaded initial symtab at 0x81258fa0, strtab at 0x81303f70, 
# entries 29097

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 5.1_STABLE (INSTALL) #7: Sun Jan 15 23:33:41 CET 2012

r...@6bone.informatik.uni-leipzig.de:/usr/obj/sys/arch/amd64/compile/INSTALL
total memory = 511 GB
avail memory = 496 GB
RTC BIOS diagnostic error 80
SMBIOS rev. 2.7 @ 0x7f0be000 (137 entries)
IBM System x3850 X5 -[71453RG]- (06)
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel 686-class, 1995MHz, id 0x206e6
fatal protection fault in supervisor mode
trap type 4 code 0 rip 8056e456 cs 8 rflags 10246 cr2  0 cpl 8 rsp 
8137ab98 
kernel: protection fault trap, code=0 
Stopped in pid 0.1 (system) at  netbsd:rdmsr+0x6:   rdmsr

db{0}> bt
rdmsr() at netbsd:rdmsr+0x6
est_init_once() at netbsd:est_init_once+0x148
_run_once() at netbsd:_run_once+0x67
cpu_identify() at netbsd:cpu_identify+0x171
cpu_attach() at netbsd:cpu_attach+0x21f
config_attach_loc() at netbsd:config_attach_loc+0x15a
mpacpi_config_cpu() at netbsd:mpacpi_config_cpu+0x6e
acpi_madt_walk() at netbsd:acpi_madt_walk+0x45
mpacpi_scan_apics() at netbsd:mpacpi_scan_apics+0x90
mainbus_attach() at netbsd:mainbus_attach+0x26c
config_attach_loc() at netbsd:config_attach_loc+0x15a
cpu_configure() at netbsd:cpu_configure+0x26
main() at netbsd:main+0x1aa
db{0}> show register
ds  0x5
es  0
fs  0x64
gs  0xbed6
rdi 0xcd
rsi 0x805690a0  est_init_once
rbp 0x8137abb0
rbx 0xe
rdx 0x1
rcx 0xcd
rax 0
r8  0x3
r9  0
r10 0x1
r11 0x802e9da0  comcnputc
r12 0x80c4ea00  cpu_info_primary
r13 0x8137ad40
r14 0x8004b0b5e710
r15 0x1
rip 0x8056e456  rdmsr+0x6
cs  0x8
rflags  0x10246
rsp 0x8137ab98
ss  0x10
netbsd:rdmsr+0x6:   rdmsr



Thank you for your efforts

Regards
Uwe


On Fri, 13 Jan 2012, Patrick Welche wrote:


Date: Fri, 13 Jan 2012 12:34:09 +
From: Patrick Welche 
To: 6b...@6bone.informatik.uni-leipzig.de
Cc: tech-kern@netbsd.org
Subject: Re: kernel crash at ibm x3850

On Fri, Jan 13, 2012 at 11:54:58AM +0100, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

if I boot the netbsd-5-1-1 install media at an ibm x3850 the kernel
crashs. You can find the screenshot at
https://suse.uni-leipzig.de/ibm-x3850.jpg

Any ideas what could be the problem?


Maybe you could get a backtrace (type "bt") at the prompt showing, and
take a picture of that?
(http://www.netbsd.org/docs/kernel/#ddb)

Cheers,

Patrick



Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Emmanuel Dreyfus
On Mon, Jan 16, 2012 at 02:02:41PM +0100, Adam Hamsik wrote:
> Just try to lower that number to some smaller one ?

sysctl(7) says:
 kern.maxvnodes (KERN_MAXVNODES)
 The maximum number of vnodes available on the system.  This can
 only be raised.

But it seems I can lower it from 26214 to 200 without a hitch. I have
no idea how mch room it has, however. We cannot get the number of used
vnode from userland, can we?


-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Adam Hamsik

On Jan,Monday 16 2012, at 11:58 AM, Emmanuel Dreyfus wrote:

> On Mon, Jan 16, 2012 at 10:56:33AM +, YAMAMOTO Takashi wrote:
>> you can increase the chance by running
>>  while :;do sysctl -w kern.maxvnodes=0; done
> 
> It will always fail:
> bacasable#  sysctl -w kern.maxvnodes=0 
> sysctl: kern.maxvnodes: sysctl() failed with Device busy


Just try to lower that number to some smaller one ?

Regards

Adam.



Re: heads-up: IPSEC is now FAST_IPSEC

2012-01-16 Thread Matthias Drochner

rm...@netbsd.org said:
>  Are you planning to remove old IPSEC code?

We should provide the KAME code as fallback for at least one
major release. Not that I don't trust the new code, but as
a matter of solid engineering.

> I think post-netbsd-6 branch (or even now?)
> would be a very good time.

Post-6-branch would be OK if no serious problems show up.

While we are here -- there are two places in the KAME code
where it interacts withe the "pf" packet filter:
-For policy lookup, a pf packet tag can be used as condition.
-There is some ifdefd code in sys/dist/pf/net/pf.c
 which has probebly never worked in NetBSD, apparently
 for interfaces with HW crypto support. (It does not get
 compiled because someone forgot to include "opt_ipsec.h".)
Can you tell whether this should be pulled into FAST_IPSEC?
Policy lookup is actually something which could need some
improvement, because it is performance critical even is
IPSEC is not used for the connection in question. OpenBSD
has integrated this with the routing framework, but using
a packet filter as packet classifier would also be
a conceivable option. What are your plans with npf?

best regards
Matthias





Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt



Kennen Sie schon unsere app? http://www.fz-juelich.de/app


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Emmanuel Dreyfus
On Mon, Jan 16, 2012 at 10:56:33AM +, YAMAMOTO Takashi wrote:
> you can increase the chance by running
>   while :;do sysctl -w kern.maxvnodes=0; done

It will always fail:
bacasable#  sysctl -w kern.maxvnodes=0 
sysctl: kern.maxvnodes: sysctl() failed with Device busy

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread YAMAMOTO Takashi
hi,

> On Mon, Jan 16, 2012 at 06:25:39AM +, YAMAMOTO Takashi wrote:
>> it should retry from puffs_cookie2pnode in that case.
> 
> I also need to build a test case that reliabiliy reproduce the bug. 
> For now I run our build.sh -Uo release and come back the next day, 
> this is not very convenient.
> 
> As I understand, I need to lookup a node I arealdy node but is beeing
> recycled. When does the kernel decide to recycle a vnode?

when the kernel wants to cache other files.
ie. whenever the kernel decides to reclaim it. :-)
you can increase the chance by running
while :;do sysctl -w kern.maxvnodes=0; done
or something like that.

YAMAMOTO Takashi

> 
> -- 
> Emmanuel Dreyfus
> m...@netbsd.org


Re: PUFFS and existing file that get ENOENT

2012-01-16 Thread Emmanuel Dreyfus
On Mon, Jan 16, 2012 at 06:25:39AM +, YAMAMOTO Takashi wrote:
> it should retry from puffs_cookie2pnode in that case.

I also need to build a test case that reliabiliy reproduce the bug. 
For now I run our build.sh -Uo release and come back the next day, 
this is not very convenient.

As I understand, I need to lookup a node I arealdy node but is beeing
recycled. When does the kernel decide to recycle a vnode?

-- 
Emmanuel Dreyfus
m...@netbsd.org