Re: sparse dumps (was: WAPL panic)

2012-11-13 Thread Chuck Silvers
On Mon, Nov 12, 2012 at 07:46:51AM -0700, Sverre Froyen wrote: On Nov 9, 2012, at 08:01, Chuck Silvers c...@chuq.com wrote: I have tested your patches for NetBSD-current on VMware Fusion (under Mac OSX). Breaking into ddb and entering reboot 0x104 results in a good core dump. As you note,

Re: sparse dumps (was: WAPL panic)

2012-11-12 Thread Sverre Froyen
On Nov 9, 2012, at 08:01, Chuck Silvers c...@chuq.com wrote: On Wed, Nov 07, 2012 at 02:22:49PM +0100, Edgar Fu wrote: Try to get a sparse dump via machdep.sparse_dump=1 How long is that supposed to take? It said dump, paused for a few seconds, then counted from 44 down to 38 and then

Re: WAPL panic

2012-11-07 Thread Edgar Fuß
wapbl_register_inode shouldn't be able to reach that panic... Maybe that's some stack frame optimization. it's in wapbl_register_deallocation. Yes, as it says: wapbl_register_deallocation: out of resources ffs_truncate calls both, but mkdir shouldn't result in things being released... or so

Re: WAPL panic

2012-11-07 Thread Edgar Fuß
Shouldn't really happen with mkdir. But it did. Twice. You could try a kernel with WAPBL_DEBUG I will try that. You should see a few thousand inode records after such a crash. But as I wrote, the file system where the mount took minutes was NOT the one I was operation on during the crash. It

amd64 sparse dumps (was: WAPL panic)

2012-11-07 Thread Edgar Fuß
Yup - appears that it did make it. It seems to be missing from CHANGES that amd64 has it.

Re: WAPL panic

2012-11-07 Thread Thor Lancelot Simon
On Wed, Nov 07, 2012 at 02:30:22PM +0100, Edgar Fu? wrote: But it did. Twice. Third time, same backtrace. In this case, a simple mkdir x on the same file system as before. That's a plain FFSv1 sized 553MB with 64k blocks. The only unusual thing is it has some 10 inodes. Does that

Re: WAPL panic

2012-11-07 Thread Edgar Fuß
Are you running all your filesystems with 64K blocks? Most of them, yes. I did some lengthy performance test on various combinations of a RAID 5's stripe size and the file system block size (the results of which I posted on tech-kern). The main concern was whether the nightly backup (a TSM

Re: WAPL panic

2012-11-07 Thread Thor Lancelot Simon
On Wed, Nov 07, 2012 at 02:57:46PM +0100, Edgar Fu? wrote: The winner was 64k fsbsize on 128SpSU (resulting in one fs block per stripe unit and four per stripe). This looks wrong. The 6.0 kernel will never, under any circumstances, issue a transfer larger than 64K to a disk device, not even

Re: WAPL panic

2012-11-07 Thread Edgar Fuß
This looks wrong. Remember I was only testing read performance then. The 6.0 kernel will never, under any circumstances, issue a transfer larger than 64K to a disk device, not even a pseudodevice like RAIDframe I'm well aware of that. That's why now, I'm testing write performance. What

Re: WAPL panic

2012-11-07 Thread David Holland
On Wed, Nov 07, 2012 at 11:34:08AM +0100, Edgar Fu? wrote: wapbl_register_inode shouldn't be able to reach that panic... Maybe that's some stack frame optimization. Well... as far as I can tell wapbl_register_inode does not call wapbl_register_deallocation, so it shouldn't be. But maybe ddb

Re: WAPL panic

2012-11-07 Thread David Holland
On Wed, Nov 07, 2012 at 12:04:01PM +0100, J. Hannken-Illjes wrote: ffs_truncate calls both, but mkdir shouldn't result in things being released... or so I'd think. It does. Just before returning ufs_direnter() tries to short the directory and calls UFS_TRUNCATE() aka ffs_truncate().

Re: WAPL panic

2012-11-07 Thread Mouse
Are you running all your filesystems with 64K blocks? Almost nobody uses them. Yet another case where I'm an outlier, I guess. :) I routinely use filesystems created with -f 8192 -b 65536 -n 1 (I have a few filesystems which contain small numbers of multi-gig files, with almost no creates and

Re: WAPL panic

2012-11-07 Thread Simon Burge
Mouse wrote: Are you running all your filesystems with 64K blocks? Almost nobody uses them. Yet another case where I'm an outlier, I guess. :) I use 64k/8k blocks/frags as well for my media filesystems, and have done as long as I can remember. A certain ex-employer of mine used 64k/64k

re: WAPL panic

2012-11-07 Thread matthew green
On Wed, Nov 07, 2012 at 02:30:22PM +0100, Edgar Fu? wrote: But it did. Twice. Third time, same backtrace. In this case, a simple mkdir x on the same file system as before. That's a plain FFSv1 sized 553MB with 64k blocks. The only unusual thing is it has some 10 inodes.

WAPL panic

2012-11-06 Thread Edgar Fuß
So, while investigating my WAPL performance problems, It looks like I can crash the machine (not reliably, but more often that not) with a simple seq 1 3000 | xargs mkdir command. I get the following backtrace in ddb (wetware OCR): panic: wapbl_register_deallocation: out of resources

Re: WAPL panic

2012-11-06 Thread Paul Goyette
On Tue, 6 Nov 2012, Edgar Fu? wrote: It's unreasonable to take a dump because that would take an estimated four to five hours. Is there any reasonable way to get a dump out of a 16G box? If I remember correctly, i386 has partial/sparse crash dumps, but they're not yet implemented in amd64.

Re: WAPL panic

2012-11-06 Thread Thor Lancelot Simon
On Tue, Nov 06, 2012 at 02:25:10PM -0800, Paul Goyette wrote: On Tue, 6 Nov 2012, Edgar Fu? wrote: It's unreasonable to take a dump because that would take an estimated four to five hours. Is there any reasonable way to get a dump out of a 16G box? If I remember correctly, i386 has

Re: WAPL panic

2012-11-06 Thread Michael van Elst
e...@math.uni-bonn.de (Edgar =?iso-8859-1?B?RnXf?=) writes: So, while investigating my WAPL performance problems, It looks like I can crash the machine (not reliably, but more often that not) with a simple seq 1 3000 | xargs mkdir command. I get the following backtrace in ddb (wetware

Re: WAPL panic

2012-11-06 Thread Christos Zoulas
In article 20121106221628.gl22...@trav.math.uni-bonn.de, Edgar Fuß e...@math.uni-bonn.de wrote: So, while investigating my WAPL performance problems, It looks like I can crash the machine (not reliably, but more often that not) with a simple seq 1 3000 | xargs mkdir command. I get the

Re: WAPL panic

2012-11-06 Thread Paul Goyette
On Tue, 6 Nov 2012, Thor Lancelot Simon wrote: On Tue, Nov 06, 2012 at 02:25:10PM -0800, Paul Goyette wrote: On Tue, 6 Nov 2012, Edgar Fu? wrote: It's unreasonable to take a dump because that would take an estimated four to five hours. Is there any reasonable way to get a dump out of a 16G

Re: WAPL panic

2012-11-06 Thread David Holland
On Tue, Nov 06, 2012 at 11:16:29PM +0100, Edgar Fu? wrote: So, while investigating my WAPL performance problems, It looks like I can crash the machine (not reliably, but more often that not) with a simple seq 1 3000 |?xargs mkdir command. I get the following backtrace in ddb (wetware