Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-08-01 Thread Mouse
>> It's worth reminding that -o tcp is an option.
> Not for NFS through a (stateful) filtering router, no.

True.

But then, not over network hops that drop port 2049, either.  Break the
assumptions underlying the 'net and you have to expect breakage from
stuff built atop it.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-08-01 Thread Hauke Fath
At 19:14 Uhr + 31.07.2019, m...@netbsd.org wrote:
>It's worth reminding that -o tcp is an option.

Not for NFS through a (stateful) filtering router, no. Reboot the router,
and you will have to walk up to every client and reboot it. With NFS over
UDP, the clients will recover.

Cheerio,
hauke


--
"It's never straight up and down" (DEVO)




Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-08-01 Thread Hauke Fath
At 10:45 Uhr +0200 31.07.2019, Edgar Fuß wrote:
>Thanks to riastradh@, this tuned out to be caused by an (UDP, hard) HFS
>mount combined with a mis-configured IPFilter that blocked all but the
>first fragment of a fragmented NFS reply (e.g., readdir) combined with a
>NetBSD design error (or so Taylor says) that a vnode lock may be held
>accross I/O, in this case, network I/O.

I ran into a similar issue 2004ish, connecting RedHat Linux clients to a
(NetBSD) nfs (udp)  server through a (NetBSD, ipfilter) filtering router.
Darren back then told me Linux sends fragmented packets tail-first, which
ipfilter was not prepared to deal with.

I switched to pf, which was able deal with the scenario just fine, and
didn't look back.

Cheerio,
hauke


--
"It's never straight up and down" (DEVO)




Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-07-31 Thread maya
On Wed, Jul 31, 2019 at 07:11:54AM -0700, Jason Thorpe wrote:
> 
> > On Jul 31, 2019, at 1:45 AM, Edgar Fuß  wrote:
> > 
> > NetBSD design error (or so Taylor says) that a vnode lock may be held 
> > accross I/O
> 
> 100%
> 
> NetBSD's VFS locking protocol needs a serious overhaul.  At least one other 
> BSD-family VFS (the one in XNU) completely eliminated locking of vnodes at 
> the VFS layer (it's all pushed into the file system back-ends who now have 
> more control over their own locking requirements).  It does have some 
> additional complexities around reference / busy counting and vnode identity, 
> but it works very well in practice.
> 
> I don't know what FreeBSD has done in this area.
> 
> -- thorpej
> 

IMNT_MPSAFE, which NFS isn't?


Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-07-31 Thread maya
On Wed, Jul 31, 2019 at 11:42:26AM -0500, Don Lee wrote:
> If you go back a few years, you can find a thread where I reported tstile 
> lockups on PPC. I don’t remember the details, but it was back in 6.1 as I 
> recall. This is not a new problem, and not limited to NFS. I still have a 
> similar problem with my 7.2 system, usually triggered when I do backups 
> (dump/restore). The dump operation locks up and cannot be killed. The system 
> continues, except any process that trips over the tstile also locks up. 
> Eventually, the system grinds to a complete halt. (can’t even log in) If I 
> catch it before that point, I can almost reboot, but I have to power cycle to 
> kill the tstile process(es), or the reboot also hangs.

It's worth reminding that -o tcp is an option.


Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-07-31 Thread Don Lee
If you go back a few years, you can find a thread where I reported tstile 
lockups on PPC. I don’t remember the details, but it was back in 6.1 as I 
recall. This is not a new problem, and not limited to NFS. I still have a 
similar problem with my 7.2 system, usually triggered when I do backups 
(dump/restore). The dump operation locks up and cannot be killed. The system 
continues, except any process that trips over the tstile also locks up. 
Eventually, the system grinds to a complete halt. (can’t even log in) If I 
catch it before that point, I can almost reboot, but I have to power cycle to 
kill the tstile process(es), or the reboot also hangs.

-dgl-

> On Jul 31, 2019, at 9:11 AM, Jason Thorpe  wrote:
> 
> 
>> On Jul 31, 2019, at 1:45 AM, Edgar Fuß  wrote:
>> 
>> NetBSD design error (or so Taylor says) that a vnode lock may be held 
>> accross I/O
> 
> 100%
> 
> NetBSD's VFS locking protocol needs a serious overhaul.  At least one other 
> BSD-family VFS (the one in XNU) completely eliminated locking of vnodes at 
> the VFS layer (it's all pushed into the file system back-ends who now have 
> more control over their own locking requirements).  It does have some 
> additional complexities around reference / busy counting and vnode identity, 
> but it works very well in practice.
> 
> I don't know what FreeBSD has done in this area.
> 
> -- thorpej
> 



Re: NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-07-31 Thread Jason Thorpe


> On Jul 31, 2019, at 1:45 AM, Edgar Fuß  wrote:
> 
> NetBSD design error (or so Taylor says) that a vnode lock may be held accross 
> I/O

100%

NetBSD's VFS locking protocol needs a serious overhaul.  At least one other 
BSD-family VFS (the one in XNU) completely eliminated locking of vnodes at the 
VFS layer (it's all pushed into the file system back-ends who now have more 
control over their own locking requirements).  It does have some additional 
complexities around reference / busy counting and vnode identity, but it works 
very well in practice.

I don't know what FreeBSD has done in this area.

-- thorpej



NFS lockup after UDP fragments getting lost (was: 8.1 tstile lockup after nfs send error 51)

2019-07-31 Thread Edgar Fuß
Thanks to riastradh@, this tuned out to be caused by an (UDP, hard) HFS mount 
combined with a mis-configured IPFilter that blocked all but the first fragment 
of a fragmented NFS reply (e.g., readdir) combined with a NetBSD design error 
(or so Taylor says) that a vnode lock may be held accross I/O, in this case, 
network I/O.

It should be reproducable with a default NFS mount and a
block in all with frag-body
IPFilter rule and then trying to readdir.

Now, in some cases, the machine in question recovered after fixing the filter 
rules, in others, it didn't, forcing a reboot. This strikes me as a bug because 
the same lock-up could as well have been caused by network problems instead of 
ipfilter mis-configuration.

It looks like the operation to which the reply was lost sometimes doesn't get 
retried. Do we have some weird bug where the first fragment arriving stops the 
timeout but the blocking of the remaining fragments cause it to wedge?


Re: 8.1 tstile lockup after nfs send error 51

2019-07-29 Thread Edgar Fuß
Here are stack traces of all the frozen processes (with a few newlines 
inserted manually):

Crash version 8.1_STABLE, image version 8.1_STABLE.
Output from a running system is unreliable.

crash> trace/t 0t16306
trace: pid 16306 lid 1 at 0x8001578daa20
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_root() at nfs_root+0x2b
VFS_ROOT() at VFS_ROOT+0x35
lookup_once() at lookup_once+0x250
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___stat50() at sys___stat50+0x28
syscall() at syscall+0x1ec
--- syscall (number 439) ---
787fc74f98ca:

crash> trace/t 0t16685
trace: pid 16685 lid 1 at 0x80015791aae0
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_root() at nfs_root+0x2b
VFS_ROOT() at VFS_ROOT+0x35
lookup_once() at lookup_once+0x250
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
chdir_lookup() at chdir_lookup+0x70
sys_chdir() at sys_chdir+0x26
syscall() at syscall+0x1ec
--- syscall (number 12) ---
758c3b8820ca:

crash> trace/t 0t19979
trace: pid 19979 lid 1 at 0x80013ed7ea20
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_root() at nfs_root+0x2b
VFS_ROOT() at VFS_ROOT+0x35
lookup_once() at lookup_once+0x250
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___lstat50() at sys___lstat50+0x25
syscall() at syscall+0x1ec
--- syscall (number 441) ---
7c0a9c4a1a7a:

crash> trace/t 0t25025
trace: pid 25025 lid 1 at 0x8001576628a0
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_lookup() at nfs_lookup+0xf84
VOP_LOOKUP() at VOP_LOOKUP+0x85
lookup_once() at lookup_once+0x18d
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___stat50() at sys___stat50+0x28
syscall() at syscall+0x1ec
--- syscall (number 439) ---
6fd53d4f98ca:

crash> trace/t 0t28321
trace: pid 28321 lid 1 at 0x80013f8768a0
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_lookup() at nfs_lookup+0xf84
VOP_LOOKUP() at VOP_LOOKUP+0x85
lookup_once() at lookup_once+0x18d
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___stat50() at sys___stat50+0x28
syscall() at syscall+0x1ec
--- syscall (number 439) ---
7656ab6f98ca:

crash> trace/t 0t18618
trace: pid 18618 lid 1 at 0x80013f9b2a20
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_root() at nfs_root+0x2b
VFS_ROOT() at VFS_ROOT+0x35
lookup_once() at lookup_once+0x250
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___lstat50() at sys___lstat50+0x25
syscall() at syscall+0x1ec
--- syscall (number 441) ---
7cbc7f0a1a7a:

crash> trace/t 0t22881
trace: pid 22881 lid 1 at 0x80013eef6a20
sleepq_block() at sleepq_block+0x97
turnstile_block() at turnstile_block+0x24f
rw_vector_enter() at rw_vector_enter+0x120
genfs_lock() at genfs_lock+0x47
VOP_LOCK() at VOP_LOCK+0x80
vn_lock() at vn_lock+0x11
nfs_root() at nfs_root+0x2b
VFS_ROOT() at VFS_ROOT+0x35
lookup_once() at lookup_once+0x250
namei_tryemulroot.constprop.9() at namei_tryemulroot.constprop.9+0x3b1
namei() at namei+0x43
fd_nameiat.isra.2() at fd_nameiat.isra.2+0x54
do_sys_statat() at do_sys_statat+0x77
sys___stat50() at sys___stat50+0x28
syscall() at syscall+0x1ec
--- syscall (number 439) ---
792b4e4f98ca:


8.1 tstile lockup after nfs send error 51

2019-07-29 Thread Edgar Fuß
I experiened an "nfs send error 51" on an NFS-imported file system, 
and after that, any process accessing that FS seems to be frozen in tstile.

Any way out short or re-booting?

Anything to analyze before?


Re: tstile lockup

2012-11-27 Thread Lars Heidieker
On 11/23/2012 05:06 PM, Edgar Fuß wrote:
>> Try running `svn ...' as `lockstat -T rwlock svn ...'.  By chance we get more
>> information on lock congestion.
> Ouch! I overlooked this post of yours until a colleague asked me about it.
> 
> Elapsed time: 18.33 seconds.
> 
> -- RW lock sleep (reader)
> 
> Total%  Count   Time/ms  Lock   Caller
> -- --- - -- --
> 100.00   1  11696.42 fe8419e266f8   wapbl_begin+54
> 
> Does that help?
> 

Hi,

there is one rw_lock that is taken in read mode in wapbl_begin and that
lock is hold in writer mode across log flushing.
So while flushing a transaction further scheduling of transaction data
is on hold, that might be the cause.

Lars

-- 


Mystische Erklärungen:
Die mystischen Erklärungen gelten für tief;
die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind.

   -- Friedrich Nietzsche
   [ Die Fröhliche Wissenschaft Buch 3, 126 ]


Re: tstile lockup

2012-11-24 Thread haad
Can you use addr2line with that wapbl address to find out whal line is it.
On Nov 23, 2012 5:06 PM, "Edgar Fuß"  wrote:

> > Try running `svn ...' as `lockstat -T rwlock svn ...'.  By chance we get
> more
> > information on lock congestion.
> Ouch! I overlooked this post of yours until a colleague asked me about it

Re: tstile lockup

2012-11-23 Thread Edgar Fuß
> Try running `svn ...' as `lockstat -T rwlock svn ...'.  By chance we get more
> information on lock congestion.
Ouch! I overlooked this post of yours until a colleague asked me about it.

Elapsed time: 18.33 seconds.

-- RW lock sleep (reader)

Total%  Count   Time/ms  Lock   Caller
-- --- - -- --
100.00   1  11696.42 fe8419e266f8   wapbl_begin+54

Does that help?


Re: tstile lockup

2012-11-19 Thread J. Hannken-Illjes
On Nov 19, 2012, at 6:40 PM, Edgar Fuß  wrote:

>> Do you get a deadlock
> No.
> 
>> will the system come back to work after some time?
> Yes. At least for appropriate values of "some time".
> This may take minutes (at least in the dd case, I haven't seen this in the 
> svn case).


Try running `svn ...' as `lockstat -T rwlock svn ...'.  By chance we get more
information on lock congestion.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: tstile lockup

2012-11-19 Thread Edgar Fuß
> Do you get a deadlock
No.

> will the system come back to work after some time?
Yes. At least for appropriate values of "some time".
This may take minutes (at least in the dd case, I haven't seen this in the 
svn case).


Re: tstile lockup

2012-11-19 Thread J. Hannken-Illjes
On Nov 19, 2012, at 4:53 PM, Edgar Fuß  wrote:

> OK, this is the svn process (directly running on the file server, not 
> operating 
> via NFS) tstile-ing:
> 
> crash> ps | grep "\(vnode\|tstile\)"
> 250511 3   0 0   fe82ec17d200svn tstile
> crash> t/a fe82ec17d200
> trace: pid 25051 lid 1 at 0xfe811e901700
> sleepq_block() at sleepq_block+0xad
> turnstile_block() at turnstile_block+0x2bb
> rw_vector_enter() at rw_vector_enter+0x1eb
> wapbl_begin() at wapbl_begin+0x54

> rw_vector_enter() at rw_vector_enter+0x1eb
> genfs_lock() at genfs_lock+0x7b
> VOP_LOCK() at VOP_LOCK+0x32
> vn_lock() at vn_lock+0x7d
> vget() at vget+0xa2
> ufs_ihashget() at ufs_ihashget+0x84
> ffs_vget() at ffs_vget+0xc1
> ufs_fhtovp() at ufs_fhtovp+0x1f
> ffs_fhtovp() at ffs_fhtovp+0x55
> VFS_FHTOVP() at VFS_FHTOVP+0x1c
> nfsrv_fhtovp() at nfsrv_fhtovp+0x9a
> nfsrv_getattr() at nfsrv_getattr+0x13a
> nfssvc_nfsd() at nfssvc_nfsd+0x1d7
> sys_nfssvc() at sys_nfssvc+0x22c
> syscall() at syscall+0xc4
> crash> ps | grep "\(vnode\|tstile\)"
> 442  4 3   5 0   fe841628c060  slave tstile
> 
> These are somewhat difficult to catch in the act since you have to copy-paste
> the address from the ps output into the t command.
> 
> Now, does that help to diagnose the problem?

Ok, so you see threads waiting for the wapbl rwlock and
some vnode rwlock.  This is ok as long as it doesn't deadlock.

Do you get a deadlock or will the system come back to work
after some time?

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: tstile lockup

2012-11-19 Thread Edgar Fuß
OK, this is the svn process (directly running on the file server, not operating 
via NFS) tstile-ing:

crash> ps | grep "\(vnode\|tstile\)"
250511 3   0 0   fe82ec17d200svn tstile
crash> t/a fe82ec17d200
trace: pid 25051 lid 1 at 0xfe811e901700
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
wapbl_begin() at wapbl_begin+0x54
ffs_valloc() at ffs_valloc+0x5e
ufs_makeinode() at ufs_makeinode+0x62
ufs_create() at ufs_create+0x5c
VOP_CREATE() at VOP_CREATE+0x38
vn_open() at vn_open+0x2ed
do_open() at do_open+0xcd
sys_open() at sys_open+0x59
syscall() at syscall+0xc4
crash> ps | grep "\(vnode\|tstile\)"
250511 3   0 0   fe82ec17d200svn tstile

And this is both the local svn and one nfsd slave tstile-ing:
crash> ps | grep "\(vnode\|tstile\)"
105321 3   0 0   fe82eca5d5c0svn tstile
442  4 3   5 0   fe841628c060  slave tstile
crash> t/a fe841628c060
trace: pid 442 lid 4 at 0xfe811dbfe4e0
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
genfs_lock() at genfs_lock+0x7b
VOP_LOCK() at VOP_LOCK+0x32
vn_lock() at vn_lock+0x7d
vget() at vget+0xa2
ufs_ihashget() at ufs_ihashget+0x84
ffs_vget() at ffs_vget+0xc1
ufs_fhtovp() at ufs_fhtovp+0x1f
ffs_fhtovp() at ffs_fhtovp+0x55
VFS_FHTOVP() at VFS_FHTOVP+0x1c
nfsrv_fhtovp() at nfsrv_fhtovp+0x9a
nfsrv_getattr() at nfsrv_getattr+0x13a
nfssvc_nfsd() at nfssvc_nfsd+0x1d7
sys_nfssvc() at sys_nfssvc+0x22c
syscall() at syscall+0xc4
crash> ps | grep "\(vnode\|tstile\)"
442  4 3   5 0   fe841628c060  slave tstile

These are somewhat difficult to catch in the act since you have to copy-paste
the address from the ps output into the t command.

Now, does that help to diagnose the problem?


Re: tstile lockup

2012-11-19 Thread Martin Husemann
On Mon, Nov 19, 2012 at 12:59:13PM +0100, Edgar Fuß wrote:
> Should I try to get crash(8) outputs of the "real world" situation?

I guess that would be good - even if only to verify this is related or
not.

Martin


Re: tstile lockup

2012-11-19 Thread Edgar Fuß
> Why do you think both lockups are related?
Because the "real world" problem also involves large amounts of metadata being 
written and also results in nfsd's stuck in tstile.
Should I try to get crash(8) outputs of the "real world" situation?


Re: tstile lockup

2012-11-19 Thread Martin Husemann
On Mon, Nov 19, 2012 at 12:31:47PM +0100, Edgar Fuß wrote:
> The problem is that this "lock-up", artificial as the dd to the block device 
> may seem, appears to happen "real-world" during an svn update command: the
> other nfsd threads get stuck to the point where other clients get "nfs server 
> not responding" messages.

Why do you think both lockups are related?

Martin


Re: tstile lockup

2012-11-19 Thread Edgar Fuß
On Wed, Oct 31, 2012 at 05:42:12PM +0100, Edgar Fuß wrote:
> > Invoke crash(8), then just perform "ps" and "t/a address" on each LWP
> > which seems to be stuck (on "tstile" or elsewhere).
> So it seems I can sort of lock up the machine for minutes with a simple
>   dd if=/dev/zero of=/dev/dk14 bs=64k count=1000
> (In case it matters, dk14 is on a RAID5 on 4+1 mpt(4) SAS discs)
> I get an impressive dd throughput of around 350kB/s
> 
> Here's the crash(8) output during one of the partial lock-ups:
[...]
> Does that help?
Apparently not.
What can I do to debug this?
The problem is that this "lock-up", artificial as the dd to the block device 
may seem, appears to happen "real-world" during an svn update command: the
other nfsd threads get stuck to the point where other clients get "nfs server 
not responding" messages.


Re: RAIDframe level 5 write performance(was: tstile lockup)

2012-11-02 Thread Thor Lancelot Simon
On Fri, Nov 02, 2012 at 06:02:01PM +0100, Edgar Fu? wrote:
> 
> Writing to that RAID's block device (raid2d) in 64k blocks gives me a dazzling
> troughput of 2.4MB/s and a dd mostly waiting in vnode.

Writing to the block device from userspace is not a good idea.  How is
performance through the filesystem?

The "block device" actually accesses the metadata cache.  It is not intended
for bulk writes and allocates memory in the kernel at every request.  It's
fine for what the filesystem needs it for, but not for what you're trying
to use it for.

Thor


RAIDframe level 5 write performance(was: tstile lockup)

2012-11-02 Thread Edgar Fuß
There seems to be a fundamental problem with writing to a level 5 RAIDframe set,
at least to the block device.

I've created five small wedges in the spared-out region of my 3TB SAS discs.
In case it matters, they are connected to an mpt(4) controller.

Then I configured a 5-component, 32-SpSU, level 5 RAID set.
Unless I've gone mad, 32 sectors per SU equal 16k per SU, i.e. 64k per stripe.

Writing to that RAID's block device (raid2d) in 64k blocks gives me a dazzling
troughput of 2.4MB/s and a dd mostly waiting in vnode.
Writing to the raw device (rraid2d) gives 240MB/s, i.e. is two orders of
magnitude faster.
Reading is at 22MB/s for the block device and 315MB/s for the raw device.
Writing 16k chunks to the block dev drops to 530kB/s.

To make sure there's nothing wrong with my discs, I the configured a two-
component level 1 array with 128 SpSU (again giving a 64k stripe size, I hope).
With that, I write 57MB/s to the block and 85MB/s to the raw device.
Reading is 21MB/s for the block dev and 87MB/s for the raw dev.

So there's nothing wrong with my discs, I think; there's something wrong with
level 5 RAIDframe. My impression is that NetBSD 4 didn't have that issue but
I can't test because the discs on that machine are too large.

How can I analyze/debug this? Performance is so insanely bad I cannot possibly
put this file server into production.


tstile lockup (was: Serious WAPL performance problems)

2012-10-31 Thread Edgar Fuß
> Invoke crash(8), then just perform "ps" and "t/a address" on each LWP
> which seems to be stuck (on "tstile" or elsewhere).
So it seems I can sort of lock up the machine for minutes with a simple
dd if=/dev/zero of=/dev/dk14 bs=64k count=1000
(In case it matters, dk14 is on a RAID5 on 4+1 mpt(4) SAS discs)
I get an impressive dd throughput of around 350kB/s

Here's the crash(8) output during one of the partial lock-ups:

crash> ps | fgrep vnode
259951 3   1   100   fe8414a24100 dd vnode
crash> t/a fe8414a24100
trace: pid 25995 lid 1 at 0xfe811e7de8f0
sleepq_block() at sleepq_block+0xad
cv_wait() at cv_wait+0x9f
vflushbuf() at vflushbuf+0x10f
ffs_spec_fsync() at ffs_spec_fsync+0x3e
VOP_FSYNC() at VOP_FSYNC+0x3e
vinvalbuf() at vinvalbuf+0xf2
spec_close() at spec_close+0x73
VOP_CLOSE() at VOP_CLOSE+0x33
vn_close() at vn_close+0x4e
closef() at closef+0x4a
fd_close() at fd_close+0x116
syscall() at syscall+0xc4
crash> ps | fgrep tstile
106295 3   5 0   fe838c989680  slave tstile
106294 3   3 0   fe838c989260  slave tstile
106293 3   4 0   fe8415e82060  slave tstile
106292 3   2 0   fe8267e74ac0  slave tstile
crash> t/a fe838c989680
trace: pid 10629 lid 5 at 0xfe811e25e4f0
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
wapbl_begin() at wapbl_begin+0x54
ffs_write() at ffs_write+0x8b9
VOP_WRITE() at VOP_WRITE+0x37
nfsrv_write() at nfsrv_write+0x923
nfssvc_nfsd() at nfssvc_nfsd+0x1d7
sys_nfssvc() at sys_nfssvc+0x22c
syscall() at syscall+0xc4
crash> t/a fe838c989260
trace: pid 10629 lid 4 at 0xfe811e27e4f0
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
wapbl_flush() at wapbl_flush+0x54
ffs_fsync() at ffs_fsync+0x112
VOP_FSYNC() at VOP_FSYNC+0x3e
nfsrv_commit() at nfsrv_commit+0x69a
nfssvc_nfsd() at nfssvc_nfsd+0x1d7
sys_nfssvc() at sys_nfssvc+0x22c
syscall() at syscall+0xc4
crash> t/a fe8415e82060
trace: pid 10629 lid 3 at 0xfe811dbde3c0
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
genfs_lock() at genfs_lock+0x7b
VOP_LOCK() at VOP_LOCK+0x32
vn_lock() at vn_lock+0x7d
vget() at vget+0xa2
ufs_ihashget() at ufs_ihashget+0x84
ffs_vget() at ffs_vget+0xc1
ufs_fhtovp() at ufs_fhtovp+0x1f
ffs_fhtovp() at ffs_fhtovp+0x55
VFS_FHTOVP() at VFS_FHTOVP+0x1c
nfsrv_fhtovp() at nfsrv_fhtovp+0x9a
nfsrv_write() at nfsrv_write+0x502
nfssvc_nfsd() at nfssvc_nfsd+0x1d7
sys_nfssvc() at sys_nfssvc+0x22c
syscall() at syscall+0xc4
crash> t/a fe8267e74ac0
trace: pid 10629 lid 2 at 0xfe811e29e3c0
sleepq_block() at sleepq_block+0xad
turnstile_block() at turnstile_block+0x2bb
rw_vector_enter() at rw_vector_enter+0x1eb
genfs_lock() at genfs_lock+0x7b
VOP_LOCK() at VOP_LOCK+0x32
vn_lock() at vn_lock+0x7d
vget() at vget+0xa2
ufs_ihashget() at ufs_ihashget+0x84
ffs_vget() at ffs_vget+0xc1
ufs_fhtovp() at ufs_fhtovp+0x1f
ffs_fhtovp() at ffs_fhtovp+0x55
VFS_FHTOVP() at VFS_FHTOVP+0x1c
nfsrv_fhtovp() at nfsrv_fhtovp+0x9a
nfsrv_write() at nfsrv_write+0x502
nfssvc_nfsd() at nfssvc_nfsd+0x1d7
sys_nfssvc() at sys_nfssvc+0x22c
syscall() at syscall+0xc4
crash> ps | fgrep tstile
106295 3   5 0   fe838c989680  slave tstile
106294 3   3 0   fe838c989260  slave tstile
106293 3   4 0   fe8415e82060  slave tstile
106292 3   2 0   fe8267e74ac0  slave tstile
crash> ps | fgrep vnode
259951 3   1   100   fe8414a24100 dd vnode
crash> t/a fe8414a24100
trace: pid 25995 lid 1 at 0xfe811e7de8f0
sleepq_block() at sleepq_block+0xad
cv_wait() at cv_wait+0x9f
vflushbuf() at vflushbuf+0x10f
ffs_spec_fsync() at ffs_spec_fsync+0x3e
VOP_FSYNC() at VOP_FSYNC+0x3e
vinvalbuf() at vinvalbuf+0xf2
spec_close() at spec_close+0x73
VOP_CLOSE() at VOP_CLOSE+0x33
vn_close() at vn_close+0x4e
closef() at closef+0x4a
fd_close() at fd_close+0x116
syscall() at syscall+0xc4
crash> ps | fgrep bio
0   81 3   2   200   fe811dcc7b20ioflush biolock
0   43 1   5   200   fe811d3631a0  softbio/5
0   37 1   4   200   fe811d33c160  softbio/4
0   31 1   3   200   fe811d318120  softbio/3
0   25 1   2   200   fe811d2f10e0  softbio/2
0   19 1   1   200   fe811d2cd0a0  softbio/1
04 1   0   200   fe841df43000  softbio/0
crash> t/a fe811dcc7b20
trace: pid 0 lid 81 at 0xfe811d8de980
sleepq_block() at sleepq_block+0xad
cv_timedwait() at cv_timedwait+0xa7
bbusy() at bbusy+0xe4
getblk() at getblk+0x60
bio_doread.clone.1() at bio_doread.clone.1+0x1d
bread() at bread+0x1a
ffs_update() at ffs_update+0x12c
ffs_sync()