>> It's worth reminding that -o tcp is an option.
> Not for NFS through a (stateful) filtering router, no.
True.
But then, not over network hops that drop port 2049, either. Break the
assumptions underlying the 'net and you have to expect breakage from
stuff built atop it.
/~\ The ASCII
At 10:45 Uhr +0200 31.07.2019, Edgar Fuß wrote:
>Thanks to riastradh@, this tuned out to be caused by an (UDP, hard) HFS
>mount combined with a mis-configured IPFilter that blocked all but the
>first fragment of a fragmented NFS reply (e.g., readdir) combined with a
>NetBSD design error (or so
On Wed, Jul 31, 2019 at 07:11:54AM -0700, Jason Thorpe wrote:
>
> > On Jul 31, 2019, at 1:45 AM, Edgar Fuß wrote:
> >
> > NetBSD design error (or so Taylor says) that a vnode lock may be held
> > accross I/O
>
> 100%
>
> NetBSD's VFS locking protocol needs a serious overhaul. At least one
On Wed, Jul 31, 2019 at 11:42:26AM -0500, Don Lee wrote:
> If you go back a few years, you can find a thread where I reported tstile
> lockups on PPC. I don’t remember the details, but it was back in 6.1 as I
> recall. This is not a new problem, and not limited to NFS. I still have a
> similar
If you go back a few years, you can find a thread where I reported tstile
lockups on PPC. I don’t remember the details, but it was back in 6.1 as I
recall. This is not a new problem, and not limited to NFS. I still have a
similar problem with my 7.2 system, usually triggered when I do backups
> On Jul 31, 2019, at 1:45 AM, Edgar Fuß wrote:
>
> NetBSD design error (or so Taylor says) that a vnode lock may be held accross
> I/O
100%
NetBSD's VFS locking protocol needs a serious overhaul. At least one other
BSD-family VFS (the one in XNU) completely eliminated locking of vnodes
Thanks to riastradh@, this tuned out to be caused by an (UDP, hard) HFS mount
combined with a mis-configured IPFilter that blocked all but the first fragment
of a fragmented NFS reply (e.g., readdir) combined with a NetBSD design error
(or so Taylor says) that a vnode lock may be held accross
Here are stack traces of all the frozen processes (with a few newlines
inserted manually):
Crash version 8.1_STABLE, image version 8.1_STABLE.
Output from a running system is unreliable.
crash> trace/t 0t16306
trace: pid 16306 lid 1 at 0x8001578daa20
sleepq_block() at sleepq_block+0x97
I experiened an "nfs send error 51" on an NFS-imported file system,
and after that, any process accessing that FS seems to be frozen in tstile.
Any way out short or re-booting?
Anything to analyze before?
On 11/23/2012 05:06 PM, Edgar Fuß wrote:
Try running `svn ...' as `lockstat -T rwlock svn ...'. By chance we get more
information on lock congestion.
Ouch! I overlooked this post of yours until a colleague asked me about it.
Elapsed time: 18.33 seconds.
-- RW lock sleep (reader)
Can you use addr2line with that wapbl address to find out whal line is it.
On Nov 23, 2012 5:06 PM, Edgar Fuß e...@math.uni-bonn.de wrote:
Try running `svn ...' as `lockstat -T rwlock svn ...'. By chance we get
more
information on lock congestion.
Ouch! I overlooked this post of yours
Try running `svn ...' as `lockstat -T rwlock svn ...'. By chance we get more
information on lock congestion.
Ouch! I overlooked this post of yours until a colleague asked me about it.
Elapsed time: 18.33 seconds.
-- RW lock sleep (reader)
Total% Count Time/ms Lock
On Wed, Oct 31, 2012 at 05:42:12PM +0100, Edgar Fuß wrote:
Invoke crash(8), then just perform ps and t/a address on each LWP
which seems to be stuck (on tstile or elsewhere).
So it seems I can sort of lock up the machine for minutes with a simple
dd if=/dev/zero of=/dev/dk14 bs=64k
On Mon, Nov 19, 2012 at 12:31:47PM +0100, Edgar Fuß wrote:
The problem is that this lock-up, artificial as the dd to the block device
may seem, appears to happen real-world during an svn update command: the
other nfsd threads get stuck to the point where other clients get nfs server
not
Why do you think both lockups are related?
Because the real world problem also involves large amounts of metadata being
written and also results in nfsd's stuck in tstile.
Should I try to get crash(8) outputs of the real world situation?
On Mon, Nov 19, 2012 at 12:59:13PM +0100, Edgar Fuß wrote:
Should I try to get crash(8) outputs of the real world situation?
I guess that would be good - even if only to verify this is related or
not.
Martin
OK, this is the svn process (directly running on the file server, not operating
via NFS) tstile-ing:
crash ps | grep \(vnode\|tstile\)
250511 3 0 0 fe82ec17d200svn tstile
crash t/a fe82ec17d200
trace: pid 25051 lid 1 at 0xfe811e901700
sleepq_block() at
On Nov 19, 2012, at 4:53 PM, Edgar Fuß e...@math.uni-bonn.de wrote:
OK, this is the svn process (directly running on the file server, not
operating
via NFS) tstile-ing:
crash ps | grep \(vnode\|tstile\)
250511 3 0 0 fe82ec17d200svn tstile
crash t/a
Do you get a deadlock
No.
will the system come back to work after some time?
Yes. At least for appropriate values of some time.
This may take minutes (at least in the dd case, I haven't seen this in the
svn case).
On Nov 19, 2012, at 6:40 PM, Edgar Fuß e...@math.uni-bonn.de wrote:
Do you get a deadlock
No.
will the system come back to work after some time?
Yes. At least for appropriate values of some time.
This may take minutes (at least in the dd case, I haven't seen this in the
svn case).
Try
There seems to be a fundamental problem with writing to a level 5 RAIDframe set,
at least to the block device.
I've created five small wedges in the spared-out region of my 3TB SAS discs.
In case it matters, they are connected to an mpt(4) controller.
Then I configured a 5-component, 32-SpSU,
On Fri, Nov 02, 2012 at 06:02:01PM +0100, Edgar Fu? wrote:
Writing to that RAID's block device (raid2d) in 64k blocks gives me a dazzling
troughput of 2.4MB/s and a dd mostly waiting in vnode.
Writing to the block device from userspace is not a good idea. How is
performance through the
Invoke crash(8), then just perform ps and t/a address on each LWP
which seems to be stuck (on tstile or elsewhere).
So it seems I can sort of lock up the machine for minutes with a simple
dd if=/dev/zero of=/dev/dk14 bs=64k count=1000
(In case it matters, dk14 is on a RAID5 on 4+1 mpt(4)
23 matches
Mail list logo