-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 19, 2006, at 8:48 AM, Scott Francis wrote:

On 7/18/06, R. Tyler Ballance <[EMAIL PROTECTED]> wrote:
Howdy,

I'm working on debugging a quirky bug (aren't they all) when using an
OpenBSD NFS client with a FreeBSD NFS server, I'm certain it's
agnostic of the NFS server, but I can't say for sure because we rely
on FreeBSD servers, and the Mac OS X and redhat NFS clients function
properly. I'm still working out the specific, and appropriate
reproduction steps for the bug, but in short, it leaves the OpenBSD
machine completely frozen. Interestingly enough, the OpenBSD machine
still responds to pings over the network, but all physical and
virtual terminals become completely locked. (This excludes the
keyboard shortcuts to drop the machine into ddb when ddb.console => 1 )

The basic question is, what are my options for pinpointing this bug?
 From what I remember correctly I can setup ddb over a serial console
through some means, but the machine is atop a bookshelf and about
50ft from my workstation ;) I've examined the tcpdump output on the
server side of things, but nothing out of order, with the exception
of the sudden drop in data being transferred, is noticable on that
side of things. I'm wondering if there's anyway from ddb I can
accurately gauge _where_ the lock up is happening, and then of
course, how it is happening ;)

you're on the right track with tcpdump, I think - I'd be running it on
the OpenBSD client and outputting to a file, and when/if the box
freezes again, you should be able to reboot and see at which point
network data stopped logging for the client.

That's a novel idea, hadn't thought of it to be honest ;)

I'm still quite uncertain that this is a network related problem at all. I've yet to peg down exactly where the problem is stemming from, but it seems to be more in how OpenBSD is handling the NFS mounts when certain actions are performed and then interrupted. The real world test scenario for this bug is when a user uploads a large file, and is either prematurely disconnected, or interrupts the transfer for any reason, the OpenBSD client will lock up. The test-case for this is using dd(1) to transfer large amounts of data to the NFS mount and then interrupt (with a SIGINT) and then the machine will proceed to lockup. I'm testing today whether Actions like a mv(1) or cp(1) from a local disk to the NFS mount act in the same manner when sent a SIGINT.

Are you using soft/interruptible mounts on the server side? What
version of OpenBSD and NFS?

3.9-RELEASE on OpenBSD, and yes, interruptible mounts are enabled.

Cheers,

- -R. Tyler Ballance
Lead Developer, bleep. LLC
http://www.bleepsoft.com
iD8DBQFEvnoUqO6nEJfroRsRAjs2AJ9so78tFX4LY5vo4+VOGvdpKqpKGwCdG2+h
oz3962FQ2oMwZ7KFCVrfkJk=
=FLXw
-----END PGP SIGNATURE-----

Reply via email to