What Derrick Said. You have to leave a packet capture running continuously on the off chance that this might happen... Not just the first error packet, but the last couple of RPCs just before that. So you really want 1. a network monitor that implements stop triggers. These used to be rather expensive, but maybe ethereal finally implemented them? I don't know. 2. a true broadcast network or the ability to tap your switch so you don't have to run monitor software directly on the fileserver. Running software on the client is not likely to be useful unless you can reliably predict which system will be affected.
Wait a sec. At this point, you're thinking you know which system will be affected, it's this one at 192.168.18.34, right? But what I'm saying is -- After you reboot that machine, and it comes back up and is running normally for a while, which client will be next to experience this bug? Is it always the same one? Even after reboots? That is new, useful, and surprising information. My experience was that the affected client would vary and not be particularly reproducible, which means that you have to monitor a whole lot of connections simultaneously, hence a tap on the switch. Make sense? -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Derrick J Brashear Sent: Sunday, August 21, 2005 1:42 AM To: [email protected] Subject: Re: [OpenAFS-devel] "Lost contact with file server" problems it needs to include the first error packet, e.g. the window where it loses contact, to be useful once it's down, that's not interesting Derrick _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
