Admittedly there is WAY too much blood in my caffeine stream at the moment but ...
According to your email describing the wireshark trace, the client sends a request, times out, retransmits, times out, and errors out; but the server sees the request, replies, sees the retransmit, and replys as well. That sounds less like an NFS issue and more like a network issue. Assuming the wireshark traffic is as I understand it from your email I see only a couple of options. 1 - the reply packet is incorrectly addressed (bad ARP entry?) 2 - The reply is correctly addressed but 'lost' in network transit (collisions, spanning tree issue, ...). The obvious third choice is: this analysis is faulty because either I have totally misread your email (Likely, see caffeine above) or you have misrepresented the traffic (more fairly put: you may have been incomplete in the details you thought non essential resulting in a potential misrepresentation of the traffic and I fell for it). So first question is: do you agree with my statement of your traffic analysis? Second question ( assuming 'yes' as 'no' shortens the discussion drastically <grin> ) is, can you confirm/repeat it (did it happen that way, was it a one time thing only in that instance, ...)? Again assuming yes, I'd recommend you check network routing then check for periods of high collisions or similar difficulties. Given that it works for awhile then fails, I'm less inclined to think basic routing/connectivity and more inclined to think something more 'intermittent' LIKE discarded packets, dhcpd address changes, routing changes (RIP?), or even a weird switch spanning tree issue due to some cross connected switches in the network (yes I've seen it, don't ask). If I am incorrect in my assumptions then can we have a clearer more detailed definition of the wire traffic so we can understand which end (presumably the client?) has the issue and when/where in the conversation it occurs (connection setup, file attribute/access, directory attribute/access, data transfer, ...)? I'm NOT an NFS guy, but assuming it isn't a network issue, it seems likely that knowing if it is the same NFS operation that always fails and which operation that is, might help. Assuming it is possible to tell that from logs or trace or ... Like I said, I'm not an NFS guy and have no idea what NFS data it is possible to gather OR how much of a pain in the butt it is to gather it. I'm only asking based on the general principle that narrowing the scope of inquiry to a single operation/function/code section is usually a good thing, basically a 'the more detailed data available the better' request. So please keep the initial NFS specific data gathering to a low pain level until someone with more NFS experience jumps in with better defined data requests. On 2010-04-01 19:43, Michael ODonnell wrote: > I gathered some Enet traffic for Wireshark anlysis on both machines thus: > > dumpcap -i eth0 -w /tmp/`hostname`.pcap > > ...and viewed the client traffic with Wireshark, which (apparently) > confirms that the client did indeed wait a while and then (apparently) > retransmitted the NFS request. The weird thing is that Wireshark analysis > of corresponding traffic on the server shows the first request coming in > and being turned around immediately, then we later see the retransmitted > request arrive and it, too, is promptly processed and the response goes > out immediately. _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/