On 27 Apr 2012, at 19:11, Peter Wells wrote: > I’m trying to work out why the fileserver keeps pausing like this.
I'm not sure how much of OpenAFS's file transport protocol that you're aware of, so sorry if some of this retreads old ground. OpenAFS uses a UDP based RPC mechanism called RX. RX provides a reliable connection layer on top of UDP by implementing its own acknowledgment and congestion control scheme. Originally this was pretty much unique, but over the years RX has converged more and more on a TCP style mechanism for congestion control. Unfortunately, OpenAFS releases up until 1.6 were stuck in a neverworld between RX's old burst based transmission algorithm, and a TCP-style mechanism for flow control. The behaviour that you are seeing is a product of a number of unfortunate issues in the 1.4.x RX stack. > The last packet before the pauses is an ACK from the client with a mixture of > 32 +ve and –ve acknowledgements… then silence between the server and client > for 1.2 seconds… RX has what we term 'hard' and 'soft' ACKs. A hard ACK moves the congestion control window forwards, a soft ACK is roughly analogous to a TCP SACK - it implies that that packet has been received, but we have missing packets and so we cannot move the window forwards. In 1.4 release, the maximum window size is 32 packets, which is why you are stalling after with 32 pending acknowledgments. There is a bug in 1.4 which means that we don't immediately start retransmitting when it becomes obvious that packets have been missed (TCP will retransmit if more than 2 packets have been received subsequent to a missing packet). So, we have to wait until the packets time out. A timeout is a hard error, it forces the connection back into slow start (which drops the window size), and so you'll see transmission rates slowly ramp back up from here. > As the rate picks up, the client will NACK a data packet, and then subsequent > ACK packets grow in length (in terms of the number of ACKS) until they reach > 32, at which time there is another long pause. What's interesting about this trace is how regular your stalls are. I can't easily explain this regularity, other than that it looks like the connection is regu;arly dropping particular packet types. > Average rtt is 0.104, with 17838 samples > Minimum rtt is 0.000, maximum is 2.147 > > That’s a pretty large maximum rtt and I was wondering if this was somehow > skewing the calculation of the retransmit timeout value, somehow causing the > fileserver to snooze before suddenly realising it should be retransmitting > packets. RTT calculation in 1.4 is very, very broken, as it feeds far too many samples into the RTT alogrithm. However, the effect here shouldn't be to inflate the RTT number itself, just to remove the smoothing factor. > Any thoughts you have will be much appreciated. The AFS versions are as > follows in case it helps: I would be very interested in seeing how 1.6.1 performs with this network configuration. It is unlikely that any work is going to get done in fixing the 1.4.x transport, but if you can reproduce these issues with 1.6, I'd really like to look at some packet traces and work out what's going on. Cheers, Simon. _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
