On Sat, Jan 18, 2003 at 08:42:04AM +0000, Steve Schmitz wrote: > Does the Linux NAT code already do this?
Possibly, but I'll have to check the source code to verify. It could either strip the option or set any scale factors inside the option to zero. But doing that is not much simpler than actually supporting non-zero factors. All these approaches have the limitation that they only work if the code sees the TCP handshake of the connections. > So I conclude that either the OpenBSD firewall code has no trouble with > wscale but the NAT code has, or the Linux NAT clears out the wscale TCP > options from the initial SYN packet - i.e. does exactly what you propose. It's the OpenBSD TCP sequence number tracking code that stalls such connections, and that is used whenever you filter a TCP connection statefully (when using 'keep state'). pf always creates a state entry when any translation (like nat, rdr or binat) is applied to a connection. If you were filtering statelessly with pf and doing nat on the Linux box, that might explain why the connection didn't stall. In the tcpdumped session you quoted, the client was using 'wscale 0' and the server 'wscale 9'. That means the client's window values didn't get shifted/multiplied at all, and the server's were shifted 9 bits (multiplied by a factor of 2^9=512). The server started sending window values of 12 (meaning 12*512=6144) and increased them to 52 (meaning 52*512=26624). As long as the client sent smaller segments, pf let them through. But the first larger packet gets dropped, and the client retransmits it until the connection times out. So you might not always see a stall, depending on the kind of traffic the client sends. If it's all small packets (like an interactive SQL session, where the client sends only small commands), it could work. Also, the server might have used a lower scaling factor on other connections. wscale 9 is quite large, that means it wants to be able to advertise a maximum window of 65535*512 bytes, about 32 MB. Such a large window would mean the client is invited to send up to 32 MB of data before getting an acknowledgment. I don't know how Linux calculates the scaling factors, but I guess it might depend on the memory available for such buffers at run-time. It might have chosen a lower scaling factor during the second test. But that's just a guess :) It's also interesting that your client chose wscale 0, indicating that it doesn't itself want to scale its own windows (because it has no large buffers?) but wants to support the peer doing so. If you worry about performance impacts due to disabled window scaling, it might depend on the nature of your traffic. If only the server uses large windows (using scaling factors), only bulk traffic client -> sender would benefit. If your client only sends small queries but gets large results back, using a factor only for the server's windows wouldn't improve performance. Daniel