Will Pierce created THRIFT-1737: ----------------------------------- Summary: UDP socket support for python Key: THRIFT-1737 URL: https://issues.apache.org/jira/browse/THRIFT-1737 Project: Thrift Issue Type: New Feature Components: Python - Library Reporter: Will Pierce
This patch adds support for UDP socket servers and clients in python. This reduces overhead and network latency due to TCP handshaking, _especially_ for "oneway" service methods. One useful feature of a UDP service is that the clients don't need to rebuild their connection to the server when a UDP packet is lost, so the "blast radius" of the timeout exception is limited to a single service call, not the entire "connection". Also, framing is not necessary because UDP packets have length encoded in their header. This transport is not suitable for large messages because UDP is inherently limited to 64 KB packet lengths, and often much smaller (500 - 1500 bytes) depending on intermediate links and whether UDP fragments are reassembled. Avoid large query/response payloads with this transport. h2. Implementation UDP support is implemented by subclassing TSocket and TServerSocket into TUDPSocket and TServerUDPSocket, and adding a TDatagramTransport. The server's accept() method actually receives an entire inbound request packet. An inbound packet is wrapped as a stream with StringIO, and the response "connection" records the sender's host+port so responses are delivered from the server's socket back to the client. The TDatagramTransport converts the EOFError raised after reaching the end of the packet into a TTransport exception, to accomodate TServers. h2. Testing: The unit tests now have a TestUDP.py script which runs a UDP server and client, and exercises several of the ThriftTest service calls, and verifies that responses match expectations. It ensures that "oneway" method calls are truly non-blocking, 1 packet "send and forget". It also forces a timeout in the middle of a sequence of blocking RPC calls, which confirms that a timeout only breaks a single RPC, not the entire client. I haven't used this with server types other than TThreadedServer, or in a big environment yet. There may be edge-cases that haven't surfaced yet. Tested with IPv4 and IPv6 on localhost and python2.7 (dev box is fedora17). h2. Minor bugfix: The python RunClientServer.py test script had a 1-line bug where it ran some other test scripts twice by mistake (probably a cut and paste error). h2. General warnings for posterity: * UDP packets are *easily*spoofed*! ** don't use this on public-internet facing interfaces ** spoofed client IP attacks may turn your server into an attack vector * UDP is not reliable ** clients will have to handle socket.timeout exceptions for every RPC call ** UDP may be _more_ unreliable during network congestion * No retries. ** this library doesn't do any retries ** there's only one timeout setting per client, which applies to every method call ** but the timeout may be changed with the existing .setTimeout(msec) call * Compression ** I haven't tested using TZlibTransport wrapping this to compress the packets, but it ought to work (unless there are bugs) h2. Tuning to avoid Timeouts: Linux hosts tend to have small default values for the kernel's memory buffers used to queue up UDP packets. When that buffer fills up with packets that the server process hasn't yet processed, then the kernel drops the packet, even though it's been fully decoded and pulled off the NIC. This would show up as lots of "socket.timeout" exceptions raised in client code, and no sign of an inbound method call at the server. If you run "netstat -s" and see increasing "packet receive errors" in the *Udp* section of output, that is strong evidence that you need to increase your hosts' receive buffers. As root, you can raise the UDP buffer receive (and send) space to 4MB with: {noformat} sysctl -w net.core.rmem_default=4194304 sysctl -w net.core.wmem_default=4194304 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira