[ https://issues.apache.org/jira/browse/THRIFT-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Duxbury closed THRIFT-1103. --------------------------------- Resolution: Fixed Fix Version/s: 0.7 I just committed this. Thanks Will! > TZlibTransport for python, a zlib compressed transport > ------------------------------------------------------ > > Key: THRIFT-1103 > URL: https://issues.apache.org/jira/browse/THRIFT-1103 > Project: Thrift > Issue Type: New Feature > Components: Python - Library > Reporter: Will Pierce > Assignee: Will Pierce > Fix For: 0.7 > > Attachments: THRIFT-1103.tzlibtransport_for_python_v1.patch, > THRIFT-1103.tzlibtransport_for_python_v2.patch > > > New implementation of zlib compressed transport for python. > The attached patch provides a zlib compressed transport wrapper for python. > It is similar to the TFramedTransport, in that it wraps another transport, > implementing the data compression as a transformation layer on top of the > underlying transport that it wraps. > The compression level is configurable in the constructor, from 0 (none) to 9 > (best) and defaults to 9 for best compression. The way this works is that > every write() to the transport appends more data to the internal cStringIO > write buffer. When the transport's flush() method is called, the buffered > bytes are then passed to a zlib Compressor object and flush()ed with > zlib.Z_SYNC_FLUSH. > Because the thrift API calls the transport's flush() after writeMessageEnd(), > this means very small thrift RPC calls don't get compressed well. This > transport works best on thrift protocols where the payload contains strings > longer than 10 characters. As with all data compression, the more redundancy > in the uncompressed input, the greater the resulting compression. > The TZlibTransport class also implements some basic statistics that track the > number of raw bytes written and read, versus the decompressed equivalent. > The getCompRatio() method returns a tuple of > (readCompressionRatio,writeCompressionRatio) where ratio is computed using: > compressed_bytes/uncompressed_bytes. (So 10 compression is 0.10, meaning > smaller numbers are better.) The getCompSavings() method returns the actual > number of (saved_read_bytes,saved_write_bytes) which might be negative when > the compression of non-compressible data ends up expanding the data. So > hopefully, anyone who uses this transport will be able to tell whether the > compression is saving bandwidth or not. > I will add the patch in a few minutes. > I haven't tested this against the C++ TZlibTransport, only against itself. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira