[
https://issues.apache.org/jira/browse/THRIFT-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Will Pierce updated THRIFT-1103:
--------------------------------
Attachment: THRIFT-1103.tzlibtransport_for_python_v1.patch
Patch attached. Adds TZlibTransport.py into ./lib/py/src/transport/ and adds
TZlibTransport into the transport/__init__.py module's __all__ list.
I tested this on python 2.4 and 2.7. The zlib module is present and provides
the same API in python 2.4 as 2.7 for our needs.
If the patch for THRIFT-1094 is good and can be commited, then it would make it
easier for me to extend the RunClientServer.py/TestServer.py/TestClient.py code
to include testing that exercises the TZlibTransport code. (I did it locally
in my copy of thrift-svn/trunk to test this code, but didn't want to submit a
patch that requires another patch ( THRIFT-1094 ) which hasn't been approved
yet.)
> TZlibTransport for python, a zlib compressed transport
> ------------------------------------------------------
>
> Key: THRIFT-1103
> URL: https://issues.apache.org/jira/browse/THRIFT-1103
> Project: Thrift
> Issue Type: New Feature
> Components: Python - Library
> Reporter: Will Pierce
> Assignee: Will Pierce
> Attachments: THRIFT-1103.tzlibtransport_for_python_v1.patch
>
>
> New implementation of zlib compressed transport for python.
> The attached patch provides a zlib compressed transport wrapper for python.
> It is similar to the TFramedTransport, in that it wraps another transport,
> implementing the data compression as a transformation layer on top of the
> underlying transport that it wraps.
> The compression level is configurable in the constructor, from 0 (none) to 9
> (best) and defaults to 9 for best compression. The way this works is that
> every write() to the transport appends more data to the internal cStringIO
> write buffer. When the transport's flush() method is called, the buffered
> bytes are then passed to a zlib Compressor object and flush()ed with
> zlib.Z_SYNC_FLUSH.
> Because the thrift API calls the transport's flush() after writeMessageEnd(),
> this means very small thrift RPC calls don't get compressed well. This
> transport works best on thrift protocols where the payload contains strings
> longer than 10 characters. As with all data compression, the more redundancy
> in the uncompressed input, the greater the resulting compression.
> The TZlibTransport class also implements some basic statistics that track the
> number of raw bytes written and read, versus the decompressed equivalent.
> The getCompRatio() method returns a tuple of
> (readCompressionRatio,writeCompressionRatio) where ratio is computed using:
> compressed_bytes/uncompressed_bytes. (So 10 compression is 0.10, meaning
> smaller numbers are better.) The getCompSavings() method returns the actual
> number of (saved_read_bytes,saved_write_bytes) which might be negative when
> the compression of non-compressible data ends up expanding the data. So
> hopefully, anyone who uses this transport will be able to tell whether the
> compression is saving bandwidth or not.
> I will add the patch in a few minutes.
> I haven't tested this against the C++ TZlibTransport, only against itself.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira