[
https://issues.apache.org/jira/browse/THRIFT-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009283#comment-13009283
]
Will Pierce commented on THRIFT-1103:
-------------------------------------
Thanks for comitting THRIFT-1094, I'll update the patch this evening (and
attach as _v2) to include testing TZlibTransport wrapping in the
TestServer.py/TestClient.py code (as a cmdline --zlib argument to both scripts).
FYI, currently the hudson(jenkins) build seems to be failing on the javascript
jslint tasks, stopped the build tests from progressing past the 'test/js'
directory...
> TZlibTransport for python, a zlib compressed transport
> ------------------------------------------------------
>
> Key: THRIFT-1103
> URL: https://issues.apache.org/jira/browse/THRIFT-1103
> Project: Thrift
> Issue Type: New Feature
> Components: Python - Library
> Reporter: Will Pierce
> Assignee: Will Pierce
> Attachments: THRIFT-1103.tzlibtransport_for_python_v1.patch
>
>
> New implementation of zlib compressed transport for python.
> The attached patch provides a zlib compressed transport wrapper for python.
> It is similar to the TFramedTransport, in that it wraps another transport,
> implementing the data compression as a transformation layer on top of the
> underlying transport that it wraps.
> The compression level is configurable in the constructor, from 0 (none) to 9
> (best) and defaults to 9 for best compression. The way this works is that
> every write() to the transport appends more data to the internal cStringIO
> write buffer. When the transport's flush() method is called, the buffered
> bytes are then passed to a zlib Compressor object and flush()ed with
> zlib.Z_SYNC_FLUSH.
> Because the thrift API calls the transport's flush() after writeMessageEnd(),
> this means very small thrift RPC calls don't get compressed well. This
> transport works best on thrift protocols where the payload contains strings
> longer than 10 characters. As with all data compression, the more redundancy
> in the uncompressed input, the greater the resulting compression.
> The TZlibTransport class also implements some basic statistics that track the
> number of raw bytes written and read, versus the decompressed equivalent.
> The getCompRatio() method returns a tuple of
> (readCompressionRatio,writeCompressionRatio) where ratio is computed using:
> compressed_bytes/uncompressed_bytes. (So 10 compression is 0.10, meaning
> smaller numbers are better.) The getCompSavings() method returns the actual
> number of (saved_read_bytes,saved_write_bytes) which might be negative when
> the compression of non-compressible data ends up expanding the data. So
> hopefully, anyone who uses this transport will be able to tell whether the
> compression is saving bandwidth or not.
> I will add the patch in a few minutes.
> I haven't tested this against the C++ TZlibTransport, only against itself.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira