[
https://issues.apache.org/jira/browse/THRIFT-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728521#comment-13728521
]
Frens Jan Rumph commented on THRIFT-1121:
-----------------------------------------
Although this issue is closed and considered fixed, I would like to add my two
cents on the combination of TFramedTransport and TSocket.
I have worked with Hector, a Cassandra client, which is using Thrift v0.6.1,
which is suffering from a large deal of overhead, at least in my analysis. I
came to the conclusion that the performance regression is caused - at least in
my setup - by overhead on the TCP layer.
In my setup I use the binary protocol over the framed transport over the Thrift
socket (without buffering in v0.6.1). I discovered, that two TCP segments are
being sent for every frame sent. One for the length of the frame and one for
the frame itself (or more if the frame is larger than the maximum a TCP segment
can hold). With small messages, the overhead is substantial: 56 extra bytes of
headers in case of Ethernet + IP + TCP. Also considering that in my setup the
PSH flag was raised, causing the 4 bytes length of the frame to be pushed from
the TCP stack to the application ... while the data itself is on its way.
Now this issue is alleviated by using a buffered output stream in TSocket.
However, I do think that this solution causes unnecessary memory overhead in
cases where framed transport is used. Then the data is in memory twice. And the
data is first written into the framed transport buffer, then written in the
buffered output stream of the socket and then written to the TCP stack.
The above goes for the writing side of things. As for the reading: it's a
system call extra (first read the length and then in a separte call read the
frame from the socket). I do not estimate this to account for the big
difference in throughput.
Of course in cases where the framed transport isn't used, buffered in and
output streams are very useful.
> Java server performance regression in 0.6
> -----------------------------------------
>
> Key: THRIFT-1121
> URL: https://issues.apache.org/jira/browse/THRIFT-1121
> Project: Thrift
> Issue Type: Bug
> Components: Java - Library
> Affects Versions: 0.6
> Reporter: Todd Lipcon
> Assignee: Bryan Duxbury
> Fix For: 0.8
>
>
> A user reports a 30% performance regression after upgrading some
> high-request-rate Java software from Thrift 0.3 to 0.6. After some
> inspection, it turns out that the changes for THRIFT-959 caused the slowdown.
> However, even after altering the code to use the TFramedTransport,
> performance was still only 80% of version 0.3. I believe the problem is that
> the TFramedTransport must read the length (unbuffered) before reading (only)
> one message. In one particular workload, sent with oneway streaming, the
> server is making many more system calls.
> It wasn't obvious how to compose a Transport that would add back the
> buffering using existing components. We created our own trivial
> TServerSocket that adds the socket buffering back. Performance is now back
> where it was with 0.3.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira