ruby transport class read buffer very slow due to usage of slice!
-----------------------------------------------------------------
Key: THRIFT-401
URL: https://issues.apache.org/jira/browse/THRIFT-401
Project: Thrift
Issue Type: Improvement
Components: Library (Ruby)
Environment: # uname -a
Linux zvm.local 2.6.9-78.0.1.ELsmp #1 SMP Tue Aug 5 11:02:47 EDT 2008 i686 i686
i386 GNU/Linux
Reporter: Tyler Kovacs
Priority: Minor
Attachments: before.png
We use Thrift as a cross-language transport for Hypertable - an open-source
distributed database. While profiling queries with large response using the
ruby Thrift libraries, we discovered that the majority of time was spent in
thrift/transport.rb. Specifically, the slice! method, which is used to manage
the read buffer (@rbuf) was responsible for almost all latency.
We tried an alternative implementation that showed 300x speedup in our tests.
Instead of repeatedly calling slice! to alter @rbuf (which apparently is
extremely expensive), we maintain an offset counter (@rpos) which starts at
zero and is incremented by sz each time we read from @rbuf. Before and after
screenshots from kcachegrind are attached.
I'll copy the monkey patch that we use within the description below - and I'll
try to assemble a patch later today.
module Thrift
class FramedTransport < Transport
def initialize(transport, read=true, write=true)
@transport = transport
@rbuf = ''
@wbuf = ''
@read = read
@write = write
@rpos = 0
end
def read(sz)
return @transport.read(sz) unless @read
return '' if sz <= 0
read_frame if @rpos >= @rbuf.length
@rpos += sz
@rb...@rpos - sz, sz] || ''
end
def borrow(requested_length = 0)
read_frame if @rpos >= @rbuf.length
# there isn't any more coming, so if it's not enough, it's an error.
raise EOFError if requested_length > (@rbuf.length - @rpos)
@rb...@rpos, requested_length]
end
def consume!(size)
@rpos += size
@rb...@rpos - size, size]
end
private
def read_frame
sz = @transport.read_all(4).unpack('N').first
@rpos = 0
@rbuf = @transport.read_all(sz)
end
end
end
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.