Ben Weintraub created THRIFT-4326:
-------------------------------------
Summary: Ruby BufferedTransport not safe for reuse after reading
corrupted input
Key: THRIFT-4326
URL: https://issues.apache.org/jira/browse/THRIFT-4326
Project: Thrift
Issue Type: Bug
Components: Ruby - Library
Affects Versions: 0.10.0
Environment: Originally observed with Thrift 0.9.3 on Linux with Ruby
2.3.4, but have also reproduced on Mac OS X with Thrifty 0.10.0.
Reporter: Ben Weintraub
We've experimented with the Ruby {{BufferedTransport}} class as a wrapper
around the {{HttpClientTransport}} class, and found that we were getting
clusters sporadic {{Thrift::ProtocolException}} errors in Ruby client processes
after network issues caused corruption of some Thrift response bodies.
Using a bare {{HttpClientTransport}} makes these issues disappear.
For a given service, we retain a long-lived protocol instance
({{CompactProtocol}} in our case), which in turn holds a reference to a
long-lived {{BufferedTransport}} instance.
The problem seems to stem from the case where the Thrift client is interrupted
(e.g. by a Ruby timeout exception) before consuming to the end of the {{@rbuf}}
instance variable in {{BufferedTransport}}, leaving {{@index}} pointing to the
middle of the read buffer, and meaning that when the transport is re-used upon
the next service call, the {{BufferedTransport}} continues reading where it
left off in the old buffer, rather than calling through to the wrapped
{{HttpClientTransport}} to read the new response obtained from the last call to
{{#flush}}.
Now I know {{Timeout}} is fundamentally unsafe in Ruby and can lead to all
kinds of issues like this, but I've also found that this same issue can be
triggered by another fairly plausible scenario: if the Thrift service returns a
well-formed Thrift response but with N extra bytes of garbage tacked onto the
end, then the next N following service calls through the same
{{BufferedTransport}} instance will fail with a {{Thrift::ProtocolException}},
as the {{BufferedTransport}} will continue attempting to read the left-over
bytes in {{@rbuf}}.
The naive solution seems like it would be to just reset {{@rbuf}} from
{{#flush}}.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)