Ben Weintraub created THRIFT-4326: ------------------------------------- Summary: Ruby BufferedTransport not safe for reuse after reading corrupted input Key: THRIFT-4326 URL: https://issues.apache.org/jira/browse/THRIFT-4326 Project: Thrift Issue Type: Bug Components: Ruby - Library Affects Versions: 0.10.0 Environment: Originally observed with Thrift 0.9.3 on Linux with Ruby 2.3.4, but have also reproduced on Mac OS X with Thrifty 0.10.0. Reporter: Ben Weintraub
We've experimented with the Ruby {{BufferedTransport}} class as a wrapper around the {{HttpClientTransport}} class, and found that we were getting clusters sporadic {{Thrift::ProtocolException}} errors in Ruby client processes after network issues caused corruption of some Thrift response bodies. Using a bare {{HttpClientTransport}} makes these issues disappear. For a given service, we retain a long-lived protocol instance ({{CompactProtocol}} in our case), which in turn holds a reference to a long-lived {{BufferedTransport}} instance. The problem seems to stem from the case where the Thrift client is interrupted (e.g. by a Ruby timeout exception) before consuming to the end of the {{@rbuf}} instance variable in {{BufferedTransport}}, leaving {{@index}} pointing to the middle of the read buffer, and meaning that when the transport is re-used upon the next service call, the {{BufferedTransport}} continues reading where it left off in the old buffer, rather than calling through to the wrapped {{HttpClientTransport}} to read the new response obtained from the last call to {{#flush}}. Now I know {{Timeout}} is fundamentally unsafe in Ruby and can lead to all kinds of issues like this, but I've also found that this same issue can be triggered by another fairly plausible scenario: if the Thrift service returns a well-formed Thrift response but with N extra bytes of garbage tacked onto the end, then the next N following service calls through the same {{BufferedTransport}} instance will fail with a {{Thrift::ProtocolException}}, as the {{BufferedTransport}} will continue attempting to read the left-over bytes in {{@rbuf}}. The naive solution seems like it would be to just reset {{@rbuf}} from {{#flush}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)