Michael Scott Leuthaeuser created THRIFT-3805:
-------------------------------------------------

             Summary: Golang server susceptible to memory spike from malformed 
message
                 Key: THRIFT-3805
                 URL: https://issues.apache.org/jira/browse/THRIFT-3805
             Project: Thrift
          Issue Type: Bug
          Components: Go - Library
    Affects Versions: 0.9.3
         Environment: OSX 10.10.5, Debian 7.10
            Reporter: Michael Scott Leuthaeuser
            Priority: Critical


When a thrift server is started through the Go library, using the default 
TServerSocket, the processor from the generated files, and a SimpleServer2 
(which uses BinaryProtocol), it listens on the created socket for incoming 
messages.  However, if the incoming message is *not* from another Thrift 
server, and thus the data submitted over the connection does not conform to the 
Thrift message serialization format (for us, this was triggered by automated 
security scanners connecting to the exposed port and sending it data to see how 
it would react), it triggers a massive memory spike.

This appears to happen because the BinaryProtocol expects the first 4 bytes of 
the file to indicate the size of the following message, interpreted as a signed 
Int32 in BigEndian format.  If this value is positive, it reads that many bytes 
from the connection.

However, to do this, it first *allocates* a slice of that many bytes.  If the 
provided message is not a thrift serialization, this signed Int32 can be 
massive (we're typically seeing it have a value of 1-2 billion), causing it to 
allocate *hundreds* of megabytes of data.  This slice is retained until the 
expected number of bytes are received (which generally doesn't happen for a 
malformed connection), or until the connection is closed from the client end.  
When either happen, the BinaryProtocol then type-casts the entire array to a 
String type, which causes a *second* allocation of equal size, before returning 
the message to be processed.  

Since this processing fails, both the byte slice and the string are discarded 
and reclaimed on the next garbage collection cycle, but in the mean type, it 
will have allocated an amount of memory equal to double the value of that 
Int32, in bytes.  We're typically seeing this allocation exceed 800-1000 MB per 
variable (so 800 MB when the connection is received, and another 800-1000 MB 
when it's closed).  The first allocation will persist as long as the connection 
is maintained, and other connections that are similarly malformed will cause 
concurrent allocation spikes, which can easily crash the server.

To demonstrate this, one can make an extremely simple single-method thrift 
contract, start a server, then netcat to the host port and feed it at least 4 
characters of data.

To fix this, we implemented an iterative buffered read into the BinaryProtocol 
itself.  Instead of allocating the entire array at once, if the message 
indicates it is larger than a certain size (we have it set at 32kb, but the 
actual value is largely arbitrary), it instead reads the message in increments 
of that size, and joins them using the standard library bytes.Buffer type.  
Here's a diff of our changes:

https://gist.github.com/kaedys/53b8fd05690d5d8f202afa7878d6e3d5

This fixes the issue of having a massive initial allocation due to a malformed 
request, while still allowing messages of arbitrary incoming size.  It will 
likely marginally slow down the reading of extremely large messages (ones 
substantially larger than the constant defined), but the bytes.Buffer type is 
well optimized for growing itself as it is written to.  I also realize that 
this only occurs with malformed requests to the server, but I would think a 
server package should be relatively hardened against malformed data being 
submitted to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to