[ https://issues.apache.org/jira/browse/THRIFT-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303044#comment-15303044 ]
Michael Scott Leuthaeuser commented on THRIFT-3805: --------------------------------------------------- Wanted to poke in and see what the status of this was. There a chance it can make it into Thrift 0.9.4? > Golang server susceptible to memory spike from malformed message > ---------------------------------------------------------------- > > Key: THRIFT-3805 > URL: https://issues.apache.org/jira/browse/THRIFT-3805 > Project: Thrift > Issue Type: Bug > Components: Go - Library > Affects Versions: 0.9.3 > Environment: OSX 10.10.5, Debian 7.10 > Reporter: Michael Scott Leuthaeuser > Priority: Critical > Attachments: thrift_golang_memory_spike_fix.patch > > > When a thrift server is started through the Go library, using the default > TServerSocket, the processor from the generated files, and a SimpleServer2 > (which uses BinaryProtocol), it listens on the created socket for incoming > messages. However, if the incoming message is *not* from another Thrift > server, and thus the data submitted over the connection does not conform to > the Thrift message serialization format (for us, this was triggered by > automated security scanners connecting to the exposed port and sending it > data to see how it would react), it triggers a massive memory spike. > This appears to happen because the BinaryProtocol expects the first 4 bytes > of the file to indicate the size of the following message, interpreted as a > signed Int32 in BigEndian format. If this value is positive, it reads that > many bytes from the connection. > However, to do this, it first *allocates* a slice of that many bytes. If the > provided message is not a thrift serialization, this signed Int32 can be > massive (we're typically seeing it have a value of 1-2 billion), causing it > to allocate *hundreds* of megabytes of data. This slice is retained until > the expected number of bytes are received (which generally doesn't happen for > a malformed connection), or until the connection is closed from the client > end. When either happen, the BinaryProtocol then type-casts the entire array > to a String type, which causes a *second* allocation of equal size, before > returning the message to be processed. > Since this processing fails, both the byte slice and the string are discarded > and reclaimed on the next garbage collection cycle, but in the mean type, it > will have allocated an amount of memory equal to double the value of that > Int32, in bytes. We're typically seeing this allocation exceed 800-1000 MB > per variable (so 800 MB when the connection is received, and another 800-1000 > MB when it's closed). The first allocation will persist as long as the > connection is maintained, and other connections that are similarly malformed > will cause concurrent allocation spikes, which can easily crash the server. > To demonstrate this, one can make an extremely simple single-method thrift > contract, start a server, then netcat to the host port and feed it at least 4 > characters of data. > To fix this, we implemented an iterative buffered read into the > BinaryProtocol itself. Instead of allocating the entire array at once, if > the message indicates it is larger than a certain size (we have it set at > 32kb, but the actual value is largely arbitrary), it instead reads the > message in increments of that size, and joins them using the standard library > bytes.Buffer type. Here's a diff of our changes: > https://gist.github.com/kaedys/53b8fd05690d5d8f202afa7878d6e3d5 > This fixes the issue of having a massive initial allocation due to a > malformed request, while still allowing messages of arbitrary incoming size. > It will likely marginally slow down the reading of extremely large messages > (ones substantially larger than the constant defined), but the bytes.Buffer > type is well optimized for growing itself as it is written to. I also > realize that this only occurs with malformed requests to the server, but I > would think a server package should be relatively hardened against malformed > data being submitted to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)