[ https://issues.apache.org/jira/browse/THRIFT-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom May updated THRIFT-1190: ---------------------------- Summary: readBufferBytesAllocated in TNonblockingServer.java should be AtomicLong to fix FD leakage and general server malfunction (was: readBufferBytesAllocated TNonblockingServer.java should be AtomicLong to fix FD leakage and general server malfunction) > readBufferBytesAllocated in TNonblockingServer.java should be AtomicLong to > fix FD leakage and general server malfunction > ------------------------------------------------------------------------------------------------------------------------- > > Key: THRIFT-1190 > URL: https://issues.apache.org/jira/browse/THRIFT-1190 > Project: Thrift > Issue Type: Bug > Components: Java - Library > Affects Versions: 0.6 > Environment: Any > Reporter: Tom May > Attachments: thrift.patch > > > We're having a problem using THsHaServer where the server doesn't suddenly > stops closing close connections and runs out of file descriptors. > It looks like the problem is in TNonblockingServer.java. The variable "long > readBufferBytesAllocated" is shared between threads but isn't synchronized. > Intermittently, its value can get out of whack due to races and cause this if > statement to always execute: > // if this frame will push us over the memory limit, then return. > // with luck, more memory will free up the next time around. > if (readBufferBytesAllocated.get() + frameSize > > MAX_READ_BUFFER_BYTES) { > return true; > } > We started having this problem when we set MAX_READ_BUFFER_BYTES to a > somewhat small size, which unmasked the issue. > I don't see anywhere here to attach a patch so here it is: > Index: lib/java/src/org/apache/thrift/server/TNonblockingServer.java > =================================================================== > --- lib/java/src/org/apache/thrift/server/TNonblockingServer.java > (revision 1129978) > +++ lib/java/src/org/apache/thrift/server/TNonblockingServer.java > (working copy) > @@ -28,6 +28,7 @@ > import java.util.HashSet; > import java.util.Iterator; > import java.util.Set; > +import java.util.concurrent.atomic.AtomicLong; > > import org.apache.thrift.TByteArrayOutputStream; > import org.apache.thrift.TException; > @@ -87,7 +88,7 @@ > /** > * How many bytes are currently allocated to read buffers. > */ > - private long readBufferBytesAllocated = 0; > + private final AtomicLong readBufferBytesAllocated = new AtomicLong(0); > > public TNonblockingServer(AbstractNonblockingServerArgs args) { > super(args); > @@ -479,12 +480,12 @@ > > // if this frame will push us over the memory limit, then return. > // with luck, more memory will free up the next time around. > - if (readBufferBytesAllocated + frameSize > MAX_READ_BUFFER_BYTES) { > + if (readBufferBytesAllocated.get() + frameSize > > MAX_READ_BUFFER_BYTES) { > return true; > } > > // increment the amount of memory allocated to read buffers > - readBufferBytesAllocated += frameSize; > + readBufferBytesAllocated.addAndGet(frameSize); > > // reallocate the readbuffer as a frame-sized buffer > buffer_ = ByteBuffer.allocate(frameSize); > @@ -576,7 +577,7 @@ > // if we're being closed due to an error, we might have allocated a > // buffer that we need to subtract for our memory accounting. > if (state_ == READING_FRAME || state_ == READ_FRAME_COMPLETE) { > - readBufferBytesAllocated -= buffer_.array().length; > + readBufferBytesAllocated.addAndGet(-buffer_.array().length); > } > trans_.close(); > } > @@ -600,7 +601,7 @@ > // our read buffer count. we do this here as well as in close because > // we'd like to free this read memory up as quickly as possible for > other > // clients. > - readBufferBytesAllocated -= buffer_.array().length; > + readBufferBytesAllocated.addAndGet(-buffer_.array().length); > > if (response_.len() == 0) { > // go straight to reading again. this was probably an oneway method -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira