[
https://issues.apache.org/activemq/browse/AMQ-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dejan Bosanac reassigned AMQ-3021:
----------------------------------
Assignee: Dejan Bosanac
> HttpTunnelServlet leaks BlockingQueueTransport objects, causing eventual OOM
> on heap space
> ------------------------------------------------------------------------------------------
>
> Key: AMQ-3021
> URL: https://issues.apache.org/activemq/browse/AMQ-3021
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker, Transport
> Affects Versions: 5.4.1
> Reporter: Stirling Chow
> Assignee: Dejan Bosanac
> Priority: Critical
> Attachments: BlockingQueueTransport.java,
> BlockingQueueTransportLeakTest.java, patch.txt
>
>
> Symptom
> ========
> We have a production system involving a network of 8 Brokers connected over
> HTTP. The Brokers discover each other using SimpleDiscoveryAgent. Our
> network experienced a period of instability during which time numerous
> Broker-to-Broker bridges were created and failed repeatedly. Over the course
> of about 7 hours, two of the Brokers crashed with OOM heap space errors.
> We analyzed the heap dump and discovered several thousand instances of
> org.apache.activemq.transport.http.BlockingQueueTransport. These transports
> were associated with bridges that had failed, however, they were not being
> garbage collected because HttpTunnelServlet was maintaining references to
> them.
> This issue was easily replicated in a test environment were we repeatedly
> broke the connection between a pair of Brokers connected over HTTP. In each
> case, both Brokers maintained *indefinitely* a number of instances of
> BlockingQueueTransport equal to the number of times the network was
> interrupted.
> Cause
> =====
> When a bridge is first created over HTTP, the client broker's
> HttpClientTransport sends a HEAD command to the server broker, which is
> processed by an instance of HttpTunnelServlet. In response,e
> HttpTunnelServlet creates an instance of BlockingQueueTransport to represent
> the connection to the client broker. This instance of BlockingQueueTransport
> is stored in a private hash map managed by HttpTunnelServlet and indexed by
> the client's unique ID:
> public class HttpTunnelServlet extends HttpServlet {
> ...
> private final Map<String, BlockingQueueTransport> clients = new
> HashMap<String, BlockingQueueTransport>();
> ...
> protected BlockingQueueTransport
> createTransportChannel(HttpServletRequest request, HttpServletResponse
> response) throws IOException {
> String clientID = request.getHeader("clientID");
> ...
> answer = createTransportChannel();
> clients.put(clientID, answer);
> ...
> Every time a client broker reestablishes a bridge, it generates a new
> clientID. As a result, the clients hash map accumulates instances of
> BlockingQueueTransport, one for each bridge created. Nowhere in the
> implementation of HttpTunnelServlet is there any code that removes the
> instance when a client broker is no longer connected. In an environment with
> multiple brokers and an unreliable network, the client hash map can
> accumulate thousands of instances of BlockingQueueTransport.
> Solution
> =======
> HttpTunnelServlet needs to remove an instance of BlockingQueueTransport from
> the clients hash map whenever that instance is no longer being used. The
> addition of InactivityMonitor as a default interceptor for the
> BlockingQueueTransport (see AMQ-2764) is a partial solution in that it
> triggers the closure of unused BlockingQueueTransport instances; however,
> HttpTunnelServlet does not detect these closures.
> The solution is included a patch and involves the following changes to
> HttpTunnelServlet (not all changes are directly related to the OOM):
> 1) The addition of a ServiceListener to the BlockingQueueTransport, which is
> triggered when the transport is closed and causes the removal of the
> transport from the clients hash map
> 2) Refactoring of the access to the clients hash map to simplify thread
> safety (in particularly, removal of explicit synchronization in lieue of
> ConcurrentHashMap)
> 3) An additional check on the BlockingQueueTransport to ensure that it was
> not prematurely closed (the previous code ignored this possibility)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.