[ 
https://issues.apache.org/activemq/browse/AMQ-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dejan Bosanac reassigned AMQ-3021:
----------------------------------

    Assignee: Dejan Bosanac

> HttpTunnelServlet leaks BlockingQueueTransport objects, causing eventual OOM 
> on heap space
> ------------------------------------------------------------------------------------------
>
>                 Key: AMQ-3021
>                 URL: https://issues.apache.org/activemq/browse/AMQ-3021
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Transport
>    Affects Versions: 5.4.1
>            Reporter: Stirling Chow
>            Assignee: Dejan Bosanac
>            Priority: Critical
>         Attachments: BlockingQueueTransport.java, 
> BlockingQueueTransportLeakTest.java, patch.txt
>
>
> Symptom
> ========
> We have a production system involving a network of 8 Brokers connected over 
> HTTP.  The Brokers discover each other using SimpleDiscoveryAgent.  Our 
> network experienced a period of instability during which time numerous 
> Broker-to-Broker bridges were created and failed repeatedly.  Over the course 
> of about 7 hours, two of the Brokers crashed with OOM heap space errors.
> We analyzed the heap dump and discovered several thousand instances of 
> org.apache.activemq.transport.http.BlockingQueueTransport.  These transports 
> were associated with bridges that had failed, however, they were not being 
> garbage collected because HttpTunnelServlet was maintaining references to 
> them.
> This issue was easily replicated in a test environment were we repeatedly 
> broke the connection between a pair of Brokers connected over HTTP.  In each 
> case, both Brokers maintained *indefinitely* a number of instances of 
> BlockingQueueTransport equal to the number of times the network was 
> interrupted.
> Cause
> =====
> When a bridge is first created over HTTP, the client broker's 
> HttpClientTransport sends a HEAD command to the server broker, which is 
> processed by an instance of HttpTunnelServlet.  In response,e 
> HttpTunnelServlet creates an instance of BlockingQueueTransport to represent 
> the connection to the client broker.  This instance of BlockingQueueTransport 
> is stored in a private hash map managed by HttpTunnelServlet and indexed by 
> the client's unique ID:
> public class HttpTunnelServlet extends HttpServlet {
> ...
>     private final Map<String, BlockingQueueTransport> clients = new 
> HashMap<String, BlockingQueueTransport>();
> ...
>     protected BlockingQueueTransport 
> createTransportChannel(HttpServletRequest request, HttpServletResponse 
> response) throws IOException {
>         String clientID = request.getHeader("clientID");
> ...
>             answer = createTransportChannel();
>             clients.put(clientID, answer);
> ...
> Every time a client broker reestablishes a bridge, it generates a new 
> clientID.  As a result, the clients hash map accumulates instances of 
> BlockingQueueTransport, one for each bridge created.  Nowhere in the 
> implementation of HttpTunnelServlet is there any code that removes the 
> instance when a client broker is no longer connected.  In an environment with 
> multiple brokers and an unreliable network, the client hash  map can 
> accumulate thousands of instances of BlockingQueueTransport.
> Solution
> =======
> HttpTunnelServlet needs to remove an instance of BlockingQueueTransport from 
> the clients hash map whenever that instance is no longer being used.  The 
> addition of InactivityMonitor as a default interceptor for the 
> BlockingQueueTransport (see AMQ-2764) is a partial solution in that it 
> triggers the closure of unused BlockingQueueTransport instances; however, 
> HttpTunnelServlet does not detect these closures.
> The solution is included a patch and involves the following changes to 
> HttpTunnelServlet (not all changes are directly related to the OOM):
> 1) The addition of a ServiceListener to the BlockingQueueTransport, which is 
> triggered when the transport is closed and causes the removal of the 
> transport from the clients hash map
> 2) Refactoring of the access to the clients hash map to simplify thread 
> safety (in particularly, removal of explicit synchronization in lieue of 
> ConcurrentHashMap)
> 3) An additional check on the BlockingQueueTransport to ensure that it was 
> not prematurely closed (the previous code ignored this possibility)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to