[
https://issues.apache.org/activemq/browse/AMQ-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Davies reassigned AMQ-1709:
-------------------------------
Assignee: Rob Davies
> Network of Brokers Memory Leak Due to Race Condition
> ----------------------------------------------------
>
> Key: AMQ-1709
> URL: https://issues.apache.org/activemq/browse/AMQ-1709
> Project: ActiveMQ
> Issue Type: Bug
> Components: Broker, Transport
> Affects Versions: 4.1.2, 5.0.0
> Reporter: Howard Orner
> Assignee: Rob Davies
>
> When you a a network of brokers configuration with at least 3 brokers, such
> as:
> <broker brokerName="A" persistent="false" ...
> ...
> <transportConnector name="AListener" uri="tcp://localhost:61610"/>
> ...
> <networkConnector name="BConnector" uri="static:(tcp://localhost:61620)"/>
> <networkConnector name="CConnector" uri="static:(tcp://localhost:61630)"/>
> with the other brokers have a similar configuration.
> Then, if you have subscribers trying to connect to all of the brokers you can
> have a race condition at start up where the transports accept connections
> from subscribers before the network connectors are initialized. In
> BrokerService.startAllConnectors(), the transports are started first. Then
> the NetworkConnectors. As part of starting the network connectors, their
> constructors takes a collection obtained by calling
> getBroker().getDurableDestinations(). Normally this list would be empty.
> However, if clients connect before this is called, a list is returned for
> each topic subscribed to. Then, instead of creating standard
> TopicSubscriptions for the network connector, DurableTopicSubscriptions are
> created. I'm not sure if this really should be a problem, but it is because
> SimpleDispatchPolicy, in the process of iterating through the
> DurableTopicSubscriptions, causes messages to be queued up for prefetch
> without clearing all of the references (for each pass through it looks like
> three references are registered and only two are cleared. This becomes a
> memory leak. In the logs you see a message saying the PrefetchLimit was
> reached and then you start seeing logs about memory usage increasing until it
> gets to 100% and then everything stops.
> To reproduce this, create a network of brokers configuration of at least 3
> brokers -- the more you have the more likely you are to hit this without a
> lot of tries so I suggest a bunch. Start all brokers. Establish a publisher
> on broker A using failover://(tcp://localhost:61610) then establish a bunch
> of subscribers on all the brokers using a similar configuration, i.e,
> failover://(tcp://localhost:61610), failover://(tcp://localhost:61620). The
> more you have on broker 'A' the better since you are trying to reproduce the
> race condition. You want the others up so that the other brokers expect
> messages to be passed to them. Once everybody is up and happy, kill broker
> A and restart it. If you do that enough times, you will hit the race
> condition and the memory leak will start. You can also put a break point
> in BrokerService.startAllConnectors() after the transports are started but
> before the network connectors are started. That'll give clients to connect
> to the transport threads before you tell the VM to continue.
> I found it an easy fix to store the durable destination list in a local
> variable before starting the transports and passing that to the network
> connectors instead of separate calls.. I'm not sure if there are 'normal'
> ways for that list to be anything other than empty. If not, you could just
> pass an empty set to the network connectors, but suspect there are legitimate
> configurations that may need this to requested. If so, this memory leak
> would likely occur in these cases, too.
> I ran into this in 4.1.2. I haven't tested 5.0 since our attempts to switch
> to 5.0 were met with failure due to the number of bugs in 5.0 (already
> reported by others). Looking at 5.0.0 source, the race condition is still
> there in BrokerService.startAllConnectors() so I suspect the memory leak is
> there as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.