Re: Correctly configuring a network of brokers

Jose María Zaragoza Thu, 12 Nov 2015 13:31:40 -0800

>if you allow a huge backlog of messages to buffer up for one
>consumer, the other consumers can't work on them even if they're sitting
>around idle.



Thanks for your answers but I don't understand that sentence
What do you mean ?

Regards

2015-11-10 16:56 GMT+01:00 Tim Bain <tb...@alumni.duke.edu>:
> On Thu, Nov 5, 2015 at 2:02 PM, jahlborn <jahlb...@gmail.com> wrote:
>
>> So I've spent some time digging around the internet and experimenting with
>> our
>> setup, trying to determine the best configuration for a network of brokers.
>> There are a variety of things which seem to conspire against me: the
>> multitude
>> of configuration options present in activemq, options which seem to have no
>> "optimal" setting (each choice has different pros and cons), and
>> behavior/features which change over time such that old recommendations may
>> now
>> be irrelevant or no longer correct.  For reference, we are using a mesh
>> network of brokers, where consumers may arrive at any broker in the
>> network.
>> We use topics, queues, and virtual queues.  We use explicit receive() calls
>> as
>> well as listeners.  We also utilize the "exclusive consumer" feature for
>> some
>> of our clustered consumers.  All messaging is currently durable.  Some
>> relevant configuration bits (changes from the defaults).
>>
>> * using activemq 5.9.1
>> * advisory support is enabled
>> * PolicyEntry
>> ** optimizedDispatch: true
>> * Using ConditionalNetworkBridgeFilterFactory
>> ** replayWhenNoConsumers: true
>> ** replayDelay: 1000
>> * NetworkConnector
>> ** prefetchSize: 1
>> ** duplex: false
>> ** messageTTL: 9999
>>
>> Alright, now that we got through all the context, here's the meat of the
>> question.  There are a bunch of configuration parameters (mostly focused in
>> the NetworkConnector), and it's not at all clear to me if my current
>> configuration is optimal.  For instance, although we had been using a
>> configuration similar to the above for about a year now (same as above
>> except
>> that messageTTL was 1), we only recently discovered that our exclusive
>> consumers can sometimes stop processing messages.  In certain cases, the
>> consumer would end up bouncing around and the messages would end up getting
>> stuck on one node.  Adding the messageTTL setting seems to fix the problem
>> (is
>> it the right fix...?).
>>
>
> If your problem was that you were getting stuck messages due to consumers
> bouncing around the Network of Brokers (which is what it sounds like), then
> yes, increasing the networkTTL (or the messageTTL, depending on your
> topology) is the correct fix, in addition to the replayWhenNoConsumers=true
> setting you said you had already set.
>
> * NetworkConnector
>> ** "dynamicOnly" - i've seen a couple of places mention enabling this and
>> some
>>    indication that it helps with scaling in a network of brokers (e.g.
>> [3]).
>>    The description in [1] also makes it sound like something i would want
>> to
>>    enable.  However, the value defaults to false, which seems to indicate
>> that
>>    there is a down-side to enabling it.  Why wouldn't i want to enable
>> this?
>>
>
> One major difference here is that with this setting disabled (the default)
> messages will stay on the broker to which they were first produced, while
> with it enabled they will go to the broker to which the durable subscriber
> was last connected (or at least, the last one in the route to the consumer,
> if the broker to which the consumer was actually connected has gone down
> since then).  There are at least two disadvantages to enabling it: 1) if
> the producer connects to an embedded broker, then those messages go offline
> when the producer goes offline and aren't available when the consumer
> reconnects, 2) it only takes filling one broker's store before you Producer
> Flow Control the producer (whereas with the default setting you have to
> fill every broker along the last route to the consumer before PFC kicks
> in), and 3) if you have a high-latency network link in the route from
> producer to consumer, you delay traversing it until the consumer
> reconnects, which means the consumer may experience more latency than it
> would otherwise need to.  So as with so many of these settings, the best
> configuration for you will depend on your situation.
>
> Also, default values are often the default because that's what they were
> when they were first introduced (to avoid breaking legacy configurations),
> not necessarily because that's the setting that's recommended for all
> users.  Default values do get changed when the old value is clearly not
> appropriate and the benefits of a change outweigh the inconvenience to
> legacy users, but when there's not a clear preference they usually get left
> alone, which is a little confusing to new users.
>
>
>> ** "decreaseNetworkConsumerPriority", "suppressDuplicateQueueSubscriptions"
>>
> -
>>    these params both seem like "damned if you do, damned if you don't" type
>>    parameters.  The first comment in [2] seems to imply that in order to
>>    scale, you really want to enable these features so that producers prefer
>>    pushing messages to local consumers (makes sense).  Yet, at the same
>> time,
>>    it seems that enabling this feature will _decrease_ scalability in that
>> it
>>    won't evenly distribute messages in the case when there are multiple
>> active
>>    consumers (we use clusters of consumers in some scenarios).  Also in
>> [2],
>>    there are some allusions to stuck messages if you don't enable this
>>    feature.  Should i enable these parameters?
>>
>
> You're right that decreaseNetworkConsumerPriority is a bit of a "damned if
> you do, damned if you don't" type parameter when you're trying to load
> balance between local and network clients, but if you're trying to load
> balance well, you should have small prefetch buffer sizes anyway.  If you
> do, it really shouldn't matter which consumer gets prioritized; they'll
> each be given a small number of messages and then there will be a large
> backlog sitting on the broker, who will hand them out as the initial small
> batches get worked off.  But the more important contribution of
> decreaseNetworkConsumerPriority is that it favors more-direct routes over
> less-direct ones through a network where multiple routes exist from a
> producer to a consumer, and (less important but still positive) enabling it
> seems likely to reduce the load on your broker network by having a message
> pass through fewer brokers, so I'd definitely enable it given what you've
> said about your concerns.
>
> suppressDuplicateQueueSubscriptions disallows multiple routes to a given
> consumer, but those routes are only a problem if
> decreaseNetworkConsumerPriority isn't used (because with
> decreaseNetworkConsumerPriority, you won't choose the non-optimal routes
> unless the optimal one disappears due to a broker going down).  So if
> you're using decreaseNetworkConsumerPriority (and it sounds like you
> should), then I don't see a reason for you to use
> suppressDuplicateQueueSubscription.  But it could be useful for anyone
> who's not using that option.
>
>
>> ** "networkTTL", "messageTTL", "consumerTTL" - until recently, we kept
>> these
>>    at the defaults (1).  However, we recently realized that we can end up
>> with
>>    stuck messages with these settings.  I've seen a couple of places which
>>    recommend setting "networkTTL" to the number of brokers in the network
>>    (e.g. [2]), or at least something > 1.  However, the recommendation for
>>    "consumerTTL" on [1] is that this value should be 1 in a mesh network
>> (and
>>    setting the "networkTTL" will set the "consumerTTL" as well).
>>    Additionally, [2] seems to imply that enabling
>>    "suppressDuplicateQueueSubscriptions" acts like "networkTTL" is 1 for
>> proxy
>>    messages (unsure what this means?).  We ended up setting only the
>>    "messageTTL" and this seemed to solve our immediate problem.  Unsure if
>> it
>>    will cause other problems...?
>>
>
> messageTTL controls how far actual messages are allowed to propagate before
> they have to be consumed locally.  consumerTTL controls how far advisory
> messages (which flow in the opposite direction as the actual messages, so
> that brokers that receive a message know where to send it to) are allowed
> to propagate.  networkTTL sets both values to the same thing; you should
> use either networkTTL (if you're using the same value) or messageTTL &
> consumerTTL (if you need different values), but don't mix them.  I'm not
> aware of any problems caused by having a "too-large" consumerTTL/networkTTL
> unless you have a non-homogenous NoBs where you want to keep messages
> produced in one part from propagating into another part; if all you have is
> a uniform mesh where each message is "allowed" anywhere if an appropriate
> consumer needs that, then just use networkTTL and avoid the
> confusion/hassle of splitting the configuration.
>
> In a mesh (all brokers connected to each other), you only need a
> consumerTTL of 1, because you can get the advisory message to every other
> broker in one hop.  But in that same mesh, there's no guarantee that a
> single hop will get you to the broker where the consumer is, because the
> consumer might jump to another node in the mesh before consuming the
> message, which would then require another forward.  So in a mesh with
> decreaseNetworkConsumerPriorty you may need a messageTTL/networkTTL of 1 +
> [MAX # FORWARDS] or greater, where [MAX # FORWARDS] is the worst-case
> number of jumps a consumer might make between the time a message is
> produced and the time it is consumed.  In your case you've chosen 9999, so
> that allows 9998 consumer jumps, which should be more than adequate.
>
>
>> ** "prefetchSize" - defaults to 1000, but I see recommendations that it
>> should
>>    be 1 for network connectors (e.g. [3]).  I think that in our initial
>>    testing i saw bad things happen with this setting and got more even load
>>    balancing by lowering it to 1.
>>
>
> As I mentioned above, setting a small prefetch size is important for load
> balancing; if you allow a huge backlog of messages to buffer up for one
> consumer, the other consumers can't work on them even if they're sitting
> around idle.  I'd pick a value like 1, 3, 5, 10, etc.; something small
> relative to the number of messages you're likely to have pending at any one
> time.  (But note that the prefetch buffer can improve performance if you
> have messages that take a variable amount of time to process and sometimes
> the amount of time to process them is lower than the amount of time to
> transfer them between your brokers or from the broker to the consumer, such
> as with a high-latency network link.  This doesn't sound like your
> situation, but it's yet another case where the right setting depends on
> your situation.)
>
>
>> I think that about summarizes my questions and confusion.  Any help would
>> be
>> appreciated!
>>
>> [1] http://activemq.apache.org/networks-of-brokers.html
>> [2] https://issues.jboss.org/browse/MB-471
>> [3]
>>
>> http://www.javabeat.net/deploying-activemq-for-large-numbers-of-concurrent-applications/
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/Correctly-configuring-a-network-of-brokers-tp4703715.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>

Re: Correctly configuring a network of brokers

Reply via email to