>if you allow a huge backlog of messages to buffer up for one >consumer, the other consumers can't work on them even if they're sitting >around idle.
Thanks for your answers but I don't understand that sentence What do you mean ? Regards 2015-11-10 16:56 GMT+01:00 Tim Bain <tb...@alumni.duke.edu>: > On Thu, Nov 5, 2015 at 2:02 PM, jahlborn <jahlb...@gmail.com> wrote: > >> So I've spent some time digging around the internet and experimenting with >> our >> setup, trying to determine the best configuration for a network of brokers. >> There are a variety of things which seem to conspire against me: the >> multitude >> of configuration options present in activemq, options which seem to have no >> "optimal" setting (each choice has different pros and cons), and >> behavior/features which change over time such that old recommendations may >> now >> be irrelevant or no longer correct. For reference, we are using a mesh >> network of brokers, where consumers may arrive at any broker in the >> network. >> We use topics, queues, and virtual queues. We use explicit receive() calls >> as >> well as listeners. We also utilize the "exclusive consumer" feature for >> some >> of our clustered consumers. All messaging is currently durable. Some >> relevant configuration bits (changes from the defaults). >> >> * using activemq 5.9.1 >> * advisory support is enabled >> * PolicyEntry >> ** optimizedDispatch: true >> * Using ConditionalNetworkBridgeFilterFactory >> ** replayWhenNoConsumers: true >> ** replayDelay: 1000 >> * NetworkConnector >> ** prefetchSize: 1 >> ** duplex: false >> ** messageTTL: 9999 >> >> Alright, now that we got through all the context, here's the meat of the >> question. There are a bunch of configuration parameters (mostly focused in >> the NetworkConnector), and it's not at all clear to me if my current >> configuration is optimal. For instance, although we had been using a >> configuration similar to the above for about a year now (same as above >> except >> that messageTTL was 1), we only recently discovered that our exclusive >> consumers can sometimes stop processing messages. In certain cases, the >> consumer would end up bouncing around and the messages would end up getting >> stuck on one node. Adding the messageTTL setting seems to fix the problem >> (is >> it the right fix...?). >> > > If your problem was that you were getting stuck messages due to consumers > bouncing around the Network of Brokers (which is what it sounds like), then > yes, increasing the networkTTL (or the messageTTL, depending on your > topology) is the correct fix, in addition to the replayWhenNoConsumers=true > setting you said you had already set. > > * NetworkConnector >> ** "dynamicOnly" - i've seen a couple of places mention enabling this and >> some >> indication that it helps with scaling in a network of brokers (e.g. >> [3]). >> The description in [1] also makes it sound like something i would want >> to >> enable. However, the value defaults to false, which seems to indicate >> that >> there is a down-side to enabling it. Why wouldn't i want to enable >> this? >> > > One major difference here is that with this setting disabled (the default) > messages will stay on the broker to which they were first produced, while > with it enabled they will go to the broker to which the durable subscriber > was last connected (or at least, the last one in the route to the consumer, > if the broker to which the consumer was actually connected has gone down > since then). There are at least two disadvantages to enabling it: 1) if > the producer connects to an embedded broker, then those messages go offline > when the producer goes offline and aren't available when the consumer > reconnects, 2) it only takes filling one broker's store before you Producer > Flow Control the producer (whereas with the default setting you have to > fill every broker along the last route to the consumer before PFC kicks > in), and 3) if you have a high-latency network link in the route from > producer to consumer, you delay traversing it until the consumer > reconnects, which means the consumer may experience more latency than it > would otherwise need to. So as with so many of these settings, the best > configuration for you will depend on your situation. > > Also, default values are often the default because that's what they were > when they were first introduced (to avoid breaking legacy configurations), > not necessarily because that's the setting that's recommended for all > users. Default values do get changed when the old value is clearly not > appropriate and the benefits of a change outweigh the inconvenience to > legacy users, but when there's not a clear preference they usually get left > alone, which is a little confusing to new users. > > >> ** "decreaseNetworkConsumerPriority", "suppressDuplicateQueueSubscriptions" >> > - >> these params both seem like "damned if you do, damned if you don't" type >> parameters. The first comment in [2] seems to imply that in order to >> scale, you really want to enable these features so that producers prefer >> pushing messages to local consumers (makes sense). Yet, at the same >> time, >> it seems that enabling this feature will _decrease_ scalability in that >> it >> won't evenly distribute messages in the case when there are multiple >> active >> consumers (we use clusters of consumers in some scenarios). Also in >> [2], >> there are some allusions to stuck messages if you don't enable this >> feature. Should i enable these parameters? >> > > You're right that decreaseNetworkConsumerPriority is a bit of a "damned if > you do, damned if you don't" type parameter when you're trying to load > balance between local and network clients, but if you're trying to load > balance well, you should have small prefetch buffer sizes anyway. If you > do, it really shouldn't matter which consumer gets prioritized; they'll > each be given a small number of messages and then there will be a large > backlog sitting on the broker, who will hand them out as the initial small > batches get worked off. But the more important contribution of > decreaseNetworkConsumerPriority is that it favors more-direct routes over > less-direct ones through a network where multiple routes exist from a > producer to a consumer, and (less important but still positive) enabling it > seems likely to reduce the load on your broker network by having a message > pass through fewer brokers, so I'd definitely enable it given what you've > said about your concerns. > > suppressDuplicateQueueSubscriptions disallows multiple routes to a given > consumer, but those routes are only a problem if > decreaseNetworkConsumerPriority isn't used (because with > decreaseNetworkConsumerPriority, you won't choose the non-optimal routes > unless the optimal one disappears due to a broker going down). So if > you're using decreaseNetworkConsumerPriority (and it sounds like you > should), then I don't see a reason for you to use > suppressDuplicateQueueSubscription. But it could be useful for anyone > who's not using that option. > > >> ** "networkTTL", "messageTTL", "consumerTTL" - until recently, we kept >> these >> at the defaults (1). However, we recently realized that we can end up >> with >> stuck messages with these settings. I've seen a couple of places which >> recommend setting "networkTTL" to the number of brokers in the network >> (e.g. [2]), or at least something > 1. However, the recommendation for >> "consumerTTL" on [1] is that this value should be 1 in a mesh network >> (and >> setting the "networkTTL" will set the "consumerTTL" as well). >> Additionally, [2] seems to imply that enabling >> "suppressDuplicateQueueSubscriptions" acts like "networkTTL" is 1 for >> proxy >> messages (unsure what this means?). We ended up setting only the >> "messageTTL" and this seemed to solve our immediate problem. Unsure if >> it >> will cause other problems...? >> > > messageTTL controls how far actual messages are allowed to propagate before > they have to be consumed locally. consumerTTL controls how far advisory > messages (which flow in the opposite direction as the actual messages, so > that brokers that receive a message know where to send it to) are allowed > to propagate. networkTTL sets both values to the same thing; you should > use either networkTTL (if you're using the same value) or messageTTL & > consumerTTL (if you need different values), but don't mix them. I'm not > aware of any problems caused by having a "too-large" consumerTTL/networkTTL > unless you have a non-homogenous NoBs where you want to keep messages > produced in one part from propagating into another part; if all you have is > a uniform mesh where each message is "allowed" anywhere if an appropriate > consumer needs that, then just use networkTTL and avoid the > confusion/hassle of splitting the configuration. > > In a mesh (all brokers connected to each other), you only need a > consumerTTL of 1, because you can get the advisory message to every other > broker in one hop. But in that same mesh, there's no guarantee that a > single hop will get you to the broker where the consumer is, because the > consumer might jump to another node in the mesh before consuming the > message, which would then require another forward. So in a mesh with > decreaseNetworkConsumerPriorty you may need a messageTTL/networkTTL of 1 + > [MAX # FORWARDS] or greater, where [MAX # FORWARDS] is the worst-case > number of jumps a consumer might make between the time a message is > produced and the time it is consumed. In your case you've chosen 9999, so > that allows 9998 consumer jumps, which should be more than adequate. > > >> ** "prefetchSize" - defaults to 1000, but I see recommendations that it >> should >> be 1 for network connectors (e.g. [3]). I think that in our initial >> testing i saw bad things happen with this setting and got more even load >> balancing by lowering it to 1. >> > > As I mentioned above, setting a small prefetch size is important for load > balancing; if you allow a huge backlog of messages to buffer up for one > consumer, the other consumers can't work on them even if they're sitting > around idle. I'd pick a value like 1, 3, 5, 10, etc.; something small > relative to the number of messages you're likely to have pending at any one > time. (But note that the prefetch buffer can improve performance if you > have messages that take a variable amount of time to process and sometimes > the amount of time to process them is lower than the amount of time to > transfer them between your brokers or from the broker to the consumer, such > as with a high-latency network link. This doesn't sound like your > situation, but it's yet another case where the right setting depends on > your situation.) > > >> I think that about summarizes my questions and confusion. Any help would >> be >> appreciated! >> >> [1] http://activemq.apache.org/networks-of-brokers.html >> [2] https://issues.jboss.org/browse/MB-471 >> [3] >> >> http://www.javabeat.net/deploying-activemq-for-large-numbers-of-concurrent-applications/ >> >> >> >> >> -- >> View this message in context: >> http://activemq.2283324.n4.nabble.com/Correctly-configuring-a-network-of-brokers-tp4703715.html >> Sent from the ActiveMQ - User mailing list archive at Nabble.com. >>