Alan Conway wrote:
On Mon, 2006-09-11 at 14:37 +0100, Gordon Sim wrote:
Why do we need extra queues to scale up?
For a single broker you don't, the broker should be optimized to take
full advantage of local CPUs etc. with a single queue.
I'm thinking about the large scale deployment where you need to
distribute load across multiple hosts (possibly on different networks)
because either the CPUs on the exchange host or it's network
connectivity are a bottleneck.
Ok, understood.
So let the exchange act as a consumer and *remove* messages from
over-full queues to re-queue them on under-full/empty ones. Now we have
a full dynamic balance in both growing and shrinking phases
This complicates the exchange though as it now presumably needs to
periodically monitor the queue lengths (i.e. it becomes an active entity
rather than just reacting to publications routed through it). Maybe an
entirely separate re-balancing component would be cleaner?
Maybe, need to think about that. My intuition is that the broker can do
a better job of this because it's a single place to keep the statistics,
but a distributed cleanup component might have advantages if network
bandwidth at the broker is the bottleneck.
I'm not saying that component shouldn't be in the broker(s) just that it
isn't necessarily part of an exchange.
I'm not really sure I understand the root problem here. i.e. why do we
want multiple queues of the same (or similar) length?
Trying to balance multiple queues only makes sense if there's a resource
problem with a everyone talking to a single queue - not enough memory,
not enough open file descriptors, performance degrades due to memory
requirements, network topology/firewalls etc. You can imagine situations
where a single broker with 1,000,000 consumers might not perform as well
as a federation of 1001 brokers each with 1000 consumers.
My confusion here stemmed from not understanding that you were talking
about a group of co-operating brokers. (This is actually the use case
that the java clustering code currently in svn was designed for though
any actual improvements to scalability have not been confirmed through
testing).
That being the case I would argue even more strongly against the
exchange removing messages from queues it has delivered them to and
redelivering them to shorter queues to load balance. For one thing that
would have implications for ordering. A single logical queue that is in
implementation distributed would seem like a better fit from the design
point of view (the clustering code mentioned in the previous paragraph
does something similar).
That said need to work hard optimizing the broker so that the single
queue solution can scale as far as possible before we get into more
complicated federations and the like. We should also look at some real
data before assuming that such federation will solve a real-world
problem, this stuff doesn't always work out the way you think it will!
Agreed! The justification for the earlier work on java was purely to get
feature parity with an alternative implementation.