On 11/19/2009 09:31 AM, Gordon Sim wrote:
On 11/19/2009 02:29 PM, Alan Conway wrote:
On 11/19/2009 09:04 AM, Gordon Sim wrote:
On 11/18/2009 10:23 PM, Alan Conway wrote:
On 11/18/2009 02:38 PM, Andrew Stitcher wrote:
On Wed, 2009-11-18 at 13:39 -0500, Carl Trieloff wrote:
Alan Conway wrote:
At the moment a clustered broker stalls clients while it is
initializing, giving or receiving an update. It's been pointed out
that this can result in long delays for clients connected to a
broker
that elects to give an update to a newcomer, and it might be better
for the broker to disconnect clients so they can fail over to
another
broker not busy with an update.

There are 3 cases to consider:

- new member joining/getting update, new client: stall or reject?
- established member giving update, new client: stall or reject?
- established member giving update, connected client: stall or
disconnect?

On the 3rd point I would note that it's possible for clients to
disconnect themselves if the broker is unresponsive by using
heartbeats, and that not all clients can fail-over so I'd lean
towards
stall on that one, but I think rejecting new clients may make sense
here.

Part of the original motivation for stalling is that it makes it
easy
to write tests. You can start a broker and immediately start a
client
without worrying about waiting till the broker is ready. That's a
nice
property but there are other ways to achieve that. Current qpidd -d
returns as soon as the broker is ready to listen for TCP requests,
which may be before the broker is has joined the cluster. We could
change that behavior to wait till all plugins report "ready". For
tests we could also grep the log output for the ready message.

I think it would be better not to stall existing clients at all in the
established member giving update case.

It would be better to not be available for connect at all until up to
date in the new member case

It's arguable what the best behaviour is in the established member
giving update gets a new client case. However I would note that the
low
level code isn't capable of stopping accepting connections and then
starting again once it has started accepting connections. So they
would
have to connect then be disconnected with an exception.

I would also suggest that considering the number of likely cluster
members is important here - I'd expect very installations to run more
than 4 machines in a cluster and 2 is probably the norm.

So if a single broker goes down and restarts it's going to be made
up to
date by the only other cluster member. In this case if that member
stalls no more work can get done until the rejoining member is now
up to
date.

I guess this sort of case can be dealt with in the current scheme by
having multiple cluster members on a single piece of hardware.

Andrew


Thoughts appreciated!

I would dis-allow connections to the new broker until it is synced. I
would not bump any active connections, but rather leave that to
heartbeat.

One other idea would be to add an option to cluster config which
could
specify the preferred nodes to update from, and it would try this
list
first. I.e. in a 4 node cluster, all updates are made from node 4
(preferred) if there, and then from an app point of view I connect to
node 1-3 for example. This way updates have no effect on my clients
and
if I care about being stalled I set this option. if the prefered
node/s
are not running it would just pick one as it does today

Carl.


Good ideas here. To bring it together, how about this:

There are 2 kinds of broker process:
- Service brokers serve clients, they never give updates.
- Update brokers give updates, they never serve clients.

We create them automatically in pairs: a service broker forks an update
broker and restarts it if it dies. The update broker never accepts
connections and is not advertised to clients for failover.

So the 3 cases are now
- new member joining/getting update: rejected (with exception) until
ready.
- established member giving update, new client: never happens.
- established member giving update, connected client: never happens.

We could further constrain things and say a service broker can *only*
get an update from its own update broker (once the update broker is up
to date). The advantage is they'll be on the same host so less network
traffic, the disadvantage is they can't update in parallel if there are
multiple update brokers available.

Does that address all the issues? There is some extra complexity in
having 2 processes per broker, but for the moment I can't see any
insurmountable hurdles. The nice thing is that we can do this with 0
new
configuration so it will Just Work when its installed.

With Carl's suggestion of an option to allow you to restrict which
servers are used for updating you could get the set up the same thing.
That allows those who don't want or need the extra complexity to avoid
it while allowing the full flexibility to those who do need it.

Yes its probably better to make it an option than an automatic behavior.

On a separate but related point, how are nodes not doing the update
affected by the updater stalling? If there is an application error (e.g.
queue not found) does the whole cluster hang until the update is
complete?
Yes, unfortunately it would. Another good reason to split the work of
updating from servicing clients, so the update can complete more quickly.

If there is a high load on the other nodes is there any danger
that the updater will never be able to catch up after the update is
complete and the cpg queue would keep filling up?

Yes, if the load is continuously at the max the update broker can
handle it
could end up lagging behind the other brokers. We could introduce a
limit on the size of the brokers CPG queue to push back on CPG and let
its flow control slow down the other brokers if one starts to lag too
far behind.

This is again another reason for making the updater faster by separating
updates from client service.

I thought that client service was stalled during an update anyway on the
node performing the update?
Yes indeed. Good point.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to