Is not that simple - while the slave is syncing, there are also running clients that are acknowledging messages (and hence they get deleted).

We could record all the message exchanges (adds/deletes/new durable subscribers/delete subscribers etc. etc.) - but is it really likely that the slave will ever catch up without a pause ?

This type of synchronization gets very difficult very quickly. We haven't even gone through edge cases (fail-over scenarios whilst the master/slave are still syncing for example).

Which is why my preference is pause processing whilst a bulk transfer happens. In reality, as we prefer shared-nothing architectures, this involves copying journal files and database files from one machine to another - which can be done relatively quickly - so pausing the clients won't be too onerous.

cheers,

Rob


I
On 8 Mar 2006, at 19:28, Sridhar Komandur wrote:

On 3/8/06, Ning Li <[EMAIL PROTECTED]> wrote:

Bulk synch is a good idea, I think we can find a way to do it in current system, like create a topic and every message comes in will be sent to that topic, when the secondary comes up, it can pull those messages. Or
we can find other ways to do it.


Yes, an internally created (persisted) queue at the primary
to store stuff when the secondary is not in sight. When the secondary comes
up
it drains from that subject ? Sounds like a good idea to me.


One difficulty is we cannot pause the primary broker, it is hard for the
secondary to catch up with both the historic and ongoing messages, I
think there is a timing issue in it. I guess that is why James
recommended pausing the primary broker.

I am not sure if we can find a way to do both dynamic synch and bulk
synch at the same time in the current system that will be great.



It can be done - we need a notion of ordering among all the messages (coming from both dynamic as well as bulk synch). This ordering can be provided by
the message arrival  time stamp  at the primary.

Once we do this it is a matter of inserting the incoming messages (without worrying about the source) to the same target store. We can even have the bulk synch proceed in a lazy fashion - a background task at the primary (and
possibly at the secondary) for a couple of reasons:
 - latest messages are more relevant/important
- latest messages could in fact be retransmissions of the old, so it is ok
to process the old messeges later for recovery purposes

Regards
- Sridhar

Thanks.

Ning
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Sridhar Komandur
Sent: Wednesday, March 08, 2006 9:59 AM
To: [email protected]
Subject: Re: improve master/slave topology

I like the idea of broker-broker synchronization. One of the issues to
resolve is how reliable this synch activity needs to be ? A
transactional
approach is too heavy weight for the common case.

I think a middle ground based on TCP may be good enough. We can divide
the
synchronization into two phases:
- dynamic synch : messages are sent to the partner on an ongoing basis - bulk synch: a new secondary comes up and its state needs to be brought
up
to par with primary

Thanks
Regards
- Sridhar

On 3/6/06, Ning Li <[EMAIL PROTECTED]> wrote:

Hi,

This is a continued discussion about dynamically reintroduce the
master
after a failure, the original discussion is here.

http://forums.activemq.org/posts/list/468.page#1653

James idea about pausing the slave and synchronize two DBs is better
than stopping the slave and doing a manual sync. But I doubt this is
acceptable to us, as in real production environment, we won't be able
to
pause the only message broker unless for a really short interval (I
guess have to less than one minute otherwise the end user will notice
it).

Maybe a broker-broker synchronization protocol is the ultimate
solution,
just we are not sure how to get there. Any recommendation or
suggestions?


Thanks

Ning



Reply via email to