Is not that simple - while the slave is syncing, there are also
running clients that are acknowledging messages (and hence they get
deleted).
We could record all the message exchanges (adds/deletes/new durable
subscribers/delete subscribers etc. etc.) - but is it really likely
that the slave will ever catch up without a pause ?
This type of synchronization gets very difficult very quickly. We
haven't even gone through edge cases (fail-over scenarios whilst the
master/slave are still syncing for example).
Which is why my preference is pause processing whilst a bulk transfer
happens. In reality, as we prefer shared-nothing architectures, this
involves copying journal files and database files from one machine to
another - which can be done relatively quickly - so pausing the
clients won't be too onerous.
cheers,
Rob
I
On 8 Mar 2006, at 19:28, Sridhar Komandur wrote:
On 3/8/06, Ning Li <[EMAIL PROTECTED]> wrote:
Bulk synch is a good idea, I think we can find a way to do it in
current
system, like create a topic and every message comes in will be
sent to
that topic, when the secondary comes up, it can pull those
messages. Or
we can find other ways to do it.
Yes, an internally created (persisted) queue at the primary
to store stuff when the secondary is not in sight. When the
secondary comes
up
it drains from that subject ? Sounds like a good idea to me.
One difficulty is we cannot pause the primary broker, it is hard
for the
secondary to catch up with both the historic and ongoing messages, I
think there is a timing issue in it. I guess that is why James
recommended pausing the primary broker.
I am not sure if we can find a way to do both dynamic synch and bulk
synch at the same time in the current system that will be great.
It can be done - we need a notion of ordering among all the
messages (coming
from both dynamic as well as bulk synch). This ordering can be
provided by
the message arrival time stamp at the primary.
Once we do this it is a matter of inserting the incoming messages
(without
worrying about the source) to the same target store. We can even
have the
bulk synch proceed in a lazy fashion - a background task at the
primary (and
possibly at the secondary) for a couple of reasons:
- latest messages are more relevant/important
- latest messages could in fact be retransmissions of the old, so
it is ok
to process the old messeges later for recovery purposes
Regards
- Sridhar
Thanks.
Ning
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Sridhar Komandur
Sent: Wednesday, March 08, 2006 9:59 AM
To: [email protected]
Subject: Re: improve master/slave topology
I like the idea of broker-broker synchronization. One of the
issues to
resolve is how reliable this synch activity needs to be ? A
transactional
approach is too heavy weight for the common case.
I think a middle ground based on TCP may be good enough. We can
divide
the
synchronization into two phases:
- dynamic synch : messages are sent to the partner on an ongoing
basis
- bulk synch: a new secondary comes up and its state needs to be
brought
up
to par with primary
Thanks
Regards
- Sridhar
On 3/6/06, Ning Li <[EMAIL PROTECTED]> wrote:
Hi,
This is a continued discussion about dynamically reintroduce the
master
after a failure, the original discussion is here.
http://forums.activemq.org/posts/list/468.page#1653
James idea about pausing the slave and synchronize two DBs is better
than stopping the slave and doing a manual sync. But I doubt this is
acceptable to us, as in real production environment, we won't be
able
to
pause the only message broker unless for a really short interval (I
guess have to less than one minute otherwise the end user will
notice
it).
Maybe a broker-broker synchronization protocol is the ultimate
solution,
just we are not sure how to get there. Any recommendation or
suggestions?
Thanks
Ning