Re: Managing multi-site clusters with Zookeeper

Patrick Hunt Mon, 08 Mar 2010 13:09:19 -0800

That's controlled by the "tickTime"/synclimit/initlimit/etc.. see moreabout this in the admin guide: http://bit.ly/c726DC

You'll want to increase from the defaults since those are typically forhigh performance interconnect (ie within colo). You are correct though,much will depend on your env. and some tuning will be involved.


Patrick

Martin Waite wrote:

Hi Patrick,

Thanks for you input.

I am planning on having 3 zk servers per data centre, with perhaps only 2 in
the tie-breaker site.

The traffic between zk and the applications will be lots of local reads -
"who is the primary database ?".  Changes to the config will be rare (server
rebuilds, etc - ie. planned changes) or caused by server / network / site
failure.

The interesting thing in my mind is how zookeeper will cope with inter-site
link failure - how quickly the remote sites will notice, and how quickly
normality can be resumed when the link reappears.

I need to get this running in the lab and start pulling out wires.

regards,
Martin

On 8 March 2010 17:39, Patrick Hunt <ph...@apache.org> wrote:

IMO latency is the primary issue you will face, but also keep in mind
reliability w/in a colo.

Say you have 3 colos (obv can't be 2), if you only have 3 servers, one in
each colo, you will be reliable but clients w/in each colo will have to
connect to a remote colo if the local fails. You will want to prioritize the
local colo given that reads can be serviced entirely local that way. If you
have 7 servers (2-2-3) that would be better - if a local server fails you
have a redundant, if both fail then you go remote.

You want to keep your writes as few as possible and as small as possible?
Why? Say you have 100ms latency btw colos, let's go through a scenario for a
client in a colo where the local servers are not the leader (zk cluster
leader).

read:
1) client reads a znode from local server
2) local server (usually < 1ms if "in colo" comm) responds in 1ms

write:
1) client writes a znode to local server A
2) A proposes change to the ZK Leader (L) in remote colo
3) L gets the proposal in 100ms
4) L proposes the change to all followers
5) all followers (not exactly, but hopefully) get the proposal in 100ms
6) followers ack the change
7) L gets the acks in 100ms
8) L commits the change (message to all followers)
9) A gets the commit in 100ms
10) A responds to client (< 1ms)

write latency: 100 + 100 + 100 + 100 = 400ms

Obviously keeping these writes small is also critical.

Patrick


Martin Waite wrote:

Hi Ted,

If the links do not work for us for zk, then they are unlikely to work
with
any other solution - such as trying to stretch Pacemaker or Red Hat
Cluster
with their multicast protocols across the links.

If the links are not good enough, we might have to spend some more money
to
fix this.

regards,
Martin

On 8 March 2010 02:14, Ted Dunning <ted.dunn...@gmail.com> wrote:

 If you can stand the latency for updates then zk should work well for

you.
It is unlikely that you will be able to better than zk does and still
maintain correctness.

Do note that you can, probalbly bias client to use a local server. That
should make things more efficient.

Sent from my iPhone


On Mar 7, 2010, at 3:00 PM, Mahadev Konar <maha...@yahoo-inc.com> wrote:

 The inter-site links are a nuisance.  We have two data-centres with
100Mb

links which I hope would be good enough for most uses, but we need a 3rd

site - and currently that only has 2Mb links to the other sites.  This
might
be a problem.

Re: Managing multi-site clusters with Zookeeper

Reply via email to