Re: zookeeper on ec2

Patrick Hunt Tue, 01 Sep 2009 17:11:46 -0700

Depends on what your tests are. Are they pretty simple/light? thenprobably network issue. Heavy load testing? then might be theserver/client, might be the network.

easiest thing is to run a ping test while running your zk test and seeif pings are getting through (and latency). You should also review yourclient/server logs for any information during the CLoss.

Ted Dunning would be a good resource - he runs ZK inside ec2 and hasalot of experience with it.


Patrick

Satish Bhatti wrote:

For my initial testing I am running with a single ZooKeeper server, i.e. the
ensemble only has one server.  Not sure if this is exacerbating the problem?
 I will check out the trouble shooting link you sent me.

On Tue, Sep 1, 2009 at 5:01 PM, Patrick Hunt <ph...@apache.org> wrote:

I'm not very familiar with ec2 environment, are you doing any monitoring?
In particular network connectivity btw nodes? Sounds like networking issues
btw nodes (I'm assuming you've also looked at stuff like this
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting and verified that
you are not swapping (see gc pressure), etc...)

Patrick


Satish Bhatti wrote:

Session timeout is 30 seconds.

On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt <ph...@apache.org> wrote:

 What is your client timeout? It may be too low.

also see this section on handling recoverable errors:
http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling

connection loss in particular needs special care since:
"When a ZooKeeper client loses a connection to the ZooKeeper server there
may be some requests in flight; we don't know where they were in their
flight at the time of the connection loss. "

Patrick


Satish Bhatti wrote:

 I have recently started running on EC2 and am seeing quite a few

ConnectionLoss exceptions.  Should I just catch these and retry?  Since
I
assume that eventually, if the shit truly hits the fan, I will get a
SessionExpired?
Satish

On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com>
wrote:

 We have used EC2 quite a bit for ZK.

The basic lessons that I have learned include:

a) EC2's biggest advantage after scaling and elasticity was conformity
of
configuration.  Since you are bringing machines up and down all the
time,
they begin to act more like programs and you wind up with boot scripts
that
give you a very predictable environment.  Nice.

b) EC2 interconnect has a lot more going on than in a dedicated VLAN.
 That
can make the ZK servers appear a bit less connected.  You have to plan
for
ConnectionLoss events.

c) for highest reliability, I switched to large instances.  On
reflection,
I
think that was helpful, but less important than I thought at the time.

d) increasing and decreasing cluster size is nearly painless and is
easily
scriptable.  To decrease, do a rolling update on the survivors to
update
their configuration.  Then take down the instance you want to lose.  To
increase, do a rolling update starting with the new instances to update
the
configuration to include all of the machines.  The rolling update
should
bounce each ZK with several seconds between each bounce.  Rescaling the
cluster takes less than a minute which makes it comparable to EC2
instance
boot time (about 30 seconds for the Alestic ubuntu instance that we
used
plus about 20 seconds for additional configuration).

On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com>
wrote:

 Hello

I wanna set up a zookeeper ensemble on amazon's ec2 service. In my

 system,

 zookeeper is used to run a locking service and to generate unique

id's.
Currently, for testing purposes, I am only running one instance. Now,
I

 need

 to set up an ensemble to protect my system against crashes.

The ec2 services has some differences to a normal server farm. E.g.
the
data saved on the file system of an ec2 instance is lost if the
instance
crashes. In the documentation of zookeeper, I have read that zookeeper

 saves

 snapshots of the in-memory data in the file system. Is that needed for

recovery? Logically, it would be much easier for me if this is not the

 case.

 Additionally, ec2 brings the advantage that serves can be switch on

and

 off

 dynamically dependent on the load, traffic, etc. Can this advantage be

utilized for a zookeeper ensemble? Is it possible to add a zookeeper

 server

 dynamically to an ensemble? E.g. dependent on the in-memory load?

David

Re: zookeeper on ec2

Reply via email to