Re: zookeeper on ec2

Ted Dunning Tue, 01 Sep 2009 17:12:53 -0700

Do you have long GC delays?

On Tue, Sep 1, 2009 at 4:51 PM, Satish Bhatti <cthd2...@gmail.com> wrote:


> Session timeout is 30 seconds.
>
> On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt <ph...@apache.org> wrote:
>
> > What is your client timeout? It may be too low.
> >
> > also see this section on handling recoverable errors:
> > http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
> >
> > connection loss in particular needs special care since:
> > "When a ZooKeeper client loses a connection to the ZooKeeper server there
> > may be some requests in flight; we don't know where they were in their
> > flight at the time of the connection loss. "
> >
> > Patrick
> >
> >
> > Satish Bhatti wrote:
> >
> >> I have recently started running on EC2 and am seeing quite a few
> >> ConnectionLoss exceptions.  Should I just catch these and retry?  Since
> I
> >> assume that eventually, if the shit truly hits the fan, I will get a
> >> SessionExpired?
> >> Satish
> >>
> >> On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com>
> >> wrote:
> >>
> >>  We have used EC2 quite a bit for ZK.
> >>>
> >>> The basic lessons that I have learned include:
> >>>
> >>> a) EC2's biggest advantage after scaling and elasticity was conformity
> of
> >>> configuration.  Since you are bringing machines up and down all the
> time,
> >>> they begin to act more like programs and you wind up with boot scripts
> >>> that
> >>> give you a very predictable environment.  Nice.
> >>>
> >>> b) EC2 interconnect has a lot more going on than in a dedicated VLAN.
> >>>  That
> >>> can make the ZK servers appear a bit less connected.  You have to plan
> >>> for
> >>> ConnectionLoss events.
> >>>
> >>> c) for highest reliability, I switched to large instances.  On
> >>> reflection,
> >>> I
> >>> think that was helpful, but less important than I thought at the time.
> >>>
> >>> d) increasing and decreasing cluster size is nearly painless and is
> >>> easily
> >>> scriptable.  To decrease, do a rolling update on the survivors to
> update
> >>> their configuration.  Then take down the instance you want to lose.  To
> >>> increase, do a rolling update starting with the new instances to update
> >>> the
> >>> configuration to include all of the machines.  The rolling update
> should
> >>> bounce each ZK with several seconds between each bounce.  Rescaling the
> >>> cluster takes less than a minute which makes it comparable to EC2
> >>> instance
> >>> boot time (about 30 seconds for the Alestic ubuntu instance that we
> used
> >>> plus about 20 seconds for additional configuration).
> >>>
> >>> On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com>
> >>> wrote:
> >>>
> >>>  Hello
> >>>>
> >>>> I wanna set up a zookeeper ensemble on amazon's ec2 service. In my
> >>>>
> >>> system,
> >>>
> >>>> zookeeper is used to run a locking service and to generate unique
> id's.
> >>>> Currently, for testing purposes, I am only running one instance. Now,
> I
> >>>>
> >>> need
> >>>
> >>>> to set up an ensemble to protect my system against crashes.
> >>>> The ec2 services has some differences to a normal server farm. E.g.
> the
> >>>> data saved on the file system of an ec2 instance is lost if the
> instance
> >>>> crashes. In the documentation of zookeeper, I have read that zookeeper
> >>>>
> >>> saves
> >>>
> >>>> snapshots of the in-memory data in the file system. Is that needed for
> >>>> recovery? Logically, it would be much easier for me if this is not the
> >>>>
> >>> case.
> >>>
> >>>> Additionally, ec2 brings the advantage that serves can be switch on
> and
> >>>>
> >>> off
> >>>
> >>>> dynamically dependent on the load, traffic, etc. Can this advantage be
> >>>> utilized for a zookeeper ensemble? Is it possible to add a zookeeper
> >>>>
> >>> server
> >>>
> >>>> dynamically to an ensemble? E.g. dependent on the in-memory load?
> >>>>
> >>>> David
> >>>>
> >>>>
> >>
>



-- 
Ted Dunning, CTO
DeepDyve

Re: zookeeper on ec2

Reply via email to