Re: zookeeper on ec2

Satish Bhatti Tue, 01 Sep 2009 17:16:12 -0700

GC Time: 11.628 seconds on PS MarkSweep (389 collections)5 minutes on PS
scavenge( 7,636 collections)


It's been running for about 48 hours.


On Tue, Sep 1, 2009 at 5:12 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Do you have long GC delays?
>
> On Tue, Sep 1, 2009 at 4:51 PM, Satish Bhatti <cthd2...@gmail.com> wrote:
>
> > Session timeout is 30 seconds.
> >
> > On Tue, Sep 1, 2009 at 4:26 PM, Patrick Hunt <ph...@apache.org> wrote:
> >
> > > What is your client timeout? It may be too low.
> > >
> > > also see this section on handling recoverable errors:
> > > http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling
> > >
> > > connection loss in particular needs special care since:
> > > "When a ZooKeeper client loses a connection to the ZooKeeper server
> there
> > > may be some requests in flight; we don't know where they were in their
> > > flight at the time of the connection loss. "
> > >
> > > Patrick
> > >
> > >
> > > Satish Bhatti wrote:
> > >
> > >> I have recently started running on EC2 and am seeing quite a few
> > >> ConnectionLoss exceptions.  Should I just catch these and retry?
>  Since
> > I
> > >> assume that eventually, if the shit truly hits the fan, I will get a
> > >> SessionExpired?
> > >> Satish
> > >>
> > >> On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com>
> > >> wrote:
> > >>
> > >>  We have used EC2 quite a bit for ZK.
> > >>>
> > >>> The basic lessons that I have learned include:
> > >>>
> > >>> a) EC2's biggest advantage after scaling and elasticity was
> conformity
> > of
> > >>> configuration.  Since you are bringing machines up and down all the
> > time,
> > >>> they begin to act more like programs and you wind up with boot
> scripts
> > >>> that
> > >>> give you a very predictable environment.  Nice.
> > >>>
> > >>> b) EC2 interconnect has a lot more going on than in a dedicated VLAN.
> > >>>  That
> > >>> can make the ZK servers appear a bit less connected.  You have to
> plan
> > >>> for
> > >>> ConnectionLoss events.
> > >>>
> > >>> c) for highest reliability, I switched to large instances.  On
> > >>> reflection,
> > >>> I
> > >>> think that was helpful, but less important than I thought at the
> time.
> > >>>
> > >>> d) increasing and decreasing cluster size is nearly painless and is
> > >>> easily
> > >>> scriptable.  To decrease, do a rolling update on the survivors to
> > update
> > >>> their configuration.  Then take down the instance you want to lose.
>  To
> > >>> increase, do a rolling update starting with the new instances to
> update
> > >>> the
> > >>> configuration to include all of the machines.  The rolling update
> > should
> > >>> bounce each ZK with several seconds between each bounce.  Rescaling
> the
> > >>> cluster takes less than a minute which makes it comparable to EC2
> > >>> instance
> > >>> boot time (about 30 seconds for the Alestic ubuntu instance that we
> > used
> > >>> plus about 20 seconds for additional configuration).
> > >>>
> > >>> On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com>
> > >>> wrote:
> > >>>
> > >>>  Hello
> > >>>>
> > >>>> I wanna set up a zookeeper ensemble on amazon's ec2 service. In my
> > >>>>
> > >>> system,
> > >>>
> > >>>> zookeeper is used to run a locking service and to generate unique
> > id's.
> > >>>> Currently, for testing purposes, I am only running one instance.
> Now,
> > I
> > >>>>
> > >>> need
> > >>>
> > >>>> to set up an ensemble to protect my system against crashes.
> > >>>> The ec2 services has some differences to a normal server farm. E.g.
> > the
> > >>>> data saved on the file system of an ec2 instance is lost if the
> > instance
> > >>>> crashes. In the documentation of zookeeper, I have read that
> zookeeper
> > >>>>
> > >>> saves
> > >>>
> > >>>> snapshots of the in-memory data in the file system. Is that needed
> for
> > >>>> recovery? Logically, it would be much easier for me if this is not
> the
> > >>>>
> > >>> case.
> > >>>
> > >>>> Additionally, ec2 brings the advantage that serves can be switch on
> > and
> > >>>>
> > >>> off
> > >>>
> > >>>> dynamically dependent on the load, traffic, etc. Can this advantage
> be
> > >>>> utilized for a zookeeper ensemble? Is it possible to add a zookeeper
> > >>>>
> > >>> server
> > >>>
> > >>>> dynamically to an ensemble? E.g. dependent on the in-memory load?
> > >>>>
> > >>>> David
> > >>>>
> > >>>>
> > >>
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Re: zookeeper on ec2

Reply via email to