I have recently started running on EC2 and am seeing quite a few ConnectionLoss exceptions. Should I just catch these and retry? Since I assume that eventually, if the shit truly hits the fan, I will get a SessionExpired? Satish
On Mon, Jul 6, 2009 at 11:35 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > We have used EC2 quite a bit for ZK. > > The basic lessons that I have learned include: > > a) EC2's biggest advantage after scaling and elasticity was conformity of > configuration. Since you are bringing machines up and down all the time, > they begin to act more like programs and you wind up with boot scripts that > give you a very predictable environment. Nice. > > b) EC2 interconnect has a lot more going on than in a dedicated VLAN. That > can make the ZK servers appear a bit less connected. You have to plan for > ConnectionLoss events. > > c) for highest reliability, I switched to large instances. On reflection, > I > think that was helpful, but less important than I thought at the time. > > d) increasing and decreasing cluster size is nearly painless and is easily > scriptable. To decrease, do a rolling update on the survivors to update > their configuration. Then take down the instance you want to lose. To > increase, do a rolling update starting with the new instances to update the > configuration to include all of the machines. The rolling update should > bounce each ZK with several seconds between each bounce. Rescaling the > cluster takes less than a minute which makes it comparable to EC2 instance > boot time (about 30 seconds for the Alestic ubuntu instance that we used > plus about 20 seconds for additional configuration). > > On Mon, Jul 6, 2009 at 4:45 AM, David Graf <david.g...@28msec.com> wrote: > > > Hello > > > > I wanna set up a zookeeper ensemble on amazon's ec2 service. In my > system, > > zookeeper is used to run a locking service and to generate unique id's. > > Currently, for testing purposes, I am only running one instance. Now, I > need > > to set up an ensemble to protect my system against crashes. > > The ec2 services has some differences to a normal server farm. E.g. the > > data saved on the file system of an ec2 instance is lost if the instance > > crashes. In the documentation of zookeeper, I have read that zookeeper > saves > > snapshots of the in-memory data in the file system. Is that needed for > > recovery? Logically, it would be much easier for me if this is not the > case. > > Additionally, ec2 brings the advantage that serves can be switch on and > off > > dynamically dependent on the load, traffic, etc. Can this advantage be > > utilized for a zookeeper ensemble? Is it possible to add a zookeeper > server > > dynamically to an ensemble? E.g. dependent on the in-memory load? > > > > David > > >