That is one important case. The offsite backup condition is probably well handled by a listener.
On Thu, Jan 19, 2012 at 7:30 PM, Flavio Junqueira <f...@yahoo-inc.com> wrote: > You're not talking about data corruption, are you? It is incorrect data > that has been introduced by a user or application by mistake. Am I getting > it right? > > -Flavio > > > On Jan 19, 2012, at 8:07 PM, Jordan Zimmerman wrote: > > It's that very replication that creates the need for backups. In there is >> a user error or a bad injection of data, the error will quickly replicate >> to all the instances. There's no way to recover without an external >> backup. >> >> >> -JZ >> >> >> On 1/19/12 10:39 AM, "Flavio Junqueira" <f...@yahoo-inc.com> wrote: >> >> Hi Ted, Znodes for leader election, group membership, etc, can all be >>> recreated, so why should I back them up instead of recreating the >>> znodes? In fact, one might bring back a previous snapshot of the >>> system that reflects an incorrect system state. >>> >>> In the case that one stores data that can't be recovered by other >>> means, I understand the need, but then we have the durability problem >>> that I mentioned and you apparently agreed. Also, ZooKeeper is a >>> replicated service, so why can't you simply rely upon the replication >>> strategy that ZooKeeper provides to you already? Again, I'm trying to >>> understand the use cases here. >>> >>> Thanks, >>> -Flavio >>> >>> On Jan 19, 2012, at 7:11 PM, Ted Dunning wrote: >>> >>> A backup can still be useful. It is a common property that a database >>>> backup is known to be slightly out of date. >>>> >>>> Such a backup can still be very useful. In many systems, the most >>>> common >>>> cause of error is simple human intervention. This especially >>>> applies to >>>> file systems and databases, but can still apply to ZK if an admin >>>> carelessly tries to clean up part of the namespace and accidentally >>>> cleans >>>> up all of it. This should be much less common with ZK because manual >>>> adjustments are so much less a part of standard operation, but they >>>> can >>>> still occur. In these cases, an out-of-date backup may be enormously >>>> valuable. >>>> >>>> If somebody wants a precise backup from a particular moment in time, >>>> the >>>> best option is to use the snapshot capabilities exposed by various >>>> file >>>> systems. Traditional NAS vendors all support this. At a lower cost >>>> and >>>> complexity point, you can get this from MapR clusters exposed as NFS >>>> or by >>>> a ZFS file system. This option also allows you to keep multiple >>>> snapshots >>>> from points in the past. >>>> >>>> What Jordan is doing would allow backups without special storage >>>> devices >>>> and, with good backup of the log, would allow nearly current >>>> recovery in >>>> the event of catastrophic loss. Yes, this loses some durability, >>>> but it is >>>> still very desirable. >>>> >>>> On Thu, Jan 19, 2012 at 11:07 AM, Flavio Junqueira <fpj@yahoo- >>>> inc.com>wrote: >>>> >>>> Since you started this thread, I've been thinking about the idea of >>>>> backing up, and I'm not sure I understand the motivation and if it >>>>> is ok to >>>>> violate safety properties. >>>>> >>>>> Given that ZooKeeper is used for coordination, I would think that >>>>> in many >>>>> cases all its state can be reconstructed in an algorithmic manner. >>>>> Perhaps >>>>> the use case for a backup would be the one in which it is being >>>>> used as a >>>>> database, for example, to keep the metadata of a file system. >>>>> Periodic >>>>> backups or even keeping an observer, however, won't guarantee that >>>>> if you >>>>> bring the system up using that backup you'll have all committed >>>>> operations. >>>>> The state of the leader reflects all committed operations, but one >>>>> needs to >>>>> have the latest state of the transaction log to not miss an update. >>>>> >>>>> But, it is true that I'm assuming that you can't miss updates. If >>>>> you can >>>>> miss updates, then that's a different story. By missing updates >>>>> we'll be >>>>> violating durability, which is a property that ZooKeeper is >>>>> supposed to >>>>> provide, so I'm trying to understand in which cases violating >>>>> durability >>>>> would be acceptable. If it is not acceptable and you still want to >>>>> have a >>>>> backup, then I don't see a way other than shutting down the clients >>>>> before >>>>> you take a backup, which doesn't seem to be what is being proposed >>>>> here. >>>>> >>>>> -Flavio >>>>> >>>>> >>>>> On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote: >>>>> >>>>> Neha - can you send me your email address. Send it to: >>>>> >>>>>> jzimmer...@netflix.com >>>>>> >>>>>> On 1/17/12 10:10 AM, "Neha Narkhede" <neha.narkh...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> Jordan, >>>>>> >>>>>>> >>>>>>> I'd be interested in previewing it. Let me know. >>>>>>> >>>>>>> Thanks, >>>>>>> Neha >>>>>>> >>>>>>> On Mon, Jan 16, 2012 at 5:42 PM, Jordan Zimmerman >>>>>>> <jzimmer...@netflix.com> wrote: >>>>>>> >>>>>>> We'll be backing up to S3. Wouldn't it be redundant to backup >>>>>>>> all the >>>>>>>> instances? >>>>>>>> >>>>>>>> -JZ >>>>>>>> >>>>>>>> P.S. I'm working on a ZooKeeper instance manager that will have >>>>>>>> backup/restore and a bunch of other stuff. We'll be open >>>>>>>> sourcing it. If >>>>>>>> anyone is interested in previewing it let me know. >>>>>>>> >>>>>>>> >>>>>>>> On 1/16/12 5:39 PM, "Patrick Hunt" <ph...@apache.org> wrote: >>>>>>>> >>>>>>>> Why would you limit to the leader? Wouldn't backing up any >>>>>>>> server (as >>>>>>>> >>>>>>>>> long as it's active) be sufficient? If you search the list it's >>>>>>>>> been >>>>>>>>> discussed before, using Observers seemed like a reasonable >>>>>>>>> option as >>>>>>>>> well. >>>>>>>>> >>>>>>>>> Patrick >>>>>>>>> >>>>>>>>> On Fri, Jan 13, 2012 at 2:29 PM, Jordan Zimmerman >>>>>>>>> <jzimmer...@netflix.com> wrote: >>>>>>>>> >>>>>>>>> That's easy as the backup app is running on the same machine >>>>>>>>>> as the ZK >>>>>>>>>> instance. I can use 'stat' to see if "my" instance is the >>>>>>>>>> leader. >>>>>>>>>> >>>>>>>>>> On 1/13/12 2:28 PM, "Camille Fournier" <cami...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> You want to have to figure out who the leader is every time >>>>>>>>>> you want >>>>>>>>>> >>>>>>>>>>> to >>>>>>>>>>> take a backup? That would be the downside to this strategy I >>>>>>>>>>> would >>>>>>>>>>> think. >>>>>>>>>>> >>>>>>>>>>> C >>>>>>>>>>> >>>>>>>>>>> From my phone >>>>>>>>>>> On Jan 13, 2012 5:24 PM, "Jordan Zimmerman" >>>>>>>>>>> <jzimmer...@netflix.com >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> As a backup strategy, it seems I would only want to backup >>>>>>>>>>> snapshots >>>>>>>>>>> >>>>>>>>>>>> from >>>>>>>>>>>> the leader. Does that make sense? >>>>>>>>>>>> >>>>>>>>>>>> -JZ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> flavio >>>>> junqueira >>>>> >>>>> research scientist >>>>> >>>>> f...@yahoo-inc.com >>>>> direct +34 93-183-8828 >>>>> >>>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es >>>>> phone (408) 349 3300 fax (408) 349 3301 >>>>> >>>>> >>>>> >>> flavio >>> junqueira >>> >>> research scientist >>> >>> f...@yahoo-inc.com >>> direct +34 93-183-8828 >>> >>> avinguda diagonal 177, 8th floor, barcelona, 08018, es >>> phone (408) 349 3300 fax (408) 349 3301 >>> >>> >> > flavio > junqueira > > research scientist > > f...@yahoo-inc.com > direct +34 93-183-8828 > > avinguda diagonal 177, 8th floor, barcelona, 08018, es > phone (408) 349 3300 fax (408) 349 3301 > >