No that you would want to do this, but simply overwriting a config file would 
"uncommit" a configuration and make that server think the last committed 
configuration was whatever is in the file?

Jared

On Jul 28, 2012, at 11:33 AM, Alexander Shraer <[email protected]> wrote:

> No problem! 
> 
> The way it works is that before a server acks a reconfig operation it writes 
> a special tmp file to disk (dynamicConfigFilename + ".tmp"). Servers look for 
> this file during recovery, they don't look for the configuration in the log 
> as for normal data, because we found it to be difficult to extract the right 
> info from the log exactly at the stage we needed it in the recovery. When a 
> commit message is received a server renames the tmp file to 
> dynamicConfigFilename.
> 
> Recently there was a change committed by someone to start using atomic file 
> operations for different files in ZooKeeper. At some point we'll probably 
> change the renaming above to use these atomic operations.
> 
> Alex
>   
> 
> On Sat, Jul 28, 2012 at 10:17 AM, Jared Cantwell <[email protected]> 
> wrote:
> Thanks Alex for the detailed explanations--  it really helps to fill in my 
> understanding of the implementation left open by the papers/presentations 
> I've read (without having to read the code yet :-) ).  #2 is what I was 
> unsure of, but makes perfect sense.
> 
> Obviously committing the new configuration to the internal database is a 
> prerequisite to committing on a server, but is writing the new configuration 
> file to disk also a prerequisite for committing the new configuration?  I'm 
> curious about this so I can match it with my observations, since reading the 
> configuration file is much easier than getting the database state.
> 
> ~Jared
> 
> 
> On Sat, Jul 28, 2012 at 11:02 AM, Alexander Shraer <[email protected]> wrote:
> Hi Jared,
> 
> figuring out what happened and how to recover is part of the reconfiguration 
> protocol. I don't think that this is something you as a user should do, 
> unless I missunderstand what you're trying to do. This should be handled by 
> ZooKeeper just like it handles other failures without admin intervention. 
> 
> In your scenario, D-F come up and one of them is elected leader (since you 
> said they know about the commit), so they start running the new config 
> normally. When A-C come up, several things may happen: 
> 
> 1. During the preliminary FastLeaderElection, A-C will try to connect to D 
> and E, and in fact they'll also try to connect with the new config members 
> that they know was proposed. So most chances are that someone in the new 
> config will send them the new config file and they'll store it and act 
> accordingly (connect as non-voting followers in the new config). To make this 
> happen, I changed FastLeaderElection to talk with proposed configs (if known) 
> and to piggiback the last active config you know of on all messages.
> 
> 2. Its possible that somehow A-C complete FastLeaderElection without talking 
> to D-F. But since a reconfiguration was committed, it was acked by a quorum 
> of the old config (and a quorum of the new one). Therefore, whoever is 
> "elected" in the old config, knows about the reconfig proposal (this is 
> guaranteed by normal ZooKeeper leader recovery). Before doing anything else, 
> the new leader among A-C will try to complete the reconfiguration, which 
> involves getting enough acks from a quorum of the new config. But in your 
> scenario the servers in the new config will not connect to it because they 
> moved on, so the candidate-leader will just give up and go back to (1) above. 
> 
> 3. In the remote chance that someone who heard about the reconfig commit 
> connects to a candidate-leader who didn't hear about it, the first thing it 
> does  is to tell that candidate-leader that its not up to date, and the 
> leader just updates its config file, gives up on being a leader and returns 
> to (1). This was done by changing the first message that a follower/observer 
> sends to a leader it is connecting to, even before the synchronization starts.
> 
> Alex
> 
> 
> 
> On Sat, Jul 28, 2012 at 8:43 AM, Jared Cantwell  <[email protected]> 
> wrote:
> So I'm working through some failure scenarios and I want to make sure I fully 
> understand the way that dynamic membership changes previous behavior, so are 
> my expectations correct in this situation:
> 
> As in my previous example, lets say that the current membership of voting 
> participants is {A,B,C,D,E} and we're looking to change membership to 
> {D,E,F,G,H}. 
> 1. Reconfiguration to {D,E,F,G,H} completes internally 
> 2. D-F update their local configuration files, but A-C do not yet.
> 3. Power loss to all nodes
> 
> Now what happens if A,B, and C come up with configuration files that still 
> say {A,B,C,D,E}, but no other servers start up yet?  Can A,B and C form a 
> quorum and elect a leader since they all agree on the same state?  What then 
> happens when the new membership of D-H starts up?
> 
> We're trying to automatically handle node failures during reconfiguration 
> situations, but it seems like without being able to query all nodes to make 
> sure you know of the latest membership list there is no safe way to do this.  
> I'm wondering if only doing single node additions/removals would create less 
> complicated failure scenarios.  What are your thoughts and best practices 
> around this?
> 
> Thanks!
> Jared
> 
> On Fri, Jul 27, 2012 at 8:57 PM, Jared Cantwell <[email protected]> 
> wrote:
> We are trying to remove the need for all admin intervention so that is one 
> failure scenario that is interesting to us. 
> 
> Jared
> 
> 
> On Jul 27, 2012, at 7:42 PM, Alexander Shraer <[email protected]> wrote:
> 
>> Yes, this entry will be deleted. I don't like this either - if a new 
>> follower reboots before added to the config it will not be able to boot up 
>> without manual help from an admin. That's why I'm considering maybe to 
>> remove the check that a participant must always initially be in its own 
>> config, but for now its there.
>> 
>> Alex
>> 
>> On Fri, Jul 27, 2012 at 6:34 PM, Jared Cantwell <[email protected]> 
>> wrote:
>> Sorry for the confusion in terminology, I was unfamiliar with the exact 
>> leader/follower semantics previously. 
>> 
>> So if all connected servers update their config file, does that mean that 
>> non-voting followers who aren't part of the new ensemble will lose the entry 
>> specific to them in their config file?  I can test this myself, but getting 
>> an inside perspective is very helpful. 
>> 
>> Thanks again for the help!
>> Jared
>> 
>> 
>> On Jul 27, 2012, at 6:55 PM, Alexander Shraer <[email protected]> wrote:
>> 
>>> Yes, any number of followers which are not in the configuration can just 
>>> connect and listen in. This has always been the case, also in 3.4, I just 
>>> made use of this for the purpose of adding members during reconfiguration. 
>>> Moreover, in 3.4 there this bug ZOOKEEPER-1113
>>> where the leader actually counts the votes of anyone connected, regardless 
>>> of config membership :) This is fixed in ZK-107, so they are really 
>>> non-voting followers. 
>>> 
>>> >   I am assuming that's the case, and that it is a follower (and not 
>>> > participant) by virtue of not being in the official configuration stored 
>>> > in 
>>> > zookeeper itself. 
>>> 
>>> Follower and participant types of servers is not something that was defined 
>>> in ZK-107. In ZooKeeper every follower/leader is a "participant". Its just 
>>> that the votes of participants that are not in the configuration are not 
>>> counted that's why we call them non-voting followers. BTW, obviously a 
>>> non-voting follower can not become leader (like ZK-1113 this was also not 
>>> enforced before ZK-107).
>>> 
>>> > And a followup... does zookeeper only overwrite the dynamic 
>>> > configuration file for nodes that are voting participants?  Such that if 
>>> > I 
>>> > started a follower and then left it running through some 
>>> > reconfigurations, its file would not get updated if it was never added as 
>>> > part of those reconfigurations?
>>> 
>>> No, as soon as it connects to the current leader, its dynamic config file 
>>> is overwritten with the current configuration as part of the 
>>> synchronization with the leader. Every time a new configuration is 
>>> committed, all connected servers (voting, non-voting, observers) will 
>>> update their dynamic config file, doesn't matter if they're in the config.
>>> 
>>> Alex
>>> 
>>> On Fri, Jul 27, 2012 at 5:35 PM, Jared Cantwell <[email protected]> 
>>> wrote:
>>> So does just having the server started and pointing to the existing 
>>> ensemble automatically make it a "non participating follower"?  In other 
>>> words, there is no need to inform the existing nodes that this new node is 
>>> joining as a follower?  And to extend that, there could be any number of 
>>> followers that are simply listening in on the event stream?  I am assuming 
>>> that's the case, and that it is a follower (and not participant) by virtue 
>>> of not being in the official configuration stored in zookeeper itself.
>>> 
>>> On Fri, Jul 27, 2012 at 6:29 PM, Alexander Shraer <[email protected]> wrote:
>>> there are just two supported types - participant and observer.
>>> (participant can act as either follower or leader).
>>> 
>>> So you can either write participant or leave it unspecified (which means 
>>> participant by default). Also, since the ip is the same for all your ports 
>>> you don't have to write it twice.  All of these should work in the same way:
>>> 
>>> server.5=10.10.5.17:2182:2183:participant;10.10.5.17:2181
>>> server.5=10.10.5.17:2182:2183:participant;2181
>>> server.5=10.10.5.17:2182:2183;10.10.5.17:2181
>>> server.5=10.10.5.17:2182:2183;2181
>>> 
>>> 
>>> 
>>> On Fri, Jul 27, 2012 at 5:25 PM, Jared Cantwell <[email protected]> 
>>> wrote:
>>> Thanks Alex for the response.  Our current lines in the configuration look 
>>> like this:
>>> 
>>> server.5=10.10.5.17:2182:2183:participant;10.10.5.17:2181
>>> 
>>> For the new servers is it ok for their entry to have "participant"?  Or 
>>> should that be something different (e.g. "follower")?
>>> 
>>> ~Jared
>>> 
>>> On Fri, Jul 27, 2012 at 6:20 PM, Alexander Shraer <[email protected]> wrote:
>>> Hi Jared,
>>> 
>>> Thanks for experimenting with this feature. 
>>> 
>>> The idea is that new servers join as "non voting followers". Which means 
>>> that they act as normal followers but the leader ignores their votes since 
>>> they are not part of the current configuration. The leader only counts 
>>> their votes during the reconfiguration itself (to make sure a quorum of the 
>>> new config is ready before the new config can be committed/activated). 
>>> Defining them as observers is not a good idea, for example in your scenario 
>>> if they were observers they wouldn't be able to participate in the 
>>> reconfiguration protocol (which is similar to the protocol for committing 
>>> any other operation in which observers don't participate) and since we 
>>> don't have a quorum of followers in the new config that can ack, 
>>> reconfiguration would throw an exception (of 
>>> KeeperException.NEWCONFIGNOQUORUM type). 
>>> Of course if you intend them to be observers in the new config you can 
>>> define them as observers since their votes are not needed during reconfig 
>>> anyway.
>>> 
>>> You're right, the new servers must be able to connect to the old quorum. At 
>>> minimum, their file should contain the current leader, but 
>>> you can also copy the current configuration file to the new members if you 
>>> wish. 
>>> 
>>> In addition, you should add a line for the member itself, so that server F 
>>> appears in F's config file (Its not important that the other new servers 
>>> appear in F's file, but it won't hurt either, so you can do a union of old 
>>> and new if you wish). The constructor of QuorumPeer checks that the server 
>>> itself is in the configuration its started with, otherwise its not going to 
>>> run. This check has always been there, but I'm thinking of possibly 
>>> changing it in the future.
>>> 
>>> As soon as F connects to the leader, its config file will be overwritten 
>>> with the current config file as part of the synchronization process. 
>>> 
>>> Alex
>>> 
>>> 
>>> On Fri, Jul 27, 2012 at 10:06 AM, Jared Cantwell <[email protected]> 
>>> wrote:
>>> Hi,
>>> 
>>> We are testing integration with 3.5.0 and dynamic membership and I have a
>>> question.  If I have a current set of servers in my ensemble {A,B,C,D,E}
>>> and I want to reconfigure the ensemble to {D,E,F,G,H}, how should the
>>> dynamic config file on servers F,G,H be configured on startup?  Should they
>>> have the old ensemble, the new ensemble, or the union of both ensembles?
>>>  It seems like these new servers need to  know about the old quorum, but
>>> since they aren't part of it yet its not clear to me how they should be
>>> configured.  Should there be an intermediate configuration with F,G, and H
>>> as simply Observers?
>>> 
>>> I can't find much documentation on this so I want to make sure I understand
>>> things correctly.
>>> 
>>> Thanks!
>>> ~Jared
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> 

Reply via email to