[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822700#comment-15822700
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
As a side effect of a new commit 
(https://github.com/apache/zookeeper/commit/42c75b5f2457f8ea5b4106ce5dc1c34c330361c0)
 that triggers git mirror sync, this PR is closed :)


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822695#comment-15822695
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/120


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822481#comment-15822481
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
Right, it is committed, confirmed. I was only checking the apache mirror on 
git (https://github.com/apache/zookeeper), instead of the apache git directly. 

I suspect there is some infra issues on Apache (the JIRA was done 
yesterday) which contributes to the bridging between various systems not 
working as expected, leading to this PR not closed automatically.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822476#comment-15822476
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
it shows up in https://git-wip-us.apache.org/repos/asf?p=zookeeper.git


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822467#comment-15822467
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
hmm, perhaps you are right. i don't think the script is working for me 
properly... checking.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822464#comment-15822464
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
Also https://issues.apache.org/jira/browse/ZOOKEEPER-261 does not have 
Fixed Version set, which implies that the JIRA was resolved manually (my guess) 
instead of automatically as part of commit flow through 
https://github.com/apache/zookeeper/blob/master/zk-merge-pr.py - usually if we 
use the merge script it will take care everything including close the PR and 
resolve the JIRA (provided right credential has been set up.).


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822461#comment-15822461
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
I don't see this is committed in master. No commit log from git, no 
notification emails. @breed Are you sure this is committed (and pushed to 
apache git)?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822389#comment-15822389
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
hmm this is committed. anyone understands why it doesn't autoclose?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822369#comment-15822369
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
commited. thanx everyone for reviewing and brian for your contribution.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821844#comment-15821844
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
@enixon Thank you very much for the explanation! Makes sense, sorry for my 
misunderstanding.

@breed Yep, agree. Let's push it forward. :+1: 


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820065#comment-15820065
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95721046
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -175,11 +193,20 @@ public long restore(DataTree dt, Map 
sessions,
 "No snapshot found, but there are log entries. " +
 "Something is broken!");
 }
-/* TODO: (br33d) we should either put a ConcurrentHashMap on 
restore()
- *   or use Map on save() */
-save(dt, (ConcurrentHashMap)sessions);
-/* return a zxid of zero, since we the database is empty */
-return 0;
+
+if (suspectEmptyDB) {
+/* return a zxid of -1, since we are possibly missing data 
*/
+LOG.warn("Unexpected empty data tree, setting zxid to -1");
--- End diff --

Are we 100% sure the data tree is empty? Couldn't it be only partially 
complete? I mean the machine recorded up to transaction n, but lost 
transactions n+1, n+2, n+3, etc?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread Edward Ribeiro (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820066#comment-15820066
 ] 

Edward Ribeiro commented on ZOOKEEPER-261:
--

I wrote the comment below on GH, but for whatever reason it was not posted 
here, so I am duplicating just to see where/if I am mistaken. :)

"Hi @enixon,

I think your approach is very cool, for real. I only had time to give a first 
pass on your patch now (hope to look closer soon, esp. the tests), but I would 
like to ask a dumb question.

What if we change the approach and, instead of the initialize file being used 
for normal execution, we use a recover (or rejoin) file whose presence denote 
an exceptional restart of a ZK node? That way, if and only if, this file is 
present we delete it and return -1L so that it cannot take part in the 
elections until it catches up with the ensemble, etc.

If this file is not present then we proceed as usual (i.e. returns 0L). This 
way, we are dealing with the exceptional case by using the initialize/recover. 
For example: node C (from a 3 node ensemble) crashes due to disk full 
exceptions. Then the operator delete the data/ directory and put the recovering 
file there.

In my humble (and naive) option, it would avoid some headaches for ops people 
who would forget to include the initialize file in a node or two, during 
rolling upgrades or other cases I can't think of right now. The presence of 
this file for normal execution changes the ordinal operation of a ZK node. So, 
we don't have to deal with changing the standard way of starting a ZK node. The 
recover file is for exceptional cases, where we want to make sure the 
restarting node cannot take part in an election.

PS: I didn't get the autocreateDB stuff also. But it's late at night here. 

Wdyt?

/cc [~hanm] [~breed] [~fpj]
"

PS2: The scenario described in the JIRA is a good point in favor of a 
{{initialize}} file, because when B & C came back **automatically** then the 
{{initialize}} file would be missing from both nodes, and the ensemble would 
grind to a halt because no one would be leader, right? Otherwise, if there was 
an script to **automatically* create those files on each node  once the machine 
was turned up then B & C would have the file created and then we could come 
back to square one, right? Does it make any sense what I am writing? Please, 
lecture me. :)

> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819918#comment-15819918
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95714487
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
+if (!initFile.delete()) {
+throw new IOException("Unable to delete initialization 
file " + initFile.toString());
+}
+suspectEmptyDB = false;
+} else {
+suspectEmptyDB = !autoCreateDB;
--- End diff --

I tempted to do put the log line on the other side of the conditional since 
this side is the expected case. We should only delete an initialize file once 
in the lifecycle of a given server while the check against `autoCreateDB` will 
happen every other time the server is restarted.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819907#comment-15819907
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95714170
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
--- End diff --

Nice optimization, I like it!


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819905#comment-15819905
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95714094
  
--- Diff: bin/zkServer-initialize.sh ---
@@ -113,6 +113,8 @@ initialize() {
 else
 echo "No myid provided, be sure to specify it in $ZOO_DATADIR/myid 
if using non-standalone"
 fi
+
+date > "$ZOO_DATADIR/initialize"
--- End diff --

True enough, `touch` is sufficient. Using `date` is an optimization I've 
included in other scripts in the past as a way of sneaking a bit more 
information into an otherwise meaningless file but in this context it's 
probably just confusing.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819849#comment-15819849
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95711916
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
--- End diff --

Disclaimer: I am not used to `Files` class so you may have to make sure it 
doesn't alter the current behaviour if you decide to use it.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819823#comment-15819823
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95710948
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
+if (!initFile.delete()) {
+throw new IOException("Unable to delete initialization 
file " + initFile.toString());
+}
+suspectEmptyDB = false;
+} else {
+suspectEmptyDB = !autoCreateDB;
--- End diff --

IMO, it would be nice to put a `debug` (warn?) log message here. Something 
along the lines of "Initialize file doesn't found! Using autoCreateDB 
attribute."


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819801#comment-15819801
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95707659
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) 
throws IOException {
 
 txnLog = new FileTxnLog(this.dataDir);
 snapLog = new FileSnap(this.snapDir);
+
+autoCreateDB = 
Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE,
--- End diff --

+1 with @hanm 


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819800#comment-15819800
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95707294
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
--- End diff --

As Java 7 is the default we could use the code below? The benefits are that 
it automatically throws the `IOException` if an I/O error happens or return 
`false` if the file doesn't exists.

```
if (Files.deleteIfExists(initFile.toPath()) {
suspectEmptyDB = false;
} else {
(...)
```


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819803#comment-15819803
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95709489
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -167,6 +175,16 @@ public long restore(DataTree dt, Map 
sessions,
 PlayBackListener listener) throws IOException {
 long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
--- End diff --

Could we rename this to `recoveringDB` or `recoveringNode`? 

My rationale is: `suspectEmptyDB` looks vague to me, **plus**  __if I 
understood it right__ a node could have been shutdown and restarted after some 
time. So, not necessarily its DB will be empty, but it is in a recovering 
process so we want to avoid that it becoming the leader and messing up with 
transactions performed while it was offline, right?
Could we rename this to `recoveringDB` or `recoveringNode`? 

My rationale is: `suspectEmptyDB` looks vague to me, **plus** because __if 
I understood it right__ a node could have been shutdown and restarted after 
some time. So, not necessarily its DB will be empty, but it is in a recovering 
process so we want to avoid that it becoming the leader and messing up with 
transactions performed while it was offline, right?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819802#comment-15819802
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user eribeiro commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r95703179
  
--- Diff: bin/zkServer-initialize.sh ---
@@ -113,6 +113,8 @@ initialize() {
 else
 echo "No myid provided, be sure to specify it in $ZOO_DATADIR/myid 
if using non-standalone"
 fi
+
+date > "$ZOO_DATADIR/initialize"
--- End diff --

Nit: If the sole purpose of this file is to act as a marker, in spite of 
its content,  then a 

```touch $ZOO_DATADIR/initialize``` 

would be enough, wouldn't it?

Of course, `date` is fine as well, no problem.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819633#comment-15819633
 ] 

Hadoop QA commented on ZOOKEEPER-261:
-

+1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 20 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/205//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/205//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/205//console

This message is automatically generated.

> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2017-01-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819558#comment-15819558
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
Rebased on to latest master to avoid any potential conflicts with @breed 's 
changes for 2325.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734431#comment-15734431
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
lgtm +1


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15733423#comment-15733423
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
- add documentation on 'zookeeper.db.autocreate' to zookeeperAdmin.xml
- extend  bin/zkServer-initialize.sh to create the initialize file
- treat failure to delete initialization file as fatal, throw IOException 
instead of logging a warning


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727937#comment-15727937
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r91235960
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) 
throws IOException {
 
 txnLog = new FileTxnLog(this.dataDir);
 snapLog = new FileSnap(this.snapDir);
+
+autoCreateDB = 
Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE,
--- End diff --

>> Is that in accord with Zookeeper style?

I see - I thought the new property was not end user facing since there is 
no associated documents added here. Since the property 
"zookeeper.db.autocreate" is exposed to user some doc could be added to 
ZooKeeperAdmin.html (similarly like how the existing 
"zookeeper.datadir.autocreate" is documented there) to describe the motivation 
/ usage of the property.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727911#comment-15727911
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r91234951
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) 
throws IOException {
 
 txnLog = new FileTxnLog(this.dataDir);
 snapLog = new FileSnap(this.snapDir);
+
+autoCreateDB = 
Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE,
--- End diff --

I included `ZOOKEEPER_DB_AUTOCREATE` to allow users to opt out of the 
feature until they're ready to update their ensemble management tooling to 
support creating the new file. Is that in accord with Zookeeper style?

On the question of style, `ZOOKEEPER_DB_AUTOCREATE_DEFAULT` exists purely 
because `ZOOKEEPER_DATADIR_AUTOCREATE_DEFAULT` exists above it in the file. If 
including the defaults as static constants isn't Zookeeper style then I'm happy 
to replace it with a string literal in the constructor.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727293#comment-15727293
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r91208860
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) 
throws IOException {
 
 txnLog = new FileTxnLog(this.dataDir);
 snapLog = new FileSnap(this.snapDir);
+
+autoCreateDB = 
Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE,
--- End diff --

It seems that this variable `autoCreateDB` (and the property 
`ZOOKEEPER_DB_AUTOCREATE_DEFAULT` and `ZOOKEEPER_DB_AUTOCREATE`) is used solely 
for testing purpose (to change control flow and get code coverage). IIUC, maybe 
add some comments about these test only variables?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727294#comment-15727294
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/120#discussion_r91208138
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +173,41 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+boolean suspectEmptyDB;
+File initFile = new File(dataDir.getParent(), "initialize");
+if (initFile.exists()) {
+if (!initFile.delete()) {
+LOG.warn("Unable to delete initialization file " + 
initFile.toString());
--- End diff --

It sounds pretty serious issue if the initialize file can't be cleaned up 
upon startup of a new ensemble for whatever reasons as the presence of this 
file is the key promise made here - not able to clean it up will possibly lead 
to inconsistent quorum state again that this PR is trying to fix. So, maybe 
throw an IOException here to abort server start process and let admin intervene 
instead?


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727075#comment-15727075
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user enixon commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
Thanks, @hanm , let's see if editing the correct string into the title 
suffices or if I need to open up a new PR.


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727057#comment-15727057
 ] 

ASF GitHub Bot commented on ZOOKEEPER-261:
--

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/120
  
Nit: the PR title should contain the string "ZOOKEEPER-261" for the merge 
script to work - otherwise comments made here will not be bridged to JIRA 
https://issues.apache.org/jira/browse/ZOOKEEPER-261


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-261) Reinitialized servers should not participate in leader election

2016-12-06 Thread Brian Nixon (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15727016#comment-15727016
 ] 

Brian Nixon commented on ZOOKEEPER-261:
---

Ben and I discussed this offline. When starting up without any local data, the 
safest thing to do is view this lack with extreme suspicion and not participate 
in voting until you can pull down the data tree from the rest of the ensemble. 
Such a server is not qualified to confirm which servers are up to date and 
could inadvertently elect a server that is missing some data. The one exception 
is the creation of a fresh ensemble, when there is no data to repopulate the 
local data tree. It's not clear that an ensemble can detect that it is in this 
state on its own since in the worst case, every server will be subject to the 
same data losing fault (in which case you should recover from backups instead 
of coming online as an empty data base). This extra information needs to come 
from the admin.

With the changes from ZOOKEEPER-2325, a server with no local data tree starts 
with a zxid of 0. I'll submit a pull request that changes that initial zxid to 
-1 unless a special 'initialize' file is present in the data directory and 
removes voting privileges from members reporting -1. The idea is that creating 
the 'initialize' file alongside 'myid' will be a standard part of ensemble 
creation - the extra information from the admin. The 'initialize' file will be 
automatically cleaned up by the server and subsequent restarts can view missing 
data directories as a sign they are legitimately missing context (e.g. adding 
to an existing ensemble).


> Reinitialized servers should not participate in leader election
> ---
>
> Key: ZOOKEEPER-261
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-261
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection, quorum
>Reporter: Benjamin Reed
>
> A server that has lost its data should not participate in leader election 
> until it has resynced with a leader. Our leader election algorithm and 
> NEW_LEADER commit assumes that the followers voting on a leader have not lost 
> any of their data. We should have a flag in the data directory saying whether 
> or not the data is preserved so that the the flag will be cleared if the data 
> is ever cleared.
> Here is the problematic scenario: you have have ensemble of machines A, B, 
> and C. C is down. the last transaction seen by C is z. a transaction, z+1, is 
> committed on A and B. Now there is a power outage. B's data gets 
> reinitialized. when power comes back up, B and C comes up, but A does not. C 
> will be elected leader and transaction z+1 is lost. (note, this can happen 
> even if all three machines are up and C just responds quickly. in that case C 
> would tell A to truncate z+1 from its log.) in theory we haven't violated our 
> 2f+1 guarantee, since A is failed and B still hasn't recovered from failure, 
> but it would be nice if when we don't have quorum that system stops working 
> rather than works incorrectly if we lose quorum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)