[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574371#comment-13574371 ] Flavio Junqueira commented on ZOOKEEPER-1642: - I would be interested in the opinion of folks that have worked on this part of the code. Is there anything I'm overlooking? In any case, I'm uploading a simple preliminary patch with an idea for solving this. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574767#comment-13574767 ] Thawan Kooburat commented on ZOOKEEPER-1642: Not sure if my understanding is correct, but with this change I think ZK the server will load database only once during its entire process life-cycle since quorum peer is going to use the same db instance to initialize ZK server when there is a new leader election. We once considered this optimization but haven't done it yet. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574830#comment-13574830 ] Flavio Junqueira commented on ZOOKEEPER-1642: - For leader election, we check if the zkDb instance has been initialized, so it is as you say already. I'm essentially applying the same approach when loading the database after the leader is elected. Right now, I can't see a problem with it, can you? I've run the tests and they all pass, not sufficient, but a positive sign. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575048#comment-13575048 ] Thawan Kooburat commented on ZOOKEEPER-1642: I think we probably want to make sure that ZkDb has the correct data when a peer transition between leader and follower. I think the current implementation will work correctly but not sure if we want to add a unit test to make sure that it behave correctly moving forward. One of the place that I can think of is the committedLog. When a follower get elected as a leader in a new epoch. It will reuse committedLog populated since it was a follower in the old epoch. So we can add a unit test to verify the integrity of the committedLog even though it should behave correctly at the moment. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576109#comment-13576109 ] Flavio Junqueira commented on ZOOKEEPER-1642: - Here is how I currently see it. The leader doesn't have to change its database instance after being elected, but a follower might need to change it, though. ZKDatabase is actually cleared in Learner#syncWithLeader(), ZooKeeperServer#shutdown(), ZKDatabase#deserializeSnapshot(), ZKDatabase#truncateLog(). In the case the follower requires changes, it will end up clearing the database and reloading it with the corresponding changes. About committedLog(), it is cleared in ZKDatabase#clear() and ZKDatabase#addCommittedProposal(). addCommittedProposal is invoked in ZKDatabase#loadDatabase() and FinalRequestProcessor#processRequest(). With the patch I'm proposing, these methods would be invoked in the same way when a follower needs to snap, take a diff, or truncate, so I don't expect a change of behavior. About a test, I'm happy to have it, but I'm still not very sure about what to test. Should we test that a server contains all commits it should? If so, I'm sure we have tests that do it already. Should we instead just check for the content of the committedLog? > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576906#comment-13576906 ] Thawan Kooburat commented on ZOOKEEPER-1642: committeLog is just for optimization, if the follower doesn't have it when it elected as a leader, the system still behave correctly. The bug that I saw previously is that if loadDatabase() is called twice without clearing the committedLog, it will have duplicate txns which may or may not cause the problem. But in the patch that you proposed, it shouldn't happen. Since we are just checking the flag before loading the db which we already do something similar in ZooKeeperServer#startData(), we might not need a test if the scope of the patch is what mentioned in the title. Now that we know that ZooKeeper#shutdown() will clear the DB, so I guess the db will need to be reloaded again during leader election after leader failure. I think the optimization of keeping the db across leader election (except when getting a SNAP or TRUNC) is worth doing, but that can be in a separate JIRA. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577093#comment-13577093 ] Flavio Junqueira commented on ZOOKEEPER-1642: - Thanks for the feedback, Thawan. I think I haven't made myself very clear. Let me try to add some more detail about this comment you've made: bq. I think the optimization of keeping the db across leader election (except when getting a SNAP or TRUNC) is worth doing, but that can be in a separate JIRA. QuorumPeer#getLastLoggedZxid is invoked in FLE#lookForLeader to set zxid of the initial vote. To get the zxid, we load the database in getLastLoggedZxid, and my understanding is that it must be loaded because it has been cleared with the shutdown when transitioning from LEADING/FOLLOWING to LOOKING. The change I'm proposing prevents a new leader from loading the database a second time when it starts executing Leader#lead(), more concretely when it calls loadData. Does it make more sense now? > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577294#comment-13577294 ] Thawan Kooburat commented on ZOOKEEPER-1642: Ah sorry. Then, I think this patch should do what it is intended to do. I think the rest is just adding comments about why the checking is added. Since it is not clear from the code why the DB can be initialized before reaching that point. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577429#comment-13577429 ] Hadoop QA commented on ZOOKEEPER-1642: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12569163/ZOOKEEPER-1642.patch against trunk revision 1441922. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1388//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1388//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1388//console This message is automatically generated. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577435#comment-13577435 ] Flavio Junqueira commented on ZOOKEEPER-1642: - We agreed so far on not having tests, so the QA result is expected. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577857#comment-13577857 ] Thawan Kooburat commented on ZOOKEEPER-1642: Thanks Flavio, The patch looks good +1 for me > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659683#comment-13659683 ] Camille Fournier commented on ZOOKEEPER-1642: - Flavio is this ready to go? It looks ok with me, if so I can check it in. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659692#comment-13659692 ] Flavio Junqueira commented on ZOOKEEPER-1642: - Yes, ready to go. Thanks, Camille. > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1642) Leader loading database twice
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659743#comment-13659743 ] Hudson commented on ZOOKEEPER-1642: --- Integrated in ZooKeeper-trunk #1928 (See [https://builds.apache.org/job/ZooKeeper-trunk/1928/]) ZOOKEEPER-1642. Leader loading database twice (fpj via camille) (Revision 1483440) Result = SUCCESS camille : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1483440 Files : * /zookeeper/trunk/CHANGES.txt * /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java > Leader loading database twice > - > > Key: ZOOKEEPER-1642 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1642 > Project: ZooKeeper > Issue Type: Bug >Reporter: Flavio Junqueira >Assignee: Flavio Junqueira > Fix For: 3.5.0, 3.4.6 > > Attachments: ZOOKEEPER-1642.patch, ZOOKEEPER-1642.patch > > > The leader server currently loads the database before running leader election > when trying to figure out the zxid it needs to use for the election and again > when it starts leading. This is problematic for larger databases so we should > remove the redundant load if possible. > The code references are: > # getLastLoggedZxid() in QuorumPeer; > # loadData() in ZooKeeperServer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira