[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628721#comment-13628721 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94 #955 (See [https://builds.apache.org/job/HBase-0.94/955/]) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1466725) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629770#comment-13629770 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security #134 (See [https://builds.apache.org/job/HBase-0.94-security/134/]) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1466725) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628024#comment-13628024 ] Jeffrey Zhong commented on HBASE-7824: -- Many thanks to Ram, Chunhui and Rajesh for the latest reviews! [~lhofhansl] Are you all right to check the patch v10 in? Thanks. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628389#comment-13628389 ] Lars Hofhansl commented on HBASE-7824: -- Patch looks good (although I didn't have time for a detailed review) +1 Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628401#comment-13628401 ] Ted Yu commented on HBASE-7824: --- Integrated to 0.94 Thanks for the continued effort, Jeff. Thanks for the reviews, Ram, Chunhui and Rajesh. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626820#comment-13626820 ] Ted Yu commented on HBASE-7824: --- [~lhofhansl], [~ram_krish]: Do you have further review comments ? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627506#comment-13627506 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- If above Chunhui's comments are fixed and i see that latest patch V10 has the changes incorporated +1 on the patch. Thanks Jeffrey, continuous persisted efforts on this JIRA. Thanks to Chunhui for good reviews. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625509#comment-13625509 ] Jeffrey Zhong commented on HBASE-7824: -- [~zjushch] Thanks for the detailed reviewing! For your first two comments, I'll make corresponding modifications. {quote} Should use the flag 'shouldSplitMetaSeparately' like other log-split? {quote} A good question. Since splitLog is a sync call, the following two calls {code} fileSystemManager.splitMetaLog(sn); fileSystemManager.splitLog(sn); {code} are logically equivalent to one splitAllLogs call while splitAllLogs has a little bit performance advantage because it submits all log splitting logs in one go. 'shouldSplitMetaSeparately' is significant in MetaSSH and SSH while in other places there is no difference logically. Being said that, in some places I could take advantage by separating them to improve a little bit more on master start up. As you know both features are new, so I choose conservative way in the beginning and make them less dependent on each other. {quote} in AssignmentManager#processDeadServersAndRegionsInTransition, how about if we mark it as a clean cluster startup? if we mark it as a failover, is there any conflict between SSH and AssignmentManager#processDeadServersAndRecoverLostRegions {quote} If we have left log splitting work, it means that the new master start up isn't a clean one. The reason to make it a failover is to let SSH(single place) to handle dead servers including the log splitting we skipped at the very beginning. If we make the start up as a clean one, we could have data loss as log splitting won't be done for some regions. During the AssignmentManager#processDeadServersAndRecoverLostRegions, there are existing implementations intentionally skipping all known dead servers and leave them to SSH so there is no conflict. {quote} From DeadServer#cleanPreviousInstance, a deadserver will be removed if the same HostnamePort servername is online. {quote} Good concern. The key point is that DeadServer#cleanPreviousInstance will be only called after master initialization. By then, we don't rely on DeadServer much as far as master start up concerns. After master is initialized, DeadServer is basically used in UI to show previously dead servers, YouAreDeadException handling and prevent duplicated expireServer calls. As you already know, once a dead server SSH is submitted, it will continue till it's done regardless if it's in the DeadServer or not. This could happen today when a RS crashed sequentially while its previous instances are still in SSH pipe no matter if DeadServer tracks them or not. In short, DeadServer#cleanPreviousInstance doesn't have much impact. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625592#comment-13625592 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- waitingOnLogSplitting - waitOnLogSplitting may be better. bq.this.catalogTracker.getMetaLocationOrReadLocationFromRoot(); Is it ok to call this in one place as per my yesterday's comment. If root went down after {code} this.initializationBeforeMetaAssignment = true; {code} we call assignRoot. when we try to split the log we will not do that because SSH is not yet enabled. {code} this.assignmentManager.assignRoot(); waitForRootAssignment(); {code} So we expect that though the above step waits, we do {code} this.serverManager.enableSSHForRoot(); {code} Which will do the assignment? Still sshEnabled is not true right? Things look fine but still this area is really a big head ache. Removal of ROOT in trunk is a blessing for developers now. I think if you are confident on the above comments then let us go for a commit and address if future issues. Else we are good. Good stuff Jeff, Everytime I feel that something may be missed out in this area. @Chunhui What do you feel? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625628#comment-13625628 ] Jeffrey Zhong commented on HBASE-7824: -- [~ram_krish] Thanks for the reviewing! {quote} Is it ok to call this in one place as per my yesterday's comment. {quote} The change was missed and I'll make sure it's in the next patch. {quote} Which will do the assignment? Still sshEnabled is not true right? {quote} A very good point. In very rare case I did see tests failed tue to this. If no objections, I can move the enableSSHForRoot right after assignRoot(); to close the loophole following the same pattern we do for metaAssignment. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626118#comment-13626118 ] chunhui shen commented on HBASE-7824: - As ram saied, things are easy to be missed out in this area... Patch v10 seems good for me. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626131#comment-13626131 ] chunhui shen commented on HBASE-7824: - Maybe I have realized one bug case. Suppose Master,RS1,RS2 1.kill master and RS1 2.start master and RS1 3.master start SSH to process dead server RS1 when initialization 4.RS1 is not in dead server since a new RS1 is online 5.AssignmentManager#joinCluster rebuild user regions, return the dead server RS1 and its regions 6.AssignmentManager#processDeadServersAndRecoverLostRegions will assign the regions carried by RS1 7.However hlogs of RS1 is still being split by SSH, it means data loss since we assign region in step6 before completing log-split [~jeffreyz] Please take a check, correct me if wrong Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626182#comment-13626182 ] Jeffrey Zhong commented on HBASE-7824: -- [~zjushch] Could you please clarify RS1 online state from step 4 to step 6? Thanks. In step4, RS1 is recorded as online by Master while in step 5 we return RS1 as dead. AM#rebuildUserRegions only returns dead servers which are not contained in online servers. Since AssignmentManager#processDeadServersAndRecoverLostRegions skips all dead servers for region assignment, it seems you're suggesting RS1 online again in step 6. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626185#comment-13626185 ] chunhui shen commented on HBASE-7824: - RS1,001 is dead server RS1,002 is online server where 001 and 002 represents the start code of regionserver RS1,001 is being processed by SSH and also marked as dead server in AM#rebuildUserRegions. However, RS1,001 is not included in ServerManager#getDeadServers, so AssignmentManager#processDeadServersAndRecoverLostRegions won't skip this server Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626221#comment-13626221 ] Jeffrey Zhong commented on HBASE-7824: -- The following clarifications might help: During Master starts up, function getFailedServersFromLogFolders will return (rs1,001) as part of failedServers. Because start code is part of server name so does hlog file path. Before AM.joinCluser(), the following code in HMaster#finishInitialization will put (rs1,001) into deadservers. {code} status.setStatus(Submit log splitting work of non-meta region servers); for (ServerName curServer : failedServers) { this.serverManager.expireServer(curServer); } {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626226#comment-13626226 ] chunhui shen commented on HBASE-7824: - bq.HMaster#finishInitialization will put (rs1,001) into deadservers. Yes, it's so. But (rs1,001) will be removed from deadservers by DeadServer#cleanPreviousInstance, you could take a see about its call hierarchy. I have poined this in the above comment: From DeadServer#cleanPreviousInstance, a deadserver will be removed if the same HostnamePort servername is online. It means a server will not belong to deadservers even if it is being processed in SSH. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626232#comment-13626232 ] Jeffrey Zhong commented on HBASE-7824: -- I'm about to send additional notes about DeadServer#cleanPreviousInstance. I think it may help to solve all the confusions: 1) The above steps including AM#JoinCluster are before master.initialized becomes true. 2) Inside function ServerManager#checkIsDead {code} // remove dead server with same hostname and port of newly checking in rs after master // initialization.See HBASE-5916 for more information. if ((this.services == null || ((HMaster) this.services).isInitialized()) this.deadservers.cleanPreviousInstance(serverName)) { {code} You can see this.deadservers.cleanPreviousInstance won't do anything because master is NOT initialized yet. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626250#comment-13626250 ] chunhui shen commented on HBASE-7824: - Good point, I'm clear about this now, thanks. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626251#comment-13626251 ] chunhui shen commented on HBASE-7824: - +1 from me Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626261#comment-13626261 ] stack commented on HBASE-7824: -- Quality works lads (Jeffrey, Ram, and Chunhui). Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824-v10.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624740#comment-13624740 ] chunhui shen commented on HBASE-7824: - bq.Since ZK session timeout take a while, HMaster#splitLogAndExpireIfOnline will kick in so there won't be any issue. 1.ZK seession will timeout once java process exit. 2.I think in a complex network we shouldn't assert that ZK session timeout happen after HMaster#splitLogAndExpireIfOnline. e.g. the return of getMetaLocationOrReadLocationFromRoot is hanged for a little time. bq.are you fine with this adjustment? Sorry, why do this adjustment? In addition, is it only for 0.94, no trunk? If trunk only, I think there is no above trouble since ROOT has dropped in trunk Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624751#comment-13624751 ] Jeffrey Zhong commented on HBASE-7824: -- It's only for 0.94 and the adjustment is guaranteed that splitLog will happen right before {code}assignmentManager.assignMeta();{code} just like before to deal with the possible data loss issue you mentioned. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624757#comment-13624757 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- {code} ServerName currentMetaServer = this.catalogTracker.getMetaLocationOrReadLocationFromRoot(); {code} This is read in two places. One in finishInitialization() and the other inside assignMEta where the META RS is checked with prev ROOT server. Can we check this only once and then make the pseudo code change as above mentioned by Jeffrey? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624769#comment-13624769 ] Jeffrey Zhong commented on HBASE-7824: -- I pasted the whole related code snippet as below. Ram's suggestion is possible because no one re-assign META till assignmentManager.assignMeta(). The modified logic is same as before the patch. {code} ... ServerName currentMetaServer = this.catalogTracker.getMetaLocationOrReadLocationFromRoot(); if (currentMetaServer != null !currentMetaServer.equals(previousRootServer)) { fileSystemManager.splitAllLogs(currentMetaServer); if (this.serverManager.isServerOnline(currentMetaServer)) { this.serverManager.expireServer(currentMetaServer); } } assignmentManager.assignMeta(); enableSSHandWaitForMeta(); ... {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624788#comment-13624788 ] chunhui shen commented on HBASE-7824: - bq.fileSystemManager.splitAllLogs(currentMetaServer); Should we take care of log-split concurrency between master initialization thread and SSH? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624789#comment-13624789 ] Jeffrey Zhong commented on HBASE-7824: -- Yep, I've taken care of that. Basically, synchronized log split tasks before SSH is enabled. Once you agree the approach in general, I'll submit the modified patch for review. A question through: Should we do expireServer firstly and then do log splitting after we have the log splitting synchronization mechanism before SSH is enabled. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624792#comment-13624792 ] chunhui shen commented on HBASE-7824: - bq.log splitting synchronization mechanism before SSH is enabled Yes, it's a solution. What's the difference if do expireServer firstly? None? Go as your thought Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625026#comment-13625026 ] Jeffrey Zhong commented on HBASE-7824: -- Test result for v9 patch: {code} Test Suite Results : Tests run: 1339, Failures: 0, Errors: 0, Skipped: 13 Integration: IntegrationTestDataIngestWithChaosMonkey Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 353.049 sec {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625088#comment-13625088 ] chunhui shen commented on HBASE-7824: - Minor comments: {code}+ * @param previousRootServer ServerName of previous root region server before current start up + * @return + * @throws InterruptedException{code} remove @return {code} +} catch (Exception ex) { + LOG.warn(Retry setClusterDown failed, ex); +} {code} LOG.error seems more reasonable since using error before Some doubt: {code} + this.fileSystemManager.splitAllLogs(preRootServer); + this.fileSystemManager.splitAllLogs(preMetaServer); +fileSystemManager.splitAllLogs(currentMetaServer); {code} Should use the flag 'shouldSplitMetaSeparately' like other log-split? In master#finishInitialization, after handling other dead servers in SSH, we will call assignmentManager.joinCluster(), it seems have some problems, e.g. 1.in AssignmentManager#processDeadServersAndRegionsInTransition, how about if we mark it as a clean cluster startup? 2.if we mark it as a failover, is there any conflict between SSH and AssignmentManager#processDeadServersAndRecoverLostRegions An important attention: From DeadServer#cleanPreviousInstance, a deadserver will be removed if the same HostnamePort servername is online. It means a server will not belong to deadservers even if it is processed in SSH. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, hbase-7824-v9.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624568#comment-13624568 ] Jeffrey Zhong commented on HBASE-7824: -- I have run the whole test suite with the patch 4 times in a row. Three are clean as following and one with single failure happened in recent builds as well. {code} Results : Tests run: 1339, Failures: 0, Errors: 0, Skipped: 13 Integration Test: IntegrationTestDataIngestWithChaosMonkey Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 905.014 sec {code} I also provided a release note in the JIRA. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624572#comment-13624572 ] chunhui shen commented on HBASE-7824: - [~jeffreyz] IMO, it is still able to cause META data loss as I mentioned in HBASE-8251: 1.Assign ROOT to the RS where META on 2.Enable SSH for ROOT 3.Assign META If the META RS(it is also the ROOT RS) is dead between step2 and step3, MetaSSH start splitting its hlog. However step3 will assign META directly(Because HMaster#splitLogAndExpireIfOnline will return null), it means META will loss the data from hlog. Correct me if wrong, thanks Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624583#comment-13624583 ] Jeffrey Zhong commented on HBASE-7824: -- [~zjushch] The patch already covered the case you mentioned. You can check both v7 v8 patch. The reason I don't mention the scenario in the above suggestion is to make the idea easier to be accepted. Yesterday I replied you on hbase-8251 for ROOT META collocating on one RS scenario. You can check details at MetaSSH in the patch. Basically we only recover ROOT portion and leave META part till master meta assignment completes. Below is related pseudo code snippet, please let me know if you have more questions. Thanks. {code} ... re-assign root ... if(!this.services.isServerShutdownHandlerEnabled()) { // resubmit in case we're in master initialization and SSH hasn't been enabled yet. this.services.getExecutorService().submit(this); this.deadServers.add(serverName); return; } ... re-assign meta ... {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624587#comment-13624587 ] chunhui shen commented on HBASE-7824: - I think your patch couldn't fix the problem. As the above mentioned case, I have two question. 1.What will the master initialization thread do when assigning META 2.What is the value of isCarryingMeta in ServerManager#expireServer, I think it's false rather than true Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624596#comment-13624596 ] Jeffrey Zhong commented on HBASE-7824: -- Let me add more clarifications to your first comments to see if you agree firstly: {quote} If the META RS(it is also the ROOT RS) is dead between step2 and step3, MetaSSH start splitting its hlog. {quote} The root recovery portion in Meta SSH will complete log splitting for both ROOT and META regions. Therefore, all recovered edits files are created before Meta region can be assigned because meta region can only be assigned only when ROOT is online. By then, all recovered edits files are created and they will be replayed when meta region is opened during assignment. {quote} However step3 will assign META directly(Because HMaster#splitLogAndExpireIfOnline will return null), it means META will loss the data from hlog. {quote} This is all right because the existing log splitting work has already be done and will be replayed during META region open phase. Thanks for your feedbacks. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624602#comment-13624602 ] chunhui shen commented on HBASE-7824: - bq.Therefore, all recovered edits files are created before Meta region can be assigned because meta region can only be assigned only when ROOT is online. We can open the META region on RS when ROOT is offline. How about if ROOT RS is killed between getMetaLocationOrReadLocationFromRoot and assignMeta? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624604#comment-13624604 ] chunhui shen commented on HBASE-7824: - It means META region is opening on the RS before SSH completed log-split Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624612#comment-13624612 ] Jeffrey Zhong commented on HBASE-7824: -- {quote} We can open the META region on RS when ROOT is offline. {quote} We need updated META location in ROOT RS when META region is opening on a newly assigned RS. Is that true? Therefore, a ROOT RS has to be online for a successful META assignment. So the open will fail even if META region can be opened but no one can access it because root is offline and the old location isn't updated to the newly assigned location. Later Meta region will be re-assigned by MetaSSH. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624622#comment-13624622 ] chunhui shen commented on HBASE-7824: - bq.So the open will fail even if META region can be opened but no one can access it because root is offline We will retry if fail to update META location in ROOT RS. Root will be online finally, however Meta region won't be re-assigned. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624730#comment-13624730 ] Jeffrey Zhong commented on HBASE-7824: -- {quote} We will retry if fail to update META location in ROOT RS. {quote} Are you referring to HTable.put internal retries? It seems that in high level you agreed to my pervious statements. Let's go back to the possible scenario you mentioned above that a root RS crashed after getMetaLocationOrReadLocationFromRoot. Since ZK session timeout take a while, HMaster#splitLogAndExpireIfOnline will kick in so there won't be any issue. Let's conclude this issue. I'll change the patch to the following pesudo-code snippet, are you fine with this adjustment? {code} ... fileSystemManager.splitAllLogs(sn); if(serverManager.isServerOnline(currentMetaServer)){ expire(currentMetaServer); } ... {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623417#comment-13623417 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- Ok will check this fix. It is related to MTTR so always it is useful. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623960#comment-13623960 ] rajeshbabu commented on HBASE-7824: --- [~jeffreyz] Going through the patch. {code} +// SSH should enabled before META region assignment +// because META region assignment is depending on ROOT server online. {code} FYI,There is possible META data loss with this see chunhui comment https://issues.apache.org/jira/browse/HBASE-8251?focusedCommentId=13621689page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13621689 Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, hbase-7824-v5.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624030#comment-13624030 ] Jeffrey Zhong commented on HBASE-7824: -- [~rajesh23] I saw the similar issue if the RS who host ROOT dies before step 4, master will be blocked. during my testing since it's pre-existing issue so I guess I can live with it in the patch. Let me try to find a solution otherwise leave the issue as it is. Thanks for the reviewing. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, hbase-7824-v5.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624248#comment-13624248 ] Jeffrey Zhong commented on HBASE-7824: -- Test suite result: {code}Tests run: 1336, Failures: 0, Errors: 0, Skipped: 13{code} Integration Test: {code} Running org.apache.hadoop.hbase.IntegrationTestDataIngestWithChaosMonkey Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 681.593 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 {code} Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, hbase-7824-v7.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624350#comment-13624350 ] Lars Hofhansl commented on HBASE-7824: -- You feel good about this one, Jeffrey? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624353#comment-13624353 ] Jeffrey Zhong commented on HBASE-7824: -- Yeah. So far all test failures(relating/unrelating to this patch) found in tests of this patch are fixed in this patch or other patches. For example, a recent flaky test case failure http://54.241.6.143/job/HBase-0.94/org.apache.hbase$hbase/60/testReport/junit/org.apache.hadoop.hbase.regionserver/TestRSKilledWhenMasterInitializing/testCorrectnessWhenMasterFailOver/ should be also fixed. I'll run more rounds of test suite through the weekend to really make it as solid as possible. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623196#comment-13623196 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/]) HBASE-7824 Improve master start up time when there is log splitting work, revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS failure (Revision 1456689) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1455976) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.8 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604137#comment-13604137 ] Jeffrey Zhong commented on HBASE-7824: -- [~lhofhansl] Thanks for giving potential another chance:-). I'm still looking for a good solution. The cause of the test failure is what Ram suggested. While the cause leads me suspecting a potential situation that a region could stuck in RIT forever, I need to write a test to verify that and will keep you updated. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604365#comment-13604365 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- Found some reasons..will keep you updated. Let me understand Jeff's patch also and the idea behind it. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604457#comment-13604457 ] Jeffrey Zhong commented on HBASE-7824: -- [~ram_krish] Thanks for looking this as well! Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604458#comment-13604458 ] Jeffrey Zhong commented on HBASE-7824: -- I think the test case fails before with the patch is due to an existing issue which I filed at https://issues.apache.org/jira/browse/HBASE-8127. Please see details there. Basically RITs of disabling(or disabled) table could stuck in RIT state forever for master failover case. The changes in the patch triggers the existing issue so we have the test failures. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604123#comment-13604123 ] Lars Hofhansl commented on HBASE-7824: -- Maybe I was a bit rash here. You said you worked on figuring out what the issues was with the failed test. Any luck? The fix that Ram suggests in HBASE-7985 does not work? This is a good improvement and it would be a shame to miss out on this in 0.94. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602785#comment-13602785 ] Ted Yu commented on HBASE-7824: --- Backed out again due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS test failure. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602794#comment-13602794 ] Lars Hofhansl commented on HBASE-7824: -- Thanks for reverting Ted. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603033#comment-13603033 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94 #903 (See [https://builds.apache.org/job/HBase-0.94/903/]) HBASE-7824 Improve master start up time when there is log splitting work, revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS failure (Revision 1456689) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603095#comment-13603095 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security #124 (See [https://builds.apache.org/job/HBase-0.94-security/124/]) HBASE-7824 Improve master start up time when there is log splitting work, revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS failure (Revision 1456689) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1455976) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601206#comment-13601206 ] Ted Yu commented on HBASE-7824: --- Integrated to 0.94 Thanks for the patch, Jeff. Let's see how it goes. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601234#comment-13601234 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94 #894 (See [https://builds.apache.org/job/HBase-0.94/894/]) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1455976) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600785#comment-13600785 ] Ted Yu commented on HBASE-7824: --- [~lhofhansl]: What do you think of this one ? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600808#comment-13600808 ] Lars Hofhansl commented on HBASE-7824: -- +1 let's try again for 0.94. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch, hbase-7824_v3.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594952#comment-13594952 ] Jeffrey Zhong commented on HBASE-7824: -- [~ram_krish] Are you all right with my explanation in https://issues.apache.org/jira/browse/HBASE-7824?focusedCommentId=13592721page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13592721? So far the test cases passed with a small modifications in the file src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java. Thanks, -Jeffrey Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593237#comment-13593237 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security #116 (See [https://builds.apache.org/job/HBase-0.94-security/116/]) HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 1452452) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592434#comment-13592434 ] Jeffrey Zhong commented on HBASE-7824: -- The reason to add those previous dead non-meta region servers into deadServers is to let the new master instance start as a failover such as the following code in AssignmentManager. In addition, we don't want AM assign those regions before log splitting work complete that's why let AM skip them inside function processDeadServersAndRegionsInTransition but handle them in SSH. {code} if (!this.serverManager.getDeadServers().isEmpty()) { this.failover = true; } {code} Since those failed servers will be processed by SSH so their regions should be online and I did see test log message Finished processing of shutdown. There are couple of regions aren't assigned for some reason and I need to dig more in the test case and keep you updated. Thanks, -Jeffrey Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592457#comment-13592457 ] Lars Hofhansl commented on HBASE-7824: -- I would like to roll 0.94.6 soon. Should we revert this for 0.94.6 and put it back up for 0.94.7? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592474#comment-13592474 ] Jeffrey Zhong commented on HBASE-7824: -- [~lhofhansl]I'm fine to revert it for now and put it back up for 0.94.7 because there is no hurry for this. Thanks, -Jeffrey Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592481#comment-13592481 ] Lars Hofhansl commented on HBASE-7824: -- If we can work out the failure that would be preferable of course :) Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592492#comment-13592492 ] Jeffrey Zhong commented on HBASE-7824: -- Yeah, in either way I'll get the bottom of this(hopefully by end of today) so that we can be sure there is no issue. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592501#comment-13592501 ] Lars Hofhansl commented on HBASE-7824: -- Thanks [~jeffreyz]! Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592522#comment-13592522 ] Ted Yu commented on HBASE-7824: --- Talked with Jeffrey. I reverted the patch for now. Jeffrey would provide his suggestion on how to make TestMasterFailover more reliable, along with his changes. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592721#comment-13592721 ] Jeffrey Zhong commented on HBASE-7824: -- I think I found the root cause and I addressed in the trunk patch(https://reviews.apache.org/r/9419/diff/#index_header) where I have the following line: {code} // wait till all dead server are processed ServerManager serverManager = master.getServerManager(); while (serverManager.areDeadServersInProgress()) { Thread.sleep(100); } {code} Because my change will make master start up quickly with some SSH handling left which changes existing test case assumption a little bit. So I added the above lines to match the exiting test case expectation which that all log splitting work is done previous dead servers are handled. I've run the test case 20 times in a loop without any failure. The reason that the test case passed with removing this.deadservers.add(serverName);. Because it basically assigns regions before master initialization due to waitForActiveAndReadyMaster in the test code. Since it matches old behavior so that test case passed while the log splitting work might not have been done before those regions are assigned. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592943#comment-13592943 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94 #879 (See [https://builds.apache.org/job/HBase-0.94/879/]) HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 1452452) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593183#comment-13593183 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security-on-Hadoop-23 #12 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/12/]) HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 1452452) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1449920) Result = FAILURE tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.7 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592002#comment-13592002 ] Lars Hofhansl commented on HBASE-7824: -- We're seeing relatively frequent failures of TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS now. Over in HBASE-7985 Ram determined that this always happens when the RS we abort does not carry META. It might be related to this change. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592009#comment-13592009 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- @Jeff Any specific reason for adding the server to be processed (other than RS carrying ROOT or META) to the deadServers. {code} void processDeadServer(final ServerName serverName) { this.deadservers.add(serverName); this.services.getExecutorService().submit( new ServerShutdownHandler(this.master, this.services, this.deadservers, serverName, true)); } {code} I remember we used to track the servers that got expired when master was coming up using deadServers. See ServerManager.expireServer(). May be some specific issues you got? Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592021#comment-13592021 ] Jeffrey Zhong commented on HBASE-7824: -- [~ram_krish] The reason is that we already recovered META region servers during the initialization so we don't need to keep meta region servers there. The failedServer list is only for log splitting work.Let me see why the test TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS failed more often. Thanks, -Jeffrey Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592028#comment-13592028 ] ramkrishna.s.vasudevan commented on HBASE-7824: --- Yes i can understand that part. But do we need to explicitly add to deadServers? Because the deadServers list was like used when an RS goes down just when the master started coming up. Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586683#comment-13586683 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94 #859 (See [https://builds.apache.org/job/HBase-0.94/859/]) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1449920) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work
[ https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586891#comment-13586891 ] Hudson commented on HBASE-7824: --- Integrated in HBase-0.94-security #112 (See [https://builds.apache.org/job/HBase-0.94-security/112/]) HBASE-7824 Improve master start up time when there is log splitting work (Jeffrey Zhong) (Revision 1449920) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java Improve master start up time when there is log splitting work - Key: HBASE-7824 URL: https://issues.apache.org/jira/browse/HBASE-7824 Project: HBase Issue Type: Bug Components: master Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.94.6 Attachments: hbase-7824.patch, hbase-7824_v2.patch When there is log split work going on, master start up waits till all log split work completes even though the log split has nothing to do with meta region servers. It's a bad behavior considering a master node can run when log split is happening while its start up is blocking by log split work. Since master is kind of single point of failure, we should start it ASAP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira