[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628721#comment-13628721
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94 #955 (See 
[https://builds.apache.org/job/HBase-0.94/955/])
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1466725)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629770#comment-13629770
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security #134 (See 
[https://builds.apache.org/job/HBase-0.94-security/134/])
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1466725)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/MetaServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-10 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628024#comment-13628024
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Many thanks to Ram, Chunhui and Rajesh for the latest reviews! 

[~lhofhansl] Are you all right to check the patch v10 in? Thanks.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628389#comment-13628389
 ] 

Lars Hofhansl commented on HBASE-7824:
--

Patch looks good (although I didn't have time for a detailed review)
+1

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628401#comment-13628401
 ] 

Ted Yu commented on HBASE-7824:
---

Integrated to 0.94

Thanks for the continued effort, Jeff.

Thanks for the reviews, Ram, Chunhui and Rajesh.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626820#comment-13626820
 ] 

Ted Yu commented on HBASE-7824:
---

[~lhofhansl], [~ram_krish]:
Do you have further review comments ?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-09 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13627506#comment-13627506
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

If above Chunhui's comments are fixed and i see that latest patch V10 has the 
changes incorporated +1 on the patch. Thanks Jeffrey, continuous persisted 
efforts on this JIRA.
Thanks to Chunhui for good reviews.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625509#comment-13625509
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~zjushch] Thanks for the detailed reviewing!

For your first two comments, I'll make corresponding modifications.

{quote}
Should use the flag 'shouldSplitMetaSeparately' like other log-split?
{quote}
A good question. Since splitLog is a sync call, the following two calls 
{code}
  fileSystemManager.splitMetaLog(sn);
  fileSystemManager.splitLog(sn);
{code}
are logically equivalent to one splitAllLogs call while splitAllLogs has a 
little bit performance advantage because it submits all log splitting logs in 
one go. 'shouldSplitMetaSeparately' is significant in MetaSSH and SSH while in 
other places there is no difference logically. 
Being said that, in some places I could take advantage by separating them to 
improve a little bit more on master start up. As you know both features are 
new, so I choose conservative way in the beginning and make them less dependent 
on each other.

{quote}
in AssignmentManager#processDeadServersAndRegionsInTransition, how about if we 
mark it as a clean cluster startup?
if we mark it as a failover, is there any conflict between SSH and 
AssignmentManager#processDeadServersAndRecoverLostRegions
{quote}
If we have left log splitting work, it means that the new master start up isn't 
a clean one. The reason to make it a failover is to let SSH(single place) to 
handle dead servers including the log splitting we skipped at the very 
beginning. If we make the start up as a clean one, we could have data loss as 
log splitting won't be done for some regions. 
During the AssignmentManager#processDeadServersAndRecoverLostRegions, there are 
existing implementations intentionally skipping all known dead servers and 
leave them to SSH so there is no conflict.

{quote}
From DeadServer#cleanPreviousInstance, a deadserver will be removed if the 
same HostnamePort servername is online. 
{quote}
Good concern. The key point is that DeadServer#cleanPreviousInstance will be 
only called after master initialization. By then, we don't rely on DeadServer 
much as far as master start up concerns. After master is initialized, 
DeadServer is basically used in UI to show previously dead servers, 
YouAreDeadException handling and prevent duplicated expireServer calls. As 
you already know, once a dead server SSH is submitted, it will continue till 
it's done regardless if it's in the DeadServer or not. This could happen today 
when a RS crashed sequentially while its previous instances are still in SSH 
pipe no matter if DeadServer tracks them or not. In short, 
DeadServer#cleanPreviousInstance doesn't have much impact.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, 
 hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625592#comment-13625592
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

waitingOnLogSplitting - waitOnLogSplitting may be better.
bq.this.catalogTracker.getMetaLocationOrReadLocationFromRoot();
Is it ok to call this in one place as per my yesterday's comment.
If root went down after 
{code}
this.initializationBeforeMetaAssignment = true;
{code}
we call assignRoot.  when we try to split the log we will not do that because 
SSH is not yet enabled.
{code}
this.assignmentManager.assignRoot();
  waitForRootAssignment();
{code}
So we expect that though the above step waits, we do 
{code}
this.serverManager.enableSSHForRoot();
{code}
Which will do the assignment? Still sshEnabled is not true right?
Things look fine but still this area is really a big head ache.  Removal of 
ROOT in trunk is a blessing for developers now.
I think if you are confident on the above comments then let us go for a commit 
and address if future issues.  Else we are good.

Good stuff Jeff, Everytime I feel that something may be missed out in this area.
@Chunhui
What do you feel?


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, 
 hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625628#comment-13625628
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~ram_krish] Thanks for the reviewing!
{quote}
Is it ok to call this in one place as per my yesterday's comment.
{quote}
The change was missed and I'll make sure it's in the next patch.

{quote}
Which will do the assignment? Still sshEnabled is not true right?
{quote}
A very good point. In very rare case I did see tests failed tue to this. If no 
objections, I can move the enableSSHForRoot right after assignRoot(); to close 
the loophole following the same pattern we do for metaAssignment.



 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, 
 hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626118#comment-13626118
 ] 

chunhui shen commented on HBASE-7824:
-

As ram saied, things are easy to be missed out in this area...

Patch v10 seems good for me.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626131#comment-13626131
 ] 

chunhui shen commented on HBASE-7824:
-

Maybe I have realized one bug case.
Suppose Master,RS1,RS2
1.kill master and RS1
2.start master and RS1
3.master start SSH to process dead server RS1 when initialization
4.RS1 is not in dead server since a new RS1 is online
5.AssignmentManager#joinCluster rebuild user regions, return the dead server 
RS1 and its regions
6.AssignmentManager#processDeadServersAndRecoverLostRegions will assign the 
regions carried by RS1
7.However hlogs of RS1 is still being split by SSH, it means data loss since we 
assign region in step6 before completing log-split

[~jeffreyz]
Please take a check, correct me if wrong

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626182#comment-13626182
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~zjushch] Could you please clarify RS1 online state from step 4 to step 6? 
Thanks.

In step4, RS1 is recorded as online by Master while in step 5 we return RS1 as 
dead. AM#rebuildUserRegions only returns dead servers which are not contained 
in online servers. 
Since AssignmentManager#processDeadServersAndRecoverLostRegions skips all dead 
servers for region assignment, it seems you're suggesting RS1 online again in 
step 6.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626185#comment-13626185
 ] 

chunhui shen commented on HBASE-7824:
-

RS1,001 is dead server 
RS1,002 is online server

where 001 and 002 represents the start code of regionserver

RS1,001 is being processed by SSH and also marked as dead server in 
AM#rebuildUserRegions.

However, RS1,001 is not included in ServerManager#getDeadServers, so 
AssignmentManager#processDeadServersAndRecoverLostRegions won't skip this server

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626221#comment-13626221
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

The following clarifications might help:

During Master starts up, function getFailedServersFromLogFolders will return 
(rs1,001) as part of failedServers. Because start code is part of server name 
so does hlog file path. Before AM.joinCluser(), the following code in 
HMaster#finishInitialization will put (rs1,001) into deadservers. {code}
status.setStatus(Submit log splitting work of non-meta region servers);
for (ServerName curServer : failedServers) {
  this.serverManager.expireServer(curServer);
}
{code}



 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626226#comment-13626226
 ] 

chunhui shen commented on HBASE-7824:
-

bq.HMaster#finishInitialization will put (rs1,001) into deadservers. 
Yes, it's so.
But (rs1,001) will be removed from deadservers by 
DeadServer#cleanPreviousInstance, you could take a see about its call hierarchy.

I have poined this in the above comment:
From DeadServer#cleanPreviousInstance, a deadserver will be removed if the 
same HostnamePort servername is online.
It means a server will not belong to deadservers even if it is being processed 
in SSH. 

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626232#comment-13626232
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

I'm about to send additional notes about DeadServer#cleanPreviousInstance. I 
think it may help to solve all the confusions:

1) The above steps including AM#JoinCluster are before master.initialized 
becomes true.
2) Inside function ServerManager#checkIsDead {code}
// remove dead server with same hostname and port of newly checking in rs 
after master
// initialization.See HBASE-5916 for more information.
if ((this.services == null || ((HMaster) this.services).isInitialized())
 this.deadservers.cleanPreviousInstance(serverName)) {
{code}
You can see this.deadservers.cleanPreviousInstance won't do anything because 
master is NOT initialized yet.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626250#comment-13626250
 ] 

chunhui shen commented on HBASE-7824:
-

Good point, I'm clear about this now, thanks.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626251#comment-13626251
 ] 

chunhui shen commented on HBASE-7824:
-

+1 from me

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626261#comment-13626261
 ] 

stack commented on HBASE-7824:
--

Quality works lads (Jeffrey, Ram, and Chunhui).

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824-v10.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v7.patch, 
 hbase-7824-v8.patch, hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624740#comment-13624740
 ] 

chunhui shen commented on HBASE-7824:
-

bq.Since ZK session timeout take a while, HMaster#splitLogAndExpireIfOnline 
will kick in so there won't be any issue.
1.ZK seession will timeout once java process exit.
2.I think in a complex network we shouldn't assert that ZK session timeout 
happen after HMaster#splitLogAndExpireIfOnline. e.g. the return of 
getMetaLocationOrReadLocationFromRoot is hanged for a little time.

bq.are you fine with this adjustment?
Sorry, why do this adjustment?


In addition, is it only for 0.94, no trunk?  If trunk only, I think there is no 
above trouble since ROOT has dropped in trunk


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624751#comment-13624751
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

It's only for 0.94 and the adjustment is guaranteed that splitLog will happen 
right before {code}assignmentManager.assignMeta();{code} just like before to 
deal with the possible data loss issue you mentioned. 





 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624757#comment-13624757
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

{code}
   ServerName currentMetaServer = 
this.catalogTracker.getMetaLocationOrReadLocationFromRoot();
{code}
This is read in two places. One in finishInitialization() and the other inside 
assignMEta where the META RS is checked with prev ROOT server.
Can we check this only once and then make the pseudo code change as above  
mentioned by Jeffrey?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624769#comment-13624769
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

I pasted the whole related code snippet as below. Ram's suggestion is possible 
because no one re-assign META till assignmentManager.assignMeta(). The modified 
logic is same as before the patch.   

{code}
  ...
  ServerName currentMetaServer = 
this.catalogTracker.getMetaLocationOrReadLocationFromRoot();
  if (currentMetaServer != null  
!currentMetaServer.equals(previousRootServer)) {
fileSystemManager.splitAllLogs(currentMetaServer);
if (this.serverManager.isServerOnline(currentMetaServer)) {
  this.serverManager.expireServer(currentMetaServer);
}
  }
  assignmentManager.assignMeta();
  enableSSHandWaitForMeta();
  ...
{code}


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624788#comment-13624788
 ] 

chunhui shen commented on HBASE-7824:
-

bq.fileSystemManager.splitAllLogs(currentMetaServer);
Should we take care of log-split concurrency between master initialization 
thread and SSH?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624789#comment-13624789
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Yep, I've taken care of that. Basically, synchronized log split tasks before 
SSH is enabled. Once you agree the approach in general, I'll submit the 
modified patch for review.

A question through: Should we do expireServer firstly and then do log splitting 
after we have the log splitting synchronization mechanism before SSH is enabled.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624792#comment-13624792
 ] 

chunhui shen commented on HBASE-7824:
-

bq.log splitting synchronization mechanism before SSH is enabled
Yes, it's a solution.  

What's the difference if do expireServer firstly? None? 
Go as your thought

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625026#comment-13625026
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Test result for v9 patch:
{code}
Test Suite Results :

Tests run: 1339, Failures: 0, Errors: 0, Skipped: 13

Integration: IntegrationTestDataIngestWithChaosMonkey

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 353.049 sec
{code}


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, 
 hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-07 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13625088#comment-13625088
 ] 

chunhui shen commented on HBASE-7824:
-

Minor comments:
{code}+   * @param previousRootServer ServerName of previous root region server 
before current start up
+   * @return
+   * @throws InterruptedException{code}
remove @return

{code}
+} catch (Exception ex) {
+  LOG.warn(Retry setClusterDown failed, ex);
+}
{code}
LOG.error seems more reasonable since using error before

Some doubt:
{code}
+  this.fileSystemManager.splitAllLogs(preRootServer);
+  this.fileSystemManager.splitAllLogs(preMetaServer);
+fileSystemManager.splitAllLogs(currentMetaServer);
{code}
Should use the flag 'shouldSplitMetaSeparately' like other log-split?

In master#finishInitialization, after handling other dead servers in SSH, we 
will call assignmentManager.joinCluster(), it seems have some problems, e.g.
1.in AssignmentManager#processDeadServersAndRegionsInTransition, how about if 
we mark it as a clean cluster startup?
2.if we mark it as a failover, is there any conflict between SSH and 
AssignmentManager#processDeadServersAndRecoverLostRegions

An important attention:
From DeadServer#cleanPreviousInstance, a deadserver will be removed if the 
same HostnamePort servername is online.
It means a server will not belong to deadservers even if it is processed in SSH.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch, 
 hbase-7824-v9.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624568#comment-13624568
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

I have run the whole test suite with the patch 4 times in a row. Three are 
clean as following and one with single failure happened in recent builds as 
well.
{code}
Results :
Tests run: 1339, Failures: 0, Errors: 0, Skipped: 13

Integration Test: IntegrationTestDataIngestWithChaosMonkey
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 905.014 sec
{code}

I also provided a release note in the JIRA.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624572#comment-13624572
 ] 

chunhui shen commented on HBASE-7824:
-

[~jeffreyz]
IMO, it is still able to cause META data loss as I mentioned in HBASE-8251:
1.Assign ROOT to the RS where META on
2.Enable SSH for ROOT
3.Assign META

If the META RS(it is also the ROOT RS) is dead between step2 and step3, MetaSSH 
start splitting its hlog.
However step3 will assign META directly(Because 
HMaster#splitLogAndExpireIfOnline will return null), it means META will loss 
the data from hlog.

Correct me if wrong, thanks

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624583#comment-13624583
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~zjushch] The patch already covered the case you mentioned. You can check both 
v7  v8 patch. The reason I don't mention the scenario in the above suggestion 
is to make the idea easier to be accepted.

Yesterday I replied you on hbase-8251 for ROOT  META collocating on one RS 
scenario. You can check details at MetaSSH in the patch. Basically we only 
recover ROOT portion and leave META part till master meta assignment completes. 
Below is related pseudo code snippet, please let me know if you have more 
questions. Thanks.
{code}
  ...
  re-assign root
  ... 
  
  if(!this.services.isServerShutdownHandlerEnabled()) {
// resubmit in case we're in master initialization and SSH hasn't been 
enabled yet.
this.services.getExecutorService().submit(this);
this.deadServers.add(serverName);
return;
  }
  ...
  re-assign meta
  ...
{code}


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624587#comment-13624587
 ] 

chunhui shen commented on HBASE-7824:
-

I think your patch couldn't fix the problem.

As the above mentioned case, I have two question.

1.What will the master initialization thread do when assigning META
2.What is the value of isCarryingMeta in ServerManager#expireServer, I think 
it's false rather than true

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624596#comment-13624596
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Let me add more clarifications to your first comments to see if you agree 
firstly:

{quote}
If the META RS(it is also the ROOT RS) is dead between step2 and step3, MetaSSH 
start splitting its hlog.
{quote}
The root recovery portion in Meta SSH will complete log splitting for both ROOT 
and META regions. Therefore, all recovered edits files are created before Meta 
region can be assigned because meta region can only be assigned only when ROOT 
is online. By then, all recovered edits files are created and they will be 
replayed when meta region is opened during assignment.

{quote}
However step3 will assign META directly(Because 
HMaster#splitLogAndExpireIfOnline will return null), it means META will loss 
the data from hlog.
{quote}
This is all right because the existing log splitting work has already be done 
and will be replayed during META region open phase.

Thanks for your feedbacks.



 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624602#comment-13624602
 ] 

chunhui shen commented on HBASE-7824:
-

bq.Therefore, all recovered edits files are created before Meta region can be 
assigned because meta region can only be assigned only when ROOT is online.

We can open the META region on RS when ROOT is offline. 
How about if ROOT RS is killed between getMetaLocationOrReadLocationFromRoot 
and assignMeta?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624604#comment-13624604
 ] 

chunhui shen commented on HBASE-7824:
-

It means META region is opening on the RS before SSH completed log-split

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624612#comment-13624612
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

{quote}
We can open the META region on RS when ROOT is offline. 
{quote}
We need updated META location in ROOT RS when META region is opening on a newly 
assigned RS. Is that true? Therefore, a ROOT RS has to be online for a 
successful META assignment.

So the open will fail even if META region can be opened but no one can access 
it because root is offline and the old location isn't updated to the newly 
assigned location. Later Meta region will be re-assigned by MetaSSH.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624622#comment-13624622
 ] 

chunhui shen commented on HBASE-7824:
-

bq.So the open will fail even if META region can be opened but no one can 
access it because root is offline
We will retry if fail to update META location in ROOT RS.
Root will be online finally, however Meta region won't be re-assigned.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624730#comment-13624730
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

{quote}
We will retry if fail to update META location in ROOT RS.
{quote}
Are you referring to HTable.put internal retries? It seems that in high level 
you agreed to my pervious statements. 

Let's go back to the possible scenario you mentioned above that a root RS 
crashed after getMetaLocationOrReadLocationFromRoot. Since ZK session timeout 
take a while, HMaster#splitLogAndExpireIfOnline will kick in so there won't be 
any issue.

Let's conclude this issue. I'll change the patch to the following pesudo-code 
snippet, are you fine with this adjustment?
{code}
  ...
  fileSystemManager.splitAllLogs(sn); 
  if(serverManager.isServerOnline(currentMetaServer)){
expire(currentMetaServer);
  }
  ...
{code}
  

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623417#comment-13623417
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

Ok will check this fix.  It is related to MTTR so always it is useful.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623960#comment-13623960
 ] 

rajeshbabu commented on HBASE-7824:
---

[~jeffreyz]
Going through the patch. 
{code}
+// SSH should enabled before META region assignment
+// because META region assignment is depending on ROOT server online.
{code}
FYI,There is possible META data loss with this
see chunhui comment
https://issues.apache.org/jira/browse/HBASE-8251?focusedCommentId=13621689page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13621689

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, 
 hbase-7824-v5.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624030#comment-13624030
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~rajesh23] I saw the similar issue if the RS who host ROOT dies before step 
4, master will be blocked. during my testing since it's pre-existing issue so 
I guess I can live with it in the patch. Let me try to find a solution 
otherwise leave the issue as it is. Thanks for the reviewing.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, 
 hbase-7824-v5.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624248#comment-13624248
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Test suite result:
{code}Tests run: 1336, Failures: 0, Errors: 0, Skipped: 13{code}

Integration Test:
{code}
Running org.apache.hadoop.hbase.IntegrationTestDataIngestWithChaosMonkey
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 681.593 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
{code}

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch, hbase-7824-v4.patch, 
 hbase-7824-v7.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624350#comment-13624350
 ] 

Lars Hofhansl commented on HBASE-7824:
--

You feel good about this one, Jeffrey?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-05 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13624353#comment-13624353
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Yeah. So far all test failures(relating/unrelating to this patch) found in 
tests of this patch are fixed in this patch or other patches. For example, a 
recent flaky test case failure 
http://54.241.6.143/job/HBase-0.94/org.apache.hbase$hbase/60/testReport/junit/org.apache.hadoop.hbase.regionserver/TestRSKilledWhenMasterInitializing/testCorrectnessWhenMasterFailOver/
 should be also fixed.

I'll run more rounds of test suite through the weekend to really make it as 
solid as possible.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: hbase-7824.patch, hbase-7824_v2.patch, 
 hbase-7824_v3.patch, hbase-7824-v7.patch, hbase-7824-v8.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-04-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623196#comment-13623196
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #13 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/13/])
HBASE-7824 Improve master start up time when there is log splitting work, 
revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS 
failure (Revision 1456689)
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1455976)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java

tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.8

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604137#comment-13604137
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~lhofhansl] Thanks for giving potential another chance:-). I'm still looking 
for a good solution. The cause of the test failure is what Ram suggested. While 
the cause leads me suspecting a potential situation that a region could stuck 
in RIT forever, I need to write a test to verify that and will keep you 
updated. 


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-16 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604365#comment-13604365
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

Found some reasons..will keep you updated. Let me understand Jeff's patch also 
and the idea behind it.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604457#comment-13604457
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~ram_krish] Thanks for looking this as well! 




 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604458#comment-13604458
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

I think the test case fails before with the patch is due to an existing issue 
which I filed at https://issues.apache.org/jira/browse/HBASE-8127. Please see 
details there. Basically RITs of disabling(or disabled) table could stuck in 
RIT state forever for master failover case. The changes in the patch triggers 
the existing issue so we have the test failures.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-15 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604123#comment-13604123
 ] 

Lars Hofhansl commented on HBASE-7824:
--

Maybe I was a bit rash here.
You said you worked on figuring out what the issues was with the failed test. 
Any luck?

The fix that Ram suggests in HBASE-7985 does not work?
This is a good improvement and it would be a shame to miss out on this in 0.94.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-14 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602785#comment-13602785
 ] 

Ted Yu commented on HBASE-7824:
---

Backed out again due to 
TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS test failure.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602794#comment-13602794
 ] 

Lars Hofhansl commented on HBASE-7824:
--

Thanks for reverting Ted.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603033#comment-13603033
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94 #903 (See 
[https://builds.apache.org/job/HBase-0.94/903/])
HBASE-7824 Improve master start up time when there is log splitting work, 
revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS 
failure (Revision 1456689)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603095#comment-13603095
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security #124 (See 
[https://builds.apache.org/job/HBase-0.94-security/124/])
HBASE-7824 Improve master start up time when there is log splitting work, 
revert due to TestMasterFailover#testMasterFailoverWithMockedRITOnDeadRS 
failure (Revision 1456689)
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1455976)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java

tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601206#comment-13601206
 ] 

Ted Yu commented on HBASE-7824:
---

Integrated to 0.94

Thanks for the patch, Jeff.

Let's see how it goes.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13601234#comment-13601234
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94 #894 (See 
[https://builds.apache.org/job/HBase-0.94/894/])
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1455976)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600785#comment-13600785
 ] 

Ted Yu commented on HBASE-7824:
---

[~lhofhansl]:
What do you think of this one ?

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600808#comment-13600808
 ] 

Lars Hofhansl commented on HBASE-7824:
--

+1 let's try again for 0.94.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, 
 hbase-7824_v2.patch, hbase-7824_v3.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594952#comment-13594952
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~ram_krish] Are you all right with my explanation in 
https://issues.apache.org/jira/browse/HBASE-7824?focusedCommentId=13592721page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13592721?
 So far the test cases passed with a small modifications in the file 
src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java.

Thanks,
-Jeffrey

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593237#comment-13593237
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security #116 (See 
[https://builds.apache.org/job/HBase-0.94-security/116/])
HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 
1452452)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592434#comment-13592434
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

The reason to add those previous dead non-meta region servers into deadServers 
is to let the new master instance start as a failover such as the following 
code in AssignmentManager. In addition, we don't want AM assign those regions 
before log splitting work complete that's why let AM skip them inside function 
processDeadServersAndRegionsInTransition but handle them in SSH.

{code}
if (!this.serverManager.getDeadServers().isEmpty()) {
  this.failover = true;
}
{code} 

Since those failed servers will be processed by SSH so their regions should be 
online and I did see test log message Finished processing of shutdown. 

There are couple of regions aren't assigned for some reason and I need to dig 
more in the test case and keep you updated.

Thanks,
-Jeffrey



 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592457#comment-13592457
 ] 

Lars Hofhansl commented on HBASE-7824:
--

I would like to roll 0.94.6 soon.
Should we revert this for 0.94.6 and put it back up for 0.94.7?


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592474#comment-13592474
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~lhofhansl]I'm fine to revert it for now and put it back up for 0.94.7 because 
there is no hurry for this.

Thanks,
-Jeffrey

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592481#comment-13592481
 ] 

Lars Hofhansl commented on HBASE-7824:
--

If we can work out the failure that would be preferable of course :)

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592492#comment-13592492
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

Yeah, in either way I'll get the bottom of this(hopefully by end of today) so 
that we can be sure there is no issue. 

 

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592501#comment-13592501
 ] 

Lars Hofhansl commented on HBASE-7824:
--

Thanks [~jeffreyz]!

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592522#comment-13592522
 ] 

Ted Yu commented on HBASE-7824:
---

Talked with Jeffrey.

I reverted the patch for now.

Jeffrey would provide his suggestion on how to make TestMasterFailover more 
reliable, along with his changes.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592721#comment-13592721
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

I think I found the root cause and I addressed in the trunk 
patch(https://reviews.apache.org/r/9419/diff/#index_header) where I have the 
following line:
{code}
 // wait till all dead server are processed 
ServerManager serverManager = master.getServerManager();
while (serverManager.areDeadServersInProgress()) {
  Thread.sleep(100);
}
{code}

Because my change will make master start up quickly with some SSH handling left 
which changes existing test case assumption a little bit. So I added the above 
lines to match the exiting test case expectation which that all log splitting 
work is done  previous dead servers are handled.   

I've run the test case 20 times in a loop without any failure. 

The reason that the test case passed with removing 
this.deadservers.add(serverName);. Because it basically assigns regions 
before master initialization due to waitForActiveAndReadyMaster in the test 
code. Since it matches old behavior so that test case passed while the log 
splitting work might not have been done before those regions are assigned.


 



 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592943#comment-13592943
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94 #879 (See 
[https://builds.apache.org/job/HBase-0.94/879/])
HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 
1452452)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593183#comment-13593183
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #12 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/12/])
HBASE-7824 Revert until TestMasterFailover passes reliably (Revision 
1452452)
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1449920)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java

tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.7

 Attachments: HBASE-7824_3.patch, hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592002#comment-13592002
 ] 

Lars Hofhansl commented on HBASE-7824:
--

We're seeing relatively frequent failures of 
TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS now. Over in 
HBASE-7985 Ram determined that this always happens when the RS we abort does 
not carry META. It might be related to this change.


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592009#comment-13592009
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

@Jeff
Any specific reason for adding the server to be processed (other than RS 
carrying ROOT or META) to the deadServers.
{code}
 void processDeadServer(final ServerName serverName) {
this.deadservers.add(serverName);
this.services.getExecutorService().submit(
  new ServerShutdownHandler(this.master, this.services, this.deadservers, 
serverName, true));
  }
{code}
I remember we used to track the servers that got expired when master was coming 
up using deadServers.  See ServerManager.expireServer().  
May be some specific issues you got?  

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-03 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592021#comment-13592021
 ] 

Jeffrey Zhong commented on HBASE-7824:
--

[~ram_krish]
The reason is that we already recovered META region servers during the 
initialization so we don't need to keep meta region servers there. The 
failedServer list is only for log splitting work.Let me see why the test 
TestMasterFailover.testMasterFailoverWithMockedRITOnDeadRS failed more often. 

Thanks,
-Jeffrey

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-03-03 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592028#comment-13592028
 ] 

ramkrishna.s.vasudevan commented on HBASE-7824:
---

Yes i can understand that part.  But do we need to explicitly add to 
deadServers?  Because the deadServers list was like used when an RS goes down 
just when the master started coming up.

 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-02-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586683#comment-13586683
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94 #859 (See 
[https://builds.apache.org/job/HBase-0.94/859/])
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1449920)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7824) Improve master start up time when there is log splitting work

2013-02-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586891#comment-13586891
 ] 

Hudson commented on HBASE-7824:
---

Integrated in HBase-0.94-security #112 (See 
[https://builds.apache.org/job/HBase-0.94-security/112/])
HBASE-7824 Improve master start up time when there is log splitting work 
(Jeffrey Zhong) (Revision 1449920)

 Result = SUCCESS
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


 Improve master start up time when there is log splitting work
 -

 Key: HBASE-7824
 URL: https://issues.apache.org/jira/browse/HBASE-7824
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 0.94.6

 Attachments: hbase-7824.patch, hbase-7824_v2.patch


 When there is log split work going on, master start up waits till all log 
 split work completes even though the log split has nothing to do with meta 
 region servers.
 It's a bad behavior considering a master node can run when log split is 
 happening while its start up is blocking by log split work. 
 Since master is kind of single point of failure, we should start it ASAP.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira