[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2014-06-20 Thread Nikhil Mulley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039663#comment-14039663
 ] 

Nikhil Mulley commented on HBASE-3833:
--

Hi,

I was wondering if there is any activity/interest on this jira since the last 
update. I faced a situation in my cluster where in the existing regionserver 
machine had to be reip'd, and due to which the master started behaving after 
the machine was added back with a new ip.The hase master had to be restarted to 
come out of the weird situation (not sure about it but the regions could not 
move or assign onto the slave with a new ip). I wondered if I could just have 
avoided the master restart with an exclude file pointing to the old ip of the 
regionserver.

Nikhil

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ASF.LICENSE.NOT.GRANTED--excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-10-04 Thread Vishal Kathuria (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13120541#comment-13120541
 ] 

Vishal Kathuria commented on HBASE-3833:


I fell sick last week and then got busy with a conference. I should be getting 
back to this soon.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-09-19 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108236#comment-13108236
 ] 

Vishal Kathuria commented on HBASE-3833:


@Stack: I started on another project, so didn't get a chance to put the patch 
together. I have been meaning to get back to it. Will try to shoot for it this 
week.

Thanks!

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-09-17 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107191#comment-13107191
 ] 

stack commented on HBASE-3833:
--

@Vishal Any luck w/ hacking up another patch?

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-23 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038364#comment-13038364
 ] 

Vishal Kathuria commented on HBASE-3833:


Stack:
My intention was build a graceful shutdown (that is the reason I assign the 
regions first and then expire the server). I assumed that assign* API unloads 
them off the source region server when assigning them to the new region server, 
but I guess I was wrong in that assumption. Let me take a look at the code 
again to see how I can change it to make the shutdown graceful and not have the 
situation you mentioned of two servers owning the region - that is clearly not 
desirable.

Yuzhi:
In case any of the meta regions are involved, then the 
MetaServerShutdownHandler is dispatched. That class has both  isCarryingRoot() 
and isCarryingMeta() defined.

I will publish another patch in a couple of days. Stack, please hold on to 
hacking this in. 

Thanks!
Vishal

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-23 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038342#comment-13038342
 ] 

dhruba borthakur commented on HBASE-3833:
-

It will be nice to get both "forced" and "graceful" decommissioning to this 
jira, is Vishal agrees.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038067#comment-13038067
 ] 

stack commented on HBASE-3833:
--

bq. In short, maybe we need a "force" option in decommissioning, which if used, 
does not wait for a graceful shutdown of the specified regionserver, instead 
declares it dead immediately and then follows the normal course of action 
(lease recovery, reassign regions, etc)

Agreed.  Seems like this patch does the 'forced' option.  Missing from this 
patch (though available as a script) is the graceful decommission.  Vishal, 
were you going for the 'forced' option only?  If so, should we commit this 
patch as is (though I think it dangerous if misunderstood and user is not clear 
that excludes means 'forced' decommission).

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-23 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037791#comment-13037791
 ] 

dhruba borthakur commented on HBASE-3833:
-

There are actually two use-cases that triggered this JIRA.

1. There are times when the adminstrator wants to shut down a few region 
servers, usually to upgrade hardware or some such reason. In this case, the 
administrator can put these machines in the excludes list, wait for those 
region servers to gracefully shutdown.

2. The other use-case is when certain region servers become unresponsive. Twice 
it so happened that a regionserver is heartbeating with ZK, but its capability 
to process hbase workload suddenly fell to almost zero. We could not ssh into 
the machine to debug what is wrong. (The suspicion is that the machine started 
swapping). In this case, it would be nice if the administrator had an option to 
put a machine in the excludes list, wait for a few minutes for it to gracefully 
exit, but if it still does not exit, then forcefully declare the regionserver 
as "dead".

In short, maybe we need a "force" option in decommissioning, which if used, 
does not wait for a graceful shutdown of the specified regionserver, instead 
declares it dead immediately and then follows the normal course of action 
(lease recovery, reassign regions, etc)
  

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035883#comment-13035883
 ] 

Ted Yu commented on HBASE-3833:
---

Looking through ServerShutdownHandler, I found this code in process():
{code}
// Assign root and meta if we were carrying them.
if (isCarryingRoot()) { // -ROOT-
{code}
But:
{code}
  boolean isCarryingRoot() {
return false;
  }
{code}

Same with 
{code}
if (isCarryingMeta()) this.services.getAssignmentManager().assignMeta();
{code}

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035859#comment-13035859
 ] 

stack commented on HBASE-3833:
--

bq. 2. enqueues a shutdownhandler to move all the regions off this node

ServerShutdownHandler splits logs and reassigns regions that were (or *are* in 
the decommissioning case) up on the shutdown server.  It does not unload them 
off the hosting regionserver.  So, it looks to me as though regions can be in 
two places at once... still on the server that has been marked decommissioned 
and up in the new locations that ServerShutdownHandler has assigned them too.  
This looks a little dangerous (if I am reading this right).

If decommissioning, it looks like you expire the server only at the end of the 
server shutdown processing (your call to markServerExpiry will be noticed by 
the hosting regionserver and it'll shut itself down closing out the regions it 
was hosting).

New JIRA for Karthiks suggestion sounds good.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035840#comment-13035840
 ] 

Vishal Kathuria commented on HBASE-3833:


Hi stack,

Here is some more info on the design 
ServerManager::decommissionServer(serverInfo) f
1. first puts the node in decommissioned list so no new regions get placed on 
this node. decommissioned list prevents a node with that name from joining in 
but doesn't kick the node out if it is already joined in.
2. enqueues a shutdownhandler to move all the regions off this node
3. once the shutdown handler is done doing that, then it marks the node dead. 
It is only at this point that the region server gets a YouAreDeadException and 
shuts itself down. 

So this is supposed to do gentle offload of regions. That's why I had to put a 
wait in my test because the gentle offload can take some time. 
I did check in the debug logs that the regions are offloaded before the server 
gets shut down - so its working for me.  Please go ahead and hack this in :)

Re Karthik's suggestion of timeout, I'll open a separate task for it. Off the 
top of my head, I can't think of a simple way of implementing it

Thanks a lot for reviewing and committing this change!
Vishal


> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035670#comment-13035670
 ] 

stack commented on HBASE-3833:
--

bq. Regarding destination for the patch - I'll follow up with dhruba to figure 
out the desired destination and make appropriate changes to remove the 
deprecated stuff.

NP.  Looks like this patch is for branch.  Thats fine.  I can hack it into 
TRUNK for you if needs be.

bq. ...ServerManager::refreshNodes() calls 
ServerManager::decommissionServer(serverInfo)...

I see.  So we just crash the server down and let the shutdown process do the 
reassign of regions.  I was looking for gentle offload of regions.  Thats not 
here.  Thats fine.  I just thought it was.

Let me know if you are going to update patch.  If you are, I'll wait, else will 
try hack this in (if its working for you!)

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035655#comment-13035655
 ] 

Vishal Kathuria commented on HBASE-3833:


Thanks for the review and the suggestions stack. 

I didn't know about the JUNit test timeout - you are right, I don't need to do 
limit the loop.

Regarding destination for the patch - I'll follow up with dhruba to figure out 
the desired destination and make appropriate changes to remove the deprecated 
stuff. 

"I'm not clear on how a decommissioned node, once added to the excludes, is 
cleared of its regions. Did I miss it?"
When the node list is refreshed and a node shows up in the decommissioned list, 
ServerManager::refreshNodes() calls 
ServerManager::decommissionServer(serverInfo) to start the decomissioning 
process.

Thanks!
Vishal



> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035565#comment-13035565
 ] 

stack commented on HBASE-3833:
--

On timeout and then kill, if Karthik suggested it, then it must be a good idea 
(smile).

No biggie but FYI and for next time, style is to put spaces around operators; 
not "(int i=0;i<3;i++)" but "(int i = 0; i < 3; i++)"

This is bad, no?

{code}
+try {
+  BufferedWriter out = new BufferedWriter(new FileWriter(exFilePath));
+  out.write(info.getHostnamePort());
+  out.close();
+} catch (IOException e) {
+}
{code}

If you fail to write the excludes file, the test will fail anyways?  Just let 
out the IOE?

Here:

{code}
+for (int i=0;i<3;i++) {
+   Thread.sleep(2000);
+   if 
(!cluster.getMaster().getServerManager().getOnlineServersList().contains(info));
+   break;
+}
{code}

You already have a three minute timeout on the test so this wait of six seconds 
is probably not needed... just loop till it happens OR until junit intercedes 
and kills the thing because its going on too long.

Again, none of the above are that important... more for the next time.

Just to note that NodeDecommissionedException is very close to the 
YouAreDeadException.  Even so, I think it a good idea to add it.

This patch is for TRUNK?  HServerInfo is deprecated.  Do you want to use that?

Nice. You added refreshNodes to the shell.

I'm not clear on how a decommissioned node, once added to the excludes, is 
cleared of its regions.  Did I miss it?

Thanks Vishal.




> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-17 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035000#comment-13035000
 ] 

Vishal Kathuria commented on HBASE-3833:


Sorry for the radio silence on this issue folks. I have updated the patch with 
the comments. The biggest change is that now we wait for the region servers to 
be reassigned before adding the node to expire server list (which will trigger 
node shutdown).

Please let me know if you have any further comments. 

Also, let me know if we need to have some sort of timeout and if the region 
servers aren't migrated within that timeout, then we kill the region server 
anyway. I got this suggestion from Karthik, but wanted to see what other folks 
think before implementing it.



> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt, excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029048#comment-13029048
 ] 

stack commented on HBASE-3833:
--

It was done in ruby because I wanted a graceful_restart that worked with the 
shipping version of hbase, I did not want to require users update to get a 
working decommission.  The ruby script will be discarded when hbase can do what 
it does natively.  Its a stop gap.

Don't call out to the ruby script (smile).  On the other hand, it has had some 
exercise and has had some experience frozen into it so is worth a little bit of 
study if reimplementing up in the master.

+1 on your 1. and 2.  I would encourage you to add bits of general 
functionality that can be exploited by master or others in future silly ruby 
scripts.

Good on you Vishal

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028993#comment-13028993
 ] 

Vishal Kathuria commented on HBASE-3833:


stack: A quick question. I looked at region_mover.rb and I was wondering why 
this was implemented in ruby as a script and not in the master. Since I do the 
parsing of the decom file in the master, I would need to call out from master 
to the ruby script, which doesn't sound like such a good idea. Secondly, if we 
improve the load balancing policy from just random to something like load 
based, the ruby script would not be able to take advantage of it since random 
is hard coded in it. 

Here is what I think we should ideally do:
1. Refactor the code in AssignmentManager/ServerShutdown handler to expose a 
function that takes all the regions off a node.
2. Use the above function for decommissioning, failure handling and the unload 
admin commands so we get similar behavior and single implementation across the 
three.

What do you guys think?

dhruba:
Agreed: I will leave the configs as is for this jira. I assume the config files 
need to live on some NFS share so they are available when the master fails over 
- is that so?

Thanks

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028985#comment-13028985
 ] 

dhruba borthakur commented on HBASE-3833:
-

I meant "let's not move any of these configs to zk".

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028984#comment-13028984
 ] 

dhruba borthakur commented on HBASE-3833:
-

agree with stack. let's nove any of these configs to zk (as part of this jira).

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028911#comment-13028911
 ] 

stack commented on HBASE-3833:
--

bq. 2. I would like to hear from folks regarding what they think of storing 
these files in zk. Heck, why shouldn't we store the entire config in zk and the 
only config you need is the address of the zk cluster?

Yeah, long time plan is to put all config. up in zk.  Would suggest though that 
that is out of scope for this issue and that if your aim is to have this 
feature work like hadoop's, you not move the includes/excludes there, at least, 
not as part of this issue.

Otherwise all else is good.  Good on you Vishal.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028900#comment-13028900
 ] 

Vishal Kathuria commented on HBASE-3833:


Thank you for the comments folks. I wasn't aware of HDFS-1547 - I just looked 
through it.

Here are my thoughts
1. Agree with doing a graceful shutdown of the region server that is being 
excluded.
2. I would like to hear from folks regarding what they think of storing these 
files in zk. Heck, why shouldn't we store the entire config in zk and the only 
config you need is the address of the zk cluster?
3. Regarding the name, I will change the file to decom instead of exclude for 
consistency with HDFS. One difference with 1547 is that 1547 leaves those nodes 
up and I am shutting them down. I don't see the value of keeping those nodes up 
so I will stay with that unless you folks think otherwise.
4. Re Todd's suggestion on the truth table, is it okay if I publish the boolean 
expression in the documentation. I have an easier time inferring the intuition 
from a boolean expression than a truth table.
5. stack: I will update the licenses to 2011 and look at the spaces again. My 
intent was to use 2 but I may have missed some places.
6. The change in .sh file is intentional, otherwise the script for starting the 
local region server was not working. I need to make sure the change doesn't 
break any of the other scripts. I will verify that.

Thanks again folks.
Vishal 

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028782#comment-13028782
 ] 

stack commented on HBASE-3833:
--

There is a 'Configuration' section in the book: 
http://hbase.apache.org/book/configuration.html  Vishal, you can add doc. there 
or just dump a paragraph or two in here and on commit we'll port it over.  
Thanks.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028655#comment-13028655
 ] 

dhruba borthakur commented on HBASE-3833:
-

Thanks Todd. I agree that Vishal can help in filling up the table. Is there a 
hbase admin guide that needs to be updated too?

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028652#comment-13028652
 ] 

Todd Lipcon commented on HBASE-3833:


I'm OK with the separate files approach as long as we have clear documentation 
about what the different combinations mean. In the table above there are a few 
obvious cells and a lot that are unclear.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-03 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028643#comment-13028643
 ] 

dhruba borthakur commented on HBASE-3833:
-

Patch looks good at first sight. I would like the regions to close elegantly, 
so that log replay is not required when those regions are instantiated on other 
regionservers. If you can do that enhancement, that will be great.

I am fine with the includes/excludes/decommission list (instead of Todd's regex 
solution), especially because HBase administrators are usually the same as the 
hdfs administrators; unless Todd feels strongly about it.  

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028332#comment-13028332
 ] 

stack commented on HBASE-3833:
--

Vishal:  What Todd says.  Here's some comments on the patch itself.  Licenses 
should be 2011 FYI.  Tabs are two spaces in hbase/hadoop (You seem to have 
more).  It looks like you are crashing out the regionserver and on master 
running the expireServer.  That means that the ShutdownServer handler runs.  
It'll take care of assigning out the regions that were on the crashed server 
but because this is a crash-out, we'll also be playing the regionserver's logs. 
 Do you want to add facility for having a regionserver shed its regions and 
then close clean so there are no regionserver WAL logs to replay?  (Its ok if 
you don't, just suggesting).

This is intentional?

{code}
-$command $startStop "$@" > "$logout" 2>&1 < /dev/null &
+$command "$@" $startStop > "$logout" 2>&1 < /dev/null &
{code}

Otherwise patch looks good Vishal. 

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-02 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028053#comment-13028053
 ] 

Todd Lipcon commented on HBASE-3833:


I find these include/exclude files just as confusing here in HBase as they've 
always been in Hadoop. Some users are less polite and call them "insane" :)

Could you please do a favor and fill in the following table?

|| Include present || Exclude present || In include || In exclude || Result at 
startup || Result after refresh ||
| o | o | o | o | ? | ? |
| o | x | o | o | ? | ?|
| o | x | o | x | ? | ?|
| x | o | o | o | ? | ?|
| x | o | x | o | ? | ?|
| x | x | o | o | ? | ?|
| x | x | o | x | ? | ?|
| x | x | x | o | ? | ?|
| x | x | x | x | ? | ?|

Rather than this very complicated structure, how about a simpler format, 
something like:

host.rules:
{code}
 (allow|deny)
...
{code}

where we ship a default config that looks like:
{code}
.* allow
{code}

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Affects Versions: 0.90.2
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: excl-patch.txt
>
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-02 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027828#comment-13027828
 ] 

dhruba borthakur commented on HBASE-3833:
-

+1 to Todd's idea of similarity with HDFS-1547

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-02 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027784#comment-13027784
 ] 

Todd Lipcon commented on HBASE-3833:


If going for "same semantics as Hadoop", we should try to match the semantics 
of HDFS-1547 which adds a new "decom" file, rather than include/exclude. Most 
ops people I've talked to find the include/exclude files super confusing (what 
does it mean to be both included and excluded, for example?)

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-02 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027726#comment-13027726
 ] 

Jean-Daniel Cryans commented on HBASE-3833:
---

@Vishal, currently doing stop regionserver is probably not what you want to do 
in a cluster serving live requests (it closes all connections then closes the 
regions, meaning regions unavailable for a while). You'd want something more 
like bin/graceful_stop.sh

@Dhruba, I think it would be cool to have that in ZK, but I also like having 
the same semantics as Hadoop. So I'm 0.

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-01 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027563#comment-13027563
 ] 

dhruba borthakur commented on HBASE-3833:
-

another option would be to maintain the includes-list and excludes-list in zk 
itself. Instead of local files on the master, they could be zk nodes that 
contain a list of machine-names. what do people think about that?

> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-01 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027530#comment-13027530
 ] 

Vishal Kathuria commented on HBASE-3833:


Hi,
I have started implementing this and below is a mini-spec. Could you pls look 
through it and let me know if you have any comments.

Functionality
1. The semantics of the include and exclude lists would be identical to Hadoop. 
If the include list is empty, then universal include set is assumed. exclude 
list takes precedence over include list.
2. Adding a node to exclude list + issuing a refresh using the command line 
utility will trigger ServerManager::expireServer for an already online server. 
This is the same code path that happens when the node goes down.
3. Nodes in excludeList (or the nodes which are not in include list) will not 
be allowed to join. 
4. The list will support hostname:port type of entries, where port is optional. 
Even though in production we wouldn't have multiple regionservers per node, I 
am doing this to make testing easier on dev servers.

Tools
For the command line utility, there will be a new command to refresh server 
list that will be implemented in HBaseAdmin.
HBaseAdmin will call into a method that will be added to HMaster and 
HMasterInterface. As mentioned above, if a node is added to excludeList or is 
removed from include list, it will be expired. If a node is added to the 
includeList, no action will be taken.

The command to refresh server list will be exposed in the hbase shell as well.


Nicolas: 'bin/hbase-daemon.sh stop regionserver' kills the region server 
process. I am proposing that we don't try to do that. In the scenario that 
Dhruba mentioned to me, the regionserver node might be sick and we may not be 
able to kill it.

Dhruba: Other than the command line utility to refresh the server list, are 
there any other tools that we need to provide?

Thanks!


> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase

2011-05-01 Thread Vishal Kathuria (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027529#comment-13027529
 ] 

Vishal Kathuria commented on HBASE-3833:


Copying comment history from facebook internal task:


Dhruba Borthakur - at 3:09pm on April 21st
andrew/matthew: do we need a discussion on how this feature needs to be 
designed?

1. There would be two files includes and excludes.
2. If a  machine is listed in excludes file, it shuts down the regionserver on 
that machine.
3. Region servers should start only on machines that are listed in the includes 
file.
4. a command line utility to modify/update the excludes/includes file.


Andrew Ryan - at 3:15pm on April 21st
We've wanted parity from the beginning. So yes, online decommissioning should 
be supported. I don't see any reason to do otherwise.

To Vishal's question, we wouldn't allow the node to join and migrate the 
regions. We just wouldn't allow it to join. Just shut the regionserver down. 
But, if a regionserver has active regions when it gets the decommission 
request, then it should gracefully reassign its regions and then shut down.


Nicolas Spiegelberg - at 3:20pm on April 21st I think we just want to deny 
requests and force a regionserver already online to close.  HBase just has a 
logical association with regions that is given upon onlining, stored in an HDFS 
file called META.  We save the state across a graceful restart, but will assign 
out those regions after a small startup period has passed without onlining.  
Therefore, we don't benefit from trying to grab any state from the physical 
server itself.


Vishal Kathuria - at 3:28pm on April 21st Thanks guys.

One question is: 
if I move a server from includes to excludes, should that trigger online 
decommissioning and shut the machine down when decomissioning is done? or 
should we kick it out right away?

I think the former is better. Looks like we are all agreeing on that?

thanks






Adrian Potra - at 3:35pm on April 21st
 * changed the subscribers. Removed: Adrian Potra.


Vishal Kathuria - at 3:35pm on April 21st Sorry, I wrote my comment before I 
read Nicolas'

I see his point - since the Region Servers don't have persisted data (which is 
in HDFS), we don't loose much by kicking them out.

sounds reasonable to me.


Dhruba Borthakur - at 4:37pm on April 21st
@Nicolas: "but will assign out those regions after a small startup period has 
passed without onlining"  -- is it possible to avoid the small startup period 
mentioned above if the RS is decommissioned?


Nicolas Spiegelberg - at 5:43pm on April 21st
@Dhruba: agreed.  We should just see what 'bin/hbase-daemon.sh stop 
regionserver" does.  By my comment, I just meant to say that, if need be, we 
can actually just kick out the connection and do all the work on the master.  
As a safeguard either way, we should rename that RS's log directory (if it 
exists).  This will forcibly cause the RS to die if something went awry.



> ability to support includes/excludes list in Hbase
> --
>
> Key: HBASE-3833
> URL: https://issues.apache.org/jira/browse/HBASE-3833
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> An HBase cluster currently does not have the ability to specify that the 
> master should accept regionservers only from a specified list. This helps 
> preventing administrative errors where the same machine could be included in 
> two clusters. It also allows the administrator to easily remove un-ssh-able 
> machines from the cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira