[ 
https://issues.apache.org/jira/browse/HBASE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929841#action_12929841
 ] 

HBase Review Board commented on HBASE-3112:
-------------------------------------------

Message from: st...@duboce.net


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > Looks pretty good.  Should be much improved.
bq.  > 
bq.  > What about client-side semantics and checks?  I thought there was going 
to be some change there?  An async trigger and then a fast way to see if it's 
done or not?  Should be clear on javadoc since enable/disable is always thing 
we get questions on.
bq.  > 
bq.  > I think most of my issues with patch is that it adds huge amount of new 
code into already big AssignmentManager class.  Should move these classes out 
and not sure if we should have these methods which just touch ZK.  I think 
having all logic about the state transitions, usage of ZK, etc more explicitly 
controlled in the handlers themselves might be more clear to follow how this 
works.  Maybe not but was a little hard to follow (that, for example, we don't 
have enabling/disabling states in memory... where we were talking about 
in-memory state vs zk states, etc).
bq.  > 
bq.  > Also, ZKTable can be simplified (and made more future proof) with an 
enum.
bq.  > 
bq.  > Now, for failover, I know we said we would punt to 0.92 on handling of 
this case, but we should at least make it so we don't get into broken state if 
we have a master failover.  What will happen if we are in disabling up in zk 
but not all closes have been done?
bq.  > 
bq.  > Lastly... I think we definitely need some unit tests on this stuff.  
TestMasterFailover does a little but not really relevant to these changes (only 
deals with regions already RIT).  TestRollingRestart does an enable/disable of 
table.  But none of this stuff takes into account failure of stuff in handlers, 
failure of RS, etc... Don't need to hold up this patch on unit tests if it's 
working in cluster testing on large tables, but it's kind of thing that would 
be good to test at some point.

Added to shell is_enabled and is_disabled.  Added javadoc to enableTable and 
disableTable explaining how these methods return immediately now but that 
process can take a while to complete.

On methods that just touch zk, they are of a class of which one member -- 
isTableDisabled -- does checks of AM data structures... so I can't really move 
them out.

Change ZKTable to use enums.

If disabling and master failover, when new master comes online, will be broke. 
But enable/disable are idempotent and a rerun of enable/disable will start up 
the process again and it should finish off properly.  Thats the theory anyways.

Regards unit tests, enable/disable is used all over unit test code base.  If 
you are looking for a test that simulates big table enable/disable, that's hard 
to do in a unit test.


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > 
trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 
1302
bq.  > <http://review.cloudera.org/r/1187/diff/2/?file=16884#file16884line1302>
bq.  >
bq.  >     I thought we had this method somewhere already?

there is a similarly named one for zk that you added -- is that what you are 
referring to?


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > 
trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 
1573
bq.  > <http://review.cloudera.org/r/1187/diff/2/?file=16884#file16884line1573>
bq.  >
bq.  >     Are these methods necessary?  Seems like unnecessary stuff in 
AssignmentManager.  The Handlers can just use the ZK methods directly?  (keep 
the AM methods as check/set of it's internal state?)

See note above where argument is that these are of a set and that since one of 
the set members does AM machinations, then all belong in AM.  For now I think 
its fine.


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > 
trunk/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java,
 line 97
bq.  > <http://review.cloudera.org/r/1187/diff/2/?file=16888#file16888line97>
bq.  >
bq.  >     This is for repeated runs of enable?  Should we log if this actually 
removes regions (table has X total regions, already Y online, assigning Z still 
offline)?
bq.  >     
bq.  >     It's okay that this operation is not done under any locks?  Couldn't 
regions be coming online/offline concurrent with this operation?

Added logging to enable/disable.

Regards 'locks', these are my own local copies of these Lists.  Also notion 
that regions in table count can change is sort of allowed..  Each time through 
loop we'll recheck .META. for enabling and we'll re-look at the list of table 
regions in AssignmentManager....


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTable.java, line 
41
bq.  > <http://review.cloudera.org/r/1187/diff/2/?file=16890#file16890line41>
bq.  >
bq.  >     enum?

done


bq.  On 2010-11-08 12:13:11, Jonathan Gray wrote:
bq.  > trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java, line 
716
bq.  > <http://review.cloudera.org/r/1187/diff/2/?file=16892#file16892line716>
bq.  >
bq.  >     What is this method's policy on watches?  Please note it in javadoc. 
 This method is not at all strict and is multiple operations so is not safe to 
use on nodes where we rely on watches / monitoring of state transitions, so 
let's make some kind of note.

K. This method is for non-watched znode.  Will add this to javadoc.


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1187/#review1841
-----------------------------------------------------------





> Enable and disable of table needs a bit of loving in new master
> ---------------------------------------------------------------
>
>                 Key: HBASE-3112
>                 URL: https://issues.apache.org/jira/browse/HBASE-3112
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: 3112-v2.txt, 3112-v3.txt, 3112.txt
>
>
> The tools are in place to do a more reliable enable/disable of tables.  Some 
> work has been done to hack in a basic enable/disable but its not enough -- 
> see the test avro/thrift tests where a disable/enable/disable switchback can 
> confuse the table state (and has been disabled until this issue addressed).
> This issue is about finishing off enable/disable in the new master.   I think 
> we need to add to the table znode an enabling/disabling state rather than 
> have them binary with a watcher that will stop an enable (or disable) 
> starting until the previous completes (Currently we atomically switch the 
> state though the region close/open lags -- some work in enable/disable 
> handlers helps in that they won't complete till all regions have 
> transitioned.. but its not enough).
> Need to add tests too.
> Marking issue critical bug because loads of the questions we get on lists are 
> about enable/disable probs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to