Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hbase/MasterRewrite" page has been changed by stack.
http://wiki.apache.org/hadoop/Hbase/MasterRewrite?action=diff&rev1=12&rev2=13

--------------------------------------------------

   * [[#scope|Design Scope]]
   * [[#design|Design]]
    * [[#moveall|Move all state, state transitions, and schema to go via 
zookeeper]]
+    * [[#tablestate|Table State]]
+    * [[#regionstate|Region State]]
     * [[#zklayout|Zookeeper layout]]
    * [[#clean|Region State changes are clean, minimal, and comprehensive]]
    * [[#balancer|Load Assignment/Balancer]]
@@ -63, +65 @@

  <<Anchor(regionstate)>>
  ==== Region State ====
  
- Run region state transitions -- i.e. opening, closing -- by changing state in 
zookeeper rather than in Master maps as is currently done.
+ Run region state transitions -- i.e. ''opening'', ''closing'' -- by changing 
state in zookeeper rather than in Master Maps as is currently done.
  
  Keep up a region transition trail; regions move through states from 
''unassigned'' to ''opening'' to ''open'', etc.  A region can't jump states as 
in going from ''unassigned'' to ''open''.
  
  Master (or client) moves regions between states.  Watchers on RegionServers 
notice changes and act on it.  Master (or client) can do transitions in bulk; 
e.g. assign a regionserver 50 regions to open on startup.  Effect is that 
Master "pushes" work out to regionservers rather than wait on them to heartbeat.
  
- A problem we have in current master is that states do not make a circle.  
Once a region is open, master stops keeping account of a regions' state; region 
state is now kept out in the .META. catalog table with its condition checked 
periodically by .META. table scan.  State spanning two systems currently makes 
for confusion and evil such as region double assignment because there are race 
condition potholes as we move from one system -- internal state maps in master 
-- to the other during update to state in .META.  Current thinking is to keep 
region lifecycle all up in zookeeper but that won't scale.  Postulate 100k 
regions -- 100TB at 1G regions -- each with two or three possible states each 
with watchers for state change.  My guess is that this is too much to put in 
zk.  TODO: how to manage transition from zk to .META.?
+ A problem we have in current master is that states do not make a circle.  
Once a region is open, master stops keeping account of a regions' state; region 
state is now kept out in the .META. catalog table with its condition checked 
periodically by .META. table scan.  State spanning two systems currently makes 
for confusion and evil such as region double assignment because there are race 
condition potholes as we move from one system -- internal state maps in master 
-- to the other during update to state in .META.  Current thinking is to keep 
region lifecycle all up in zookeeper but that won't scale.  Postulate 100k 
regions -- 100TB at 1G regions -- each with two or three possible states each 
with watchers for state change.  My guess is that this is too much to put in zk 
(Mahadev+Patrick say no if data is small).  TODO: how to manage transition from 
zk to .META.?  Also, can't do getClosest up in zk, only in .META.
  
- State and Schema are distinct in zk.  No interactions.
+ TODO: qs in zk?
  
  <<Anchor(zklayout)>>
  
@@ -82, +84 @@

  /hbase/root-region-server
  
  # Is STARTCODE a timestamp or a random id?
- /hbase/rs/STARTCODE/load/
+ /hbase/rs/STARTCODE
- /hbase/rs/STARTCODE/regions/opening/
+ 
- /hbase/tables/TABLENAME {JSON array of table objects.  Each table object 
would have state and schema objects, etc.  State is read-only, offline, etc.  
Schema has differences from default only}
+ /hbase/tables {JSON array of table objects.  Each table object would have 
state and schema objects, etc.  State is read-only, offline, etc.  Schema has 
differences from default only}
  }}}
  
  <<Anchor(clean)>>

Reply via email to