The Master is responsible for orchestrating recovery from RangeServer
failures as well as carrying out meta operations in response to commands
such as CREATE TABLE, ALTER TABLE, and DROP TABLE.  These meta operations
are relatively straightforward except in the face of RangeServer failure.
When this happens, any in-progress meta operation that is dependent on the
failed RangeServer needs to block until the RangeServer has been recovered.
If another RangeServer that is involved in the recovery goes down, there is
now another recovery operation that needs to get carried out. The Master can
quickly start building up a fairly complex set of operation dependencies.

The master is also responsible for moving ranges from one RangeServer to
another when load across the RangeServers gets out of balance.  If a MOVE
RANGE operation is in progress when, say, an ALTER TABLE request arrives,
and the range being moved is part of the table specified in the ALTER TABLE
request, then the ALTER TABLE operation needs to wait until the MOVE RANGE
operation is complete before it can continue.  Also, if two ALTER TABLE
requests arrive at the Master at the same time, then they should get carried
out in sequential order with one of the ALTER TABLE operations depending on
the completion of the other operation.

To handle these dependencies, I propose designing the Master as an execution
engine for a directed acyclic graph of operations or operation dependency
graph (ODG).  Each node in the graph would represent an operation (e.g.
ALTER TABLE, RECOVER RangeServer) and would contain dynamic state.
Execution threads would carry out the operations by picking up nodes from
the graph in topological sort order.  When a RangeServer dies, the ODG
execution engine would pause, a new "RECOVER RangeServer" will get created
and the ODG will get modified to include this new node.  All of the existing
nodes that were dependent on that RangeServer would become dependent on this
new RECOVER RangeServer node.  At this point the ODG execution engine would
be restarted.

The Master Meta Log (MML) would essentially persist any changes to the ODG,
both node state as well as structural graph changes.  When the Master fails
and a new one comes up, it would replay the MML to reconstruct the ODG after
which it could continue execution.

Thoughts?

- Doug

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to