For alter table, its not merely an atomic update to Hyperspace. The master
updates the schema on hyperspace and then sends "update schema" commands to
all the RangeServers and waits for them to ack before returning. This avoids
unnecessary per-RangeServer traffic to Hyperspace. Since alter table is
expected to be a fairly infrequent operations I don't think its unreasonable
for users to have to wait if execution is blocked by RangeServer recovery.
This is pretty much the same drop table.

-Sanjit

On Fri, Jul 31, 2009 at 2:54 PM, Luke <[email protected]> wrote:

>
> Master is getting more and more like a workqueue and jobtracker :) It
> seems to be advantageous to actually create a separate general server
> to manage all the tasks, which can be used for schedule map/reduce
> tasks in the future as well.
>
> On Fri, Jul 31, 2009 at 11:14 AM, Doug Judd<[email protected]> wrote:
> > The Master is responsible for orchestrating recovery from RangeServer
> > failures as well as carrying out meta operations in response to commands
> > such as CREATE TABLE, ALTER TABLE, and DROP TABLE.  These meta operations
> > are relatively straightforward except in the face of RangeServer failure.
> > When this happens, any in-progress meta operation that is dependent on
> the
> > failed RangeServer needs to block until the RangeServer has been
> recovered.
> > If another RangeServer that is involved in the recovery goes down, there
> is
> > now another recovery operation that needs to get carried out. The Master
> can
> > quickly start building up a fairly complex set of operation dependencies.
> >
> > The master is also responsible for moving ranges from one RangeServer to
> > another when load across the RangeServers gets out of balance.  If a MOVE
> > RANGE operation is in progress when, say, an ALTER TABLE request arrives,
> > and the range being moved is part of the table specified in the ALTER
> TABLE
> > request, then the ALTER TABLE operation needs to wait until the MOVE
> RANGE
> > operation is complete before it can continue.  Also, if two ALTER TABLE
> > requests arrive at the Master at the same time, then they should get
> carried
> > out in sequential order with one of the ALTER TABLE operations depending
> on
> > the completion of the other operation.
>
> I'm not sure about this particular case. For alter table while ranges
> are split/moved, it seems to that me as long as you update the schema
> in hyperspace/range servers atomically. The split/moved ranges on the
> destination new server will get the right schema. Also two alter table
> can overlap in many cases, as long as the schema updates on
> hyperspace/range servers are atomic. For cases where alter table on
> the same table needs to be sequenced, it's actually not too much to
> ask the application to do the sequence, as alter table is not really a
> frequent operations (otherwise, they should go with a generic column
> family and go nuts on qualifiers.)
>
> > To handle these dependencies, I propose designing the Master as an
> execution
> > engine for a directed acyclic graph of operations or operation dependency
> > graph (ODG).  Each node in the graph would represent an operation (e.g.
> > ALTER TABLE, RECOVER RangeServer) and would contain dynamic state.
> > Execution threads would carry out the operations by picking up nodes from
> > the graph in topological sort order.  When a RangeServer dies, the ODG
> > execution engine would pause, a new "RECOVER RangeServer" will get
> created
> > and the ODG will get modified to include this new node.  All of the
> existing
> > nodes that were dependent on that RangeServer would become dependent on
> this
> > new RECOVER RangeServer node.  At this point the ODG execution engine
> would
> > be restarted.
>
> The same alter table arguments can apply here as well. You can let the
> alter table to proceed on hyperspace and the remaining range servers.
> The recovered ranges would get the right schema. Otherwise, an alter
> table command can take a long time (up to a few minutes) while one of
> the range server is being recovered.
>
> > The Master Meta Log (MML) would essentially persist any changes to the
> ODG,
> > both node state as well as structural graph changes.  When the Master
> fails
> > and a new one comes up, it would replay the MML to reconstruct the ODG
> after
> > which it could continue execution.
> >
> > Thoughts?
>
> It seems to me that an ODG is not absolutely required for normal
> Hypertable operations. I'd like to avoid over engineering (if
> possible) for the first release.
>
> __Luke
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to