Hi folks

Branch TEZ-2003 has made a fair amount of progress over the last several
months. This would be a good time to get started with merging the branch
back to master.
It's been running stable on a multi-node cluster for a while now. 0.8 is
just getting started, so merging this in now would allow for additional
testing. Also the merge/rebase is starting to get fairly painful.

Committers/interested individuals, could you please start reviewing the
changes, and providing feedback. I'd like to target a merge vote by end of
the month or early next month - if that's possible.


Here's the core list of what has (and has not) changed in the branch, with
some details which should help with reviews.


   - The Task Communicator plane has been made pluggable

TaskAttemtListenerImp(l)TezDag split into a controller and
TaskCommunicators. Most of the logic for handling task heartbeats, etc
moved into TezTaskCommunicator


   - Custom ContainerLaunchers allowed.
   - Custom TaskSchedulers allowed
   - Task execution in the runtime re-written to be less prone to errors
   and races. (TezTaskRunner2 and related classes).
   - Runtime modified to preempt and interrupt running tasks.
   - New event type TA_KILLED - for situations where the runtime decides to
   kill a task.
   - Node tracking and blacklisting is per scheduler source
   - Test framework added to test external services.
   - The core state machines and DAG co-ordination remains mostly unchanged
   (except for the TA_KILLED support)

Features which exist in the branch, as a side affect of the changes

   - Support for uber mode (Running tasks within the AM)
   - Partial support for preemptable tasks instead of containers (this
   exists only in the runtime. Additional changes are required in the runtime
   and AM to make this production ready)

Pending work items, which are going to get added over the next few days.

   - API modifications to make them consistent with the way Tez defines
   APIs (Plugin / PluginContext)
   - API simplification for specification of which plugins run in an AM,
   and how vertices use these (This is currently conf based)
   - Additional unit tests.

I expect all the APIs to evolve as we move forward and learn from usage. To
that extent, these changes will be Unstable and can change even across
minor releases.

Thanks
- Sid

Reply via email to