Job setup and take down on Nodes
--------------------------------
Key: HADOOP-580
URL: http://issues.apache.org/jira/browse/HADOOP-580
Project: Hadoop
Issue Type: New Feature
Components: mapred
Reporter: Benjamin Reed
It would be nice if there was a hook for doing job provisioning and cleanup on
compute nodes. The TaskTracker implicitly knows when a job starts (a task for
the job is received) and pollForTaskWithClosedJob() will explicitly say that a
job is finished if a Map task has been run (If only Reduce tasks have run and
are finished I don't think pollForTaskWithClosedJob() will return anything will
it?), but child Tasks do not get this information.
It would be nice if there was a hook so that programmers could do some
provisioning when a job starts and cleanup when a job ends. Caching addresses
some of the provisioning, but in some cases a helper daemon may need to be
started or the results of queries need to be retrieved and having startJob(),
finishJob() callbacks that happen exactly once for each node that runs part of
the job would be wonderful.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira