It would be good to have this process to be an arbitrary executable,
not necessarily a Java program.
On Oct 6, 2006, at 10:49 AM, Benjamin Reed (JIRA) wrote:
[
http://issues.apache.org/jira/browse/HADOOP-580?
page=comments#action_12440557 ]
Benjamin Reed commented on HADOOP-580:
--------------------------------------
No. I'm very against running code in the Trackers (as my mail
indicates :). The idea would be that you would spawn off a child
process at the beginning of a job and kill it at the end. (Or some
variation on that theme.)
Job setup and take down on Nodes
--------------------------------
Key: HADOOP-580
URL: http://issues.apache.org/jira/browse/HADOOP-580
Project: Hadoop
Issue Type: New Feature
Components: mapred
Reporter: Benjamin Reed
It would be nice if there was a hook for doing job provisioning and
cleanup on compute nodes. The TaskTracker implicitly knows when a job
starts (a task for the job is received) and
pollForTaskWithClosedJob() will explicitly say that a job is finished
if a Map task has been run (If only Reduce tasks have run and are
finished I don't think pollForTaskWithClosedJob() will return
anything will it?), but child Tasks do not get this information.
It would be nice if there was a hook so that programmers could do
some provisioning when a job starts and cleanup when a job ends.
Caching addresses some of the provisioning, but in some cases a
helper daemon may need to be started or the results of queries need
to be retrieved and having startJob(), finishJob() callbacks that
happen exactly once for each node that runs part of the job would be
wonderful.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira