[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Shi updated MAPREDUCE-1434:
--------------------------------

    Attachment: dynamic_input-v1.patch

We can dynamic add input to a job by use cmd:
{noformat} 
hadoop job -D mapred.input.format.class="YourInputFormatClass" -input-add 
<jobid> <inputdir>
{noformat} 

and tell the master(jobtracker) that the input has been added done by:
{noformat} 
hadoop job -input-done <jobid>
{noformat} 


> Dynamic add input for one job
> -----------------------------
>
>                 Key: MAPREDUCE-1434
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1434
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 0.20.3
>            Reporter: Xing Shi
>             Fix For: 0.20.3
>
>         Attachments: dynamic_input-v1.patch
>
>
> Always we should firstly upload the data to hdfs, then we can analize the 
> data using hadoop mapreduce.
> Sometimes, the upload process takes long time. So if we can add input during 
> one job, the time can be saved.
> WHAT?
> Client:
> a) hadoop job -add-input jobId inputFormat ...
> Add the input to jobid
> b) hadoop job -add-input done
> Tell the JobTracker, the input has been prepared over.
> c) hadoop job -add-input status jobid
> Show how many input the jobid has.
> HOWTO?
> Mainly, I think we should do three things:
> 1. JobClinet: here JobClient should support add input to a job, indeed, 
> JobClient generate the split, and submit to JobTracker.
> 2. JobTracker: JobTracker support addInput, and add the new tasks to the 
> original mapTasks. Because the uploaded data will be 
> processed quickly, so it also should update the scheduler to support pending 
> a map task till Client tells the Job input done.
> 3. Reducer: the reducer should also update the mapNums, so it will shuffle 
> right.
> This is the rough idea, and I will update it .

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to