[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908313#action_12908313
 ] 

luoli commented on MAPREDUCE-1434:
----------------------------------

Of course  we prefer that this patch been reviewed, actually this feature has 
been proved very helpful for us. Similar problems  people may encounter as us 
can be solved using this dynamic input job feature. ShiXing, maybe you could 
submit a new patch for this and make it's code conventions good. What do you 
think?

> Dynamic add input for one job
> -----------------------------
>
>                 Key: MAPREDUCE-1434
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1434
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 0.20.3
>            Reporter: Xing Shi
>             Fix For: 0.20.3
>
>         Attachments: dynamic_input-v1.patch
>
>
> Always we should firstly upload the data to hdfs, then we can analize the 
> data using hadoop mapreduce.
> Sometimes, the upload process takes long time. So if we can add input during 
> one job, the time can be saved.
> WHAT?
> Client:
> a) hadoop job -add-input jobId inputFormat ...
> Add the input to jobid
> b) hadoop job -add-input done
> Tell the JobTracker, the input has been prepared over.
> c) hadoop job -add-input status jobid
> Show how many input the jobid has.
> HOWTO?
> Mainly, I think we should do three things:
> 1. JobClinet: here JobClient should support add input to a job, indeed, 
> JobClient generate the split, and submit to JobTracker.
> 2. JobTracker: JobTracker support addInput, and add the new tasks to the 
> original mapTasks. Because the uploaded data will be 
> processed quickly, so it also should update the scheduler to support pending 
> a map task till Client tells the Job input done.
> 3. Reducer: the reducer should also update the mapNums, so it will shuffle 
> right.
> This is the rough idea, and I will update it .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to