[ 
https://issues.apache.org/jira/browse/HADOOP-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609801#action_12609801
 ] 

Amareshwari Sriramadasu commented on HADOOP-3150:
-------------------------------------------------

Today we give promotion to all jobs that have an mapred.output.dir defined. So, 
one more thing that needs to be fixed here is 'Applications creating side files 
with OutputFormat which is not an instance of FileOutputFormat'. For example, 
Hadoop archives has output format as NullOutputFormat, but the archive is 
created using task's side files. Thus 'moving the task commit to OutputFormat' 
will ignore the sidefiles. And also, 'with setupJob and setupTask moved to 
OutputFormat', the facility to create side files is removed.

To support creating side files even with OutputFormat which is not a 
FileOutputFormat, we can have SideFileOutputFormat which extends 
FileOutputFormat. If the job OutputFormat is not a FileOutputFormat and 
mapred.output.dir is defined, the framework will instantiate 
SideFileOutputFormat. 
The following apis in FileOutputFormat will be moved to SideFileOutputFormat
{noformat}
static void setWorkOutputPath(JobConf conf, Path outputDir)
public static Path getWorkOutputPath(JobConf conf) 
{noformat}

SideFileOutputFormat.getRecordWriter() will use TextOutputFormat's RecordWriter.

Finally task commit will constitute commit of Job's OutputFormat and 
SideFileOutputFormat ( if Job's OutputFormat is not a FileOutputFormat). 
Thoughts?


> Move task file promotion into the task
> --------------------------------------
>
>                 Key: HADOOP-3150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3150
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>             Fix For: 0.19.0
>
>         Attachments: 3150.patch
>
>
> We need to move the task file promotion from the JobTracker to the Task and 
> move it down into the output format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to