[ 
https://issues.apache.org/jira/browse/HADOOP-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642383#action_12642383
 ] 

Chris K Wensel commented on HADOOP-4510:
----------------------------------------

We prefer it public because we write through the FileOutputFormat class via a 
RecordWriter, which internally (magically) inserts the temp path and task id 
path at the end of the intended path. 

This is done so that speculative execution will succeed. And we would like to 
benefit from this behavior, so aren't really asking that it change.

The side effect is that we have no way of finding the actual location of the 
data written and then moving it to where it was intended to be written. 

Since we don't have multiple (named) output collectors, we must emulate the 
behavior through our own api.

> FileOutputFormat protects getTaskOutputPath
> -------------------------------------------
>
>                 Key: HADOOP-4510
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4510
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>            Reporter: Chris K Wensel
>            Priority: Blocker
>         Attachments: hadoop-4510.patch
>
>
> o.a.h.m.FileOutputFormat#getTaskOutputPath() is protected. 
> Having access to a task output directory as used internally by RecordWriters 
> is quite handy. This is especially true if the user is attempting to 
> serialize out data in a similar fashion as the output collector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to