[
https://issues.apache.org/jira/browse/HADOOP-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alejandro Abdelnur updated HADOOP-3258:
---------------------------------------
Attachment: patch-3258.txt
> FileOutputFormat should have a method to create custom files under the
> outputdir with a unique name per task to avoid name collision
> ------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3258
> URL: https://issues.apache.org/jira/browse/HADOOP-3258
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Environment: all
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Fix For: 0.18.0
>
> Attachments: patch-3258.txt
>
>
> Currently, if a M/R code creates a file, it is the responsibility of the M/R
> code to avoid file name collisions from different tasks.
> Hadoop should provide an API that creates unique file names based on the task
> type (map or reduce) and the task ID. Similarly to how output files,
> part-#####, are created.
> The proposed patch adds 2 static methods to the {{FileOutputFormat}}
> {nofomat}
> public static String getUniqueName(JobConf conf, String name);
> public static Path getPathForCustomFile(JobConf conf, String name);
> {nofomat}
> The first one adds task type and task ID to the given name.
> The second gives a PATH to a file in the working outputdir using a file name
> namespaced by the first method.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.