[
https://issues.apache.org/jira/browse/MAPREDUCE-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated MAPREDUCE-7331:
--------------------------------------
Issue Type: New Feature (was: Bug)
> Make temporary directory used by FileOutputCommitter configurable
> -----------------------------------------------------------------
>
> Key: MAPREDUCE-7331
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7331
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: mrv2
> Affects Versions: 3.0.0
> Environment: CDH 6.2.1 Hadoop 3.0.0
> Reporter: Bimalendu Choudhary
> Priority: Major
>
> Spark SQL applications uses FileOutputCommitter to commit and merge its files
> under a table directory. The hardcoded PENDING_DIR_NAME = _temporary
> directory results in multiple application using the same temporary directory.
> This casues unwanted results of one application interfering with other
> applications temporary files. Also one application ending up deleting
> temporary files of other. There is no way right now for applications to have
> there unique path to store the temporary files to avoid any interference from
> other totally independent applications. I think the temporary directory
> being used by FileOutputCommitter should be made configurable to let the
> caller call with with its own unique value as per the requirement and avoid
> it getting deleted or overwritten by other applications
> Something like:
> {quote}public static final String PENDING_DIR_NAME_DEFAULT = "_temporary";
> public static final String PENDING_DIR_NAME_DEFAULT =
> "mapreduce.fileoutputcommitter.tempdir";
> {quote}
>
> This can be used very efficiently by Spark applications to handle even stage
> failures where temporary directories from previous attempts cause problem and
> can help in so many situations.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]