[jira] [Resolved] (SPARK-3595) Spark should respect configured OutputCommitter when using saveAsHadoopFile

Patrick Wendell (JIRA) Sun, 21 Sep 2014 13:10:07 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Patrick Wendell resolved SPARK-3595.
------------------------------------
          Resolution: Fixed
       Fix Version/s: 1.2.0
    Target Version/s: 1.2.0

Thanks I've merged this into master. We can consider merging this into 1.1 as 
well later on. I decided not to do that yet because we've often found that 
changes around Hadoop configurations can produce unanticipated regressions. So 
let's see how this fares in master and if there is lots of demand we can 
backport a fix once it's been stable in master for a while.

> Spark should respect configured OutputCommitter when using saveAsHadoopFile
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-3595
>                 URL: https://issues.apache.org/jira/browse/SPARK-3595
>             Project: Spark
>          Issue Type: Improvement
>    Affects Versions: 1.1.0
>            Reporter: Ian Hummel
>            Assignee: Ian Hummel
>             Fix For: 1.2.0
>
>
> When calling {{saveAsHadoopFile}}, Spark hardcodes the OutputCommitter to be 
> a {{FileOutputCommitter}}.
> When using Spark on an EMR cluster to process and write files to/from S3, the 
> default Hadoop configuration uses a {{DirectFileOutputCommitter}} to avoid 
> writing to a temporary directory and doing a copy.
> Will submit a patch via GitHub shortly.
> Cheers,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-3595) Spark should respect configured OutputCommitter when using saveAsHadoopFile

Reply via email to