Ian Hummel created SPARK-3595: --------------------------------- Summary: Spark should respect configured OutputCommitter when using saveAsHadoopFile Key: SPARK-3595 URL: https://issues.apache.org/jira/browse/SPARK-3595 Project: Spark Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Ian Hummel
When calling {{saveAsHadoopFile}}, Spark hardcodes the OutputCommitter to be a {{FileOutputCommitter}}. When using Spark on an EMR cluster to process and write files to/from S3, the default Hadoop configuration uses a {{DirectFileOutputCommitter}} to avoid writing to a temporary directory and doing a copy. Will submit a patch via GitHub shortly. Cheers, -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org