I got this info. from a hadoop jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-5485
// maropu On Sat, Oct 1, 2016 at 7:14 PM, Igor Berman <igor.ber...@gmail.com> wrote: > Takeshi, why are you saying this, how have you checked it's only used from > 2.7.3? > We use spark 2.0 which is shipped with hadoop dependency of 2.7.2 and we > use this setting. > We've sort of "verified" it's used by configuring log of file output > commiter > > On 30 September 2016 at 03:12, Takeshi Yamamuro <linguin....@gmail.com> > wrote: > >> Hi, >> >> FYI: Seems >> `sc.hadoopConfiguration.set("mapreduce.fileoutputcommitter.algorithm.version","2”)` >> is only available at hadoop-2.7.3+. >> >> // maropu >> >> >> On Thu, Sep 29, 2016 at 9:28 PM, joffe.tal <joffe....@gmail.com> wrote: >> >>> You can use partition explicitly by adding "/<col_name>=<partition >>> value>" to >>> the end of the path you are writing to and then use overwrite. >>> >>> BTW in Spark 2.0 you just need to use: >>> >>> sc.hadoopConfiguration.set("mapreduce.fileoutputcommitter.al >>> gorithm.version","2”) >>> and use s3a:// >>> >>> and you can work with regular output committer (actually >>> DirectParquetOutputCommitter is no longer available in Spark 2.0) >>> >>> so if you are planning on upgrading this could be another motivation >>> >>> >>> >>> -- >>> View this message in context: http://apache-spark-user-list. >>> 1001560.n3.nabble.com/S3-DirectParquetOutputCommitter-Partit >>> ionBy-SaveMode-Append-tp26398p27810.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>> >> >> >> -- >> --- >> Takeshi Yamamuro >> > > -- --- Takeshi Yamamuro