[ https://issues.apache.org/jira/browse/SPARK-32701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184729#comment-17184729 ]
Apache Spark commented on SPARK-32701: -------------------------------------- User 'waleedfateem' has created a pull request for this issue: https://github.com/apache/spark/pull/29541 > mapreduce.fileoutputcommitter.algorithm.version default depends on runtime > environment > -------------------------------------------------------------------------------------- > > Key: SPARK-32701 > URL: https://issues.apache.org/jira/browse/SPARK-32701 > Project: Spark > Issue Type: Bug > Components: docs, Documentation > Affects Versions: 2.4.0, 3.0.0 > Reporter: Waleed Fateem > Priority: Major > > When someone reads the documentation in its current state, the assumption is > that the default value of > spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 and that's > not entirely accurate. > Spark doesn't explicitly set this configuration and instead is inherited from > Hadoop's FileOutputCommitter class. The default value is 1 until Hadoop 3.0 > where this changed to 2. > I'm proposing that we clarify that this value's default will depend on the > Hadoop version in a user's runtime environment, where: > 1 for < Hadoop 3.0 > 2 for >= Hadoop 3.0 > There are also plans to revert this default again to v1 so might also be > useful to reference this JIRA: > https://issues.apache.org/jira/browse/MAPREDUCE-7282 > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org