steveloughran commented on code in PR #6716: URL: https://github.com/apache/hadoop/pull/6716#discussion_r1584877464
########## hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md: ########## @@ -523,7 +509,7 @@ And optional settings for debugging/performance analysis ``` spark.hadoop.mapreduce.outputcommitter.factory.scheme.abfs org.apache.hadoop.fs.azurebfs.commit.AzureManifestCommitterFactory -spark.hadoop.fs.azure.io.rate.limit 10000 +spark.hadoop.fs.azure.io.rate.limit 1000 Review Comment: good q. The default allocation for the entire cluster is 10K; this ensures that a single commit only uses at most 10% of it during renames. in contrast, s3 throttles by shard so only things down the same directory tree were impacted, and if they were only reading data, not throttled at all. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org