steveloughran commented on code in PR #6716:
URL: https://github.com/apache/hadoop/pull/6716#discussion_r1584877464


##########
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/markdown/manifest_committer.md:
##########
@@ -523,7 +509,7 @@ And optional settings for debugging/performance analysis
 
 ```
 spark.hadoop.mapreduce.outputcommitter.factory.scheme.abfs 
org.apache.hadoop.fs.azurebfs.commit.AzureManifestCommitterFactory
-spark.hadoop.fs.azure.io.rate.limit 10000
+spark.hadoop.fs.azure.io.rate.limit 1000

Review Comment:
   good q. The default allocation for the entire cluster is 10K; this ensures 
that a single commit only uses at most 10% of it during renames. in contrast, 
s3 throttles by shard so only things down the same directory tree were 
impacted, and if they were only reading data, not throttled at all.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to