[ https://issues.apache.org/jira/browse/HADOOP-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas updated HADOOP-14660: ---------------------------- Status: Patch Available (was: Open) Re-submitting HADOOP-14660-003.patch due to transient build failure. > wasb: improve throughput by 34% when account limit exceeded > ----------------------------------------------------------- > > Key: HADOOP-14660 > URL: https://issues.apache.org/jira/browse/HADOOP-14660 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/azure > Reporter: Thomas > Assignee: Thomas > Attachments: HADOOP-14660-001.patch, HADOOP-14660-002.patch, > HADOOP-14660-003.patch > > > Big data workloads frequently exceed the Azure Storage max ingress and egress > limits > (https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits). > For example, the max ingress limit for a GRS account in the United States is > currently 10 Gbps. When the limit is exceeded, the Azure Storage service > fails a percentage of incoming requests, and this causes the client to > initiate the retry policy. The retry policy delays requests by sleeping, but > the sleep duration is independent of the client throughput and account limit. > This results in low throughput, due to the high number of failed requests > and thrashing causes by the retry policy. > To fix this, we introduce a client-side throttle which minimizes failed > requests and maximizes throughput. Tests have shown that this improves > throughtput by ~34% when the storage account max ingress and/or egress limits > are exceeded. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org