amaliujia commented on code in PR #10482:
URL: https://github.com/apache/ozone/pull/10482#discussion_r3393455898
##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/conf/RatisClientConfig.java:
##########
@@ -70,31 +70,31 @@ public class RatisClientConfig {
private String multilinearPolicy;
@Config(key = "hdds.ratis.client.exponential.backoff.base.sleep",
- defaultValue = "4s",
+ defaultValue = "1s",
type = ConfigType.TIME,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "Specifies base sleep for exponential backoff retry
policy."
- + " With the default base sleep of 4s, the sleep duration for ith"
- + " retry is min(4 * pow(2, i), max_sleep) * r, where r is "
+ + " With the default base sleep of 1s, the sleep duration for ith"
+ + " retry is min(1 * pow(2, i), max_sleep) * r, where r is "
+ "random number in the range [0.5, 1.5).")
- private Duration exponentialPolicyBaseSleep = Duration.ofSeconds(4);
+ private Duration exponentialPolicyBaseSleep = Duration.ofSeconds(1);
@Config(key = "hdds.ratis.client.exponential.backoff.max.sleep",
- defaultValue = "40s",
+ defaultValue = "5s",
type = ConfigType.TIME,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "The sleep duration obtained from exponential backoff "
+ "policy is limited by the configured max sleep. Refer "
+ "dfs.ratis.client.exponential.backoff.base.sleep for further "
+ "details.")
- private Duration exponentialPolicyMaxSleep = Duration.ofSeconds(40);
+ private Duration exponentialPolicyMaxSleep = Duration.ofSeconds(5);
@Config(key = "hdds.ratis.client.exponential.backoff.max.retries",
- defaultValue = "2147483647",
+ defaultValue = "2",
type = ConfigType.INT,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "Client's max retry value for the exponential backoff
policy.")
- private int exponentialPolicyMaxRetries = Integer.MAX_VALUE;
+ private int exponentialPolicyMaxRetries = 2;
Review Comment:
Oh Sorry I might be confused myself:
I guess what retries here control when to retry and there is another request
time out. So update the number is:
1. max total retry time widow would be 2 * max_sleep * 1.5 + 2 *
write_request_time_out = 2 * 5 * 1.5 + 2 * 70 = 155 seconds.
2. min total retry time window would be 1 * 2 * 0.5 + 1 * 4 * 0.5 + 2 *
write_request_time_out = 143 second.
After the initial timeout, there are around 143 to 155 seconds before giving
up, which seems to be sufficient.
##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/conf/RatisClientConfig.java:
##########
@@ -70,31 +70,31 @@ public class RatisClientConfig {
private String multilinearPolicy;
@Config(key = "hdds.ratis.client.exponential.backoff.base.sleep",
- defaultValue = "4s",
+ defaultValue = "1s",
type = ConfigType.TIME,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "Specifies base sleep for exponential backoff retry
policy."
- + " With the default base sleep of 4s, the sleep duration for ith"
- + " retry is min(4 * pow(2, i), max_sleep) * r, where r is "
+ + " With the default base sleep of 1s, the sleep duration for ith"
+ + " retry is min(1 * pow(2, i), max_sleep) * r, where r is "
+ "random number in the range [0.5, 1.5).")
- private Duration exponentialPolicyBaseSleep = Duration.ofSeconds(4);
+ private Duration exponentialPolicyBaseSleep = Duration.ofSeconds(1);
@Config(key = "hdds.ratis.client.exponential.backoff.max.sleep",
- defaultValue = "40s",
+ defaultValue = "5s",
type = ConfigType.TIME,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "The sleep duration obtained from exponential backoff "
+ "policy is limited by the configured max sleep. Refer "
+ "dfs.ratis.client.exponential.backoff.base.sleep for further "
+ "details.")
- private Duration exponentialPolicyMaxSleep = Duration.ofSeconds(40);
+ private Duration exponentialPolicyMaxSleep = Duration.ofSeconds(5);
@Config(key = "hdds.ratis.client.exponential.backoff.max.retries",
- defaultValue = "2147483647",
+ defaultValue = "2",
type = ConfigType.INT,
tags = { OZONE, CLIENT, PERFORMANCE },
description = "Client's max retry value for the exponential backoff
policy.")
- private int exponentialPolicyMaxRetries = Integer.MAX_VALUE;
+ private int exponentialPolicyMaxRetries = 2;
Review Comment:
So all the changes combined theoretically, for write requests:
1. max total retry time widow would be 2 * max_sleep * 1.5 = 2 * 5 * 1.5 =
15 second.
2. min total retry time window would be 1 * 2 * 0.5 + 1 * 4 * 0.5 = 3
second.
Just curious if the 3 to 15 seconds time window is enough to decide if the
leader is not reachable after the initial timeout?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]