liyude-tw commented on code in PR #26809:
URL: https://github.com/apache/flink/pull/26809#discussion_r2217918195
##########
flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnConfigOptions.java:
##########
@@ -110,16 +110,16 @@ public class YarnConfigOptions {
public static final ConfigOption<Long>
APPLICATION_ATTEMPT_FAILURE_VALIDITY_INTERVAL =
key("yarn.application-attempt-failures-validity-interval")
.longType()
- .defaultValue(10000L)
+ .defaultValue(-1L)
Review Comment:
Below is the reasoning that led me to propose -1 and how I believe the
change is safer and less surprising than the current default.
1. Few users intentionally depend on the current 10 s window
The 10 s sliding window was introduced in PR #8400 by re-using the
then-default Akka timeout. It wasn’t added to satisfy a concrete production
need, so I think almost no one relies on it on purpose. We discover it after
being surprised by extra restarts.
2. Hadoop YARN’s own default is -1 (global counting)
Because Flink runs as a YARN ApplicationMaster, aligning with the upstream
default reduces the cognitive overhead for operators who administer both
systems.
3. The documentation and common intuition both imply “global counting”
The description of yarn.application-attempts naturally suggests a total
attempt limit. A hidden time window can therefore be surprising.
### Risk-mitigation proposal
1. Upgrade guide
Add the following note in the upgrade section for this release:
> Starting with this release,
yarn.application-attempt-failures-validity-interval defaults to -1 (global
counting).
> Clusters that benefit from the previous 10 s sliding window can retain the
old behaviour by adding
> `yarn.application-attempt-failures-validity-interval: 10000`
2. Release notes
Repeat the same notice and example so that operators can quickly restore the
former setting if needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]