Aitozi created FLINK-29308:
------------------------------
Summary: NoResourceAvailableException fails the batch job
Key: FLINK-29308
URL: https://issues.apache.org/jira/browse/FLINK-29308
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Reporter: Aitozi
When running batch job configured with the following restart strategy
{code:java}
restart-strategy: fixed-delay
restart-strategy.fixed-delay.delay: 15 s
restart-strategy.fixed-delay.attempts: 10 {code}
If the cluster resource is not enough to run the single stage, it can run
partial of the stage, but it still will fail after the 10 times
\{{NoResourceAvailableException}}. IMO, for batch job the
\{{NoResourceAvailableException}} do not necessary to trigger the job to fail.
Or at least this failure reason are not share the same restart strategy with
other failure reasons
--
This message was sent by Atlassian Jira
(v8.20.10#820010)