Martijn Visser created FLINK-40070:
--------------------------------------
Summary: DynamicParameterITCase hangs when the JobManager startup
banner rolls to a numbered log file
Key: FLINK-40070
URL: https://issues.apache.org/jira/browse/FLINK-40070
Project: Flink
Issue Type: Bug
Components: Test Infrastructure, Tests
Affects Versions: 2.4.0
Reporter: Martijn Visser
Assignee: Martijn Visser
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=76627&view=results
(leg: e2e_4_ci); previously build 75992.
{{DynamicParameterITCase}} starts a JobManager via jobmanager.sh and reads the
startup banner (program arguments, classpath) back from the distribution logs
to assert how dynamic parameters were passed through. The distribution log4j
configuration rolls the log file on startup ({{OnStartupTriggeringPolicy}}), so
the banner frequently lands in a rolled file (for example
{{flink-...-standalonesession-1-host.log.1}}), which
{{FlinkDistribution.searchAllLogs}} deliberately skips.
Two failure modes of the same race:
- The readiness loop waits for a "Classpath:" line that never appears in the
live .log, spinning with no upper bound until the CI watchdog kills the leg. In
build 76627, {{testWithHostAndPort}} started and never finished.
- When rotation happens between the readiness check and the read, the test
parses an incomplete arguments block and fails with "Missing required option:
c".
Proposed fix: let {{FlinkDistribution.searchAllLogs}} optionally include rolled
log files when looking for the startup banner (other callers unchanged), and
bound the readiness wait with {{CommonTestUtils.waitUtil}} so a missing banner
fails fast with a clear message instead of hanging.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)