[
https://issues.apache.org/jira/browse/FLINK-40070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-40070:
-----------------------------------
Labels: pull-request-available test-stability (was: test-stability)
> DynamicParameterITCase hangs when the JobManager startup banner rolls to a
> numbered log file
> --------------------------------------------------------------------------------------------
>
> Key: FLINK-40070
> URL: https://issues.apache.org/jira/browse/FLINK-40070
> Project: Flink
> Issue Type: Bug
> Components: Test Infrastructure, Tests
> Affects Versions: 2.4.0
> Reporter: Martijn Visser
> Assignee: Martijn Visser
> Priority: Major
> Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=76627&view=results
> (leg: e2e_4_ci); previously build 75992.
> {{DynamicParameterITCase}} starts a JobManager via jobmanager.sh and reads
> the startup banner (program arguments, classpath) back from the distribution
> logs to assert how dynamic parameters were passed through. The distribution
> log4j configuration rolls the log file on startup
> ({{OnStartupTriggeringPolicy}}), so the banner frequently lands in a rolled
> file (for example {{flink-...-standalonesession-1-host.log.1}}), which
> {{FlinkDistribution.searchAllLogs}} deliberately skips.
> Two failure modes of the same race:
> - The readiness loop waits for a "Classpath:" line that never appears in the
> live .log, spinning with no upper bound until the CI watchdog kills the leg.
> In build 76627, {{testWithHostAndPort}} started and never finished.
> - When rotation happens between the readiness check and the read, the test
> parses an incomplete arguments block and fails with "Missing required option:
> c".
> Proposed fix: let {{FlinkDistribution.searchAllLogs}} optionally include
> rolled log files when looking for the startup banner (other callers
> unchanged), and bound the readiness wait with {{CommonTestUtils.waitUtil}} so
> a missing banner fails fast with a clear message instead of hanging.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)