Xintong Song created FLINK-23266: ------------------------------------ Summary: HA per-job cluster (rocks, non-incremental) hangs on Azure Key: FLINK-23266 URL: https://issues.apache.org/jira/browse/FLINK-23266 Project: Flink Issue Type: Bug Affects Versions: 1.12.4 Reporter: Xintong Song Fix For: 1.12.5
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19943&view=logs&j=6caf31d6-847a-526e-9624-468e053467d6&t=0b23652f-b18b-5b6e-6eb6-a11070364610&l=1858 {code} Jul 05 21:56:00 ============================================================================== Jul 05 21:56:00 Running 'Running HA per-job cluster (rocks, non-incremental) end-to-end test' Jul 05 21:56:00 ============================================================================== Jul 05 21:56:00 TEST_DATA_DIR: /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-00772599944 Jul 05 21:56:00 Flink dist directory: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT Jul 05 21:56:00 Flink dist directory: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT Jul 05 21:56:01 Starting zookeeper daemon on host fv-az43-4. Jul 05 21:56:01 Running on HA mode: parallelism=4, backend=rocks, asyncSnapshots=true, incremSnapshots=false and zk=3.4. Jul 05 21:56:03 Starting standalonejob daemon on host fv-az43-4. Jul 05 21:56:03 Start 1 more task managers Jul 05 21:56:04 Starting taskexecutor daemon on host fv-az43-4. Jul 05 21:56:10 Job (00000000000000000000000000000000) is not yet running. Jul 05 21:56:18 Job (00000000000000000000000000000000) is running. Jul 05 21:56:18 Running JM watchdog @ 266158 Jul 05 21:56:18 Running TM watchdog @ 266159 Jul 05 21:56:18 Waiting for text Completed checkpoint [1-9]* for job 00000000000000000000000000000000 to appear 2 of times in logs... Jul 05 21:56:22 Killed JM @ 264313 Jul 05 21:56:22 Waiting for text Completed checkpoint [1-9]* for job 00000000000000000000000000000000 to appear 2 of times in logs... grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory Jul 05 21:56:26 Killed TM @ 264571 grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory Jul 05 21:56:26 Starting standalonejob daemon on host fv-az43-4. grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-1*.log: No such file or directory Jul 05 21:57:12 Killed JM @ 267798 Jul 05 21:57:12 Waiting for text Completed checkpoint [1-9]* for job 00000000000000000000000000000000 to appear 2 of times in logs... grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-2*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-2*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-2*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-2*.log: No such file or directory Jul 05 21:57:15 Starting standalonejob daemon on host fv-az43-4. grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-2*.log: No such file or directory /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/common_ha.sh: line 151: [: 58)\n\tat org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$0(AbstractExecutionGraphHandler.java: integer expression expected Jul 05 21:58:07 Killed JM @ 271440 Jul 05 21:58:07 Waiting for text Completed checkpoint [1-9]* for job 00000000000000000000000000000000 to appear 2 of times in logs... grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-3*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-3*.log: No such file or directory Jul 05 21:58:09 Killed TM @ 267660 Jul 05 21:58:09 Starting standalonejob daemon on host fv-az43-4. grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-3*.log: No such file or directory grep: /home/vsts/work/1/s/flink-dist/target/flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/log/*standalonejob-3*.log: No such file or directory kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] Jul 05 21:58:51 Killed TM @ Jul 05 22:11:00 Test (pid: 263840) did not finish after 900 seconds. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)