[ 
https://issues.apache.org/jira/browse/FLINK-19882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228382#comment-17228382
 ] 

Robert Metzger commented on FLINK-19882:
----------------------------------------

Okay, the problem seems to be the following:

1. the test process gets killed by the timeout watchdog (it's just a 
coincidence by the OS pid allocation)
2. the timeout watchdog is still running because the "Local recovery and sticky 
scheduling end-to-end test" is not properly exiting
3. It is unclear why the "Local recovery and sticky scheduling end-to-end test" 
is not properly exiting. This is its output:
{code}
2020-11-09T00:48:38.6400606Z Nov 09 00:48:38 Starting zookeeper daemon on host 
fv-az668-576.
2020-11-09T00:48:38.7987804Z Nov 09 00:48:38 Starting HA cluster with 1 masters.
2020-11-09T00:48:39.9627643Z Nov 09 00:48:39 Starting standalonesession daemon 
on host fv-az668-576.
2020-11-09T00:48:41.3899696Z Nov 09 00:48:41 Starting taskexecutor daemon on 
host fv-az668-576.
2020-11-09T00:48:41.4279555Z Nov 09 00:48:41 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:42.4789201Z Nov 09 00:48:42 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:43.6305190Z Nov 09 00:48:43 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:44.6585379Z Nov 09 00:48:44 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:45.6980086Z Nov 09 00:48:45 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:46.7181937Z Nov 09 00:48:46 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:47.7353049Z Nov 09 00:48:47 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:48.7532273Z Nov 09 00:48:48 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:49.7689345Z Nov 09 00:48:49 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:50.7851585Z Nov 09 00:48:50 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:51.8031040Z Nov 09 00:48:51 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:52.8264372Z Nov 09 00:48:52 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:53.8605046Z Nov 09 00:48:53 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:54.8848538Z Nov 09 00:48:54 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:55.9036072Z Nov 09 00:48:55 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:56.9203992Z Nov 09 00:48:56 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:57.9366759Z Nov 09 00:48:57 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:58.9564157Z Nov 09 00:48:58 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:48:59.9745170Z Nov 09 00:48:59 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:00.9914037Z Nov 09 00:49:00 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:02.0076906Z Nov 09 00:49:02 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:03.0240580Z Nov 09 00:49:03 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:04.0426484Z Nov 09 00:49:04 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:05.0598331Z Nov 09 00:49:05 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:06.0774841Z Nov 09 00:49:06 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:07.0934791Z Nov 09 00:49:07 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:08.1095532Z Nov 09 00:49:08 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:09.1217046Z Nov 09 00:49:09 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:10.1384550Z Nov 09 00:49:10 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:11.1555560Z Nov 09 00:49:11 Waiting for Dispatcher REST 
endpoint to come up...
2020-11-09T00:49:12.1573132Z Nov 09 00:49:12 Dispatcher REST endpoint has not 
started within a timeout of 30 sec
2020-11-09T00:49:12.1596545Z Nov 09 00:49:12 Checking of logs skipped.
2020-11-09T00:49:12.1597438Z Nov 09 00:49:12 
2020-11-09T00:49:12.1601475Z Nov 09 00:49:12 [PASS] 'Local recovery and sticky 
scheduling end-to-end test' passed after 0 minutes and 34 seconds! Test exited 
with exit code 0.
{code}

> E2E: SQLClientHBaseITCase crash
> -------------------------------
>
>                 Key: FLINK-19882
>                 URL: https://issues.apache.org/jira/browse/FLINK-19882
>             Project: Flink
>          Issue Type: Test
>          Components: Connectors / HBase
>            Reporter: Jingsong Lee
>            Assignee: Robert Metzger
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.12.0
>
>
> INSTANCE: 
> [https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_apis/build/builds/8563/logs/141]
> {code:java}
> 2020-10-29T09:43:24.0088180Z [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (end-to-end-tests) 
> on project flink-end-to-end-tests-hbase: There are test failures.
> 2020-10-29T09:43:24.0088792Z [ERROR] 
> 2020-10-29T09:43:24.0089518Z [ERROR] Please refer to 
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target/surefire-reports
>  for the individual test results.
> 2020-10-29T09:43:24.0090427Z [ERROR] Please refer to dump files (if any 
> exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2020-10-29T09:43:24.0090914Z [ERROR] The forked VM terminated without 
> properly saying goodbye. VM crash or System.exit called?
> 2020-10-29T09:43:24.0093105Z [ERROR] Command was /bin/sh -c cd 
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target
>  && /usr/lib/jvm/adoptopenjdk-8-hotspot-amd64/jre/bin/java -Xms256m -Xmx2048m 
> -Dmvn.forkNumber=2 -XX:+UseG1GC -jar 
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target/surefire/surefirebooter6795869883612750001.jar
>  
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target/surefire
>  2020-10-29T09-34-47_778-jvmRun2 surefire2269050977160717631tmp 
> surefire_67897497331523564186tmp
> 2020-10-29T09:43:24.0094488Z [ERROR] Error occurred in starting fork, check 
> output in log
> 2020-10-29T09:43:24.0094797Z [ERROR] Process Exit Code: 143
> 2020-10-29T09:43:24.0095033Z [ERROR] Crashed tests:
> 2020-10-29T09:43:24.0095321Z [ERROR] 
> org.apache.flink.tests.util.hbase.SQLClientHBaseITCase
> 2020-10-29T09:43:24.0095828Z [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM 
> terminated without properly saying goodbye. VM crash or System.exit called?
> 2020-10-29T09:43:24.0097838Z [ERROR] Command was /bin/sh -c cd 
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target
>  && /usr/lib/jvm/adoptopenjdk-8-hotspot-amd64/jre/bin/java -Xms256m -Xmx2048m 
> -Dmvn.forkNumber=2 -XX:+UseG1GC -jar 
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target/surefire/surefirebooter6795869883612750001.jar
>  
> /home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-hbase/target/surefire
>  2020-10-29T09-34-47_778-jvmRun2 surefire2269050977160717631tmp 
> surefire_67897497331523564186tmp
> 2020-10-29T09:43:24.0098966Z [ERROR] Error occurred in starting fork, check 
> output in log
> 2020-10-29T09:43:24.0099266Z [ERROR] Process Exit Code: 143
> 2020-10-29T09:43:24.0099502Z [ERROR] Crashed tests:
> 2020-10-29T09:43:24.0099789Z [ERROR] 
> org.apache.flink.tests.util.hbase.SQLClientHBaseITCase
> 2020-10-29T09:43:24.0100331Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669)
> 2020-10-29T09:43:24.0100883Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:282)
> 2020-10-29T09:43:24.0101774Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:245)
> 2020-10-29T09:43:24.0102360Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
> 2020-10-29T09:43:24.0103004Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
> 2020-10-29T09:43:24.0103737Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
> 2020-10-29T09:43:24.0104301Z [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2020-10-29T09:43:24.0104828Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2020-10-29T09:43:24.0105334Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2020-10-29T09:43:24.0105826Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2020-10-29T09:43:24.0106384Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2020-10-29T09:43:24.0106969Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2020-10-29T09:43:24.0107603Z [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2020-10-29T09:43:24.0108201Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2020-10-29T09:43:24.0108673Z [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2020-10-29T09:43:24.0109110Z [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2020-10-29T09:43:24.0109517Z [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2020-10-29T09:43:24.0110063Z [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 2020-10-29T09:43:24.0110601Z [ERROR] at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
> 2020-10-29T09:43:24.0110998Z [ERROR] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-29T09:43:24.0111426Z [ERROR] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-29T09:43:24.0112032Z [ERROR] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-29T09:43:24.0112487Z [ERROR] at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-29T09:43:24.0112955Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
> 2020-10-29T09:43:24.0113563Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
> 2020-10-29T09:43:24.0114072Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
> 2020-10-29T09:43:24.0114578Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> 2020-10-29T09:43:24.0115188Z [ERROR] -> [Help 1]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to