[ 
https://issues.apache.org/jira/browse/FLINK-18290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135526#comment-17135526
 ] 

Robert Metzger commented on FLINK-18290:
----------------------------------------

First occurrence of a related issue was build 20200612.19 (build ID 3416)
First occurrence of the exact issue with exit code was build 20200612.26 (id 
3434). Is a docs change. Potentially problematic commits:

This commit 
(https://github.com/flink-ci/flink-mirror/commit/c004b119ef28dc7387935d8d3a4dbf296cc5f661)
 introduces a System.exit(-17) in the checkpoint coordinator. 256 - 17 = 239. 
Coincidence? Introducing a {{System.exit(-17);}} into any test will lead to 
exactly the failure reported here.

This seems to be the reason why System.exit() gets called (from 20200612.19):
{code}
14:56:33,906 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Stopped 
TaskExecutor akka://flink/user/rpc/taskmanager_28.
14:56:33,887 [jobmanager-future-thread-7] ERROR 
org.apache.flink.runtime.util.FatalExitExceptionHandler      [] - FATAL: Thread 
'jobmanager-future-thread-7' produced an uncaught exception. Stopping the 
process...
java.util.concurrent.CompletionException: 
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@1db3e21b 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@198c75f2[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 20]
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:838) 
~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609)
 [?:1.8.0_242]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
Caused by: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@1db3e21b 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@198c75f2[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 20]
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:622)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
 ~[?:1.8.0_242]
        at 
org.apache.flink.runtime.concurrent.ScheduledExecutorServiceAdapter.execute(ScheduledExecutorServiceAdapter.java:62)
 ~[flink-runtime_2.11-1.11-SNAPSHOT.jar:1.11-SNAPSHOT]
        at 
java.util.concurrent.CompletableFuture$UniCompletion.claim(CompletableFuture.java:543)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:826) 
~[?:1.8.0_242]
        ... 10 more
{code}

Seems to be the same in the .26 run:
{code}
20:51:18,122 [mini-cluster-io-thread-26] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - JobManager 
for job 63af744539889f9a6bf731aa05b02e97 with leader id 
93f8812403b7e711da29465d96a74439 lost leadership.
20:51:18,122 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Close 
JobManager connection for job 63af744539889f9a6bf731aa05b02e97.
20:51:18,121 [    Checkpoint Timer] ERROR 
org.apache.flink.runtime.util.FatalExitExceptionHandler      [] - FATAL: Thread 
'Checkpoint Timer' produced an uncaught exception. Stopping the process...
java.util.concurrent.CompletionException: 
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@59fa0f36 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@1cf89a6d[Shutting down, pool 
size = 1, active threads = 1, queued tasks = 0, completed tasks = 7]
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:838) 
~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575) 
[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:594)
 [?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
 [?:1.8.0_242]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
Caused by: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@59fa0f36 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@1cf89a6d[Shutting down, pool 
size = 1, active threads = 1, queued tasks = 0, completed tasks = 7]
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:622)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
 ~[?:1.8.0_242]
        at 
org.apache.flink.runtime.concurrent.ScheduledExecutorServiceAdapter.execute(ScheduledExecutorServiceAdapter.java:62)
 ~[flink-runtime_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
        at 
java.util.concurrent.CompletableFuture$UniCompletion.claim(CompletableFuture.java:543)
 ~[?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:826) 
~[?:1.8.0_242]
        ... 12 more
20:51:18,123 [PermanentBlobCache shutdown hook] INFO  
org.apache.flink.runtime.blob.PermanentBlobCache             [] - Shutting down 
BLOB cache
{code}

> Tests are crashing with exit code 239
> -------------------------------------
>
>                 Key: FLINK-18290
>                 URL: https://issues.apache.org/jira/browse/FLINK-18290
>             Project: Flink
>          Issue Type: Bug
>          Components: Build System / Azure Pipelines
>    Affects Versions: 1.11.0
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>            Priority: Blocker
>              Labels: test-stability
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3467&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=34f486e1-e1e4-5dd2-9c06-bfdd9b9c74a8]
> Kafka011ProducerExactlyOnceITCase
>  
> {code:java}
> 2020-06-15T03:24:28.4677649Z [WARNING] The requested profile 
> "skip-webui-build" could not be activated because it does not exist.
> 2020-06-15T03:24:28.4692049Z [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test 
> (integration-tests) on project flink-connector-kafka-0.11_2.11: There are 
> test failures.
> 2020-06-15T03:24:28.4692585Z [ERROR] 
> 2020-06-15T03:24:28.4693170Z [ERROR] Please refer to 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire-reports 
> for the individual test results.
> 2020-06-15T03:24:28.4693928Z [ERROR] Please refer to dump files (if any 
> exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> 2020-06-15T03:24:28.4694423Z [ERROR] ExecutionException The forked VM 
> terminated without properly saying goodbye. VM crash or System.exit called?
> 2020-06-15T03:24:28.4696762Z [ERROR] Command was /bin/sh -c cd 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target && 
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
> -Dlog4j.configurationFile=log4j2-test.properties -Dmvn.forkNumber=2 
> -XX:-UseGCOverheadLimit -jar 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire/surefirebooter617700788970993266.jar
>  /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire 
> 2020-06-15T03-07-01_381-jvmRun2 surefire2676050245109796726tmp 
> surefire_602825791089523551074tmp
> 2020-06-15T03:24:28.4698486Z [ERROR] Error occurred in starting fork, check 
> output in log
> 2020-06-15T03:24:28.4699066Z [ERROR] Process Exit Code: 239
> 2020-06-15T03:24:28.4699458Z [ERROR] Crashed tests:
> 2020-06-15T03:24:28.4699960Z [ERROR] 
> org.apache.flink.streaming.connectors.kafka.Kafka011ProducerExactlyOnceITCase
> 2020-06-15T03:24:28.4700849Z [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 2020-06-15T03:24:28.4703760Z [ERROR] Command was /bin/sh -c cd 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target && 
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
> -Dlog4j.configurationFile=log4j2-test.properties -Dmvn.forkNumber=2 
> -XX:-UseGCOverheadLimit -jar 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire/surefirebooter617700788970993266.jar
>  /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire 
> 2020-06-15T03-07-01_381-jvmRun2 surefire2676050245109796726tmp 
> surefire_602825791089523551074tmp
> 2020-06-15T03:24:28.4705501Z [ERROR] Error occurred in starting fork, check 
> output in log
> 2020-06-15T03:24:28.4706297Z [ERROR] Process Exit Code: 239
> 2020-06-15T03:24:28.4706592Z [ERROR] Crashed tests:
> 2020-06-15T03:24:28.4706895Z [ERROR] 
> org.apache.flink.streaming.connectors.kafka.Kafka011ProducerExactlyOnceITCase
> 2020-06-15T03:24:28.4707386Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510)
> 2020-06-15T03:24:28.4708053Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:457)
> 2020-06-15T03:24:28.4708908Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:298)
> 2020-06-15T03:24:28.4709720Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246)
> 2020-06-15T03:24:28.4710497Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
> 2020-06-15T03:24:28.4711448Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
> 2020-06-15T03:24:28.4712395Z [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
> 2020-06-15T03:24:28.4712997Z [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 2020-06-15T03:24:28.4713524Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 2020-06-15T03:24:28.4714079Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 2020-06-15T03:24:28.4714560Z [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 2020-06-15T03:24:28.4715096Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 2020-06-15T03:24:28.4715672Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 2020-06-15T03:24:28.4716445Z [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 2020-06-15T03:24:28.4717024Z [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 2020-06-15T03:24:28.4717478Z [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 2020-06-15T03:24:28.4717939Z [ERROR] at 
> org.apache.maven.DefaultMaven.execute(DefaultMaven.java:155)
> 2020-06-15T03:24:28.4718378Z [ERROR] at 
> org.apache.maven.cli.MavenCli.execute(MavenCli.java:584)
> 2020-06-15T03:24:28.4718852Z [ERROR] at 
> org.apache.maven.cli.MavenCli.doMain(MavenCli.java:216)
> 2020-06-15T03:24:28.4719230Z [ERROR] at 
> org.apache.maven.cli.MavenCli.main(MavenCli.java:160)
> 2020-06-15T03:24:28.4719676Z [ERROR] at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-06-15T03:24:28.4720309Z [ERROR] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-06-15T03:24:28.4720882Z [ERROR] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-06-15T03:24:28.4721339Z [ERROR] at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-06-15T03:24:28.4721888Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
> 2020-06-15T03:24:28.4722658Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
> 2020-06-15T03:24:28.4723430Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
> 2020-06-15T03:24:28.4724062Z [ERROR] at 
> org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
> 2020-06-15T03:24:28.4724657Z [ERROR] Caused by: 
> org.apache.maven.surefire.booter.SurefireBooterForkException: The forked VM 
> terminated without properly saying goodbye. VM crash or System.exit called?
> 2020-06-15T03:24:28.4726770Z [ERROR] Command was /bin/sh -c cd 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target && 
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m 
> -Dlog4j.configurationFile=log4j2-test.properties -Dmvn.forkNumber=2 
> -XX:-UseGCOverheadLimit -jar 
> /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire/surefirebooter617700788970993266.jar
>  /__w/2/s/flink-connectors/flink-connector-kafka-0.11/target/surefire 
> 2020-06-15T03-07-01_381-jvmRun2 surefire2676050245109796726tmp 
> surefire_602825791089523551074tmp
> 2020-06-15T03:24:28.4728582Z [ERROR] Error occurred in starting fork, check 
> output in log
> 2020-06-15T03:24:28.4729202Z [ERROR] Process Exit Code: 239
> 2020-06-15T03:24:28.4729612Z [ERROR] Crashed tests:
> 2020-06-15T03:24:28.4730247Z [ERROR] 
> org.apache.flink.streaming.connectors.kafka.Kafka011ProducerExactlyOnceITCase
> 2020-06-15T03:24:28.4730781Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:669)
> 2020-06-15T03:24:28.4731292Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$600(ForkStarter.java:115)
> 2020-06-15T03:24:28.4731829Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:444)
> 2020-06-15T03:24:28.4732353Z [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:420)
> 2020-06-15T03:24:28.4732792Z [ERROR] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2020-06-15T03:24:28.4733235Z [ERROR] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 2020-06-15T03:24:28.4733718Z [ERROR] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 2020-06-15T03:24:28.4734170Z [ERROR] at java.lang.Thread.run(Thread.java:748)
> 2020-06-15T03:24:28.4734682Z [ERROR] -> [Help 1]
> 2020-06-15T03:24:28.4734859Z [ERROR] 
> 2020-06-15T03:24:28.4735312Z [ERROR] To see the full stack trace of the 
> errors, re-run Maven with the -e switch.
> 2020-06-15T03:24:28.4735927Z [ERROR] Re-run Maven using the -X switch to 
> enable full debug logging.
> 2020-06-15T03:24:28.4736439Z [ERROR] 
> 2020-06-15T03:24:28.4736952Z [ERROR] For more information about the errors 
> and possible solutions, please read the following articles:
> 2020-06-15T03:24:28.4737706Z [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> 2020-06-15T03:24:28.4738167Z [ERROR] 
> 2020-06-15T03:24:28.4738553Z [ERROR] After correcting the problems, you can 
> resume the build with the command
> 2020-06-15T03:24:28.4739663Z [ERROR]   mvn <goals> -rf 
> :flink-connector-kafka-0.11_2.11
> 2020-06-15T03:24:29.0980029Z MVN exited with EXIT CODE: 1.
> {code}
> This could be a CI environment issue...
> When did it start?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to