environment:
flinksql 1.12.2
k8s session mode
description:
I got follow error log when my kafka connector port was wrong
>>>>>
2021-04-25 16:49:50
org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired
before the position for partition filebeat_json_install_log-3 could be
determined
>>>>>
I got follow error log when my kafka connector ip was wrong
>>>>>
2021-04-25 20:12:53
org.apache.flink.runtime.JobException: Recovery is suppressed by
NoRestartBackoffTimeStrategy
at
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:118)
at
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:80)
at
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:233)
at
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:224)
at
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:215)
at
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:669)
at
org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:89)
at
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:447)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
at
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
at
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired
while fetching topic metadata
>>>>>
When the job was cancelled,there was follow error log:
>>>>>
2021-04-25 08:53:41,115 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job
v2_ods_device_action_log (fcc451b8a521398b10e5b86153141fbf) switched from state
CANCELLING to CANCELED.
2021-04-25 08:53:41,115 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Stopping
checkpoint coordinator for job fcc451b8a521398b10e5b86153141fbf.
2021-04-25 08:53:41,115 INFO
org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] -
Shutting down
2021-04-25 08:53:41,115 INFO
org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint
with ID 1 at
'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-1'
not discarded.
2021-04-25 08:53:41,115 INFO
org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint
with ID 2 at
'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-2'
not discarded.
2021-04-25 08:53:41,116 INFO
org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint
with ID 3 at
'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-3'
not discarded.
2021-04-25 08:53:41,137 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job
fcc451b8a521398b10e5b86153141fbf reached globally terminal state CANCELED.
2021-04-25 08:53:41,148 INFO org.apache.flink.runtime.jobmaster.JobMaster
[] - Stopping the JobMaster for job
v2_ods_device_action_log(fcc451b8a521398b10e5b86153141fbf).
2021-04-25 08:53:41,151 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Suspending
SlotPool.
2021-04-25 08:53:41,151 INFO org.apache.flink.runtime.jobmaster.JobMaster
[] - Close ResourceManager connection
5bdeb8d0f65a90ecdfafd7f102050b19: JobManager is shutting down..
2021-04-25 08:53:41,151 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Stopping
SlotPool.
2021-04-25 08:53:41,151 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
Disconnect job manager
[email protected]://flink@flink-jobmanager:6123/user/rpc/jobmanager_3
for job fcc451b8a521398b10e5b86153141fbf from the resource manager.
2021-04-25 08:53:41,178 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Could not
archive completed job
v2_ods_device_action_log(fcc451b8a521398b10e5b86153141fbf) to the history
server.
java.util.concurrent.CompletionException: java.lang.ExceptionInInitializerError
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
~[?:1.8.0_265]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
[?:1.8.0_265]
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643)
[?:1.8.0_265]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_265]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_265]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
Caused by: java.lang.ExceptionInInitializerError
at
org.apache.flink.runtime.dispatcher.JsonResponseHistoryServerArchivist.lambda$archiveExecutionGraph$0(JsonResponseHistoryServerArchivist.java:55)
~[flink-dist_2.11-1.11.2.jar:1.11.2]
at
org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50)
~[template-common-jar-0.0.1.jar:?]
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
~[?:1.8.0_265]
... 3 more
Caused by: java.lang.IllegalStateException: zip file closed
at java.util.zip.ZipFile.ensureOpen(ZipFile.java:686) ~[?:1.8.0_265]
at java.util.zip.ZipFile.getEntry(ZipFile.java:315) ~[?:1.8.0_265]
at java.util.jar.JarFile.getEntry(JarFile.java:240) ~[?:1.8.0_265]
at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128)
~[?:1.8.0_265]
at java.util.jar.JarFile.getJarEntry(JarFile.java:223) ~[?:1.8.0_265]
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1054)
~[?:1.8.0_265]
at sun.misc.URLClassPath.getResource(URLClassPath.java:249) ~[?:1.8.0_265]
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[?:1.8.0_265]
at java.net.URLClassLoader$1.run(URLClassLoader.java:363) ~[?:1.8.0_265]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_265]
at java.net.URLClassLoader.findClass(URLClassLoader.java:362) ~[?:1.8.0_265]
at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_265]
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_265]
at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_265]
at java.lang.Class.forName0(Native Method) ~[?:1.8.0_265]
at java.lang.Class.forName(Class.java:264) ~[?:1.8.0_265]
at org.apache.logging.log4j.util.LoaderUtil.loadClass(LoaderUtil.java:168)
~[log4j-api-2.12.1.jar:2.12.1]
at org.apache.logging.slf4j.Log4jLogger.createConverter(Log4jLogger.java:416)
~[log4j-slf4j-impl-2.12.1.jar:2.12.1]
at org.apache.logging.slf4j.Log4jLogger.<init>(Log4jLogger.java:54)
~[log4j-slf4j-impl-2.12.1.jar:2.12.1]
at
org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:39)
~[log4j-slf4j-impl-2.12.1.jar:2.12.1]
at
org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:30)
~[log4j-slf4j-impl-2.12.1.jar:2.12.1]
at
org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:54)
~[log4j-api-2.12.1.jar:2.12.1]
at
org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:30)
~[log4j-slf4j-impl-2.12.1.jar:2.12.1]
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:277)
~[template-common-jar-0.0.1.jar:?]
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:288)
~[template-common-jar-0.0.1.jar:?]
at
org.apache.flink.runtime.history.FsJobArchivist.<clinit>(FsJobArchivist.java:50)
~[flink-dist_2.11-1.11.2.jar:1.11.2]
at
org.apache.flink.runtime.dispatcher.JsonResponseHistoryServerArchivist.lambda$archiveExecutionGraph$0(JsonResponseHistoryServerArchivist.java:55)
~[flink-dist_2.11-1.11.2.jar:1.11.2]
at
org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50)
~[template-common-jar-0.0.1.jar:?]
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
~[?:1.8.0_265]
... 3 more
>>>>>
And then I will get follow error log when I run a new job, unless I restart the
cluster
>>>>>
2021-04-25 08:54:06,711 INFO org.apache.flink.client.ClientUtils
[] - Starting program (detached: true)
2021-04-25 08:54:06,715 INFO
org.apache.flink.contrib.streaming.state.RocksDBStateBackend [] - Using
predefined options: DEFAULT.
2021-04-25 08:54:06,715 INFO
org.apache.flink.contrib.streaming.state.RocksDBStateBackend [] - Using default
options factory: DefaultConfigurableOptionsFactory{configuredOptions={}}.
2021-04-25 08:54:06,722 ERROR
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler [] - Exception
occurred in REST handler: Could not execute application.
>>>>>