Hi!

I am using Flink 1.8.3 and facing an issue where job submission through
RestClusterClient times out on Akka (default value 10s). In previous Flink
versions there was an option to set a different timeout value just for the
submission client (ClusterClient config), but looks like it is not honored
now as job submission from client is no more through Akka and it will use
the same value present with Dispatcher. I wanted to know how to increase
this timeout just for job submission without affecting other akka threads
in TaskManager/JobManager, or any other solution for the problem.

The relevant stack trace is pasted below:

"cause":{"commonElementCount":8,"localizedMessage":"Could not submit job
(JobID: 26940c17ae3130fb8be1323cce1036e4)","message":"Could not submit job
(JobID:
26940c17ae3130fb8be1323cce1036e4)","name":"org.apache.flink.client.program.ProgramInvocationException","cause":{"commonElementCount":3,"localizedMessage":"Failed
to submit JobGraph.","message":"Failed to submit
JobGraph.","name":"org.apache.flink.runtime.client.JobSubmissionException","cause":{"commonElementCount":3,"localizedMessage":"[Internal
server error., <Exception on server
side:\nakka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/dispatcher#1457923918]] after [10000 ms].
Sender[null] sent message of type
\"org.apache.flink.runtime.rpc.messages.LocalFencedMessage\".\n\tat
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)\n\tat
akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)\n\tat
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)\n\tat
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)\n\tat
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)\n\tat
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)\n\tat
java.lang.Thread.run(Thread.java:745)\n\nEnd of exception on server
side>]","message":"[Internal server error., <Exception on server
side:\nakka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/dispatcher#1457923918]] after [10000 ms].
Sender[null] sent message of type
\"org.apache.flink.runtime.rpc.messages.LocalFencedMessage\".\n\tat
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)\n\tat
akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)\n\tat
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)\n\tat
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)\n\tat
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)\n\tat
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)\n\tat
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)\n\tat
java.lang.Thread.run(Thread.java:745)\n\nEnd of exception on server
side>]","name":"org.apache.flink.runtime.rest.util.RestClientException","extendedStackTrace":[{"class":"org.apache.flink.runtime.rest.RestClient","method":"parseResponse","file":"RestClient.java","line":389,"exact":false,"location":"flink-runtime_2.11-1.8.2.jar","version":"1.8.2"},{"class":"org.apache.flink.runtime.rest.RestClient","method":"lambda$submitRequest$3","file":"RestClient.java","line":373,"exact":false,"location":"flink-runtime_2.11-1.8.2.jar","version":"1.8.2"}

Reply via email to