[ 
https://issues.apache.org/jira/browse/FLINK-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402592#comment-17402592
 ] 

Till Rohrmann commented on FLINK-21538:
---------------------------------------

Looking at this test failure two things are interesting:

1) The tests don't configure a parallelism. That's why we run a job with a 
parallelism of 32. This slows down the execution.
2) The execution is not super fast on the CI infrastructure. That's why we run 
into the 10s {{akka.ask.timeout}}.

I would suggest two things:

1) Configuring a lower parallelism to reduce the complexity of the test.
2) Set a higher default {{akka.ask.timeout}} when using the {{MiniCluster}}. 
This should also solve a lot of other test instabilities that are caused by 
timeouts due to slow CI infrastructure.

> Elasticsearch6DynamicSinkITCase.testWritingDocuments fails when submitting job
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-21538
>                 URL: https://issues.apache.org/jira/browse/FLINK-21538
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / ElasticSearch, Runtime / Coordination
>    Affects Versions: 1.12.1, 1.13.0
>            Reporter: Dawid Wysakowicz
>            Priority: Minor
>              Labels: auto-deprioritized-major, auto-unassigned, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=13868&view=logs&j=3d12d40f-c62d-5ec4-6acc-0efe94cc3e89&t=5d6e4255-0ea8-5e2a-f52c-c881b7872361
> {code}
> 2021-02-27T00:16:06.9493539Z 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> 2021-02-27T00:16:06.9494494Z  at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> 2021-02-27T00:16:06.9495733Z  at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:117)
> 2021-02-27T00:16:06.9496596Z  at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> 2021-02-27T00:16:06.9497354Z  at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> 2021-02-27T00:16:06.9525795Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9526744Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-02-27T00:16:06.9527784Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237)
> 2021-02-27T00:16:06.9528552Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-02-27T00:16:06.9529271Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-02-27T00:16:06.9530013Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9530482Z  at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> 2021-02-27T00:16:06.9531068Z  at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1046)
> 2021-02-27T00:16:06.9531544Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:264)
> 2021-02-27T00:16:06.9531908Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:261)
> 2021-02-27T00:16:06.9532449Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> 2021-02-27T00:16:06.9532860Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> 2021-02-27T00:16:06.9533245Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> 2021-02-27T00:16:06.9533721Z  at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> 2021-02-27T00:16:06.9534225Z  at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
> 2021-02-27T00:16:06.9534697Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
> 2021-02-27T00:16:06.9535217Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
> 2021-02-27T00:16:06.9535718Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
> 2021-02-27T00:16:06.9536127Z  at 
> akka.pattern.PromiseActorRef.$bang(AskSupport.scala:573)
> 2021-02-27T00:16:06.9536861Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
> 2021-02-27T00:16:06.9537394Z  at 
> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
> 2021-02-27T00:16:06.9537916Z  at 
> scala.concurrent.Future.$anonfun$andThen$1(Future.scala:532)
> 2021-02-27T00:16:06.9605804Z  at 
> scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:29)
> 2021-02-27T00:16:06.9606794Z  at 
> scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
> 2021-02-27T00:16:06.9607642Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> 2021-02-27T00:16:06.9608419Z  at 
> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
> 2021-02-27T00:16:06.9609252Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:91)
> 2021-02-27T00:16:06.9610024Z  at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
> 2021-02-27T00:16:06.9613676Z  at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
> 2021-02-27T00:16:06.9615526Z  at 
> akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
> 2021-02-27T00:16:06.9616727Z  at 
> akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> 2021-02-27T00:16:06.9617826Z  at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
> 2021-02-27T00:16:06.9618940Z  at 
> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> 2021-02-27T00:16:06.9620109Z  at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> 2021-02-27T00:16:06.9621415Z  at 
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> 2021-02-27T00:16:06.9622598Z  at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> 2021-02-27T00:16:06.9623716Z Caused by: 
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> NoRestartBackoffTimeStrategy
> 2021-02-27T00:16:06.9625006Z  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:118)
> 2021-02-27T00:16:06.9626398Z  at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:80)
> 2021-02-27T00:16:06.9628020Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:233)
> 2021-02-27T00:16:06.9629257Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:224)
> 2021-02-27T00:16:06.9630622Z  at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:215)
> 2021-02-27T00:16:06.9631835Z  at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:669)
> 2021-02-27T00:16:06.9633415Z  at 
> org.apache.flink.runtime.scheduler.UpdateSchedulerNgOnInternalFailuresListener.notifyTaskFailure(UpdateSchedulerNgOnInternalFailuresListener.java:56)
> 2021-02-27T00:16:06.9634940Z  at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.notifySchedulerNgAboutInternalTaskFailure(ExecutionGraph.java:1869)
> 2021-02-27T00:16:06.9636193Z  at 
> org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1437)
> 2021-02-27T00:16:06.9637220Z  at 
> org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1377)
> 2021-02-27T00:16:06.9638462Z  at 
> org.apache.flink.runtime.executiongraph.Execution.markFailed(Execution.java:1205)
> 2021-02-27T00:16:06.9639683Z  at 
> org.apache.flink.runtime.executiongraph.Execution.lambda$deploy$11(Execution.java:856)
> 2021-02-27T00:16:06.9640771Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-02-27T00:16:06.9641839Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-02-27T00:16:06.9643554Z  at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
> 2021-02-27T00:16:06.9644658Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
> 2021-02-27T00:16:06.9645998Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
> 2021-02-27T00:16:06.9647143Z  at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> 2021-02-27T00:16:06.9648506Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> 2021-02-27T00:16:06.9649340Z  at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> 2021-02-27T00:16:06.9650021Z  at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> 2021-02-27T00:16:06.9650741Z  at 
> scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
> 2021-02-27T00:16:06.9651406Z  at 
> scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
> 2021-02-27T00:16:06.9652093Z  at 
> akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> 2021-02-27T00:16:06.9652972Z  at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> 2021-02-27T00:16:06.9653685Z  at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> 2021-02-27T00:16:06.9654385Z  at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> 2021-02-27T00:16:06.9655010Z  at 
> akka.actor.Actor.aroundReceive(Actor.scala:517)
> 2021-02-27T00:16:06.9655606Z  at 
> akka.actor.Actor.aroundReceive$(Actor.scala:515)
> 2021-02-27T00:16:06.9656223Z  at 
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> 2021-02-27T00:16:06.9656910Z  at 
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> 2021-02-27T00:16:06.9739719Z  at 
> akka.actor.ActorCell.invoke(ActorCell.scala:561)
> 2021-02-27T00:16:06.9740802Z  at 
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> 2021-02-27T00:16:06.9741521Z  at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> 2021-02-27T00:16:06.9742086Z  at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> 2021-02-27T00:16:06.9742776Z  ... 4 more
> 2021-02-27T00:16:06.9743982Z Caused by: 
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException: Invocation of public abstract 
> java.util.concurrent.CompletableFuture 
> org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.submitTask(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor,org.apache.flink.runtime.jobmaster.JobMasterId,org.apache.flink.api.common.time.Time)
>  timed out.
> 2021-02-27T00:16:06.9745460Z  at 
> java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
> 2021-02-27T00:16:06.9746212Z  at 
> java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
> 2021-02-27T00:16:06.9746961Z  at 
> java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925)
> 2021-02-27T00:16:06.9747806Z  at 
> java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913)
> 2021-02-27T00:16:06.9748553Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9749330Z  at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> 2021-02-27T00:16:06.9750210Z  at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234)
> 2021-02-27T00:16:06.9751031Z  at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> 2021-02-27T00:16:06.9751954Z  at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> 2021-02-27T00:16:06.9752836Z  at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> 2021-02-27T00:16:06.9753594Z  at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> 2021-02-27T00:16:06.9754400Z  at 
> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1044)
> 2021-02-27T00:16:06.9755076Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:263)
> 2021-02-27T00:16:06.9755623Z  at 
> akka.dispatch.OnComplete.internal(Future.scala:261)
> 2021-02-27T00:16:06.9756221Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
> 2021-02-27T00:16:06.9756841Z  at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
> 2021-02-27T00:16:06.9757772Z  at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> 2021-02-27T00:16:06.9758524Z  at 
> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73)
> 2021-02-27T00:16:06.9759315Z  at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
> 2021-02-27T00:16:06.9760053Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
> 2021-02-27T00:16:06.9760865Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
> 2021-02-27T00:16:06.9761785Z  at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
> 2021-02-27T00:16:06.9762565Z  at 
> akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:650)
> 2021-02-27T00:16:06.9763213Z  at 
> akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205)
> 2021-02-27T00:16:06.9763902Z  at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)
> 2021-02-27T00:16:06.9764625Z  at 
> scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)
> 2021-02-27T00:16:06.9765323Z  at 
> scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
> 2021-02-27T00:16:06.9766035Z  at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)
> 2021-02-27T00:16:06.9766812Z  at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)
> 2021-02-27T00:16:06.9767768Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.executeBucket$1(LightArrayRevolverScheduler.scala:279)
> 2021-02-27T00:16:06.9768801Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.nextTick(LightArrayRevolverScheduler.scala:283)
> 2021-02-27T00:16:06.9769614Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.run(LightArrayRevolverScheduler.scala:235)
> 2021-02-27T00:16:06.9770261Z  at java.lang.Thread.run(Thread.java:748)
> 2021-02-27T00:16:06.9771578Z Caused by: 
> java.util.concurrent.TimeoutException: Invocation of public abstract 
> java.util.concurrent.CompletableFuture 
> org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.submitTask(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor,org.apache.flink.runtime.jobmaster.JobMasterId,org.apache.flink.api.common.time.Time)
>  timed out.
> 2021-02-27T00:16:06.9773056Z  at 
> org.apache.flink.runtime.jobmaster.RpcTaskManagerGateway.submitTask(RpcTaskManagerGateway.java:68)
> 2021-02-27T00:16:06.9773869Z  at 
> org.apache.flink.runtime.executiongraph.Execution.lambda$deploy$10(Execution.java:832)
> 2021-02-27T00:16:06.9774669Z  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
> 2021-02-27T00:16:06.9775423Z  at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 2021-02-27T00:16:06.9776076Z  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 2021-02-27T00:16:06.9776887Z  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> 2021-02-27T00:16:06.9777916Z  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> 2021-02-27T00:16:06.9778756Z  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 2021-02-27T00:16:06.9779504Z  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 2021-02-27T00:16:06.9780025Z  ... 1 more
> 2021-02-27T00:16:06.9782426Z Caused by: akka.pattern.AskTimeoutException: Ask 
> timed out on [Actor[akka://flink/user/rpc/taskmanager_12#1583248815]] after 
> [10000 ms]. Message of type 
> [org.apache.flink.runtime.rpc.messages.LocalRpcInvocation]. A typical reason 
> for `AskTimeoutException` is that the recipient actor didn't send a reply.
> 2021-02-27T00:16:06.9783716Z  at 
> akka.pattern.PromiseActorRef$.$anonfun$defaultOnTimeout$1(AskSupport.scala:635)
> 2021-02-27T00:16:06.9784445Z  at 
> akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:650)
> 2021-02-27T00:16:06.9785102Z  at 
> akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205)
> 2021-02-27T00:16:06.9785957Z  at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)
> 2021-02-27T00:16:06.9786701Z  at 
> scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)
> 2021-02-27T00:16:06.9787410Z  at 
> scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
> 2021-02-27T00:16:06.9788182Z  at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)
> 2021-02-27T00:16:06.9788982Z  at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)
> 2021-02-27T00:16:06.9789866Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.executeBucket$1(LightArrayRevolverScheduler.scala:279)
> 2021-02-27T00:16:06.9790699Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.nextTick(LightArrayRevolverScheduler.scala:283)
> 2021-02-27T00:16:06.9791644Z  at 
> akka.actor.LightArrayRevolverScheduler$$anon$3.run(LightArrayRevolverScheduler.scala:235)
> 2021-02-27T00:16:06.9792194Z  ... 1 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to