[ 
https://issues.apache.org/jira/browse/FLINK-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190727#comment-17190727
 ] 

Huang Xiao commented on FLINK-19141:
------------------------------------

Hi, [~yunhui].

According to the log, it seems there's no enough slots to run this job.

For streaming jobs, the default slot request timeout is 300s. If the scheduler 
cannot get the needed slot after 300s, it will throw this exception.

You can try to decrease the memory of tm or add the number of slots per tm to 
increase the total number of slots in cluster.

Hope this can solve your problem :)

> Flink Job Submitted on Yarn Does not Work
> -----------------------------------------
>
>                 Key: FLINK-19141
>                 URL: https://issues.apache.org/jira/browse/FLINK-19141
>             Project: Flink
>          Issue Type: Bug
>          Components: Client / Job Submission
>    Affects Versions: 1.11.1
>            Reporter: Yunhui
>            Priority: Major
>
> I first launch a cluster on yarn.
> {code:java}
> $flink_path/bin/yarn-session.sh \
>   -qu dev \
>   -d -nm flink_cluster_1.11 \
>   -jm 8192 \
>   -tm 12288 \
>   -s 2 \
>   -D taskmanager.memory.framework.off-heap.size=2048m \
>   -D taskmanager.memory.managed.size=0{code}
> Then I submit my job with the following command
> {code:java}
> $flink_path/bin/flink run \
>   -d -m $host_port \
>   -c MyMainClass \
>   my-jar.jar{code}
> It take a long time to schedule. And it ended with the following Exception. 
> But it works for flink-1.10.1
> {code:java}
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Could not allocate the required slot within slot request timeout. Please make 
> sure that the cluster has enough resources.    at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.lambda$assignResourceOrHandleError$6(DefaultScheduler.java:422)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822) 
> ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[?:1.8.0_77]    at 
> org.apache.flink.runtime.jobmaster.slotpool.SchedulerImpl.lambda$internalAllocateSlot$0(SchedulerImpl.java:168)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[?:1.8.0_77]    at 
> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$SingleTaskSlot.release(SlotSharingManager.java:726)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.release(SlotSharingManager.java:537)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.jobmaster.slotpool.SlotSharingManager$MultiTaskSlot.lambda$new$0(SlotSharingManager.java:432)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822) 
> ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[?:1.8.0_77]    at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$forwardTo$21(FutureUtils.java:1120)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
>  ~[?:1.8.0_77]    at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1036)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>  ~[flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.actor.Actor$class.aroundReceive(Actor.scala:517) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.actor.ActorCell.invoke(ActorCell.scala:561) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.Mailbox.run(Mailbox.scala:225) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.Mailbox.exec(Mailbox.scala:235) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
> [flink-dist_2.11-1.11.1.jar:1.11.1]    at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>  [flink-dist_2.11-1.11.1.jar:1.11.1]Caused by: 
> java.util.concurrent.CompletionException: 
> java.util.concurrent.TimeoutException    at 
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
>  ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593) 
> ~[?:1.8.0_77]    at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>  ~[?:1.8.0_77]    ... 25 moreCaused by: java.util.concurrent.TimeoutException 
>    ... 23 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to