Ok. There is an OOM exception...but this used to work fine with the same 
configurations.There are four nodes: beam1 through 4.The Kafka partitions are 
4096 > 3584 deg of parallelism.
jobmanager.rpc.address: beam1jobmanager.rpc.port: 6123jobmanager.heap.mb: 
1024taskmanager.heap.mb: 102400taskmanager.numberOfTaskSlots:  896 
taskmanager.memory.preallocate: false
parallelism.default: 3584

Thanks for your valuable time Till.
AnonymousParDo -> AnonymousParDo (3584/3584) (ebe8da5bda017ee31ad774c5bc5e5e88) 
switched from DEPLOYING to RUNNING2016-11-08 22:51:44,471 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: 
Read(UnboundedKafkaSource) -> AnonymousParDo -> AnonymousParDo (3573/3584) 
(ddf5a8939c1fc4ad1e6d71f17fe5ab0b) switched from RUNNING to FAILED2016-11-08 
22:51:44,474 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       
 - Source: Read(UnboundedKafkaSource) -> AnonymousParDo -> AnonymousParDo 
(1/3584) (865c54432153a0230e62bf7610118ff8) switched from RUNNING to 
CANCELING2016-11-08 22:51:44,474 INFO  
org.apache.flink.runtime.jobmanager.JobManager                - Status of job 
e61cada683c0f7a709101c26c2c9a17c (benchbeamrunners-abahman-1108225128) changed 
to FAILING.java.lang.OutOfMemoryError: unable to create new native thread at 
java.lang.Thread.start0(Native Method) at 
java.lang.Thread.start(Thread.java:714) at 
java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950) 
at 
java.util.concurrent.ThreadPoolExecutor.ensurePrestart(ThreadPoolExecutor.java:1587)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:334)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
 at 
java.util.concurrent.Executors$DelegatedScheduledExecutorService.schedule(Executors.java:729)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.registerTimer(StreamTask.java:652)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.registerTimer(AbstractStreamOperator.java:250)
 at 
org.apache.flink.streaming.api.operators.StreamingRuntimeContext.registerTimer(StreamingRuntimeContext.java:92)
 at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.setNextWatermarkTimer(UnboundedSourceWrapper.java:381)
 at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:233)
 at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:78) 
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:56)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:224) 
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) at 
java.lang.Thread.run(Thread.java:745)


      From: Till Rohrmann <till.rohrm...@gmail.com>
 To: user@flink.apache.org; amir bahmanyari <amirto...@yahoo.com> 
 Sent: Tuesday, November 8, 2016 2:11 PM
 Subject: Re: Why did the Flink Cluster JM crash?
   
Hi Amir,
what does the JM logs say?
Cheers,Till
On Tue, Nov 8, 2016 at 9:33 PM, amir bahmanyari <amirto...@yahoo.com> wrote:

Hi colleagues,I started the cluster all fine. Started the Beam app running in 
the Flink Cluster fine.Dashboard showed all tasks being consumed and open for 
business.I started sending data to the Beam app, and all of the sudden the 
Flink JM crashed.Exceptions below.Thanks+regardsAmir
java.lang.RuntimeException: Pipeline execution failed        at 
org.apache.beam.runners.flink. FlinkRunner.run(FlinkRunner. java:113)        at 
org.apache.beam.runners.flink. FlinkRunner.run(FlinkRunner. java:48)        at 
org.apache.beam.sdk.Pipeline. run(Pipeline.java:183)        at 
benchmark.flinkspark.flink. BenchBeamRunners.main( BenchBeamRunners.java:622)  
//p.run();        at sun.reflect. NativeMethodAccessorImpl. invoke0(Native 
Method)        at sun.reflect. NativeMethodAccessorImpl. invoke( 
NativeMethodAccessorImpl.java: 62)        at sun.reflect. 
DelegatingMethodAccessorImpl. invoke( DelegatingMethodAccessorImpl. java:43)    
    at java.lang.reflect.Method. invoke(Method.java:498)        at 
org.apache.flink.client. program.PackagedProgram. callMainMethod( 
PackagedProgram.java:505)        at org.apache.flink.client. 
program.PackagedProgram. invokeInteractiveModeForExecut 
ion(PackagedProgram.java:403)        at org.apache.flink.client. 
program.Client.runBlocking( Client.java:248)        at org.apache.flink.client. 
CliFrontend. executeProgramBlocking( CliFrontend.java:866)        at 
org.apache.flink.client. CliFrontend.run(CliFrontend. java:333)        at 
org.apache.flink.client. CliFrontend.parseParameters( CliFrontend.java:1189)    
    at org.apache.flink.client. CliFrontend.main(CliFrontend. java:1239)Caused 
by: org.apache.flink.client. program. ProgramInvocationException: The program 
execution failed: Communication with JobManager failed: Lost connection to the 
JobManager.        at org.apache.flink.client. program.Client.runBlocking( 
Client.java:381)        at org.apache.flink.client. program.Client.runBlocking( 
Client.java:355)        at org.apache.flink.streaming. api.environment. 
StreamContextEnvironment. execute( StreamContextEnvironment.java: 65)        at 
org.apache.beam.runners.flink. FlinkPipelineExecutionEnvironm 
ent.executePipeline( FlinkPipelineExecutionEnvironm ent.java:118)        at 
org.apache.beam.runners.flink. FlinkRunner.run(FlinkRunner. java:110)        
... 14 moreCaused by: org.apache.flink.runtime. client.JobExecutionException: 
Communication with JobManager failed: Lost connection to the JobManager.        
at org.apache.flink.runtime. client.JobClient. submitJobAndWait(JobClient. 
java:140)        at org.apache.flink.client. program.Client.runBlocking( 
Client.java:379)        ... 18 moreCaused by: org.apache.flink.runtime. client. 
JobClientActorConnectionTimeou tException: Lost connection to the JobManager.   
     at org.apache.flink.runtime. client.JobClientActor. 
handleMessage(JobClientActor. java:244)        at 
org.apache.flink.runtime.akka. FlinkUntypedActor. handleLeaderSessionID( 
FlinkUntypedActor.java:88)        at org.apache.flink.runtime.akka. 
FlinkUntypedActor.onReceive( FlinkUntypedActor.java:68)        at 
akka.actor.UntypedActor$$ anonfun$receive$1.applyOrElse( 
UntypedActor.scala:167)        at akka.actor.Actor$class. 
aroundReceive(Actor.scala:465)        at akka.actor.UntypedActor. 
aroundReceive(UntypedActor. scala:97)        at akka.actor.ActorCell. 
receiveMessage(ActorCell. scala:516)        at akka.actor.ActorCell.invoke( 
ActorCell.scala:487)        at akka.dispatch.Mailbox. 
processMailbox(Mailbox.scala: 254)        at akka.dispatch.Mailbox.run( 
Mailbox.scala:221)        at akka.dispatch.Mailbox.exec( Mailbox.scala:231)     
   at scala.concurrent.forkjoin. ForkJoinTask.doExec( ForkJoinTask.java:260)    
    at scala.concurrent.forkjoin. ForkJoinPool$WorkQueue. 
pollAndExecAll(ForkJoinPool. java:1253)        at scala.concurrent.forkjoin. 
ForkJoinPool$WorkQueue. runTask(ForkJoinPool.java: 1346)        at 
scala.concurrent.forkjoin. ForkJoinPool.runWorker( ForkJoinPool.java:1979)      
  at scala.concurrent.forkjoin. ForkJoinWorkerThread.run( 
ForkJoinWorkerThread.java:107)



   

Reply via email to