[jira] [Commented] (FLINK-14525) buffer pool is destroyed

2020-10-19 Thread Zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217338#comment-17217338
 ] 

Zhijiang commented on FLINK-14525:
--

Close this issue for cleanup, since the reporter was not responsive for long 
time and the affected version is out of date for maintaining.

> buffer pool is destroyed
> 
>
> Key: FLINK-14525
> URL: https://issues.apache.org/jira/browse/FLINK-14525
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.7.2
>Reporter: Saqib
>Priority: Major
>
> Have a flink app running in standalone mode. The app runs ok in our non-prod 
> env. However on our prod server it throws this exception:
> Buffer pool is destroyed. 
>  
> This error is being thrown as a RuntimeException on the collect call, on the 
> flatmap function. The flatmap is just collecting a Tuple, 
> the Document is a XML Document object.
>  
> As mentioned the non prod env  (and we have multiple, DEV,QA,UAT) this is not 
> happening. The UAT box is spec-ed exactly as our Prod host with 4CPU. The 
> java version is the same too.
>  
> Not sure how to proceed.
>  
> Thanks
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14525) buffer pool is destroyed

2019-10-28 Thread zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960815#comment-16960815
 ] 

zhijiang commented on FLINK-14525:
--

The above stack trace is not helpful for tracing the root cause. If you can get 
the JobMaster log, then it is easy to find the first failure reason which 
causes the above buffer pool destroyed.

> buffer pool is destroyed
> 
>
> Key: FLINK-14525
> URL: https://issues.apache.org/jira/browse/FLINK-14525
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.7.2
>Reporter: Saqib
>Priority: Blocker
>
> Have a flink app running in standalone mode. The app runs ok in our non-prod 
> env. However on our prod server it throws this exception:
> Buffer pool is destroyed. 
>  
> This error is being thrown as a RuntimeException on the collect call, on the 
> flatmap function. The flatmap is just collecting a Tuple, 
> the Document is a XML Document object.
>  
> As mentioned the non prod env  (and we have multiple, DEV,QA,UAT) this is not 
> happening. The UAT box is spec-ed exactly as our Prod host with 4CPU. The 
> java version is the same too.
>  
> Not sure how to proceed.
>  
> Thanks
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14525) buffer pool is destroyed

2019-10-25 Thread Saqib (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960037#comment-16960037
 ] 

Saqib commented on FLINK-14525:
---

here is the stack trace of the exception:

 

java.lang.RuntimeException: Buffer pool is destroyed.
 at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110)
 
 at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
 at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
 at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
 
 at 
com.cs.ib.tarsan.odds.flink.CMSAccountFilter.flatMap(CMSAccountFilter.java:51)
 at 
com.cs.ib.tarsan.cdds.flink.CMSAccountFilter.flatMap(CMSAccountFilter.java:15)
 at 
org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:50)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
 
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
 at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
 
 at 
com.cs.ib.tarsan.cdds.flink.CddsXMLDocumentCreator.flatMap(CddsXMLDocumentCreator.java:50)
 at 
com.cs.ib.tarsan.cdds.flink.CddsXMLDocumentCreator.flatMap(CddsXMLDocumentCreator.java:22)2019-10-24
 16:37:55.734 [Source: Custom 5<
 - GPID=30428415 ...Exception= Buffer pool is destroyed.

at 
org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:50)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
 
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
 
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
 
 at 
org.apache.flink.streaming.api.operators.StreamFilter.processElement(StreamFilter.java:40)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
 
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
 
 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
 
 at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
 at 
org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
 at 
org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:398)
 at 
org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher.emitRecord(Kafka010Fetcher.java:89)
 
 at 
org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:154)
 
 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:665)
 
 at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:94)
 at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:58)
 at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
 at java.lang.Thread.run(Thread.java:745)
Caused by: 'iava.lang.IllegalStateException: Buffer pool is destroyed.
 at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.reques

[jira] [Commented] (FLINK-14525) buffer pool is destroyed

2019-10-25 Thread zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959508#comment-16959508
 ] 

zhijiang commented on FLINK-14525:
--

As [~wind_ljy] mentioned above, there exists some other failures in the job, 
then it would trigger the cancel all the tasks. The  "Buffer pool is destroyed" 
has correlation with the cancel operation. You can further check whether it 
exists other task failure or TaskExecutor lost.

> buffer pool is destroyed
> 
>
> Key: FLINK-14525
> URL: https://issues.apache.org/jira/browse/FLINK-14525
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.7.2
>Reporter: Saqib
>Priority: Blocker
>
> Have a flink app running in standalone mode. The app runs ok in our non-prod 
> env. However on our prod server it throws this exception:
> Buffer pool is destroyed. 
>  
> This error is being thrown as a RuntimeException on the collect call, on the 
> flatmap function. The flatmap is just collecting a Tuple, 
> the Document is a XML Document object.
>  
> As mentioned the non prod env  (and we have multiple, DEV,QA,UAT) this is not 
> happening. The UAT box is spec-ed exactly as our Prod host with 4CPU. The 
> java version is the same too.
>  
> Not sure how to proceed.
>  
> Thanks
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14525) buffer pool is destroyed

2019-10-24 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959386#comment-16959386
 ] 

Jiayi Liao commented on FLINK-14525:


I believe this is not the root cause. "Buffer pool is destroyed" is because the 
NettyShuffleEnvironment is closed, which is kind of a "normal" phenomenon when 
exception is thrown.

It'd be better if you can attach the full logs of the jobmanager and 
taskmanager. 

> buffer pool is destroyed
> 
>
> Key: FLINK-14525
> URL: https://issues.apache.org/jira/browse/FLINK-14525
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.7.2
>Reporter: Saqib
>Priority: Blocker
>
> Have a flink app running in standalone mode. The app runs ok in our non-prod 
> env. However on our prod server it throws this exception:
> Buffer pool is destroyed. 
>  
> This error is being thrown as a RuntimeException on the collect call, on the 
> flatmap function. The flatmap is just collecting a Tuple, 
> the Document is a XML Document object.
>  
> As mentioned the non prod env  (and we have multiple, DEV,QA,UAT) this is not 
> happening. The UAT box is spec-ed exactly as our Prod host with 4CPU. The 
> java version is the same too.
>  
> Not sure how to proceed.
>  
> Thanks
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)