[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2021-12-19 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462378#comment-17462378
 ] 

Xintong Song edited comment on FLINK-25373 at 12/20/21, 3:56 AM:
-

Hi [~SpongebobZ],

I think these threads are expected to be there. They belong to a io thread 
pool, whose lifecycle is aligned with the task manager. That means as long as 
the task manager is alive, this thread pool will not be released.

bq. I submit my Flinksql jobs to the Flink standalone cluster and what  out of 
my expectation is that TaskManagers could not free memory when all jobs are 
finished whether normally or not.
Could you share more information how do you observe that the memory is not 
freed? I'm asking because, some of the memory are designed not to be freed, 
while some are lazily freed relying on the JVM GC. I'm not entirely sure what 
you've observed is unexpected or not.


was (Author: xintongsong):
Hi [~SpongebobZ],

I think these threads are expected to be there. They belong to a io thread 
pool, whose lifecycle is aligned with the task manager. That means as long as 
the task manager is alive, these thread pool will not be released.

bq. I submit my Flinksql jobs to the Flink standalone cluster and what  out of 
my expectation is that TaskManagers could not free memory when all jobs are 
finished whether normally or not.
Could you share more information how do you observe that the memory is not 
freed? I'm asking because, some of the memory are designed not to be freed, 
while some are lazily freed relying on the JVM GC. I'm not entirely sure what 
you've observed is unexpected or not.

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-25 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 6:58 AM:
-

I also encountered the same problem
 


was (Author: JIRAUSER296932):
我也遇到了同样的问题

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-26 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:01 AM:
-

I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
 


was (Author: JIRAUSER296932):
I also encountered the same problem
 [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
 

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-26 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:01 AM:
-

I also encountered the same problem
 [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
 


was (Author: JIRAUSER296932):
I also encountered the same problem
 

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-26 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:04 AM:
-

I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
I think this part is not heap memory, because I saw through jvm that my heap 
memory is only 5G but the top command shows more than 15G. 
 

 


was (Author: JIRAUSER296932):
I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
 

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-26 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:48 AM:
-

I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
I think this part is not heap memory, because I saw through jvm that my heap 
memory is only 5G but the top command shows more than 15G. 
I have 6 slots per tm, but there are more than 6 
org.apache.flink.runtime.memory.MemoryManager

!image-2023-09-26-15-48-25-221.png!

 


was (Author: JIRAUSER296932):
I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
I think this part is not heap memory, because I saw through jvm that my heap 
memory is only 5G but the top command shows more than 15G. 
 

 

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png, image-2023-09-26-15-48-25-221.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished

2023-09-26 Thread yuanfenghu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033
 ] 

yuanfenghu edited comment on FLINK-25373 at 9/26/23 8:56 AM:
-

I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
I think this part is not heap memory, because I saw through jvm that my heap 
memory is only 5G but the top command shows more than 15G. 



 

 


was (Author: JIRAUSER296932):
I also encountered the same problem
Hi,  [~xtsong] 
Our tm runs in docker. This problem will be triggered after the task is stopped 
on the original cluster or the task encounters an abnormal restart, which will 
cause these containers to be killed.
I think this part is not heap memory, because I saw through jvm that my heap 
memory is only 5G but the top command shows more than 15G. 
I have 6 slots per tm, but there are more than 6 
org.apache.flink.runtime.memory.MemoryManager

!image-2023-09-26-15-48-25-221.png!

 

> task manager can not free memory when jobs are finished
> ---
>
> Key: FLINK-25373
> URL: https://issues.apache.org/jira/browse/FLINK-25373
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.14.0
> Environment: flink 1.14.0
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2021-12-19-11-48-33-622.png, 
> image-2022-03-11-10-06-19-499.png, image-2023-09-26-15-48-25-221.png
>
>
> I submit my Flinksql jobs to the Flink standalone cluster and what  out of my 
> expectation is that TaskManagers could not free memory when all jobs are 
> finished whether normally or not.
> And I found that there were many threads named like `
> flink-taskexecutor-io-thread-x` and their states were waiting on conditions.
> here's the detail of these threads:
>  
> "flink-taskexecutor-io-thread-31" Id=5386 WAITING on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at sun.misc.Unsafe.park(Native Method)
> - waiting on 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> !image-2021-12-19-11-48-33-622.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)