[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462378#comment-17462378 ] Xintong Song edited comment on FLINK-25373 at 12/20/21, 3:56 AM: - Hi [~SpongebobZ], I think these threads are expected to be there. They belong to a io thread pool, whose lifecycle is aligned with the task manager. That means as long as the task manager is alive, this thread pool will not be released. bq. I submit my Flinksql jobs to the Flink standalone cluster and what out of my expectation is that TaskManagers could not free memory when all jobs are finished whether normally or not. Could you share more information how do you observe that the memory is not freed? I'm asking because, some of the memory are designed not to be freed, while some are lazily freed relying on the JVM GC. I'm not entirely sure what you've observed is unexpected or not. was (Author: xintongsong): Hi [~SpongebobZ], I think these threads are expected to be there. They belong to a io thread pool, whose lifecycle is aligned with the task manager. That means as long as the task manager is alive, these thread pool will not be released. bq. I submit my Flinksql jobs to the Flink standalone cluster and what out of my expectation is that TaskManagers could not free memory when all jobs are finished whether normally or not. Could you share more information how do you observe that the memory is not freed? I'm asking because, some of the memory are designed not to be freed, while some are lazily freed relying on the JVM GC. I'm not entirely sure what you've observed is unexpected or not. > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 6:58 AM: - I also encountered the same problem was (Author: JIRAUSER296932): 我也遇到了同样的问题 > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:01 AM: - I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. was (Author: JIRAUSER296932): I also encountered the same problem [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:01 AM: - I also encountered the same problem [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. was (Author: JIRAUSER296932): I also encountered the same problem > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:04 AM: - I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. I think this part is not heap memory, because I saw through jvm that my heap memory is only 5G but the top command shows more than 15G. was (Author: JIRAUSER296932): I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 7:48 AM: - I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. I think this part is not heap memory, because I saw through jvm that my heap memory is only 5G but the top command shows more than 15G. I have 6 slots per tm, but there are more than 6 org.apache.flink.runtime.memory.MemoryManager !image-2023-09-26-15-48-25-221.png! was (Author: JIRAUSER296932): I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. I think this part is not heap memory, because I saw through jvm that my heap memory is only 5G but the top command shows more than 15G. > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png, image-2023-09-26-15-48-25-221.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-25373) task manager can not free memory when jobs are finished
[ https://issues.apache.org/jira/browse/FLINK-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769033#comment-17769033 ] yuanfenghu edited comment on FLINK-25373 at 9/26/23 8:56 AM: - I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. I think this part is not heap memory, because I saw through jvm that my heap memory is only 5G but the top command shows more than 15G. was (Author: JIRAUSER296932): I also encountered the same problem Hi, [~xtsong] Our tm runs in docker. This problem will be triggered after the task is stopped on the original cluster or the task encounters an abnormal restart, which will cause these containers to be killed. I think this part is not heap memory, because I saw through jvm that my heap memory is only 5G but the top command shows more than 15G. I have 6 slots per tm, but there are more than 6 org.apache.flink.runtime.memory.MemoryManager !image-2023-09-26-15-48-25-221.png! > task manager can not free memory when jobs are finished > --- > > Key: FLINK-25373 > URL: https://issues.apache.org/jira/browse/FLINK-25373 > Project: Flink > Issue Type: Bug > Components: API / Core >Affects Versions: 1.14.0 > Environment: flink 1.14.0 >Reporter: Spongebob >Priority: Major > Attachments: image-2021-12-19-11-48-33-622.png, > image-2022-03-11-10-06-19-499.png, image-2023-09-26-15-48-25-221.png > > > I submit my Flinksql jobs to the Flink standalone cluster and what out of my > expectation is that TaskManagers could not free memory when all jobs are > finished whether normally or not. > And I found that there were many threads named like ` > flink-taskexecutor-io-thread-x` and their states were waiting on conditions. > here's the detail of these threads: > > "flink-taskexecutor-io-thread-31" Id=5386 WAITING on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at sun.misc.Unsafe.park(Native Method) > - waiting on > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2da8b14c > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > !image-2021-12-19-11-48-33-622.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010)