q4q5q6qw opened a new issue, #15783: URL: https://github.com/apache/dolphinscheduler/issues/15783
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened JVM parameter configuration of the master service: -Xms2g -Xmx2g -Xmn1g GC logs are generated in the path:/opt/cloud/csb-soc-3rd-dolphinscheduler-master-service/apache-dolphinscheduler-3.2.0-bin/master-server/gc.log & dump.hprof finally stopping Task Scheduling ```shell 2024-03-29T19:57:50.361+0800: 796606.193: [GC (Allocation Failure) [PSYoungGen: 1039742K->7646K(1040384K)] 1978860K->947204K(2088960K), 0.0631915 secs] [Times: user=0.05 sys=0.00, real=0.07 secs] 2024-03-29T19:58:39.396+0800: 796655.228: [GC (Allocation Failure) [PSYoungGen: 1039838K->7296K(1040384K)] 1979396K->947518K(2088960K), 0.1130164 secs] [Times: user=0.06 sys=0.00, real=0.11 secs] 2024-03-29T19:59:26.804+0800: 796702.635: [GC (Allocation Failure) [PSYoungGen: 1039488K->7296K(1040384K)] 1979710K->947950K(2088960K), 0.0434791 secs] [Times: user=0.06 sys=0.00, real=0.05 secs] 2024-03-29T20:00:21.340+0800: 796757.171: [GC (Allocation Failure) [PSYoungGen: 1039488K->7326K(1040384K)] 1980142K->948532K(2088960K), 0.1332294 secs] [Times: user=0.06 sys=0.00, real=0.13 secs] 2024-03-29T20:01:12.051+0800: 796807.883: [GC (Allocation Failure) [PSYoungGen: 1039518K->7566K(1040384K)] 1980724K->949212K(2088960K), 0.5198885 secs] [Times: user=0.10 sys=0.00, real=0.52 secs] 2024-03-29T20:02:10.025+0800: 796865.856: [GC (Allocation Failure) [PSYoungGen: 1039758K->7630K(1040384K)] 1981404K->949708K(2088960K), 0.1057110 secs] [Times: user=0.08 sys=0.00, real=0.11 secs] 2024-03-29T20:03:05.311+0800: 796921.142: [GC (Allocation Failure) [PSYoungGen: 1039822K->7790K(1040384K)] 1981900K->950308K(2088960K), 0.2611041 secs] [Times: user=0.08 sys=0.00, real=0.26 secs] 2024-03-29T20:03:53.279+0800: 796969.110: [GC (Allocation Failure) [PSYoungGen: 1039982K->7456K(1040384K)] 1982500K->950626K(2088960K), 0.0455552 secs] [Times: user=0.05 sys=0.00, real=0.04 secs] 2024-03-29T20:04:43.705+0800: 797019.537: [GC (Allocation Failure) [PSYoungGen: 1039648K->7392K(1040384K)] 1982818K->951027K(2088960K), 0.0480305 secs] [Times: user=0.06 sys=0.00, real=0.04 secs] 2024-03-29T20:05:34.431+0800: 797070.263: [GC (Allocation Failure) [PSYoungGen: 1039584K->7406K(1040384K)] 1983219K->951625K(2088960K), 0.0720115 secs] [Times: user=0.05 sys=0.00, real=0.07 secs] 2024-03-29T20:06:19.201+0800: 797115.040: [GC (Allocation Failure) [PSYoungGen: 1039598K->7520K(1040384K)] 1983817K->952155K(2088960K), 0.1395749 secs] [Times: user=0.06 sys=0.01, real=0.14 secs] 2024-03-29T20:07:06.857+0800: 797162.688: [GC (Allocation Failure) [PSYoungGen: 1039712K->7550K(1040384K)] 1984347K->952633K(2088960K), 0.0989028 secs] [Times: user=0.05 sys=0.00, real=0.10 secs] 2024-03-29T20:07:53.536+0800: 797209.367: [GC (Allocation Failure) [PSYoungGen: 1039742K->7584K(1040384K)] 1984825K->953099K(2088960K), 0.0613938 secs] [Times: user=0.06 sys=0.00, real=0.06 secs] 2024-03-29T20:08:41.681+0800: 797257.513: [GC (Allocation Failure) [PSYoungGen: 1039776K->7422K(1040384K)] 1985291K->953574K(2088960K), 0.0935569 secs] [Times: user=0.05 sys=0.00, real=0.10 secs] 2024-03-29T20:09:34.735+0800: 797310.566: [GC (Allocation Failure) [PSYoungGen: 1039614K->7614K(1040384K)] 1985766K->954182K(2088960K), 0.1850809 secs] [Times: user=0.06 sys=0.00, real=0.18 secs] 2024-03-29T20:10:35.604+0800: 797371.435: [GC (Allocation Failure) [PSYoungGen: 1039806K->7456K(1040384K)] 1986374K->954599K(2088960K), 0.1839166 secs] [Times: user=0.05 sys=0.00, real=0.19 secs] 2024-03-29T20:11:21.813+0800: 797417.644: [GC (Allocation Failure) [PSYoungGen: 1039648K->7822K(1040384K)] 1986791K->955390K(2088960K), 0.1277530 secs] [Times: user=0.06 sys=0.01, real=0.13 secs] 2024-03-29T20:12:09.507+0800: 797465.338: [GC (Allocation Failure) [PSYoungGen: 1040014K->7742K(1040384K)] 1987582K->955734K(2088960K), 0.0312008 secs] [Times: user=0.05 sys=0.00, real=0.03 secs] 2024-03-29T20:12:57.583+0800: 797513.415: [GC (Allocation Failure) [PSYoungGen: 1039934K->7776K(1040384K)] 1987926K->956183K(2088960K), 0.1278959 secs] [Times: user=0.05 sys=0.01, real=0.13 secs] 2024-03-29T20:13:46.174+0800: 797562.005: [GC (Allocation Failure) [PSYoungGen: 1039968K->7582K(1040384K)] 1988375K->956687K(2088960K), 0.0466003 secs] [Times: user=0.06 sys=0.00, real=0.05 secs] 2024-03-29T20:14:35.058+0800: 797610.890: [GC (Allocation Failure) [PSYoungGen: 1039774K->7518K(1040384K)] 1988879K->957111K(2088960K), 0.0587399 secs] [Times: user=0.05 sys=0.00, real=0.05 secs] dump.hprof: "WorkflowExecuteThread-26" daemon prio=5 tid=135 RUNNABLE at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:48) at java.lang.Long.valueOf(Long.java:859) at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.taskFinished(WorkflowExecuteRunnable.java:419) Local Variable: org.apache.dolphinscheduler.dao.entity.ProcessInstance#51032 at org.apache.dolphinscheduler.server.master.event.TaskStateEventHandler.handleStateEvent(TaskStateEventHandler.java:74) Local Variable: org.apache.dolphinscheduler.dao.entity.TaskInstance#45628 at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable.handleEvents(WorkflowExecuteRunnable.java:293) Local Variable: org.apache.dolphinscheduler.server.master.event.TaskStateEvent#45629 Local Variable: org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable#47518 at org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThreadPool$$Lambda$1125.run(<unknown string>) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) Local Variable: java.util.concurrent.Executors$RunnableAdapter#8 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) Local Variable: org.springframework.util.concurrent.ListenableFutureTask#8 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Local Variable: java.util.concurrent.ThreadPoolExecutor$Worker#37 at java.lang.Thread.run(Thread.java:750) java.lang.Object[] Number of Instances:2.7% Size:56.1% char[] Number of Instances:17.1% Size:23.5% ### What you expected to happen Normal scheduling ### How to reproduce Modifying the JVM Startup Parameters of the Master Service: -Xms2g -Xmx2g -Xmn1g 100 scheduled tasks are executed every 5 minutes. Observe the CPU monitoring curve of the host where the master service is located. OOM in half an hour on my environment. ### Anything else _No response_ ### Version 3.2.x ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
