[ https://issues.apache.org/jira/browse/DAEMON-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856781#comment-17856781 ]
Mark Linley edited comment on DAEMON-460 at 6/21/24 3:48 PM: ------------------------------------------------------------- [~markt], I had a deeper look at this yesterday. I ran remote debug sessions with both "Tomcat 10" and our problematic Java windows service that uses 100% of a CPU core. The "Tomcat 10" service did not suffer from the same high CPU core usage problem. Both services were using the exact same JVM (openjdk version "11.0.14.1" 2022-02-08) and prunsrv.exe v1.3.4. Something that caught my attention was a difference in the service log between each of the two services. Shortly after service startup, our problematic Java service has this line in the service log: {quote}[2024-06-21 10:00:29] [debug] ( javajni.c:1092) [11744] Java worker thread finished ourservice/util/OurService:main with status = 0 {quote} This didn't make sense to me because the app was still running and functioning seemingly normally despite the single CPU core being at 100%. This log statement is not present in the "Tomcat 10" service log until you stop the Tomcat10 windows service. Then I thought about it, and it occurred to me that in our case our startup 'main' method will spawn some application worker threads that our app needs to function and it then returns, i.e. it doesn't block. When stopping the windows service, the OurService::stop() method will then signal to the app threads to shutdown and our application then terminates normally. The problem I think is that on the prunsrv.exe native C java worker thread side it likely detects that the JNI method call to OurService::main has terminated and therefore our application has terminated, so it places the prunsrv.exe native C java worker thread into state 0, which will trigger the tight loop we are seeing. I highlighted the line in a code snippet above (line 1165 above). I tested this using a simple java application and it appears to confirm my suspicions. When the java start main() method blocks, processing in a while loop waiting on a boolean condition and doing work every 5 seconds, there is low/normal/expected CPU usage. The stop() method will then change a static boolean variable which the loop is monitoring, and it will then terminate normally. However, when the java service's main() start method spawns a thread to do work asynchronously and immediately returns. You then see prunsrv.exe detecting that and putting it's native java worker thread in state=0 triggering the tight loop then the CPU core immediately sits at 100%: {quote}[2024-06-21 11:13:39] [debug] ( javajni.c:1067) [ 7928] Java worker thread started for test/commons/daemon/TestWindowsService:main [2024-06-21 11:13:39] [debug] ( javajni.c:1073) [ 7928] JNI calling static void method test/commons/daemon/TestWindowsService:main *[2024-06-21 11:13:39] [debug] ( javajni.c:1092) [ 7928] Java worker thread finished test/commons/daemon/TestWindowsService:main with status = 0* [2024-06-21 11:13:40] [debug] ( prunsrv.c:1490) [ 2084] Java started 'test/commons/daemon/TestWindowsService'. [2024-06-21 11:13:40] [info] ( prunsrv.c:1589) [ 2084] Service started in 1096 milliseconds. {quote} I have a test app you can unzip to c:\temp attached [^test-windows-service.zip]. Change directory to 'C:\temp\test-windows-service' and then run this command to register the Windows service and start it to test: {{prunsrv.exe //IS//TestService -Startup=auto --DisplayName="Test Service" --Description="This is a test Java service" --Install="C:\temp\test-windows-service\prunsrv.exe" --Classpath="C:\temp\test-windows-service" --Jvm="C:\java11jdk\bin\client\jvm.dll" --StartMode=jvm --StopTimeout=30 --StopMode=jvm --StartClass=test/commons/daemon/TestWindowsService --StopMethod=stop StopClass=test/commons/daemon/TestWindowsService --StartMethod=main --LogPath="C:\temp\test-windows-service" --LogPrefix=Test-Service-Log --LogLevel=Debug -JvmMs=128 --JvmMx=256 --JvmOptions=-Dfile.encoding=UTF8}} Hopefully this helps recreate the issue on your side. Let me know if you need anything else. Thanks Mark was (Author: plasm0r): [~markt], I had a deeper look at this yesterday. I ran remote debug sessions with both "Tomcat 10" and our problematic Java windows service that uses 100% of a CPU core. The "Tomcat 10" service did not suffer from the same high CPU core usage problem. Both services were using the exact same JVM (openjdk version "11.0.14.1" 2022-02-08) and prunsrv.exe v1.3.4. Something that caught my attention was a difference in the service log between each of the two services. Shortly after service startup, our problematic Java service has this line in the service log: {quote}[2024-06-21 10:00:29] [debug] ( javajni.c:1092) [11744] Java worker thread finished ourservice/util/OurService:main with status = 0 {quote} This didn't make sense to me because the app was still running and functioning seemingly normally despite the single CPU core being at 100%. This log statement is not present in the "Tomcat 10" service log until you stop the Tomcat10 windows service. Then I thought about it, and it occurred to me that in our case our startup 'main' method will spawn some application worker threads that our app needs to function and it then returns, i.e. it doesn't block. When stopping the windows service, the OurService::stop() method will then signal to the app threads to shutdown and our application then terminates normally. The problem I think is that on the prunsrv.exe native C java worker thread side it likely detects that the JNI method call to OurService::main has terminated and therefore our application has terminated, so it places the prunsrv.exe native C java worker thread into state 0, which will trigger the tight loop we are seeing. I highlighted the line in a code snippet above (line 1165 above). I tested this using a simple java application and it appears to confirm my suspicions. When the java start main() method blocks, processing in a while loop waiting on a boolean condition and doing work every 5 seconds, there is low/normal/expected CPU usage. The stop() method will then change a static boolean variable which the loop is monitoring, and it will then terminate normally. However, when the java service's main() start method spawns a thread to do work asynchronously and immediately returns. You then see prunsrv.exe detecting that and putting it's native java worker thread in state=0 triggering the tight loop then the CPU core immediately sits at 100%: {quote}[2024-06-21 11:13:39] [debug] ( javajni.c:1067) [ 7928] Java worker thread started for test/commons/daemon/TestWindowsService:main [2024-06-21 11:13:39] [debug] ( javajni.c:1073) [ 7928] JNI calling static void method test/commons/daemon/TestWindowsService:main *[2024-06-21 11:13:39] [debug] ( javajni.c:1092) [ 7928] Java worker thread finished test/commons/daemon/TestWindowsService:main with status = 0* [2024-06-21 11:13:40] [debug] ( prunsrv.c:1490) [ 2084] Java started 'test/commons/daemon/TestWindowsService'. [2024-06-21 11:13:40] [info] ( prunsrv.c:1589) [ 2084] Service started in 1096 milliseconds. {quote} I have a test app you can unzip to c:\temp attached [^test-windows-service.zip]. You can then run this command to register the Windows service and start it to test: {{prunsrv.exe //IS//TestService -Startup=auto --DisplayName="Test Service" --Description="This is a test Java service" --Install="C:\temp\test-windows-service\prunsrv.exe" --Classpath="C:\temp\test-windows-service" --Jvm="C:\java11jdk\bin\client\jvm.dll" --StartMode=jvm --StopTimeout=30 --StopMode=jvm --StartClass=test/commons/daemon/TestWindowsService --StopMethod=stop StopClass=test/commons/daemon/TestWindowsService --StartMethod=main --LogPath="C:\temp\test-windows-service" --LogPrefix=Test-Service-Log --LogLevel=Debug -JvmMs=128 --JvmMx=256 --JvmOptions=-Dfile.encoding=UTF8}} Hopefully this helps recreate the issue on your side. Let me know if you need anything else. Thanks Mark > High CPU usage in prunsrv.exe since Daemon 1.3.3 > ------------------------------------------------ > > Key: DAEMON-460 > URL: https://issues.apache.org/jira/browse/DAEMON-460 > Project: Commons Daemon > Issue Type: Bug > Components: prunsrv > Affects Versions: 1.3.3 > Reporter: Japie vd Linde > Priority: Major > Attachments: EspRun-Service-Log.2023-06-05.log, > image-2023-05-31-09-31-21-485.png, image-2023-06-05-13-38-38-435.png, > image-2024-05-29-15-56-35-585.png, image-2024-05-29-15-57-37-665.png, > image-2024-05-31-10-00-10-916.png, test-windows-service.zip > > > When using the --StopTimeout=30 parameter on service using prunsrv the CPU > usage is reported as very high on Windows. Rolling back to older prunsrv > seems to resolve the problem. > !image-2023-05-31-09-31-21-485.png! > What could be the possible causes for this problem? -- This message was sent by Atlassian Jira (v8.20.10#820010)