[ 
https://issues.apache.org/jira/browse/DAEMON-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17856781#comment-17856781
 ] 

Mark Linley edited comment on DAEMON-460 at 6/21/24 2:28 PM:
-------------------------------------------------------------

[~markt],

I had a deeper look at this yesterday. I ran remote debug sessions with both 
"Tomcat 10" and our problematic Java windows service that uses 100% of a CPU 
core. The "Tomcat 10" service did not suffer from the same high CPU core usage 
problem. Both services were using the exact same JVM (openjdk version 
"11.0.14.1" 2022-02-08) and prunsrv.exe v1.3.4.

Something that caught my attention was a difference in the service log between 
each of the two services. Shortly after service startup, our problematic Java 
service has this line in the service log:
{quote}[2024-06-21 10:00:29] [debug] ( javajni.c:1092) [11744] Java worker 
thread finished ourservice/util/OurService:main with status = 0
{quote}
This didn't make sense to me because the app was still running and functioning 
seemingly normally despite the single CPU core being at 100%. This log 
statement is not present in the "Tomcat 10" service log until you stop the 
Tomcat10 windows service.

Then I thought about it, and it occurred to me that in our case our startup 
'main' method will spawn some application worker threads that our app needs to 
function and it then returns, i.e. it doesn't block. When stopping the windows 
service, the OurService::stop() method will then signal to the app threads to 
shutdown and our application then terminates normally.

The problem I think is that on the prunsrv.exe native C java worker thread side 
it likely detects that the JNI method call to OurService::main has terminated 
and therefore our application has terminated, so it places the prunsrv.exe 
native C java worker thread into state 0, which will trigger the tight loop we 
are seeing. I highlighted the line in a code snippet above (line 1165 above).

I tested this using a simple java application and it appears to confirm my 
suspicions.

When the java start main() method blocks, processing in a while loop waiting on 
a boolean condition and doing work every 5 seconds, there is 
low/normal/expected CPU usage. The stop() method will then change a static 
boolean variable which the loop is monitoring, and it will then terminate 
normally.

However, when the java service's main() start method spawns a thread to do work 
asynchronously and immediately returns. You then see prunsrv.exe detecting that 
and putting it's native java worker thread in state=0 triggering the tight loop 
then the CPU core immediately sits at 100%:
{quote}[2024-06-21 11:13:39] [debug] ( javajni.c:1067) [ 7928] Java worker 
thread started for test/commons/daemon/TestWindowsService:main
[2024-06-21 11:13:39] [debug] ( javajni.c:1073) [ 7928] JNI calling static void 
method test/commons/daemon/TestWindowsService:main
*[2024-06-21 11:13:39] [debug] ( javajni.c:1092) [ 7928] Java worker thread 
finished test/commons/daemon/TestWindowsService:main with status = 0*
[2024-06-21 11:13:40] [debug] ( prunsrv.c:1490) [ 2084] Java started 
'test/commons/daemon/TestWindowsService'.
[2024-06-21 11:13:40] [info]  ( prunsrv.c:1589) [ 2084] Service started in 1096 
milliseconds.
{quote}
 

I have a test app you can unzip to c:\temp attached [^test-windows-service.zip].

You can then run this command to register the Windows service and start it to 
test:

{{prunsrv.exe //IS//TestService --Startup=auto --DisplayName="Test Service" 
--Description="This is a test Java service" 
--Install="C:\temp\test-windows-service\prunsrv.exe" 
--Classpath="C:\temp\test-windows-service" 
--Jvm="C:\java11jdk\bin\client\jvm.dll" --StartMode=jvm --StopTimeout=30 
--StopMode=jvm --StartClass=test/commons/daemon/TestWindowsService 
--StopMethod=stop  --StopClass=test/commons/daemon/TestWindowsService 
--StartMethod=main --LogPath="C:\temp\test-windows-service" 
--LogPrefix=Test-Service-Log --LogLevel=Debug  --JvmMs=128 --JvmMx=256 
--JvmOptions=-Dfile.encoding=UTF8}}

Hopefully this helps recreate the issue on your side. Let me know if you need 
anything else.

Thanks

Mark


was (Author: plasm0r):
[~markt],

I had a deeper look at this yesterday. I ran remote debug sessions with both 
"Tomcat 10" and our problematic Java windows service that uses 100% of a CPU 
core. The "Tomcat 10" service did not suffer from the same high CPU core usage 
problem. Both services were using the exact same JVM (openjdk version 
"11.0.14.1" 2022-02-08) and prunsrv.exe v1.3.4.

Something that caught my attention was a difference in the service log between 
each of the two services. Shortly after service startup, our problematic Java 
service has this line in the service log:
{quote}[2024-06-21 10:00:29] [debug] ( javajni.c:1092) [11744] Java worker 
thread finished ourservice/util/OurService:main with status = 0
{quote}
This didn't make sense to me because the app was still running and functioning 
seemingly normally despite the single CPU core being at 100%. This log 
statement is not present in the "Tomcat 10" service log until you stop the 
Tomcat10 windows service.

Then I thought about it, and it occurred to me that in our case our startup 
'main' method will spawn some application worker threads that our app needs to 
function and it then returns, i.e. it doesn't block. When stopping the windows 
service, the OurService::stop() method will then signal to the app threads to 
shutdown and our application then terminates normally.

The problem I think is that on the prunsrv.exe native C java worker thread side 
it likely detects that the JNI method call to OurService::main has terminated 
and therefore our application has terminated, so it places the prunsrv.exe 
native C java worker thread into state 0, which will trigger the tight loop we 
are seeing. I highlighted the line in a code snippet above (line 1165 above).

I tested this using a simple java application and it appears to confirm my 
suspicions.

When the java start main() method blocks, processing in a while loop waiting on 
a boolean condition and doing work every 5 seconds, there is 
low/normal/expected CPU usage. The stop() method will then change a static 
boolean variable which the loop is monitoring, and it will then terminate 
normally.

However, when the java service's main() start method spawns a thread to do work 
asynchronously and immediately returns. You then see prunsrv.exe detecting that 
and putting it's native java worker thread in state=0 triggering the tight loop 
then the CPU core immediately sits at 100%:
{quote}[2024-06-21 11:13:39] [debug] ( javajni.c:1067) [ 7928] Java worker 
thread started for test/commons/daemon/TestWindowsService:main
[2024-06-21 11:13:39] [debug] ( javajni.c:1073) [ 7928] JNI calling static void 
method test/commons/daemon/TestWindowsService:main
*[2024-06-21 11:13:39] [debug] ( javajni.c:1092) [ 7928] Java worker thread 
finished test/commons/daemon/TestWindowsService:main with status = 0*
[2024-06-21 11:13:40] [debug] ( prunsrv.c:1490) [ 2084] Java started 
'test/commons/daemon/TestWindowsService'.
[2024-06-21 11:13:40] [info]  ( prunsrv.c:1589) [ 2084] Service started in 1096 
milliseconds.
{quote}
 

I have a test app you can unzip to c:\temp attached [^test-windows-service.zip].

You can then run this command to register the Windows service and start it to 
test:

<command>prunsrv.exe //IS//TestService --Startup=auto --DisplayName="Test 
Service" --Description="This is a test Java service" 
--Install="C:\temp\test-windows-service\prunsrv.exe" 
--Classpath="C:\temp\test-windows-service" 
--Jvm="C:\java11jdk\bin\client\jvm.dll" --StartMode=jvm --StopTimeout=30 
--StopMode=jvm --StartClass=test/commons/daemon/TestWindowsService 
--StopMethod=stop  --StopClass=test/commons/daemon/TestWindowsService 
--StartMethod=main --LogPath="C:\temp\test-windows-service" 
--LogPrefix=Test-Service-Log --LogLevel=Debug  --JvmMs=128 --JvmMx=256 
--JvmOptions=-Dfile.encoding=UTF8</command>

Hopefully this helps recreate the issue on your side. Let me know if you need 
anything else.

Thanks

Mark

> High CPU usage in prunsrv.exe since Daemon 1.3.3
> ------------------------------------------------
>
>                 Key: DAEMON-460
>                 URL: https://issues.apache.org/jira/browse/DAEMON-460
>             Project: Commons Daemon
>          Issue Type: Bug
>          Components: prunsrv
>    Affects Versions: 1.3.3
>            Reporter: Japie vd Linde
>            Priority: Major
>         Attachments: EspRun-Service-Log.2023-06-05.log, 
> image-2023-05-31-09-31-21-485.png, image-2023-06-05-13-38-38-435.png, 
> image-2024-05-29-15-56-35-585.png, image-2024-05-29-15-57-37-665.png, 
> image-2024-05-31-10-00-10-916.png, test-windows-service.zip
>
>
> When using the --StopTimeout=30 parameter on service using prunsrv the CPU 
> usage is reported as very high on Windows. Rolling back to older prunsrv 
> seems to resolve the problem. 
> !image-2023-05-31-09-31-21-485.png!
> What could be the possible causes for this problem?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to