[ https://issues.apache.org/jira/browse/DAEMON-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062041#comment-16062041 ]
Bernd Eckenfels commented on DAEMON-288: ---------------------------------------- Discussiong this in DAEMON-366 since it looks like the change (probably) causes 1.0.15 be less forcefull in terminating a JVM. Any idea when it will happen that the main thread does not see the gShutdownEvent (should we maybe add a warning log in this case?) > Hang while stopping procrun service > ----------------------------------- > > Key: DAEMON-288 > URL: https://issues.apache.org/jira/browse/DAEMON-288 > Project: Commons Daemon > Issue Type: Bug > Components: Procrun > Affects Versions: 1.0.13 > Environment: Windows 7 64 bit > Reporter: Mike Miller > Fix For: 1.0.15 > > Attachments: prunsrv.c.patch > > > There is a hang of the procrun service while it is attempting to stop. It is > not easy to reproduce ( 30%-5% depending on pc ). Using a debugging to > analyze the hang, both the serviceMain() and serviceStop() threads appear to > have run and exited. I can tell this from the state of the global variables > like gSargs and gShutdownEvents. Looking at the code, both are calling > reportServiceStatus( SERVICE_STOPPED...). Typically when either one reports > SERVICE_STOPPED, the main thread unblocks and the process terminates. This > often occurs without both threads running to completion. I think this is a > race condition caused by the reportServiceStatus() usage. The MSDN > documentation for SetServiceStatus() states to only call SetServiceStatus() > with SERVICE_STOPPED after all cleanup has occurred and to only call it once. > It appears that procrun has a race condition where 2 threads will both > attempt to report SERVICE_STOPPED and will likely report this while the other > thread is still running. I believe this is the root cause of why the Service > Control Manager sometimes is unable to stop the service. > > As a potential solution, I modified serviceStop() to not call > reportServiceStatus(SERVICE_STOPPED...) and to move the SetEvent( > gShutdownEvent) to the end of the method. This change allows the thread > running the serviceStop() to complete. Now the only method reporting > stopped is when serviceMain() exits. With this refactoring to only report > SERVICE_STOPPED once (per MSDN) the hang has not been reproducible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)