Sal Merino edited a comment on Bug JENKINS-22685

Here's what's happening:

1. Restarting Jenkins (after plugin-update or by calling /restart) spawns a new process to execute "jenkins.exe restart" (i.e., restarting via the Windows Service wrapper). The Parent Process ID of this new process equals the Process ID that is running jenkins.war.

2. Jenkins.exe's "restart" command is a three-part process: A) stop the jenkins.war PID and all children PID; B) Wait for jenkins.war PID to stop; C) launch jenkins.war in a new process.

3. Since this execution of "jenkins.exe restart" is itself a child process of jenkins.war, the Jenkins service is killing it as part of 2A. This results in:

3A. Killing "jenkins.exe restart" (via SIGINT) causes that process to return w/ errorlevel=1. The WindowsServiceLifecycle restart method is checking the errorlevel from running "jenkins.exe restart" and seeing that it is not 0, and therefore throws an IOException. This is the error message seen in jenkins.err.log.

3B. Since the process running "jenkins.exe restart" gets killed as part of Step 2A, it does not get a change to complete the restart process, so Jenkins remains stopped.

Solution:

From looking at the WinSW project, it looks like there was some anticipation of this issue, in regards to the "secret" "restart!" command vs. the standard "restart". However, simply changing WindowsServiceLifecycle so that it calls "jenkins.exe restart!" does not fix the issue (actually, I just switched the "restart" and "restart!" commands in WinSW as that was easier to drop in and test), as there are additional exceptions thrown that prevent the service from stopping.

Something that should work would be if WinSW.Main.StopProcessAndChildren(), when it recurses over the jenkins.war PID and children, would skip killing a child PID if that Process's ExecutablePath is equal to the WinSW executable path. Testing this idea and will submit a pull request if it works.

Note:

There's also a bug in the output from WinSW.Main.StopProcessAndChildren: variable "pid" is the process being killed during this execution of the method (either the main jenkins.war PID or the PID of one of its children), but the "Send SIGINT/SIGINT to" log output uses process.Id, which is always the jenkins.war PID. So the log output looks like multiple SIGINTs are being sent to the same PID, which is not the case.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira

--
You received this message because you are subscribed to the Google Groups "Jenkins Issues" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to