[GitHub] trafficserver pull request #1518: Ensure 'service trafficserver stop’ is s...

2017-02-28 Thread zwoop
Github user zwoop closed the pull request at:

https://github.com/apache/trafficserver/pull/1518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] trafficserver pull request #1518: Ensure 'service trafficserver stop’ is s...

2017-02-28 Thread zwoop
GitHub user zwoop opened a pull request:

https://github.com/apache/trafficserver/pull/1518

Ensure 'service trafficserver stop’ is synchronous under load on redhat

Currently the service trafficserver stop command on redhat might return 
before ATS has exited.
This is not good as it’s often followed by a start, which then fails.

/etc/init.d/trafficserver does:

action "Stopping ${TC_NAME}:" killproc -p $TC_PIDFILE $TC_DAEMON
action "Stopping ${TM_NAME}:" killproc -p $TM_PIDFILE $TM_DAEMON
action "Stopping ${TS_NAME}:" killproc -p $TS_PIDFILE $TS_DAEMON

and killproc, as defined in /etc/rc.d/init.d/functions, with those 
arguments essentially does this:

send SIGTERM
wait up to 3 seconds for it to exit
send SIGKILL
wait 0.1 seconds then return an exit status indicating if the process 
still exists

A SIGKILL signal always causes the death of a process but it’s not 
instantaneous. A process can take a long time to exit on a busy system for 
assorted reasons, including flushing dirty buffers to disk.

So if the stop is immediately followed by a start, as it often is, the 
start may fail with an error like ‘port 80 in use’. This seems to be a 
common cause of restart failures on busy systems and frustrating manual 
hand-holding.

Contrast this with the behaviour that /etc/init.d/trafficserver uses when 
run on Ubuntu… there it’ll wait up to 35 seconds.
(It'll also use SIGQUIT instead of SIGTERM which seems odd).

This PR makes /etc/init.d/trafficserver more reliable, and consistent, on 
redhat by adding `-d 35` to the killproc arguments so it'll wait for the 
daemons to stop on redhat for about as long as it does in ubuntu. Not perfect, 
but much better.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zwoop/trafficserver TimsPatchSquashed

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/trafficserver/pull/1518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1518


commit 432151695cbd2dcedfffcb1baed620d335154f15
Author: Tim Bunce 
Date:   2017-02-28T18:14:15Z

Ensure 'service trafficserver stop’ is synchronous under load on redhat

Currently the service trafficserver stop command on redhat might return 
before ATS has exited.
This is not good as it’s often followed by a start, which then fails.

/etc/init.d/trafficserver does:

action "Stopping ${TC_NAME}:" killproc -p $TC_PIDFILE $TC_DAEMON
action "Stopping ${TM_NAME}:" killproc -p $TM_PIDFILE $TM_DAEMON
action "Stopping ${TS_NAME}:" killproc -p $TS_PIDFILE $TS_DAEMON

and killproc, as defined in /etc/rc.d/init.d/functions, with those 
arguments essentially does this:

send SIGTERM
wait up to 3 seconds for it to exit
send SIGKILL
wait 0.1 seconds then return an exit status indicating if the process 
still exists

A SIGKILL signal always causes the death of a process but it’s not 
instantaneous. A process can take a long time to exit on a busy system for 
assorted reasons, including flushing dirty buffers to disk.

So if the stop is immediately followed by a start, as it often is, the 
start may fail with an error like ‘port 80 in use’. This seems to be a 
common cause of restart failures on busy systems and frustrating manual 
hand-holding.

Contrast this with the behaviour that /etc/init.d/trafficserver uses when 
run on Ubuntu… there it’ll wait up to 35 seconds.
(It'll also use SIGQUIT instead of SIGTERM which seems odd).

This PR makes /etc/init.d/trafficserver more reliable, and consistent, on 
redhat by adding `-d 35` to the killproc arguments so it'll wait for the 
daemons to stop on redhat for about as long as it does in ubuntu. Not perfect, 
but much better.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---