[jira] [Work logged] (TS-4866) Remove traffic_cop health checking
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=30185&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-30185 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 05/Oct/16 19:00 Start Date: 05/Oct/16 19:00 Worklog Time Spent: 10m Work Description: Github user jacksontj commented on the issue: https://github.com/apache/trafficserver/pull/1035 +1 on metric for this :) Issue Time Tracking --- Worklog Id: (was: 30185) Time Spent: 1h 10m (was: 1h) > Remove traffic_cop health checking > -- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: New Feature > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29563&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29563 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 23/Sep/16 02:05 Start Date: 23/Sep/16 02:05 Worklog Time Spent: 10m Work Description: Github user zwoop closed the pull request at: https://github.com/apache/trafficserver/pull/1035 Issue Time Tracking --- Worklog Id: (was: 29563) Time Spent: 1h (was: 50m) > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 1h > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29403&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29403 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 20/Sep/16 15:36 Start Date: 20/Sep/16 15:36 Worklog Time Spent: 10m Work Description: Github user jpeach commented on the issue: https://github.com/apache/trafficserver/pull/1035 I like this a lot. As a follow-on Jira, can you look into keeping some metrics (published to traffic_manager or simply to stdout on SIGUSR1) to make it easier to monitor? Issue Time Tracking --- Worklog Id: (was: 29403) Time Spent: 50m (was: 40m) > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 50m > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29401 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 20/Sep/16 15:08 Start Date: 20/Sep/16 15:08 Worklog Time Spent: 10m Work Description: Github user PSUdaemon commented on a diff in the pull request: https://github.com/apache/trafficserver/pull/1035#discussion_r79634032 --- Diff: cmd/traffic_cop/traffic_cop.cc --- @@ -651,6 +661,26 @@ config_reload_records() config_read_int("proxy.config.cluster.rsport", &rs_port, true); config_read_int("proxy.config.cop.init_sleep_time", &init_sleep_time, true); + config_read_int("proxy.config.cop.active_health_checks", &tmp, true); + // 0 == No servers are killed + // 1 == Only traffic_manager can be killed on failure + // 2 == Only traffic_server can be killed on failure + // 3 == Any failing healthchecks can cause restarts (default) + switch (tmp) { + case 0: +active_health_checks = COP_KILL_NONE; +break; + case 1: +active_health_checks = COP_KILL_MANAGER; +break; + case 2: +active_health_checks = COP_KILL_SERVER; +break; + default: +active_health_checks = COP_KILL_SERVER | COP_KILL_MANAGER; +break; + } + --- End diff -- What do you think about skipping `tmp` and just saying something like: ```c++ config_read_int("proxy.config.cop.active_health_checks", &active_health_checks, true); if ((active_health_checks < 0) || (active_health_checks > 3)) { active_health_checks = COP_KILL_SERVER | COP_KILL_MANAGER; } ``` Issue Time Tracking --- Worklog Id: (was: 29401) Time Spent: 40m (was: 0.5h) > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29393 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 20/Sep/16 13:44 Start Date: 20/Sep/16 13:44 Worklog Time Spent: 10m Work Description: Github user atsci commented on the issue: https://github.com/apache/trafficserver/pull/1035 Linux build *successful*! See https://ci.trafficserver.apache.org/job/Github-Linux/736/ for details. Issue Time Tracking --- Worklog Id: (was: 29393) Time Spent: 0.5h (was: 20m) > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29392 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 20/Sep/16 13:43 Start Date: 20/Sep/16 13:43 Worklog Time Spent: 10m Work Description: Github user atsci commented on the issue: https://github.com/apache/trafficserver/pull/1035 FreeBSD build *successful*! See https://ci.trafficserver.apache.org/job/Github-FreeBSD/840/ for details. Issue Time Tracking --- Worklog Id: (was: 29392) Time Spent: 20m (was: 10m) > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 20m > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work logged] (TS-4866) Remove traffic_cop health checking.
[ https://issues.apache.org/jira/browse/TS-4866?focusedWorklogId=29389&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29389 ] ASF GitHub Bot logged work on TS-4866: -- Author: ASF GitHub Bot Created on: 20/Sep/16 13:28 Start Date: 20/Sep/16 13:28 Worklog Time Spent: 10m Work Description: GitHub user zwoop opened a pull request: https://github.com/apache/trafficserver/pull/1035 TS-4866: Makes traffic_cop killing optional This adds a new configuration option, proxy.config.cop.active_health_checks: 0 - traffic_cop is not allowed to kill any processes 1 - Only traffic_manager can be killed on failed health checks 2 - Only traffic_server can be killed on failed health checks 3 - traffic_server and traffic_manager can be killed on failure (default) You can merge this pull request into a Git repository by running: $ git pull https://github.com/zwoop/trafficserver TS-4866 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/trafficserver/pull/1035.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1035 commit 6c609d4acbf9de7524527f720b68dcd62c982812 Author: Leif Hedstrom Date: 2016-09-19T22:19:25Z TS-4866: Makes traffic_cop killing optional This adds a new configuration option, proxy.config.cop.active_health_checks: 0 - traffic_cop is not allowed to kill any processes 1 - Only traffic_manager can be killed on failed health checks 2 - Only traffic_server can be killed on failed health checks 3 - traffic_server and traffic_manager can be killed on failure (default) Issue Time Tracking --- Worklog Id: (was: 29389) Time Spent: 10m Remaining Estimate: 0h > Remove traffic_cop health checking. > --- > > Key: TS-4866 > URL: https://issues.apache.org/jira/browse/TS-4866 > Project: Traffic Server > Issue Type: Bug > Components: Cop >Affects Versions: 7.0.0 >Reporter: James Peach >Assignee: Leif Hedstrom > Fix For: 7.1.0 > > Time Spent: 10m > Remaining Estimate: 0h > > There is a school of thought that {{traffic_cop}} health checking causes more > problems that in solves. Consider whether we should eliminate health checking > from {{traffic_cop}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)