[ 
https://issues.apache.org/jira/browse/TS-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14727922#comment-14727922
 ] 

Leif Hedstrom commented on TS-3848:
-----------------------------------

It's somewhat different in that an empty config + wait_for_cache==1 is one of 
two things:

1) A bad configuration made by the sysadmin.

2) A bad configuration pushed by the config management system (e.g. Salt, 
Puppet or Chef).


The case I'm primarily worried about was #2, which has happened several times. 
The case you bring up is interesting, but I still stand by my thoughts that 
starting up with no cache as configured above is a bad idea. There's also a 
big, big difference: The odds that a large number of machines all have bad 
disks like this is slim to zero, whereas pushing a bad config to a large number 
of machines is a very real possibility.

With that in mind, I think we can do either of

a) Make wait_for_cache=1 mean that if all disks in storage.config are down, 
then refuse to start up (in some way, I don't care how). But don't let it proxy.

b) Add another enum to this config, that says "if not at least one disk is 
available".


My argument is still that the directive to wait_for_cache=1 is a very strong 
wish that I do not want to risk killing the Origin servers.

> ATS runs without cache or partial cache on disk errors
> ------------------------------------------------------
>
>                 Key: TS-3848
>                 URL: https://issues.apache.org/jira/browse/TS-3848
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cache
>            Reporter: Pushkar Pradhan
>            Assignee: Alan M. Carroll
>             Fix For: 6.1.0
>
>
> Problem:
> If ATS fails to initialize the cache (none of the disks were accessible), the 
> behavior depends on proxy.config.http.wait_for_cache:
> If wait_for_cache = 0, it will listen for requests and serve the requests (by 
> fetching from origin/parent/peer). 
> If wait_for_cache = 1, it will never listen for requests. This is almost like 
> a hang.
> We would like to change this so that we can take some action when the cache 
> fails to initialize (even partially):
> Proposed Solution:
> Define a new variable: proxy.config.http.cache.required
> Value range: 0-2
> 0 (default) - Do nothing
> 1 - Abort trafficserver if it failed to initialize all the disks/volumes
> 2 - Abort trafficserver if it failed to initialize even one of the disks or 
> volumes.
> Preconditions for this new behavior are:
> proxy.config.http.cache.required = 1 (HTTP caching enabled) and 
> proxy.config.http.wait_for_cache = 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to