Jeff Elsloo created TC-401:
------------------------------

             Summary: Traffic Router Serves OFFLINE Caches
                 Key: TC-401
                 URL: https://issues.apache.org/jira/browse/TC-401
             Project: Traffic Control
          Issue Type: Bug
          Components: Traffic Router
    Affects Versions: 2.0.0
            Reporter: Jeff Elsloo
             Fix For: 2.1.0


We identified an issue that causes Traffic Router to serve up an {{OFFLINE}} 
cache indefinitely after a snapshot of the CRConfig. This bug will also do the 
inverse, where a cache that was previously set to {{OFFLINE}} will never have 
traffic routed to it when set back to {{ONLINE}} or {{REPORTED}} (referenced 
only as {{ONLINE}} henceforth).

The bug is caused by {{ConfigHandler.processConfig()}} clearing the cache 
locations from the {{NetworkNode}} prior to swapping out the instance of 
{{CacheRegister}}. When the cache locations have been cleared, but the prior 
{{CacheRegister}} is still in place, a race condition can occur where the 
{{CacheLocation}} for a given cache group from the prior config will be set on 
the recently cleared {{NetworkNode}}. When this happens, the {{List<Cache>}} 
contains the prior config's list for that cache group, which means that any 
host state change from/to {{ONLINE}} or {{OFFLINE}} will not be reflected. This 
is because when transitioning to {{OFFLINE}} the {{Cache}} drops from the 
CRConfig and it will reappear when set to {{ONLINE}}. Contrast this with 
{{ONLINE}} to {{ADMIN_DOWN}}, the {{Cache}} remains in the CRConfig, so we are 
simply using the status to determine whether the cache is available and the 
software works as designed.

This is due to the way we use lazy loading to associate network ranges within 
the CZF with {{CacheLocations}} within a given {{NetworkNode}} representing 
that section of the CZF. In {{TrafficRouter}}, during cache selection, if we 
have a hit in the coverage zone file but the {{CacheLocation}} is 
uninitialized, we obtain the {{CacheLocation}} from {{CacheRegister}} and set 
it for that specific {{NetworkNode}}. If our {{NetworkNode}} is cleared but our 
{{CacheRegister}} has yet to be swapped, we will set the {{NetworkNode}} to the 
old {{CacheLocation}} and as mentioned, which will have a reference to the 
prior {{List<Cache>}}, denying anyone the opportunity to populate that 
{{NetworkNode}} with the new {{CacheLocation}} and new {{List<Cache>}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to