Bertrand Delacretaz created SLING-3321:
------------------------------------------

             Summary: Incorrect caching/timeout behavior with slow health check
                 Key: SLING-3321
                 URL: https://issues.apache.org/jira/browse/SLING-3321
             Project: Sling
          Issue Type: Bug
          Components: Extensions
    Affects Versions: Health Check Core 1.0.8
            Reporter: Bertrand Delacretaz
            Assignee: Bertrand Delacretaz


We might not need to fix this right now, just making a note of some tests I did 
with the SlowHealthCheckSample.

By default SlowHealthCheckSample takes 1200-3700 msec to execute, and I have 
set the cache lifetime to 5 seconds.

With these settings, executing the health check every second should always 
provide a result: even if a particular execute call takes more than the default 
2 seconds execution timeout, an older cached result should still be available 
as 3700 (max execution time) + 1000 (execution period) is smaller than 5000 
(time to live in cache)

I'll attach an execution log which shows that this is not the case. I see two 
problems:

# A result which times out is cached and reused, even though the actual 
execution might have finished in the meantime. We then get a timeout result and 
the actual result is thrown away. There's no " execution counter=2" result in 
my log for example.
# There's no way to say "execute the health check, but if it times out use an 
older result if still valid". We might need an execution option for that, as 
you don't always want that.

I think this is a realistic use case, checking external systems for example 
might have that kind of timing characteristics. I should be able to call the 
executor for such an HC every second, for example, and get a result every time,.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to