[ 
https://issues.apache.org/jira/browse/SLING-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846328#comment-13846328
 ] 

Georg Henzler edited comment on SLING-3278 at 12/12/13 2:29 PM:
----------------------------------------------------------------

The property for async execution property can make sense when you want to make 
sure a check is called not as often as the health check itself (e.g. only twice 
a day).

I'm pretty much done, No 2 of Bertrand's list and unit tests are missing if you 
like you can have a look at the patches to give feedback before I submit a 
final one.

Impl Notes:
* The main entry method is 
org.apache.sling.hc.core.executor.HealthCheckExecutor.runAllForTags(String...)
* Results are cached for 2sec by default (configurable)
* Results have now a HealthCheckDescriptor that contains meta info for the 
check (also used in the executor as cache key etc.) 
* Async is supported by attribute hc.async.cronExpression, a service listener 
is in place for registering/unregistering of jobs 
(org.apache.sling.hc.core.executor.AsyncHealthCheckExecutor)
* I did add a natural order to results (failed tests first, then by name 
alphabetically) - if not using this the order would be arbitrary (depending on 
execution time)
* The result has an additional finishDate and elapsedTime (I think finish date 
is more interesting for caching than the start date!)

Other thoughts (not in patch):
* I'm not sure if the CompositeHealthCheck makes sense - is this not a grouping 
competing with the tags? It is easy to configure it in a way that some checks 
are executed twice, especially if you run all checks without giving a tag (and 
the HealthCheckExecutor cannot prevent it as the CompositeHealthCheck looks 
like any other check to it)
* Exceptions: The result should be able to carry a exception - I would even go 
as far as adding "throws Exception" to the execute() signature (this would not 
break any existing implementation classes) and generically add a last critical 
log if the HC happens to throw an exception


was (Author: henzlerg):
The property for async execution property can make sense when you want to make 
sure a check is called not as often as the health check itself (e.g. only twice 
a day).

I'm pretty much done, No 2 of Bertrand's list and unit tests are missing if you 
like you can have a look at the patches to give feedback before I submit a 
final one.

Impl Notes:
* The main entry method is 
org.apache.sling.hc.core.executor.HealthCheckExecutor.runAllForTags(String...)
* Results have now a HealthCheckDescriptor that contains meta info for the 
check (also used in the executor as cache key etc.) 
* Async is supported by attribute hc.async.cronExpression, a service listener 
is in place for registering/unregistering of jobs 
(org.apache.sling.hc.core.executor.AsyncHealthCheckExecutor)
* I did add a natural order to results (failed tests first, then by name 
alphabetically) - if not using this the order would be arbitrary (depending on 
execution time)
* The result has an additional finishDate and elapsedTime (I think finish date 
is more interesting for caching than the start date!)

Other thoughts (not in patch):
* I'm not sure if the CompositeHealthCheck makes sense - is this not a grouping 
competing with the tags? It is easy to configure it in a way that some checks 
are executed twice, especially if you run all checks without giving a tag (and 
the HealthCheckExecutor cannot prevent it as the CompositeHealthCheck looks 
like any other check to it)
* Exceptions: The result should be able to carry a exception - I would even go 
as far as adding "throws Exception" to the execute() signature (this would not 
break any existing implementation classes) and generically add a last critical 
log if the HC happens to throw an exception

> Provide a HealthCheckExecutor service
> -------------------------------------
>
>                 Key: SLING-3278
>                 URL: https://issues.apache.org/jira/browse/SLING-3278
>             Project: Sling
>          Issue Type: New Feature
>          Components: Health Check
>            Reporter: Georg Henzler
>            Assignee: Georg Henzler
>         Attachments: 
> SLING-3278-hc.core-HealthCheckExecutorService-v0.5.patch, 
> SLING-3278-hc.webconsole-v0.5.patch
>
>
> Goals:
> * Be able to get an overall (aggregated) result as quickly as possible 
> (ideally <2sec)
> * Whenever possible, return most current results (e.g. for a memory check)
> * Provide a declarative way for async checks (async checks should be the 
> exception though) 
> Approach
> * Run checks in parallel
> * Make sure long running (or even stuck) checks are timed out
> * If a health check must run asynchronously (because its execution time 
> cannot be optimized), it should be enough to just specify a service property 
> (e.g. "hc.async").
> See also
> http://apache-sling.73963.n3.nabble.com/Health-Check-Improvements-td4029330.html#a4029402
> http://apache-sling.73963.n3.nabble.com/Health-checks-execution-service-td4028477.html



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to