Hi Georg, On Thu, Dec 5, 2013 at 7:51 AM, Georg Henzler <[email protected]> wrote: > ...I just had a closer look at the Sling code > and I like some of the concepts but believe some other things could maybe be > improved...
Thanks for your review - I agree that we need better control on the execution time and asynchronous execution of our health checks. We discussed this recently [1] and what's suggested there is fairly similar to what you suggest in terms of health checks execution, with timeouts and caching of previously computed values. > ...There is an emphasis on getting the overall status of the system: There is > a Web Console Plugin > and whiteboard servlet (not being dependent on sling) to retrieve an > aggregated result of all > health checks registered as services... You can aggregate Sling health checks with the CompositeHealthCheck that's briefly described at [3] and used in the health check samples, would that cover your use cases? > As a first step, I would like to propose the following: > * Introduce HealthCheckRunner to hc-core with the following signature: > List<Result> HealthCheckRunner.runAllForTags(String... tags) // the > list is sorted to put failed ones always on top... I don't think I would sort here, that's a presentation concern - I prefer having a stable order in the output of the execution service itself. > * The HealthCheckRunner would use the existing class HealthCheckFilter to > retrieve the service references Sounds good > * The Web Console would be adjusted to use HealthCheckRunner Ok > * I would add getExecutionTimeInMs() to org.apache.sling.hc.api.Result If we're caching the Results I'd add creation timestamp, an expiration time that can be set when creating the Result and the execution duration as you suggest. > ...* Add parameter format=json to /system/console/healthcheck to provide the > result in JSON format (to avoid an extra servlet, I think it is possible for > console urls to return JSON but I would have to check)... Maybe we don't need that as we have the SLING-2999 JMX resource provider, but in general this makes sense. If you want to provide a prototype health check executor service that would be cool. Note that we have a Sling thread pools service [2] that's probably useful for that. -Bertrand [1] http://markmail.org/message/ioatdxdogexacu2b [2] http://sling.apache.org/documentation/bundles/apache-sling-commons-thread-pool.html [3] http://sling.apache.org/documentation/bundles/sling-health-check-tool.html
