Re: HealthCheckExecutor.execute(ServiceReference) ??

Georg Henzler Mon, 30 Dec 2013 07:17:32 -0800

Hi Betrand and Christian,

now I'm afraid we might go a step backward... some things that yousuggest to be unnecessary are essentially important from my point ofview - supported by one year experience getting to this stage :). Hereare my explanations:


--- RE: executeXXX Methods

We definitely need the executeForTags(String... tags) and executeAll()methods because only this way we get parallel execution with intelligenttimeout handling and current results:

- If all checks are quick (let's say the the longest check requires100ms so finish), executeXxx() will return after 100ms having allcurrent results (Bertrand, your patch was returning the results of thelast call, if the last call was 2h ago, all results would be two hoursold!)- If all checks except one are quick and let's say the one check takes3s timing out, we get all current results for the quick checks and oneWARN timeout result for one check (if we wanted we could change it tothe last cached result, but for my last project this behaviour workedbetter, also see next bullet point)- The global timeout is normally configured reasonably ensuring thatwhen things are ok, checks never time out- Sometimes, if things go bad, a check can be stuck completely (I knowthis shouldn't happen, but I have seen it happening). For this case aspecial timeout can be configured marking checks as critical if theyexceed this timeout (10sec default)


The current implementation is robust and correct in many ways:

- no matter how many calls to the health check executor are made (e.g.100 per sec or only 1 every day), we get the latest results (respectingthe cache TTL) but without ever executing the same check twice (tworequest to the web console started at the exact same time will onlycause each check to be called once and the two requests will return theexact same result)

- it uses never more threads than checks exist in the system

- no matter how bad one check is implemented or how "sick" the aspectthe check observes is, it will never in any way cause the executor tostop working except that it returns red for the check (returning theresults for the other checks correctly)


--- RE: Health Check MetaData

It is important to distinguish two aspects:

1) defining the meta data
2) allow access to the meta data

to 1) I definitely think this should be done in a declarative way (notprogrammatically by adding a method to the HealthCheck interface!).There is just no need for touring complete code here. Using the serviceproperties as it's done in the moment is IMHO the best way.

to 2) There should be definitely a way to access the meta data from theresult. My way was to use the class HealthCheckDescriptor and add amethod result.getHealthCheckDescriptor() to the result (funny enoughthat class was called HealthCheckMetaData in my first version but Irenamed it later because I thought Descriptor is better). Having a flatresult structure was requested by Carsten, that's why we have themethods getHealthCheckName() and getHealthCheckTags() now. The cleanestsolution my opinion is to go back to result.getHealthCheckDescriptor() -or we can call it getHealthCheckMetaData() if you prefer. Also we cancreate an interface HealthCheckMetaData() and keep that in line with theservice properties.

--- RE: "why can't we provide these values from the HealthCheck thatwas used to create it?" (these values=timing information)


Because

- every check would have to do it (code duplication, pushing complexityto the implementor)

- it cannot be guaranteed, that every health check does it correctly


--- RE: UnsupportedOperationException in Result

Using the UnsupportedOperationException is IMHO an elegant solution.The exception is part of the JDK and for instance the Collection API isusing it as well. It is elegant because- From a conceptual/object oriented view, it totally makes sense tohave getElapsedTime() or getFinishedAt() as properties of a result.However, the HC itself cannot provide this information (see lastsection) and it's fine to throw UnsupportedOperationException for theseresults then.- If the methods would be taken away from the interface, a cast toExecutionResult would be needed (that is hidden at the moment anyway) toshow this information in the web console- From an implementors point of view, it is quickly obvious that themethods are only available when using the executor (the exceptionmessage is pretty clear)- Typical implementors would not even notice as they just construct theresult, but don't call methods on it (that is the job of the web consoleand JMX)- We should have getElapsedTime() and getFinishedAt() in theHealthCheckResult interface because this information is useful for humanobservers (and should be shown in web console / JMX)



-Georg


Am 30.12.2013 11:30, schrieb Carsten Ziegeler:

The more I think about it, the more I get the feeling that if wecan't awaywith execute(ServiceReference) in the executor api, this should bethe only
method of that service.
The two others can easily be replaced by using the HCFilter and then
calling the method above for each service reference. It's just oneline
more code, but this would completely remove the need to add meta
information about the HC in the result and make the wholeimplementation
easier.

WDYT?

Carsten


2013/12/30 Carsten Ziegeler <cziege...@apache.org>
2013/12/27 Bertrand Delacretaz <bdelacre...@apache.org>
Hi Carsten and Georg,

I was going to say that I still disagree, but being outnumbered I'd
let it go if it's just about a single method that I don'tlike...and
then I reviewed the org.apache.sling.hc.api package once again and
it's worse than I thought.
The problem is really, if we don't have a method with this signaturehowcan the jmx implementation make use of the executor while theexecutor isstill able to cache the result, do async processing etc. ? I don'tsee anyother solution atm, apart from moving the jmx implementation intothe core.
You guys both say you don't want metadata in APIs...and the
HealthCheckResult interface that you added has these two methods
(which make sense BTW):

  String getHealthCheckName();
  List<String> getHealthCheckTags();

So a HC is not brave enough to have this in its API, but a Result
needs them? This clearly demonstrates that we need a
HealthCheckMetadata interface that provides those values.
I think these are different concerns, a service(!) does not need anapi
to expose the metadata because it's already in the serviceregistration.
When we return a set of results, there need to be way of correlation
between a single result and an actual check, otherwise you get abunch of
results without any additional information about the check.
The way out of this, would be to just have a single method
execute(ServiceReference) in the executor :) - but as soon as weallowexecution of more than a single HC we need this additionalinformation.
On top of that, the Result throws UnsupportedException on these
methods...how ugly is that, why can't it provide these values fromthe
HealthCheck that was used to create it?
I haven't looked deeply at this part yet...
To me this smells of implementation-driven APIs, we are used tobetter
design in Sling.

As pointed out, I'm all ears for a better solution of the executor
interface that allows us to get the above mentioned features. And as
pointed out below, if we simplify the executor service to a singlemethod,
we don't need to make any changes to Result or add any additional
information.

Carsten

What do others think?
-Bertrand

[1]



https://svn.apache.org/repos/asf/sling/trunk/bundles/extensions/healthcheck/core/src/main/java/org/apache/sling/hc/api
--
Carsten Ziegeler
cziege...@apache.org

Re: HealthCheckExecutor.execute(ServiceReference) ??

Reply via email to