Hi Georg,

thanks for the explanation - I see your point of parallel execution, but :)
right now this is either executing all checks in parallel or based on tags.
If - for whatever reason - I have an arbitrary set of HCs to execute in
parallel, this is not possible atm. So what about having just a single
method?

execute(ServiceReference<HealthCheck>[] hcs)

The result from HealthCheckFilter.getTaggedHealthCheckServiceReferences()
can be fed directly into this method. And we solve the case where you have
a single service reference and arbitrary onces as well.

I also think we should move the executor service interface and the result
interface (regardless of how this looks) into a separate package.

For the result, I think we should leave the Result class untouched as this
is the result as produces by the HC - which is not exactly the same as the
result produced by the executor.
We can rename the result class from the Executor to
HealthCheckExecutionResult - to make this clear and instead of having the
same methods as the Result class, simple add a method which returns the
Result object that was returned by the HealthCheck and just add additional
methods for the additional information. Instead of returning the meta data
(like tags, name etc.) we could also simply return the ServiceReference -
as we now have the service reference as input, we can return it as well.

I've committed the changes except the last point to make it clearer what I
mean - we can easily revert if we disagree (rev 1554396)

WDYT?

Regards
Carsten





2013/12/30 Georg Henzler <[email protected]>

> Hi Betrand and Christian,
>
> now I'm afraid we might go a step backward... some things that you suggest
> to be unnecessary are essentially important from my point of view -
> supported by one year experience getting to this stage :).  Here are my
> explanations:
>
> --- RE: executeXXX Methods
>
> We definitely need the executeForTags(String... tags) and executeAll()
> methods because only this way we get parallel execution with intelligent
> timeout handling and current results:
>
> - If all checks are quick (let's say the the longest check requires 100ms
> so finish), executeXxx() will return after 100ms having all current results
> (Bertrand, your patch was returning the results of the last call, if the
> last call was 2h ago, all results would be two hours old!)
> - If all checks except one are quick and let's say the one check takes 3s
> timing out, we get all current results for the quick checks and one WARN
> timeout result for one check (if we wanted we could change it to the last
> cached result, but for my last project this behaviour worked better, also
> see next bullet point)
> - The global timeout is normally configured reasonably ensuring that when
> things are ok, checks never time out
> - Sometimes, if things go bad, a check can be stuck completely (I know
> this shouldn't happen, but I have seen it happening). For this case a
> special timeout can be configured marking checks as critical if they exceed
> this timeout (10sec default)
>
> The current implementation is robust and correct in many ways:
> - no matter how many calls to the health check executor are made (e.g. 100
> per sec or only 1 every day), we get the latest results (respecting the
> cache TTL) but without ever executing the same check twice (two request to
> the web console started at the exact same time will only cause each check
> to be called once and the two requests will return the exact same result)
> - it uses never more threads than checks exist in the system
> - no matter how bad one check is implemented or how "sick" the aspect the
> check observes is, it will never in any way cause the executor to stop
> working except that it returns red for the check (returning the results for
> the other checks correctly)
>
> --- RE: Health Check MetaData
>
> It is important to distinguish two aspects:
>
> 1) defining the meta data
> 2) allow access to the meta data
>
> to 1) I definitely think this should be done in a declarative way (not
> programmatically by adding a method to the HealthCheck interface!). There
> is just no need for touring complete code here. Using the service
> properties as it's done in the moment is IMHO the best way.
>
> to 2) There should be definitely a way to access the meta data from the
> result. My way was to use the class HealthCheckDescriptor and add a method
> result.getHealthCheckDescriptor() to the result (funny enough that class
> was called HealthCheckMetaData in my first version but I renamed it later
> because I thought Descriptor is better). Having a flat result structure was
> requested by Carsten, that's why we have the methods getHealthCheckName()
> and getHealthCheckTags() now. The cleanest solution my opinion is to go
> back to result.getHealthCheckDescriptor() - or we can call it
> getHealthCheckMetaData() if you prefer. Also we can create an interface
> HealthCheckMetaData() and keep that in line with the service properties.
>
> --- RE: "why can't we provide these values from the HealthCheck that was
> used to create it?" (these values=timing information)
>
> Because
> - every check would have to do it (code duplication, pushing complexity to
> the implementor)
> - it cannot be guaranteed, that every health check does it correctly
>
>
> --- RE: UnsupportedOperationException in Result
>
> Using the UnsupportedOperationException is IMHO an elegant solution. The
> exception is part of the JDK and for instance the Collection API is using
> it as well. It is elegant because
> - From a conceptual/object oriented view, it totally makes sense to have
> getElapsedTime() or getFinishedAt() as properties of a result. However, the
> HC itself cannot provide this information (see last section) and it's fine
> to throw UnsupportedOperationException for these results then.
> - If the methods would be taken away from the interface, a cast to
> ExecutionResult would be needed (that is hidden at the moment anyway) to
> show this information in the web console
> - From an implementors point of view, it is quickly obvious that the
> methods are only available when using the executor (the exception message
> is pretty clear)
> - Typical implementors would not even notice as they just construct the
> result, but don't call methods on it (that is the job of the web console
> and JMX)
> - We should have getElapsedTime() and getFinishedAt() in the
> HealthCheckResult interface because this information is useful for human
> observers (and should be shown in web console / JMX)
>
>
> -Georg
>
>
> Am 30.12.2013 11:30, schrieb Carsten Ziegeler:
>
>  The more I think about it, the more I get the feeling that if we can't
>> away
>> with execute(ServiceReference) in the executor api, this should be the
>> only
>> method of that service.
>> The two others can easily be replaced by using the HCFilter and then
>> calling the method above for each service reference. It's just one line
>> more code, but this would completely remove the need to add meta
>> information about the HC in the result and make the whole implementation
>> easier.
>>
>> WDYT?
>>
>> Carsten
>>
>>
>> 2013/12/30 Carsten Ziegeler <[email protected]>
>>
>>  2013/12/27 Bertrand Delacretaz <[email protected]>
>>>
>>>  Hi Carsten and Georg,
>>>>
>>>> I was going to say that I still disagree, but being outnumbered I'd
>>>> let it go if it's just about a single method that I don't like...and
>>>> then I reviewed the org.apache.sling.hc.api package once again and
>>>> it's worse than I thought.
>>>>
>>>>  The problem is really, if we don't have a method with this signature
>>> how
>>> can the jmx implementation make use of the executor while the executor is
>>> still able to cache the result, do async processing etc. ? I don't see
>>> any
>>> other solution atm, apart from moving the jmx implementation into the
>>> core.
>>>
>>>
>>>
>>>> You guys both say you don't want metadata in APIs...and the
>>>> HealthCheckResult interface that you added has these two methods
>>>> (which make sense BTW):
>>>>
>>>>   String getHealthCheckName();
>>>>   List<String> getHealthCheckTags();
>>>>
>>>> So a HC is not brave enough to have this in its API, but a Result
>>>> needs them? This clearly demonstrates that we need a
>>>> HealthCheckMetadata interface that provides those values.
>>>>
>>>> I think these are different concerns, a service(!) does not need an api
>>>>
>>> to expose the metadata because it's already in the service registration.
>>> When we return a set of results, there need to be way of correlation
>>> between a single result and an actual check, otherwise you get a bunch of
>>> results without any additional information about the check.
>>> The way out of this, would be to just have a single method
>>> execute(ServiceReference) in the executor :) - but as soon as we allow
>>> execution of more than a single HC we need this additional information.
>>>
>>>
>>>  On top of that, the Result throws UnsupportedException on these
>>>> methods...how ugly is that, why can't it provide these values from the
>>>> HealthCheck that was used to create it?
>>>>
>>>>  I haven't looked deeply at this part yet...
>>>
>>>
>>>> To me this smells of implementation-driven APIs, we are used to better
>>>> design in Sling.
>>>>
>>>> As pointed out, I'm all ears for a better solution of the executor
>>>>
>>> interface that allows us to get the above mentioned features. And as
>>> pointed out below, if we simplify the executor service to a single
>>> method,
>>> we don't need to make any changes to Result or add any additional
>>> information.
>>>
>>> Carsten
>>>
>>> What do others think?
>>>
>>>>
>>>> -Bertrand
>>>>
>>>> [1]
>>>>
>>>>
>>>>
>>>> https://svn.apache.org/repos/asf/sling/trunk/bundles/
>>>> extensions/healthcheck/core/src/main/java/org/apache/sling/hc/api
>>>>
>>>>
>>>
>>>
>>> --
>>> Carsten Ziegeler
>>> [email protected]
>>>
>>>
>


-- 
Carsten Ziegeler
[email protected]

Reply via email to