Aled Sage created BROOKLYN-176:
----------------------------------
Summary: Expensive polling of driver.isRunning is too frequent
Key: BROOKLYN-176
URL: https://issues.apache.org/jira/browse/BROOKLYN-176
Project: Brooklyn
Issue Type: Improvement
Affects Versions: 0.9.0-SNAPSHOT
Reporter: Aled Sage
Many of our entities by default poll driver.isRunning, calling it every 5
seconds. This often involves a ssh execution, which is cpu intensive. This
doesn't scale.
This polling should only be done if there is no other/better way to determine
if an entity is unhealthy. We should use other mechanisms (e.g. checking if a
web-server is reachable over http), and only resort to calling driver.isRunning
to provide additional information if that other check fails.
However, turning it off is a little tricky in the current code. The
SoftwareProcess.initEnrichers does:
{noformat}
ServiceNotUpLogic.updateNotUpIndicator(this, SERVICE_PROCESS_IS_RUNNING, "No
information yet on whether this service is running");
{noformat}
The SERVICE_PROCESS_IS_RUNNING value is cleared by `connectServiceUpIsRunning`,
which polls the driver.isRunning`. If you don't call
`connectServiceUpIsRunning` then you'd need to do something yourself to ensure
`SERVICE_PROCESS_IS_RUNNING` is cleared.
We also need to better define the best practices for checking serviceUp in a
pure-YAML entity. We need better examples (and a simpler way) to hook up sensor
feeds, such as http feeds etc, for polling an entity's health.
There are a few areas of related code:
* The attribute `service.notUp.indicators` allows multiple ways of determining
if an entity is healthy. If any of these indicators have put an entry into
``service.notUp.indicators` map, then the entity is marked as serviceUp=false.
* The attribute `service.notUp.diagnostics` is populated with additional info
when an entity fails. See
`SoftwareProcessImpl.ServiceNotUpDiagnosticsCollector`, which is executed when
serviceUp is set to false or when serviceState changes. The defaults are to
check if the machine is ssh'able, and check driver.isRunning.
* The `HttpRequestSensor` is usable in YAML, via an `EntityInitializer`, to add
an HTTP-based sensor feed.
* Config key `SoftwareProcess.RETRIEVE_USAGE_METRICS`, disables (some) polling
for usage/performance metrics, but will still do health metrics such as
service-up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)