Victor Xu created AMBARI-18624: ---------------------------------- Summary: Deeper Alerting for Kafka Key: AMBARI-18624 URL: https://issues.apache.org/jira/browse/AMBARI-18624 Project: Ambari Issue Type: Improvement Components: alerts Affects Versions: 2.2.2 Reporter: Victor Xu
Kafka Brokers can become unhealthy while still being 'available', meaning the process is up and the service is running but it is unusable. Current alerting is only focussed on if the process is up and running, but it's desired to have alerting that is focussed on testing for these situations in which the component is up, but non functional. Example: Sometimes the brokers are not working but ambari still registers them as green. I have seen this in at least one case, OutOfMemory in the broker logs. This could easily be reproduced by dropping the broker heap and creating many topics to blow the heap. Ambari will still register kafka as healthy. We need the Kafka team to identify a script that Ambari can run to probe the Kafka process to achieve this deeper health alerting capability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)