[ https://issues.apache.org/jira/browse/RANGER-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712465#comment-17712465 ]
Vikas Kumar commented on RANGER-4047: ------------------------------------- Updated the patch. Requesting reviewers to please review and add your feedbacks. > Ranger KMS health metrics > -------------------------- > > Key: RANGER-4047 > URL: https://issues.apache.org/jira/browse/RANGER-4047 > Project: Ranger > Issue Type: New Feature > Components: kms > Reporter: Vikas Kumar > Assignee: Vikas Kumar > Priority: Major > > Ranger KMS should collect the important System as well as application level > health metrics. > System metrics: JVM/CPU/memory related metrics > Application metrics: KMS API execution metrics. Like, number of time DECRYPT > operation invoked and time taken to complete the request etc. > There should also be API to consume these stats, preferably REST API. Any > metric tools should be able to get these metrics through that REST API. > This will help making KMS highly observable. Alerts system can be configured > to consume, process and generate alerts. > There could be many other use cases. > *Approach:* > The solution depends on "ranger-metrics" common module for both Json and > Prometheus sink. > Kms has added only KMS specific application metrics. Generally, one COUNT > metric and corresponding elapsed time gauge metric for each REST end points. > By default, metric collection is not thread-safe but the by adding following > property in kms-site.xml it can be made thread-safe: > Prop name: hadoop.kms.metric.collection.threadsafe=true. // possible values > true/false > > ===========Sample response to list down metrics collected============ > curl -ivk -H "Content-Type: application/json" -H -X GET > [http://localhost:9292/kms/metrics/json?user.name=vikas] > sample response: > { > "KMS": { > "GET_CURRENT_KEY_COUNT": 0, > "DELETE_KEY_ELAPSED_TIME": 0, > "EEK_DECRYPT_ELAPSED_TIME": 0, > "GET_KEYS_METADATA_ELAPSED_TIME": 0, > "EEK_GENERATE_ELAPSED_TIME": 0, > "GET_CURRENT_KEY_ELAPSED_TIME": 0, > "EEK_REENCRYPT_ELAPSED_TIME": 0, > "KEY_CREATE_COUNT": 1, > "UNAUTHORIZED_CALLS_COUNT": 0, > "KEY_CREATE_ELAPSED_TIME": 81, > "GET_KEY_VERSION_COUNT": 0, > "ROLL_NEW_VERSION_ELAPSED_TIME": 0, > "REENCRYPT_EEK_BATCH_COUNT": 0, > "REENCRYPT_EEK_BATCH_ELAPSED_TIME": 0, > "GET_KEYS_METADATA_COUNT": 0, > "GET_KEY_VERSIONS_COUNT": 0, > "GET_KEY_VERSIONS_ELAPSED_TIME": 0, > "GET_KEYS_COUNT": 2, > "EEK_GENERATE_COUNT": 0, > "INVALIDATE_CACHE_COUNT": 0, > "GET_METADATA_COUNT": 3, > "REENCRYPT_EEK_BATCH_KEYS_COUNT": 0, > "EEK_REENCRYPT_COUNT": 0, > "UNAUTHENTICATED_CALLS_COUNT": 0, > "GET_KEY_VERSION_ELAPSED_TIME": 0, > "INVALIDATE_CACHE_ELAPSED_TIME": 0, > "ROLL_NEW_VERSION_COUNT": 0, > "EEK_DECRYPT_COUNT": 0, > "GET_KEYS_METADATA_KEYNAMES_COUNT": 0, > "DELETE_KEY_COUNT": 0, > "GET_KEYS_ELAPSED_TIME": 72, > "GET_METADATA_ELAPSED_TIME": 14, > "TOTAL_CALL_COUNT": 7 > }, > "RangerJvm": { > "GcTimeTotal": 339, > "SystemLoadAvg": 1.47, > "ThreadsBusy": 5, > "GcCountTotal": 9, > "MemoryMax": 1005584384, > "MemoryCurrent": 221646760, > "ThreadsWaiting": 20, > "ProcessorsAvailable": 2, > "GcTimeMax": 339, > "ThreadsBlocked": 0, > "ThreadsRemaining": 9 > } > } -- This message was sent by Atlassian Jira (v8.20.10#820010)