[ 
https://issues.apache.org/jira/browse/RANGER-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712465#comment-17712465
 ] 

Vikas Kumar commented on RANGER-4047:
-------------------------------------

Updated the patch. Requesting reviewers to please review and add your feedbacks.

> Ranger KMS health metrics 
> --------------------------
>
>                 Key: RANGER-4047
>                 URL: https://issues.apache.org/jira/browse/RANGER-4047
>             Project: Ranger
>          Issue Type: New Feature
>          Components: kms
>            Reporter: Vikas Kumar
>            Assignee: Vikas Kumar
>            Priority: Major
>
> Ranger KMS should collect the important System as well as application level 
> health metrics.
> System metrics: JVM/CPU/memory related metrics
> Application metrics: KMS API execution metrics. Like, number of time DECRYPT 
> operation invoked and time taken to complete the request etc.
> There should also be API to consume these stats, preferably REST API. Any 
> metric tools should be able to get these metrics through that REST API.
> This will help making KMS highly observable. Alerts system can be configured 
> to consume, process and generate alerts.
> There could be many other use cases.
> *Approach:*
> The solution depends on "ranger-metrics" common module for both Json and 
> Prometheus sink.
> Kms has added only KMS specific application metrics. Generally, one COUNT 
> metric and corresponding elapsed time gauge metric for each REST end points.
> By default, metric collection is not thread-safe but the by adding following 
> property in kms-site.xml it can be made thread-safe:
> Prop name: hadoop.kms.metric.collection.threadsafe=true. // possible values 
> true/false
>  
> ===========Sample response to list down metrics collected============
> curl -ivk  -H "Content-Type: application/json" -H  -X GET 
> [http://localhost:9292/kms/metrics/json?user.name=vikas]
> sample response:
> {
>   "KMS": {
>     "GET_CURRENT_KEY_COUNT": 0,
>     "DELETE_KEY_ELAPSED_TIME": 0,
>     "EEK_DECRYPT_ELAPSED_TIME": 0,
>     "GET_KEYS_METADATA_ELAPSED_TIME": 0,
>     "EEK_GENERATE_ELAPSED_TIME": 0,
>     "GET_CURRENT_KEY_ELAPSED_TIME": 0,
>     "EEK_REENCRYPT_ELAPSED_TIME": 0,
>     "KEY_CREATE_COUNT": 1,
>     "UNAUTHORIZED_CALLS_COUNT": 0,
>     "KEY_CREATE_ELAPSED_TIME": 81,
>     "GET_KEY_VERSION_COUNT": 0,
>     "ROLL_NEW_VERSION_ELAPSED_TIME": 0,
>     "REENCRYPT_EEK_BATCH_COUNT": 0,
>     "REENCRYPT_EEK_BATCH_ELAPSED_TIME": 0,
>     "GET_KEYS_METADATA_COUNT": 0,
>     "GET_KEY_VERSIONS_COUNT": 0,
>     "GET_KEY_VERSIONS_ELAPSED_TIME": 0,
>     "GET_KEYS_COUNT": 2,
>     "EEK_GENERATE_COUNT": 0,
>     "INVALIDATE_CACHE_COUNT": 0,
>     "GET_METADATA_COUNT": 3,
>     "REENCRYPT_EEK_BATCH_KEYS_COUNT": 0,
>     "EEK_REENCRYPT_COUNT": 0,
>     "UNAUTHENTICATED_CALLS_COUNT": 0,
>     "GET_KEY_VERSION_ELAPSED_TIME": 0,
>     "INVALIDATE_CACHE_ELAPSED_TIME": 0,
>     "ROLL_NEW_VERSION_COUNT": 0,
>     "EEK_DECRYPT_COUNT": 0,
>     "GET_KEYS_METADATA_KEYNAMES_COUNT": 0,
>     "DELETE_KEY_COUNT": 0,
>     "GET_KEYS_ELAPSED_TIME": 72,
>     "GET_METADATA_ELAPSED_TIME": 14,
>     "TOTAL_CALL_COUNT": 7
>   },
>   "RangerJvm": {
>     "GcTimeTotal": 339,
>     "SystemLoadAvg": 1.47,
>     "ThreadsBusy": 5,
>     "GcCountTotal": 9,
>     "MemoryMax": 1005584384,
>     "MemoryCurrent": 221646760,
>     "ThreadsWaiting": 20,
>     "ProcessorsAvailable": 2,
>     "GcTimeMax": 339,
>     "ThreadsBlocked": 0,
>     "ThreadsRemaining": 9
>   }
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to