[ 
https://issues.apache.org/jira/browse/HADOOP-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555834#comment-17555834
 ] 

Viraj Jasani commented on HADOOP-18288:
---------------------------------------

[~aajisaka] are you fine with the change? If so, I can create backport PR for 
branch-3.3, else [~tomscut] can help revert it on trunk.

My goal was to make this info available on Namenode ui so that it would be 
straightforward to know which Datanodes are busier than usual without having to 
explore more metrics. Same is the case with HBase, when user is alerted of 
higher than usual traffic, HMaster ui itself would be sufficient to know which 
Regionservers are busier than usual and take any action if required (e.g. run 
balancer or move regions) before we even have to look at detailed metrics 
(derived based on expressions on Prometheus or in-house built metric system). 
But yes when more specific details require any attention (like more CPU usage 
or Network errors etc), we anyways need to look at detailed metrics. This Jira 
is about exposing overall business of servers such that they can be used on ui.

Moreover, we don't have these details on dev clusters also (e.g. pseudo 
distributed mode or dockerized cluster) as majority dev would not have 
Prometheus or any other metric systems deployed locally as well. Hence, from 
that viewpoint also, total rps is basic to get some quick analysis of how busy 
our servers are getting with some traffic we initiate for dev clusters.

> Total requests and total requests per sec served by RPC servers
> ---------------------------------------------------------------
>
>                 Key: HADOOP-18288
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18288
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> RPC Servers provide bunch of useful information like num of open connections, 
> slow requests, num of in-progress handlers, RPC processing time, queue time 
> etc, however so far it doesn't provide accumulation of all requests as well 
> as current snapshot of requests per second served by the server. Exposing 
> them would benefit from operational viewpoint in identifying how busy the 
> servers have been and how much load they are currently serving in the 
> presence of cluster wide high load.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to