[ https://issues.apache.org/jira/browse/HADOOP-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555834#comment-17555834 ]
Viraj Jasani commented on HADOOP-18288: --------------------------------------- [~aajisaka] are you fine with the change? If so, I can create backport PR for branch-3.3, else [~tomscut] can help revert it on trunk. My goal was to make this info available on Namenode ui so that it would be straightforward to know which Datanodes are busier than usual without having to explore more metrics. Same is the case with HBase, when user is alerted of higher than usual traffic, HMaster ui itself would be sufficient to know which Regionservers are busier than usual and take any action if required (e.g. run balancer or move regions) before we even have to look at detailed metrics (derived based on expressions on Prometheus or in-house built metric system). But yes when more specific details require any attention (like more CPU usage or Network errors etc), we anyways need to look at detailed metrics. This Jira is about exposing overall business of servers such that they can be used on ui. Moreover, we don't have these details on dev clusters also (e.g. pseudo distributed mode or dockerized cluster) as majority dev would not have Prometheus or any other metric systems deployed locally as well. Hence, from that viewpoint also, total rps is basic to get some quick analysis of how busy our servers are getting with some traffic we initiate for dev clusters. > Total requests and total requests per sec served by RPC servers > --------------------------------------------------------------- > > Key: HADOOP-18288 > URL: https://issues.apache.org/jira/browse/HADOOP-18288 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > RPC Servers provide bunch of useful information like num of open connections, > slow requests, num of in-progress handlers, RPC processing time, queue time > etc, however so far it doesn't provide accumulation of all requests as well > as current snapshot of requests per second served by the server. Exposing > them would benefit from operational viewpoint in identifying how busy the > servers have been and how much load they are currently serving in the > presence of cluster wide high load. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org