Ability to measure the throughput of individual components at node level ------------------------------------------------------------------------
Key: HADOOP-6535 URL: https://issues.apache.org/jira/browse/HADOOP-6535 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 0.20.1 Reporter: Rajesh Balamohan Currently rely on JVMMetrics, DFSMetrics, IPCMetrics, MapRedMetrics to quickly check performance. It would be helpful to have metrics which would measure the individual component metrics. Some of them are listed below 1. Throughput of MapOutputServlet in a tasktracker (Currently there are statistics which report on server busy, successful output etc. However, understanding the data throughput of this servlet is little hard. Some additional metrics like the following would be helpful in determining the parallel copies in shuffle phase. - Total amount of bytes served - Total amount of time spent in reading & streaming - Data throughput of MapOutputServlet in MB/second - Number of concurrent requests now, peak requests 2. Another metric could be on DFSClient's DataStreamer. The amount of data streamed per second by a specific node would be helpful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.