Ability to measure the throughput of individual components at node level
------------------------------------------------------------------------

                 Key: HADOOP-6535
                 URL: https://issues.apache.org/jira/browse/HADOOP-6535
             Project: Hadoop Common
          Issue Type: Improvement
          Components: metrics
    Affects Versions: 0.20.1
            Reporter: Rajesh Balamohan


Currently rely on JVMMetrics, DFSMetrics, IPCMetrics, MapRedMetrics to quickly 
check performance. It would be helpful to have metrics which would measure the 
individual component metrics. Some of them are listed below

1. Throughput of MapOutputServlet in a tasktracker (Currently there are 
statistics which report on server busy, successful output etc. However, 
understanding the data throughput of this servlet is little hard. Some 
additional metrics like the following would be helpful in determining the 
parallel copies in shuffle phase.

- Total amount of bytes served
- Total amount of time spent in reading & streaming
- Data throughput of MapOutputServlet in MB/second
- Number of concurrent requests now, peak requests

2. Another metric could be on DFSClient's DataStreamer. The amount of data 
streamed per second by a specific node would be helpful.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to