[ 
http://issues.apache.org/jira/browse/HADOOP-237?page=comments#action_12420694 ] 

David Bowen commented on HADOOP-237:
------------------------------------


This code is not using the metrics API as intended, in that it calls the update 
method after each metric modification.  The API is record-oriented, so update 
copies the whole record to the client library.  

I don't think that this will cause significant, observable problems with the 
metric data, but it could be a significant performance issue.

The preferred model would be to replace per-metric methods like
        void mapInput(long numBytes) 
        void mapOutput(long numBytes) 

with something like
       void mapIO(long numBytesInput, long numBytesOutput)

and have this only call the update method once.


      

> Standard set of Performance Metrics for Hadoop
> ----------------------------------------------
>
>          Key: HADOOP-237
>          URL: http://issues.apache.org/jira/browse/HADOOP-237
>      Project: Hadoop
>         Type: Improvement

>   Components: metrics
>     Versions: 0.3.0
>  Environment: All
>     Reporter: Milind Bhandarkar
>     Assignee: Milind Bhandarkar
>  Attachments: hadoop-metrics.patch
>
> I am starting to use Hadoop's shiny new Metrics API to publish performance 
> (and other) Metrics of running jobs and other daemons.
> Which performance metrics are people interested in seeing ? If possible, 
> please group them according to modules, such as map-reduce, dfs, 
> general-cluster-related etc. I will follow this process:
> 1. collect this list
> 2. assess feasibility of obtaining metric
> 3. assign context/record/metrics names
> 4. seek approval for names
> 5. instrument the code.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to