[ http://issues.apache.org/jira/browse/HADOOP-492?page=comments#action_12431909 ] Doug Cutting commented on HADOOP-492: -------------------------------------
For the third time: why can't we use the Metrics API here? This is precisely the sort of thing it was designed for, no? Milind says, "there is no way to programmatically add new metrics to the TaskMetrics class". Okay, so we shouldn't use TaskMetrics for this. But shouldn't we use a MetricsRecord? If we use MetricsRecord to collect metrics, then we need to decide how to aggregate these. We could use Ganglia, or we could aggregate over heartbeats, having the JobTracker and TaskTracker implement a MetricsContext. > Global counters > --------------- > > Key: HADOOP-492 > URL: http://issues.apache.org/jira/browse/HADOOP-492 > Project: Hadoop > Issue Type: New Feature > Components: mapred > Reporter: arkady borkovsky > > It would be nice to have map / reduce job keep aggregated counts for > arbitrary events occuring in its tasks -- the numer of records processed, the > numer of exceptions of a specific type, the number of sentences in passive > voice, whatever the jobs finds useful. > This can be implemented by tasks periodically sending <name, value> pairs to > the jobtracker (in some implementations such messages are piggy-backed on the > heartbeats), so that the job tracker stores all the latests values from each > task and aggregates them on a request. It should also make the aggregated > values available at the job end. The value for a task would be flushed when > the task fails. > #491 and #490 may be related to this one. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira