[ 
https://issues.apache.org/jira/browse/HADOOP-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Bowen updated HADOOP-1041:
--------------------------------

    Attachment: 1041.patch


This patch implements the grouped display of counters.  The individual counters 
are in the order that they were declared in the originating Enum class, so the 
developer has full control over the ordering.

Optionally, a ResourceBundle may be provided.  (I had coded this before people 
started to vote that it wasn't necessary.)  The resource bundle goes in the 
same package directory as the class containing the enum, and must have the name 
<class name>_<enum name>.properties.  E.g. for the MapReduce counters, I 
created an enum called Counter in Task.java, so the bundle is 
Task_Counter.properties.  See that file for how to customize both the group 
name and the counter names.

There are also some changes to address Hadoop-1048.  Now the task counters are 
not summed every time a task status gets updated.  Instead, they are summed 
when someone - either a client, a JSP page, or a callback from the metrics 
package - requests them.  I changed JobSubmissionProtocol to allow fetching the 
counters on demand.  I bumped up its version, which I guess I should have also 
done on the previous patch in which the Counters object was included in the 
JobStatus.  (I could have kept it there, but it seemed a bit inconsistent now 
that the counters are computed on demand, since everything else in the 
JobStatus is kept up-to-date.)

A related JobTracker efficiency change was to stop updating the job metrics 
(via the metrics package) every time a task update is received.  Instead this 
is now only done when the metrics timer-based callback occurs (see 
JobTrackerMetrics.doUpdates()).  This means that this callback needs to get a 
list of the running jobs - I think I implemented that with the correct locking 
in the new method getRunningJobs, but someone might want to double check.

Changes affecting TastTracker:

The incrementCounter method should now be somewhat more efficient because it 
doesn't (normally) involve any String or Long construction.  The Counters 
object now holds a map of maps, where the enum class name is the index into the 
first.  (Previously it was just a map, so the key had to be constructed by 
string concatenation.)  The serialized form of Counter is a bit more concise, 
since the enum class name is only written once.

I didn't change the PROGRESS_INTERVAL (1  second) at which MapTasks report 
their progress to the TaskTracker, because I don't think it is relevant to 
Hadoop-1048 which is a JobTracker problem.  

JSP stuff:

There is a change that was requested in Hadoop-1038, namely to show both the 
map and reduce phase counters on the main jobdetails page.

I added a refresh parameter to jobdetails so that a refresh time in seconds can 
be specified (it was refreshing evey 60 seconds).  I changed the jobtracker 
page to use the refresh parameter so that running jobs by default get refreshed 
every 10 seconds, but completed and failed jobs don't get refreshed.

The counts are now right-justfied and decimal-formatted.






> Counter names are ugly
> ----------------------
>
>                 Key: HADOOP-1041
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1041
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.12.0
>            Reporter: Owen O'Malley
>         Assigned To: David Bowen
>             Fix For: 0.12.0
>
>         Attachments: 1041.patch
>
>
> Having the complete class name in the counter names makes them unique, but 
> they are ugly to present to non-developers. It would be nice to have some way 
> to have a nicer string presented to the user. Currently, the Enum is 
> converted to a name like:
> key.getDeclaringClass().getName() + "#" + key.toString()
> which gives counter names like 
> "org.apache.hadoop.examples.RandomWriter$Counters#BYTES_WRITTEN"
> which is unique, but not very user friendly. Perhaps, we should strip off the 
> class name for presenting to the users, which would allow them to make nice 
> names. In particular, you could define an enum type that overloaded toString 
> to print a nice user friendly string.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to