[
https://issues.apache.org/jira/browse/HADOOP-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Bowen updated HADOOP-1041:
--------------------------------
Attachment: 1041.patch
This patch implements the grouped display of counters. The individual counters
are in the order that they were declared in the originating Enum class, so the
developer has full control over the ordering.
Optionally, a ResourceBundle may be provided. (I had coded this before people
started to vote that it wasn't necessary.) The resource bundle goes in the
same package directory as the class containing the enum, and must have the name
<class name>_<enum name>.properties. E.g. for the MapReduce counters, I
created an enum called Counter in Task.java, so the bundle is
Task_Counter.properties. See that file for how to customize both the group
name and the counter names.
There are also some changes to address Hadoop-1048. Now the task counters are
not summed every time a task status gets updated. Instead, they are summed
when someone - either a client, a JSP page, or a callback from the metrics
package - requests them. I changed JobSubmissionProtocol to allow fetching the
counters on demand. I bumped up its version, which I guess I should have also
done on the previous patch in which the Counters object was included in the
JobStatus. (I could have kept it there, but it seemed a bit inconsistent now
that the counters are computed on demand, since everything else in the
JobStatus is kept up-to-date.)
A related JobTracker efficiency change was to stop updating the job metrics
(via the metrics package) every time a task update is received. Instead this
is now only done when the metrics timer-based callback occurs (see
JobTrackerMetrics.doUpdates()). This means that this callback needs to get a
list of the running jobs - I think I implemented that with the correct locking
in the new method getRunningJobs, but someone might want to double check.
Changes affecting TastTracker:
The incrementCounter method should now be somewhat more efficient because it
doesn't (normally) involve any String or Long construction. The Counters
object now holds a map of maps, where the enum class name is the index into the
first. (Previously it was just a map, so the key had to be constructed by
string concatenation.) The serialized form of Counter is a bit more concise,
since the enum class name is only written once.
I didn't change the PROGRESS_INTERVAL (1 second) at which MapTasks report
their progress to the TaskTracker, because I don't think it is relevant to
Hadoop-1048 which is a JobTracker problem.
JSP stuff:
There is a change that was requested in Hadoop-1038, namely to show both the
map and reduce phase counters on the main jobdetails page.
I added a refresh parameter to jobdetails so that a refresh time in seconds can
be specified (it was refreshing evey 60 seconds). I changed the jobtracker
page to use the refresh parameter so that running jobs by default get refreshed
every 10 seconds, but completed and failed jobs don't get refreshed.
The counts are now right-justfied and decimal-formatted.
> Counter names are ugly
> ----------------------
>
> Key: HADOOP-1041
> URL: https://issues.apache.org/jira/browse/HADOOP-1041
> Project: Hadoop
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.12.0
> Reporter: Owen O'Malley
> Assigned To: David Bowen
> Fix For: 0.12.0
>
> Attachments: 1041.patch
>
>
> Having the complete class name in the counter names makes them unique, but
> they are ugly to present to non-developers. It would be nice to have some way
> to have a nicer string presented to the user. Currently, the Enum is
> converted to a name like:
> key.getDeclaringClass().getName() + "#" + key.toString()
> which gives counter names like
> "org.apache.hadoop.examples.RandomWriter$Counters#BYTES_WRITTEN"
> which is unique, but not very user friendly. Perhaps, we should strip off the
> class name for presenting to the users, which would allow them to make nice
> names. In particular, you could define an enum type that overloaded toString
> to print a nice user friendly string.
> Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.