[ https://issues.apache.org/jira/browse/HDFS-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216509#comment-15216509 ]
Hitesh Shah commented on HDFS-10175: ------------------------------------ [~cmccabe] For cluster admins who keep tabs on Hive queries or Pig scripts to see what kind of ops they are doing are using the job counters today. Unfortunately, the current job counters - read ops, write ops. etc are quite old and obsolete and are very high level and cannot be used to find bad jobs which say create 1000s of files ( for partitions, etc ). The approach Jitendra and Minglian seem to be proposing seems to integrate well with Mapreduce, etc where counters could be created to map to the filesystem stats. Additionally, these counters would be available for historical analysis as needed via the job history. This is something which would likely be needed for all jobs. To some extent I do agree that all the 50-odd metrics being tracked may not be useful for all use-cases. However, enabling this via htrace and turning on htrace for all jobs is also not an option as it does not allow any selective tracking of only certain info. Htrace being enabled for a certain job means too indepth profiling which is not really needed unless necessary for very specific profiling. Do you have any suggestions on the app layers above HDFS can obtain more in-depth stats but at the same time have all of these stats be accessible for all jobs? > add per-operation stats to FileSystem.Statistics > ------------------------------------------------ > > Key: HDFS-10175 > URL: https://issues.apache.org/jira/browse/HDFS-10175 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: Ram Venkatesh > Assignee: Mingliang Liu > Attachments: HDFS-10175.000.patch, HDFS-10175.001.patch, > HDFS-10175.002.patch, HDFS-10175.003.patch > > > Currently FileSystem.Statistics exposes the following statistics: > BytesRead > BytesWritten > ReadOps > LargeReadOps > WriteOps > These are in-turn exposed as job counters by MapReduce and other frameworks. > There is logic within DfsClient to map operations to these counters that can > be confusing, for instance, mkdirs counts as a writeOp. > Proposed enhancement: > Add a statistic for each DfsClient operation including create, append, > createSymlink, delete, exists, mkdirs, rename and expose them as new > properties on the Statistics object. The operation-specific counters can be > used for analyzing the load imposed by a particular job on HDFS. > For example, we can use them to identify jobs that end up creating a large > number of files. > Once this information is available in the Statistics object, the app > frameworks like MapReduce can expose them as additional counters to be > aggregated and recorded as part of job summary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)