[ https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702832#comment-13702832 ]
Varun Sharma commented on HDFS-4960: ------------------------------------ I don't know about the future plans of .meta - however, I think currently its only being used for storing checksums and if checksums are not being asked for, there is no need for seeking into the .meta file. I think the FB branch already has this change. I personally think that inlining metadata in blocks is the way to go, for the future, instead of storing separately in a .meta file - it cripples hdfs for real time work loads. I did not have a large enough data set so probably all the .meta files were in fs cache. lseek in .meta ~50 microseconds lseek in block ~50 microseconds read .meta ~50 microseconds read block ~ 300 microseconds So, just looking from system calls perspective, it is ~ 20 % however, it does not show up in the end to end test, because in general, there are a bunch of other contention issues inside HDFS + HBase which hog CPU resources. > Unnecessary .meta seeks even when skip checksum is true > ------------------------------------------------------- > > Key: HDFS-4960 > URL: https://issues.apache.org/jira/browse/HDFS-4960 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0, 2.1.0-beta > Reporter: Varun Sharma > Assignee: Varun Sharma > Attachments: 4960-branch2.patch, 4960-trunk.patch > > > While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found > unnecessary seeks into .meta files, each seek was a 7 byte read at the head > of the file - this attempts to validate the version #. Since the client is > requesting no-checksum, we should not be needing to touch the .meta file at > all. > Since the purpose of skip checksum is to also avoid the performance penalty > of the extra seek, we should not be seeking into .meta if skip checksum is > true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira