[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702832#comment-13702832
 ] 

Varun Sharma commented on HDFS-4960:
------------------------------------

I don't know about the future plans of .meta - however, I think currently its 
only being used for storing checksums and if checksums are not being asked for, 
there is no need for seeking into the .meta file. I think the FB branch already 
has this change. I personally think that inlining metadata in blocks is the way 
to go, for the future, instead of storing separately in a .meta file - it 
cripples hdfs for real time work loads.

I did not have a large enough data set so probably all the .meta files were in 
fs cache.

lseek in .meta ~50 microseconds
lseek in block ~50 microseconds
read .meta  ~50 microseconds
read block  ~ 300 microseconds

So, just looking from system calls perspective, it is ~ 20 % however, it does 
not show up in the end to end test, because in general, there are a bunch of 
other contention issues inside HDFS + HBase which hog CPU resources.
                
> Unnecessary .meta seeks even when skip checksum is true
> -------------------------------------------------------
>
>                 Key: HDFS-4960
>                 URL: https://issues.apache.org/jira/browse/HDFS-4960
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.1.0-beta
>            Reporter: Varun Sharma
>            Assignee: Varun Sharma
>         Attachments: 4960-branch2.patch, 4960-trunk.patch
>
>
> While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
> unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
> of the file - this attempts to validate the version #. Since the client is 
> requesting no-checksum, we should not be needing to touch the .meta file at 
> all.
> Since the purpose of skip checksum is to also avoid the performance penalty 
> of the extra seek, we should not be seeking into .meta if skip checksum is 
> true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to