[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702877#comment-13702877
 ] 

stack commented on HDFS-4960:
-----------------------------

bq. The .meta header exists so that we can check the version of the block file.

Thanks for taking a looksee mighty [~cmccabe].

What if we made a patch with a configuration to skip the reading of metadata?  
Or, added a setSkipMetadata API as we have a setVerifyChecksum API?

Agree, it is handy having blocks versioned so can be evolved at later date.  No 
harm having version in side file either since we are probably going to do an 
extra seek to get metadata anyways (would be coolio if DN cached block metadata 
so could save a seek but maybe this would be more trouble than it is worth -- 
we'd need to prove it useful in say a heavy random read scenario).

But the metadata doesn't look to have ever changed.  It is version 1 (It looks 
like the version used to be a define out of FSDataset before it was defined 
inside this file but even then the version was 1).  Meantime the lads here are 
paying a seek to learn something that is unlikely ever to change.  

(Filename would be good place for version but would be a massive change).

Thanks Colin.
                
> Unnecessary .meta seeks even when skip checksum is true
> -------------------------------------------------------
>
>                 Key: HDFS-4960
>                 URL: https://issues.apache.org/jira/browse/HDFS-4960
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.1.0-beta
>            Reporter: Varun Sharma
>            Assignee: Varun Sharma
>         Attachments: 4960-branch2.patch, 4960-trunk.patch
>
>
> While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
> unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
> of the file - this attempts to validate the version #. Since the client is 
> requesting no-checksum, we should not be needing to touch the .meta file at 
> all.
> Since the purpose of skip checksum is to also avoid the performance penalty 
> of the extra seek, we should not be seeking into .meta if skip checksum is 
> true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to