[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033295#comment-13033295
 ] 

Mikhail Bautin commented on HBASE-3857:
---------------------------------------

Hi St.Ack,

Thank you for all the feedback! 

To scan an HFile in the new format we don't even need the root index. Each 
block is self-sufficient in that the header contains all the information 
necessary to decode the block, except the compression type, which is found in 
the trailer. We could create an "HFile fix" tool that would rebuild the block 
index if necessary. In HFile format v1, however, if the block index is corrupt, 
we would not be able to read any data blocks at all. So I don't see how HFile 
format v2 is more brittle than v1.

Implementation update: a load test (org.apache.hadoop.hbase.manual.HBaseTest) 
is successfully running on a 5-node cluster, and I see some 2-level indexes 
being created with 5-15 root-level entries so far (with the max index block 
size set to 128K), as well as some compound ROW Bloom filters.

Regards,
--Mikhail


> Change the HFile Format
> -----------------------
>
>                 Key: HBASE-3857
>                 URL: https://issues.apache.org/jira/browse/HBASE-3857
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Liyin Tang
>            Assignee: Mikhail Bautin
>         Attachments: hfile_format_v2_design_draft_0.1.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to