[
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033295#comment-13033295
]
Mikhail Bautin commented on HBASE-3857:
---------------------------------------
Hi St.Ack,
Thank you for all the feedback!
To scan an HFile in the new format we don't even need the root index. Each
block is self-sufficient in that the header contains all the information
necessary to decode the block, except the compression type, which is found in
the trailer. We could create an "HFile fix" tool that would rebuild the block
index if necessary. In HFile format v1, however, if the block index is corrupt,
we would not be able to read any data blocks at all. So I don't see how HFile
format v2 is more brittle than v1.
Implementation update: a load test (org.apache.hadoop.hbase.manual.HBaseTest)
is successfully running on a 5-node cluster, and I see some 2-level indexes
being created with 5-15 root-level entries so far (with the max index block
size set to 128K), as well as some compound ROW Bloom filters.
Regards,
--Mikhail
> Change the HFile Format
> -----------------------
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
> Issue Type: New Feature
> Reporter: Liyin Tang
> Assignee: Mikhail Bautin
> Attachments: hfile_format_v2_design_draft_0.1.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format
> of the HFile. The new format proposal is attached here. Thanks for Mikhail
> Bautin for the documentation.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira