[
https://issues.apache.org/jira/browse/HBASE-13260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518456#comment-14518456
]
Nick Dimiduk commented on HBASE-13260:
--------------------------------------
Had a chat with [~enis] offline on this. Here's my understanding/summary:
- this patch cleans up region code in a way that everyone likes, +1 for that
bit
- procV2 is used for all DDL operations in 1.1. DDL is a relatively small
number of edits to wal
- procV2 is not used for region assignment in 1.1, the use-case that involves
potentially lots of wal edits
- proc-wal is a branch new file format, new code, &c.
- proc-wal is probably faster than region-wal, but we now think it's less than
an order of magnitude slower
- proc-wal and region-wal are interchangeable for the purposes of procV2
For branch-1.1, I'm in favor of region-wal for procV2 because it's *NOT* in a
high throughput situation AND it means we can avoid supporting a new file
format. Future improvements in performance to region wal help everyone. If we
can't get it where we need perf-wise, we can always bring back proc-wal for
region assignment operations -- that card is still up our sleeve.
[~stack], [~mbertozzi], [~enis] are you swayed?
> Bootstrap Tables for fun and profit
> ------------------------------------
>
> Key: HBASE-13260
> URL: https://issues.apache.org/jira/browse/HBASE-13260
> Project: HBase
> Issue Type: Bug
> Reporter: Enis Soztutar
> Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.1.0
>
> Attachments: hbase-13260_bench.patch, hbase-13260_prototype.patch
>
>
> Over at the ProcV2 discussions(HBASE-12439) and elsewhere I was mentioning an
> idea where we may want to use regular old regions to store/persist some data
> needed for HBase master to operate.
> We regularly use system tables for storing system data. acl, meta, namespace,
> quota are some examples. We also store the table state in meta now. Some data
> is persisted in zk only (replication peers and replication state, etc). We
> are moving away from zk as a permanent storage. As any self-respecting
> database does, we should store almost all of our data in HBase itself.
> However, we have an "availability" dependency between different kinds of
> data. For example all system tables need meta to be assigned first. All
> master operations need ns table to be assigned, etc.
> For at least two types of data, (1) procedure v2 states, (2) RS groups in
> HBASE-6721 we cannot depend on meta being assigned since "assignment" itself
> will depend on accessing this data. The solution in (1) is to implement a
> custom WAL format, and custom recover lease and WAL recovery. The solution in
> (2) is to have the table to store this data, but also cache it in zk for
> bootrapping initial assignments.
> For solving both of the above (and possible future use cases if any), I
> propose we add a "boostrap table" concept, which is:
> - A set of predefined tables hosted in a separate dir in HDFS.
> - A table is only 1 region, not splittable
> - Not assigned through regular assignment
> - Hosted only on 1 server (typically master)
> - Has a dedicated WAL.
> - A service does WAL recovery + fencing for these tables.
> This has the benefit of using a region to keep the data, but frees us to
> re-implement caching and we can use the same WAL / Memstore / Recovery
> mechanisms that are battle-tested.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)