This is an automated email from the ASF dual-hosted git repository.

elserj pushed a commit to branch HBASE-26067
in repository https://gitbox.apache.org/repos/asf/hbase.git


The following commit(s) were added to refs/heads/HBASE-26067 by this push:
     new 1dbdefc  HBASE-26265 Update ref guide to mention the new store file 
tracker im… (#3942)
1dbdefc is described below

commit 1dbdefc7c165e13de1272d9f03a8ef78980d36e2
Author: Wellington Ramos Chevreuil <wchevre...@apache.org>
AuthorDate: Thu Dec 16 21:07:38 2021 +0000

    HBASE-26265 Update ref guide to mention the new store file tracker im… 
(#3942)
---
 .../asciidoc/_chapters/store_file_tracking.adoc    | 145 +++++++++++++++++++++
 src/main/asciidoc/book.adoc                        |   1 +
 2 files changed, 146 insertions(+)

diff --git a/src/main/asciidoc/_chapters/store_file_tracking.adoc 
b/src/main/asciidoc/_chapters/store_file_tracking.adoc
new file mode 100644
index 0000000..74d802f
--- /dev/null
+++ b/src/main/asciidoc/_chapters/store_file_tracking.adoc
@@ -0,0 +1,145 @@
+////
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+////
+
+[[storefiletracking]]
+= Store File Tracking
+:doctype: book
+:numbered:
+:toc: left
+:icons: font
+:experimental:
+
+== Overview
+
+This feature introduces an abstraction layer to track store files still 
used/needed by store
+engines, allowing for plugging different approaches of identifying store
+files required by the given store.
+
+Historically, HBase internals have relied on creating hfiles on temporary 
directories first, renaming
+those files to the actual store directory at operation commit time. That's a 
simple and convenient
+way to separate transient from already finalised files that are ready to serve 
client reads with data.
+This approach works well with strong consistent file systems, but with the 
popularity of less consistent
+file systems, mainly Object Store which can be used like file systems, 
dependency on atomic rename operations starts to introduce
+performance penalties. The Amazon S3 Object Store, in particular, has been the 
most affected deployment,
+due to its lack of atomic renames. The HBase community temporarily bypassed 
this problem by building a distributed locking layer called HBOSS,
+to guarantee atomicity of operations against S3.
+
+With *Store File Tracking*, decision on where to originally create new hfiles 
and how to proceed upon
+commit is delegated to the specific Store File Tracking implementation.
+The implementation can be set at the HBase service leve in *hbase-site.xml* or 
at the
+Table or Column Family via the TableDescriptor configuration.
+
+NOTE: When the store file tracking implementation is specified in 
*hbase_site.xml*, this configuration is also propagated into a tables 
configuration
+at table creation time. This is to avoid dangerous configuration mismatches 
between processes, which
+could potentially lead to data loss.
+
+== Available Implementations
+
+Store File Tracking initial version provides three builtin implementations:
+
+* DEFAULT
+* FILE
+* MIGRATION
+
+### DEFAULT
+
+As per the name, this is the Store File Tracking implementation used by 
default when no explicit
+configuration has been defined. The DEFAULT tracker implements the standard 
approach using temporary
+directories and renames. This is how all previous (implicit) implementation 
that HBase used to track store files.
+
+### FILE
+
+A file tracker implementation that creates new files straight in the store 
directory, avoiding the
+need for rename operations. It keeps a list of committed hfiles in memory, 
backed by meta files, in
+each store directory. Whenever a new hfile is committed, the list of _tracked 
files_ in the given
+store is updated and a new meta file is written with this list contents, 
discarding the previous
+meta file now containing an out dated list.
+
+### MIGRATION
+
+A special implementation to be used when swapping between Store File Tracking 
implementations on
+pre-existing tables that already contain data, and therefore, files being 
tracked under an specific
+logic.
+
+== Usage
+
+For fresh deployments that don't yet contain any user data, *FILE* 
implementation can be just set as
+value for *hbase.store.file-tracker.impl* property in global *hbase-site.xml* 
configuration, prior
+to the first hbase start. Omitting this property sets the *DEFAULT* 
implementation.
+
+For clusters with data that are upgraded to a version of HBase containing the 
store file tracking
+feature, the Store File Tracking implementation can only be changed with the 
*MIGRATION*
+implementation, so that the _new tracker_ can safely build its list of tracked 
files based on the
+list of the _current tracker_.
+
+NOTE: MIGRATION tracker should NOT be set at global configuration. To use it, 
follow below section
+about setting Store File Tacking at Table or Column Family configuration.
+
+
+### Configuring for Table or Column Family
+
+Setting Store File Tracking configuration globally may not always be possible 
or desired, for example,
+in the case of upgraded clusters with pre-existing user data.
+Store File Tracking can be set at Table or Column Family level configuration.
+For example, to specify *FILE* implementation in the table configuration at 
table creation time,
+the following should be applied:
+
+----
+create 'my-table', 'f1', 'f2', {CONFIGURATION => 
{'hbase.store.file-tracker.impl' => 'FILE'}}
+----
+
+To define *FILE* for an specific Column Family:
+
+----
+create 'my-table', {NAME=> '1', CONFIGURATION => 
{'hbase.store.file-tracker.impl' => 'FILE'}}
+----
+
+### Switching trackers at Table or Column Family
+
+A very common scenario is to set Store File Tracking on pre-existing HBase 
deployments that have
+been upgraded to a version that supports this feature. To apply the FILE 
tracker, tables effectively
+need to be migrated from the DEFAULT tracker to the FILE tracker. As explained 
previously, such
+process requires the usage of the special MIGRATION tracker implementation, 
which can only be
+specified at table or Column Family level.
+
+For example, to switch _tracker_ from *DEFAULT* to *FILE* in a table 
configuration:
+
+----
+alter 'my-table', CONFIGURATION => {'hbase.store.file-tracker.impl' => 
'MIGRATION',
+'hbase.store.file-tracker.migration.src.impl' => 'DEFAULT',
+'hbase.store.file-tracker.migration.dst.impl' => 'FILE'}
+----
+
+To apply similar switch at column family level configuration:
+
+----
+alter 'my-table', {NAME => 'f1', CONFIGURATION => 
{'hbase.store.file-tracker.impl' => 'MIGRATION',
+'hbase.store.file-tracker.migration.src.impl' => 'DEFAULT',
+'hbase.store.file-tracker.migration.dst.impl' => 'FILE'}}
+----
+
+Once all table regions have been onlined again, don't forget to disable 
MIGRATION, by now setting
+*hbase.store.file-tracker.migration.dst.impl* value as the 
*hbase.store.file-tracker.impl*. In the above
+example, that would be as follows:
+
+----
+alter 'my-table', CONFIGURATION => {'hbase.store.file-tracker.impl' => 'FILE'}
+----
diff --git a/src/main/asciidoc/book.adoc b/src/main/asciidoc/book.adoc
index a622786..b8c648e 100644
--- a/src/main/asciidoc/book.adoc
+++ b/src/main/asciidoc/book.adoc
@@ -89,6 +89,7 @@ include::_chapters/zookeeper.adoc[]
 include::_chapters/community.adoc[]
 include::_chapters/hbtop.adoc[]
 include::_chapters/tracing.adoc[]
+include::_chapters/store_file_tracking.adoc[]
 
 = Appendix
 

Reply via email to