http://git-wip-us.apache.org/repos/asf/hbase/blob/abaea39e/src/main/asciidoc/chapters/ops_mgt.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/chapters/ops_mgt.adoc 
b/src/main/asciidoc/chapters/ops_mgt.adoc
deleted file mode 100644
index 8046591..0000000
--- a/src/main/asciidoc/chapters/ops_mgt.adoc
+++ /dev/null
@@ -1,2009 +0,0 @@
-////
-/**
- *
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-////
-
-[[ops_mgt]]
-= Apache HBase Operational Management
-:doctype: book
-:numbered:
-:toc: left
-:icons: font
-:experimental:
-:docinfo1:
-
-This chapter will cover operational tools and practices required of a running 
Apache HBase cluster.
-The subject of operations is related to the topics of <<trouble,trouble>>, 
<<performance,performance>>, and <<configuration,configuration>> but is a 
distinct topic in itself. 
-
-[[tools]]
-== HBase Tools and Utilities
-
-HBase provides several tools for administration, analysis, and debugging of 
your cluster.
-The entry-point to most of these tools is the [path]_bin/hbase_ command, 
though some tools are available in the [path]_dev-support/_ directory.
-
-To see usage instructions for [path]_bin/hbase_ command, run it with no 
arguments, or with the +-h+ argument.
-These are the usage instructions for HBase 0.98.x.
-Some commands, such as +version+, +pe+, +ltt+, +clean+, are not available in 
previous versions.
-
-----
-$ bin/hbase
-Usage: hbase [<options>] <command> [<args>]
-Options:
-  --config DIR    Configuration direction to use. Default: ./conf
-  --hosts HOSTS   Override the list in 'regionservers' file
-
-Commands:
-Some commands take arguments. Pass no args or -h for usage.
-  shell           Run the HBase shell
-  hbck            Run the hbase 'fsck' tool
-  hlog            Write-ahead-log analyzer
-  hfile           Store file analyzer
-  zkcli           Run the ZooKeeper shell
-  upgrade         Upgrade hbase
-  master          Run an HBase HMaster node
-  regionserver    Run an HBase HRegionServer node
-  zookeeper       Run a Zookeeper server
-  rest            Run an HBase REST server
-  thrift          Run the HBase Thrift server
-  thrift2         Run the HBase Thrift2 server
-  clean           Run the HBase clean up script
-  classpath       Dump hbase CLASSPATH
-  mapredcp        Dump CLASSPATH entries required by mapreduce
-  pe              Run PerformanceEvaluation
-  ltt             Run LoadTestTool
-  version         Print the version
-  CLASSNAME       Run the class named CLASSNAME
-----
-
-Some of the tools and utilities below are Java classes which are passed 
directly to the [path]_bin/hbase_ command, as referred to in the last line of 
the usage instructions.
-Others, such as +hbase shell+ (<<shell,shell>>), +hbase upgrade+ 
(<<upgrading,upgrading>>), and +hbase
-        thrift+ (<<thrift,thrift>>), are documented elsewhere in this guide.
-
-=== Canary
-
-There is a Canary class can help users to canary-test the HBase cluster 
status, with every column-family for every regions or regionservers granularity.
-To see the usage, use the [literal]+--help+ parameter. 
-
-----
-$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -help
-
-Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 
[table2]...] | [regionserver1 [regionserver2]..]
- where [opts] are:
-   -help          Show this help and exit.
-   -regionserver  replace the table argument to regionserver,
-      which means to enable regionserver mode
-   -daemon        Continuous check at defined intervals.
-   -interval <N>  Interval between checks (sec)
-   -e             Use region/regionserver as regular expression
-      which means the region/regionserver is regular expression pattern
-   -f <B>         stop whole program if first error occurs, default is true
-   -t <N>         timeout for a check, default is 600000 (milliseconds)
-----
-
-This tool will return non zero error codes to user for collaborating with 
other monitoring tools, such as Nagios.
-The error code definitions are: 
-
-[source,java]
-----
-private static final int USAGE_EXIT_CODE = 1;
-private static final int INIT_ERROR_EXIT_CODE = 2;
-private static final int TIMEOUT_ERROR_EXIT_CODE = 3;
-private static final int ERROR_EXIT_CODE = 4;
-----
-
-Here are some examples based on the following given case.
-There are two HTable called test-01 and test-02, they have two column family 
cf1 and cf2 respectively, and deployed on the 3 regionservers.
-see following table. 
-
-[cols="1,1,1", options="header"]
-|===
-| RegionServer
-| test-01
-| test-02
-|rs1| r1|  r2
-|rs2 |r2 |  
-|rs3 |r2  |r1
-|===
-
-Following are some examples based on the previous given case. 
-
-==== Canary test for every column family (store) of every region of every table
-
-----
-$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary
-            
-3/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf1 in 
2ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf2 in 
2ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,0004883,1386230156732.87b55e03dfeade00f441125159f8ca87. column family 
cf1 in 4ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,0004883,1386230156732.87b55e03dfeade00f441125159f8ca87. column family 
cf2 in 1ms
-...
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,,1386559511167.aa2951a86289281beee480f107bb36ee. column family cf1 in 
5ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,,1386559511167.aa2951a86289281beee480f107bb36ee. column family cf2 in 
3ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,0004883,1386559511167.cbda32d5e2e276520712d84eaaa29d84. column family 
cf1 in 31ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,0004883,1386559511167.cbda32d5e2e276520712d84eaaa29d84. column family 
cf2 in 8ms
-----
-
-So you can see, table test-01 has two regions and two column families, so the 
Canary tool will pick 4 small piece of data from 4 (2 region * 2 store) 
different stores.
-This is a default behavior of the this tool does. 
-
-==== Canary test for every column family (store) of every region of 
specifictable(s)
-
-You can also test one or more specific tables.
-
-----
-$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary test-01 test-02
-----
-
-==== Canary test with regionserver granularity
-
-This will pick one small piece of data from each regionserver, and can also 
put your resionserver name as input options for canary-test specific 
regionservers.
-
-----
-$ ${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver
-            
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-01 on region 
server:rs2 in 72ms
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-02 on region 
server:rs3 in 34ms
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-01 on region 
server:rs1 in 56ms
-----
-
-==== Canary test with regular expression pattern
-
-This will test both table test-01 and test-02.
-
-----
-$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -e test-0[1-2]
-----
-
-==== Run canary test as daemon mode
-
-Run repeatedly with interval defined in option -interval whose default value 
is 6 seconds.
-This daemon will stop itself and return non-zero error code if any error 
occurs, due to the default value of option -f is true.
-
-----
-$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon
-----
-
-Run repeatedly with internal 5 seconds and will not stop itself even error 
occurs in the test.
-
-----
-$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -daemon 
-interval 50000 -f false
-----
-
-==== Force timeout if canary test stuck
-
-In some cases, we suffered the request stucked on the regionserver and not 
response back to the client.
-The regionserver in problem, would also not indicated to be dead by Master, 
which would bring the clients hung.
-So we provide the timeout option to kill the canary test forcefully and return 
non-zero error code as well.
-This run sets the timeout value to 60 seconds, the default value is 600 
seconds.
-
-----
-$ ${HBASE_HOME}/bin/hbase orghapache.hadoop.hbase.tool.Canary -t 600000
-----
-
-==== Running Canary in a Kerberos-enabled Cluster
-
-To run Canary in a Kerberos-enabled cluster, configure the following two 
properties in [path]_hbase-site.xml_:
-
-* [code]+hbase.client.keytab.file+
-* [code]+hbase.client.kerberos.principal+
-
-Kerberos credentials are refreshed every 30 seconds when Canary runs in daemon 
mode.
-
-To configure the DNS interface for the client, configure the following 
optional properties in [path]_hbase-site.xml_.
-
-* [code]+hbase.client.dns.interface+
-* [code]+hbase.client.dns.nameserver+
-
-.Canary in a Kerberos-Enabled Cluster
-====
-This example shows each of the properties with valid values.
-
-[source,xml]
-----
-
-<property>
-  <name>hbase.client.kerberos.principal</name>
-  <value>hbase/_h...@your-realm.com</value>
-</property>
-<property>
-  <name>hbase.client.keytab.file</name>
-  <value>/etc/hbase/conf/keytab.krb5</value>
-</property>
-<!-- optional params -->
-property>
-  <name>hbase.client.dns.interface</name>
-  <value>default</value>
-</property>
-<property>
-  <name>hbase.client.dns.nameserver</name>
-  <value>default</value>
-</property>
-----
-====
-
-[[health.check]]
-=== Health Checker
-
-You can configure HBase to run a script on a period and if it fails N times 
(configurable), have the server exit.
-See link:[HBASE-7351 Periodic health check script] for configurations and 
detail. 
-
-=== Driver
-
-Several frequently-accessed utilities are provided as [code]+Driver+ classes, 
and executed by the [path]_bin/hbase_ command.
-These utilities represent MapReduce jobs which run on your cluster.
-They are run in the following way, replacing [replaceable]_UtilityName_ with 
the utility you want to run.
-This command assumes you have set the environment variable 
[literal]+HBASE_HOME+ to the directory where HBase is unpacked on your server.
-
-----
-
-${HBASE_HOME}/bin/hbase org.apache.hadoop.hbase.mapreduce.UtilityName
-----
-
-The following utilities are available:
-
-+LoadIncrementalHFiles+::
-  Complete a bulk data load.
-
-+CopyTable+::
-  Export a table from the local cluster to a peer cluster.
-
-+Export+::
-  Write table data to HDFS.
-
-+Import+::
-  Import data written by a previous +Export+ operation.
-
-+ImportTsv+::
-  Import data in TSV format.
-
-+RowCounter+::
-  Count rows in an HBase table.
-
-+replication.VerifyReplication+::
-  Compare the data from tables in two different clusters.
-  WARNING: It doesn't work for incrementColumnValues'd cells since the 
timestamp is changed.
-  Note that this command is in a different package than the others.
-
-Each command except +RowCounter+ accepts a single `--help` argument to print 
usage instructions.
-
-[[hbck]]
-=== HBase +hbck+
-
-To run +hbck+ against your HBase cluster run `$./bin/hbase hbck`. At the end 
of the command's output it prints `OK` or `INCONSISTENCY`.
-If your cluster reports inconsistencies, pass `-details` to see more detail 
emitted.
-If inconsistencies, run `hbck` a few times because the inconsistency may be 
transient (e.g.
-cluster is starting up or a region is splitting). Passing `-fix` may correct 
the inconsistency (This latter is an experimental feature). 
-
-For more information, see <<hbck.in.depth,hbck.in.depth>>. 
-
-[[hfile_tool2]]
-=== HFile Tool
-
-See <<hfile_tool,hfile tool>>.
-
-=== WAL Tools
-
-[[hlog_tool]]
-==== [class]+FSHLog+ tool
-
-The main method on [class]+FSHLog+ offers manual split and dump facilities.
-Pass it WALs or the product of a split, the content of the 
[path]_recovered.edits_.
-directory.
-
-You can get a textual dump of a WAL file content by doing the following:
-
-----
- $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012
-----
-
-The return code will be non-zero if issues with the file so you can test 
wholesomeness of file by redirecting [var]+STDOUT+ to [code]+/dev/null+ and 
testing the program return.
-
-Similarly you can force a split of a log file directory by doing:
-
-----
- $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.FSHLog --split 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/
-----
-
-[[hlog_tool.prettyprint]]
-===== WAL Pretty Printer
-
-The WAL Pretty Printer is a tool with configurable options to print the 
contents of a WAL.
-You can invoke it via the hbase cli with the 'wal' command. 
-
-----
- $ ./bin/hbase wal 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012
-----
-
-.WAL Printing in older versions of HBase
-[NOTE]
-====
-Prior to version 2.0, the WAL Pretty Printer was called the 
[class]+HLogPrettyPrinter+, after an internal name for HBase's write ahead log.
-In those versions, you can pring the contents of a WAL using the same 
configuration as above, but with the 'hlog' command. 
-
-----
- $ ./bin/hbase hlog 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012
-----
-====
-
-[[compression.tool]]
-=== Compression Tool
-
-See <<compression.test,compression.test>>.
-
-=== CopyTable
-
-CopyTable is a utility that can copy part or of all of a table, either to the 
same cluster or another cluster.
-The target table must first exist.
-The usage is as follows:
-
-----
-
-$ ./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help        
-/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help
-Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] 
[--new.name=NEW] [--peer.adr=ADR] <tablename>
-
-Options:
- rs.class     hbase.regionserver.class of the peer cluster, 
-              specify if different from current cluster
- rs.impl      hbase.regionserver.impl of the peer cluster,
- startrow     the start row
- stoprow      the stop row
- starttime    beginning of the time range (unixtime in millis)
-              without endtime means from starttime to forever
- endtime      end of the time range.  Ignored if no starttime specified.
- versions     number of cell versions to copy
- new.name     new table's name
- peer.adr     Address of the peer cluster given in the format
-              
hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
- families     comma-separated list of families to copy
-              To copy from cf1 to cf2, give sourceCfName:destCfName.
-              To keep the same name, just give "cfName"
- all.cells    also copy delete markers and deleted cells
-
-Args:
- tablename    Name of the table to copy
-
-Examples:
- To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
- $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable 
--starttime=1265875194289 --endtime=1265878794289 
--peer.adr=server1,server2,server3:2181:/hbase 
--families=myOldCf:myNewCf,cf2,cf3 TestTable
-
-For performance consider the following general options:
-  It is recommended that you set the following to >=100. A higher value uses 
more memory but
-  decreases the round trip time to the server and may increase performance.
-    -Dhbase.client.scanner.caching=100
-  The following should always be set to false, to prevent writing data twice, 
which may produce
-  inaccurate results.
-    -Dmapred.map.tasks.speculative.execution=false
-----
-
-.Scanner Caching
-[NOTE]
-====
-Caching for the input Scan is configured via 
[code]+hbase.client.scanner.caching+          in the job configuration. 
-====
-
-.Versions
-[NOTE]
-====
-By default, CopyTable utility only copies the latest version of row cells 
unless [code]+--versions=n+ is explicitly specified in the command. 
-====
-
-See Jonathan Hsieh's 
link:http://www.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/[Online
-          HBase Backups with CopyTable] blog post for more on +CopyTable+. 
-
-=== Export
-
-Export is a utility that will dump the contents of table to HDFS in a sequence 
file.
-Invoke via:
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> 
[<versions> [<starttime> [<endtime>]]]
-----
-
-Note: caching for the input Scan is configured via 
[code]+hbase.client.scanner.caching+ in the job configuration. 
-
-=== Import
-
-Import is a utility that will load data that has been exported back into HBase.
-Invoke via:
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
-----
-
-To import 0.94 exported files in a 0.96 cluster or onwards, you need to set 
system property "hbase.import.version" when running the import command as below:
-
-----
-$ bin/hbase -Dhbase.import.version=0.94 
org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>
-----
-
-=== ImportTsv
-
-ImportTsv is a utility that will load data in TSV format into HBase.
-It has two distinct usages: loading data from TSV format in HDFS into HBase 
via Puts, and preparing StoreFiles to be loaded via the 
[code]+completebulkload+. 
-
-To load data via Puts (i.e., non-bulk loading):
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv 
-Dimporttsv.columns=a,b,c <tablename> <hdfs-inputdir>
-----
-
-To generate StoreFiles for bulk-loading:
-
-[source,bourne]
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv 
-Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir 
<tablename> <hdfs-data-inputdir>
-----
-
-These generated StoreFiles can be loaded into HBase via 
<<completebulkload,completebulkload>>. 
-
-[[importtsv.options]]
-==== ImportTsv Options
-
-Running +ImportTsv+ with no arguments prints brief usage information:
-
-----
-
-Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
-
-Imports the given input directory of TSV data into the specified table.
-
-The column names of the TSV data must be specified using the 
-Dimporttsv.columns
-option. This option takes the form of comma-separated column names, where each
-column name is either a simple column family, or a columnfamily:qualifier. The 
special
-column name HBASE_ROW_KEY is used to designate that this column should be used
-as the row key for each imported record. You must specify exactly one column
-to be the row key, and you must specify a column name for every column that 
exists in the
-input data.
-
-By default importtsv will load data directly into HBase. To instead generate
-HFiles of data to prepare for a bulk data load, pass the option:
-  -Dimporttsv.bulk.output=/path/for/output
-  Note: the target table will be created with default column family 
descriptors if it does not already exist.
-
-Other options that may be specified with -D include:
-  -Dimporttsv.skip.bad.lines=false - fail if encountering an invalid line
-  '-Dimporttsv.separator=|' - eg separate on pipes instead of tabs
-  -Dimporttsv.timestamp=currentTimeAsLong - use the specified timestamp for 
the import
-  -Dimporttsv.mapper.class=my.Mapper - A user-defined Mapper to use instead of 
org.apache.hadoop.hbase.mapreduce.TsvImporterMapper
-----
-
-[[importtsv.example]]
-==== ImportTsv Example
-
-For example, assume that we are loading data into a table called 'datatsv' 
with a ColumnFamily called 'd' with two columns "c1" and "c2". 
-
-Assume that an input file exists as follows: 
-----
-
-row1   c1      c2
-row2   c1      c2
-row3   c1      c2
-row4   c1      c2
-row5   c1      c2
-row6   c1      c2
-row7   c1      c2
-row8   c1      c2
-row9   c1      c2
-row10  c1      c2
-----        
-
-For ImportTsv to use this imput file, the command line needs to look like this:
-
-----
-
- HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` 
${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar importtsv 
-Dimporttsv.columns=HBASE_ROW_KEY,d:c1,d:c2 
-Dimporttsv.bulk.output=hdfs://storefileoutput datatsv hdfs://inputfile
-----
-
-\... and in this example the first column is the rowkey, which is why the 
HBASE_ROW_KEY is used.
-The second and third columns in the file will be imported as "d:c1" and 
"d:c2", respectively. 
-
-[[importtsv.warning]]
-==== ImportTsv Warning
-
-If you have preparing a lot of data for bulk loading, make sure the target 
HBase table is pre-split appropriately. 
-
-[[importtsv.also]]
-==== See Also
-
-For more information about bulk-loading HFiles into HBase, see 
<<arch.bulk.load,arch.bulk.load>>
-
-=== CompleteBulkLoad
-
-The [code]+completebulkload+ utility will move generated StoreFiles into an 
HBase table.
-This utility is often used in conjunction with output from 
<<importtsv,importtsv>>. 
-
-There are two ways to invoke this utility, with explicit classname and via the 
driver:
-
-.Explicit Classname
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
<hdfs://storefileoutput> <tablename>
-----
-
-.Driver
-----
-HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop 
jar ${HBASE_HOME}/hbase-server-VERSION.jar completebulkload 
<hdfs://storefileoutput> <tablename>
-----
-
-[[completebulkload.warning]]
-==== CompleteBulkLoad Warning
-
-Data generated via MapReduce is often created with file permissions that are 
not compatible with the running HBase process.
-Assuming you're running HDFS with permissions enabled, those permissions will 
need to be updated before you run CompleteBulkLoad.
-
-For more information about bulk-loading HFiles into HBase, see 
<<arch.bulk.load,arch.bulk.load>>. 
-
-=== WALPlayer
-
-WALPlayer is a utility to replay WAL files into HBase. 
-
-The WAL can be replayed for a set of tables or all tables, and a timerange can 
be provided (in milliseconds). The WAL is filtered to this set of tables.
-The output can optionally be mapped to another set of tables. 
-
-WALPlayer can also generate HFiles for later bulk importing, in that case only 
a single table and no mapping can be specified. 
-
-Invoke via:
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer [options] <wal 
inputdir> <tables> [<tableMappings>]>
-----
-
-For example:
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.WALPlayer /backuplogdir 
oldTable1,oldTable2 newTable1,newTable2
-----
-
-WALPlayer, by default, runs as a mapreduce job.
-To NOT run WALPlayer as a mapreduce job on your cluster, force it to run all 
in the local process by adding the flags 
[code]+-Dmapreduce.jobtracker.address=local+ on the command line. 
-
-[[rowcounter]]
-=== RowCounter and CellCounter
-
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter]
        is a mapreduce job to count all the rows of a table.
-This is a good utility to use as a sanity check to ensure that HBase can read 
all the blocks of a table if there are any concerns of metadata inconsistency.
-It will run the mapreduce all in a single process but it will run faster if 
you have a MapReduce cluster in place for it to exploit.
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename> 
[<column1> <column2>...]
-----
-
-Note: caching for the input Scan is configured via 
[code]+hbase.client.scanner.caching+ in the job configuration. 
-
-HBase ships another diagnostic mapreduce job called 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/CellCounter.html[CellCounter].
-Like RowCounter, it gathers more fine-grained statistics about your table.
-The statistics gathered by RowCounter are more fine-grained and include: 
-
-* Total number of rows in the table.
-* Total number of CFs across all rows.
-* Total qualifiers across all rows.
-* Total occurrence of each CF.
-* Total occurrence of each qualifier.
-* Total number of versions of each qualifier.
-
-The program allows you to limit the scope of the run.
-Provide a row regex or prefix to limit the rows to analyze.
-Use [code]+hbase.mapreduce.scan.column.family+ to specify scanning a single 
column family.
-
-----
-$ bin/hbase org.apache.hadoop.hbase.mapreduce.CellCounter <tablename> 
<outputDir> [regex or prefix]
-----
-
-Note: just like RowCounter, caching for the input Scan is configured via 
[code]+hbase.client.scanner.caching+ in the job configuration. 
-
-=== mlockall
-
-It is possible to optionally pin your servers in physical memory making them 
less likely to be swapped out in oversubscribed environments by having the 
servers call link:http://linux.die.net/man/2/mlockall[mlockall] on startup.
-See link:https://issues.apache.org/jira/browse/HBASE-4391[HBASE-4391 Add 
ability to
-          start RS as root and call mlockall] for how to build the optional 
library and have it run on startup. 
-
-[[compaction.tool]]
-=== Offline Compaction Tool
-
-See the usage for the 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html[Compaction
-          Tool].
-Run it like this +./bin/hbase
-          org.apache.hadoop.hbase.regionserver.CompactionTool+      
-
-=== +hbase clean+
-
-The +hbase clean+ command cleans HBase data from ZooKeeper, HDFS, or both.
-It is appropriate to use for testing.
-Run it with no options for usage instructions.
-The +hbase clean+ command was introduced in HBase 0.98.
-
-----
-
-$ bin/hbase clean
-Usage: hbase clean (--cleanZk|--cleanHdfs|--cleanAll)
-Options:
-        --cleanZk   cleans hbase related data from zookeeper.
-        --cleanHdfs cleans hbase related data from hdfs.
-        --cleanAll  cleans hbase related data from both zookeeper and hdfs.
-----
-
-=== +hbase pe+
-
-The +hbase pe+ command is a shortcut provided to run the 
[code]+org.apache.hadoop.hbase.PerformanceEvaluation+ tool, which is used for 
testing.
-The +hbase pe+ command was introduced in HBase 0.98.4.
-
-The PerformanceEvaluation tool accepts many different options and commands.
-For usage instructions, run the command with no options.
-
-To run PerformanceEvaluation prior to HBase 0.98.4, issue the command +hbase 
org.apache.hadoop.hbase.PerformanceEvaluation+.
-
-The PerformanceEvaluation tool has received many updates in recent HBase 
releases, including support for namespaces, support for tags, cell-level ACLs 
and visibility labels, multiget support for RPC calls, increased sampling 
sizes, an option to randomly sleep during testing, and ability to "warm up" the 
cluster before testing starts.
-
-=== +hbase ltt+
-
-The +hbase ltt+ command is a shortcut provided to run the 
[code]+org.apache.hadoop.hbase.util.LoadTestTool+ utility, which is used for 
testing.
-The +hbase ltt+ command was introduced in HBase 0.98.4.
-
-You must specify either +-write+ or +-update-read+ as the first option.
-For general usage instructions, pass the +-h+ option.
-
-To run LoadTestTool prior to HBase 0.98.4, issue the command +hbase
-          org.apache.hadoop.hbase.util.LoadTestTool+.
-
-The LoadTestTool has received many updates in recent HBase releases, including 
support for namespaces, support for tags, cell-level ACLS and visibility 
labels, testing security-related features, ability to specify the number of 
regions per server, tests for multi-get RPC calls, and tests relating to 
replication.
-
-[[ops.regionmgt]]
-== Region Management
-
-[[ops.regionmgt.majorcompact]]
-=== Major Compaction
-
-Major compactions can be requested via the HBase shell or 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29[HBaseAdmin.majorCompact].
 
-
-Note: major compactions do NOT do region merges.
-See <<compaction,compaction>> for more information about compactions. 
-
-[[ops.regionmgt.merge]]
-=== Merge
-
-Merge is a utility that can merge adjoining regions in the same table (see 
org.apache.hadoop.hbase.util.Merge).
-
-[source,bourne]
-----
-$ bin/hbase org.apache.hadoop.hbase.util.Merge <tablename> <region1> <region2>
-----
-
-If you feel you have too many regions and want to consolidate them, Merge is 
the utility you need.
-Merge must run be done when the cluster is down.
-See the 
link:http://ofps.oreilly.com/titles/9781449396107/performance.html[O'Reilly 
HBase
-          Book] for an example of usage. 
-
-You will need to pass 3 parameters to this application.
-The first one is the table name.
-The second one is the fully qualified name of the first region to merge, like 
"table_name,\x0A,1342956111995.7cef47f192318ba7ccc75b1bbf27a82b.". The third 
one is the fully qualified name for the second region to merge. 
-
-Additionally, there is a Ruby script attached to 
link:https://issues.apache.org/jira/browse/HBASE-1621[HBASE-1621] for region 
merging. 
-
-[[node.management]]
-== Node Management
-
-[[decommission]]
-=== Node Decommission
-
-You can stop an individual RegionServer by running the following script in the 
HBase directory on the particular node:
-
-----
-$ ./bin/hbase-daemon.sh stop regionserver
-----
-
-The RegionServer will first close all regions and then shut itself down.
-On shutdown, the RegionServer's ephemeral node in ZooKeeper will expire.
-The master will notice the RegionServer gone and will treat it as a 'crashed' 
server; it will reassign the nodes the RegionServer was carrying. 
-
-.Disable the Load Balancer before Decommissioning a node
-[NOTE]
-====
-If the load balancer runs while a node is shutting down, then there could be 
contention between the Load Balancer and the Master's recovery of the just 
decommissioned RegionServer.
-Avoid any problems by disabling the balancer first.
-See <<lb,lb>> below. 
-====
-
-.Kill Node Tool
-[NOTE]
-====
-In hbase-2.0, in the bin directory, we added a script named 
[path]_considerAsDead.sh_ that can be used to kill a regionserver.
-Hardware issues could be detected by specialized monitoring tools before the  
zookeeper timeout has expired. [path]_considerAsDead.sh_ is a simple function 
to mark a RegionServer as dead.
-It deletes all the znodes of the server, starting the recovery process.
-Plug in the script into your monitoring/fault detection tools to initiate 
faster failover.
-Be careful how you use this disruptive tool.
-Copy the script if you need to make use of it in a version of hbase previous 
to hbase-2.0. 
-====
-
-A downside to the above stop of a RegionServer is that regions could be 
offline for a good period of time.
-Regions are closed in order.
-If many regions on the server, the first region to close may not be back 
online until all regions close and after the master notices the RegionServer's 
znode gone.
-In Apache HBase 0.90.2, we added facility for having a node gradually shed its 
load and then shutdown itself down.
-Apache HBase 0.90.2 added the [path]_graceful_stop.sh_ script.
-Here is its usage:
-
-----
-$ ./bin/graceful_stop.sh
-Usage: graceful_stop.sh [--config &conf-dir>] [--restart] [--reload] 
[--thrift] [--rest] &hostname>
- thrift      If we should stop/start thrift before/after the hbase stop/start
- rest        If we should stop/start rest before/after the hbase stop/start
- restart     If we should restart after graceful stop
- reload      Move offloaded regions back on to the stopped server
- debug       Move offloaded regions back on to the stopped server
- hostname    Hostname of server we are to stop
-----
-
-To decommission a loaded RegionServer, run the following: +$
-          ./bin/graceful_stop.sh HOSTNAME+ where [var]+HOSTNAME+ is the host 
carrying the RegionServer you would decommission. 
-
-.On [var]+HOSTNAME+
-[NOTE]
-====
-The [var]+HOSTNAME+ passed to [path]_graceful_stop.sh_ must match the hostname 
that hbase is using to identify RegionServers.
-Check the list of RegionServers in the master UI for how HBase is referring to 
servers.
-Its usually hostname but can also be FQDN.
-Whatever HBase is using, this is what you should pass the 
[path]_graceful_stop.sh_ decommission script.
-If you pass IPs, the script is not yet smart enough to make a hostname (or 
FQDN) of it and so it will fail when it checks if server is currently running; 
the graceful unloading of regions will not run. 
-====
-
-The [path]_graceful_stop.sh_ script will move the regions off the 
decommissioned RegionServer one at a time to minimize region churn.
-It will verify the region deployed in the new location before it will moves 
the next region and so on until the decommissioned server is carrying zero 
regions.
-At this point, the [path]_graceful_stop.sh_ tells the RegionServer +stop+.
-The master will at this point notice the RegionServer gone but all regions 
will have already been redeployed and because the RegionServer went down 
cleanly, there will be no WAL logs to split. 
-
-.Load Balancer
-[NOTE]
-====
-It is assumed that the Region Load Balancer is disabled while the 
+graceful_stop+ script runs (otherwise the balancer and the decommission script 
will end up fighting over region deployments). Use the shell to disable the 
balancer:
-
-[source]
-----
-hbase(main):001:0> balance_switch false
-true
-0 row(s) in 0.3590 seconds
-----
-
-This turns the balancer OFF.
-To reenable, do:
-
-[source]
-----
-hbase(main):001:0> balance_switch true
-false
-0 row(s) in 0.3590 seconds
-----
-
-The +graceful_stop+ will check the balancer and if enabled, will turn it off 
before it goes to work.
-If it exits prematurely because of error, it will not have reset the balancer.
-Hence, it is better to manage the balancer apart from +graceful_stop+ 
reenabling it after you are done w/ graceful_stop. 
-====
-
-[[draining.servers]]
-==== Decommissioning several Regions Servers concurrently
-
-If you have a large cluster, you may want to decommission more than one 
machine at a time by gracefully stopping mutiple RegionServers concurrently.
-To gracefully drain multiple regionservers at the same time, RegionServers can 
be put into a "draining" state.
-This is done by marking a RegionServer as a draining node by creating an entry 
in ZooKeeper under the [path]_hbase_root/draining_ znode.
-This znode has format [code]+name,port,startcode+ just like the regionserver 
entries under [path]_hbase_root/rs_ znode. 
-
-Without this facility, decommissioning mulitple nodes may be non-optimal 
because regions that are being drained from one region server may be moved to 
other regionservers that are also draining.
-Marking RegionServers to be in the draining state prevents this from happening.
-See this 
link:http://inchoate-clatter.blogspot.com/2012/03/hbase-ops-automation.html[blog
-            post] for more details.
-
-[[bad.disk]]
-==== Bad or Failing Disk
-
-It is good having 
<<dfs.datanode.failed.volumes.tolerated,dfs.datanode.failed.volumes.tolerated>> 
set if you have a decent number of disks per machine for the case where a disk 
plain dies.
-But usually disks do the "John Wayne" -- i.e.
-take a while to go down spewing errors in [path]_dmesg_ -- or for some reason, 
run much slower than their companions.
-In this case you want to decommission the disk.
-You have two options.
-You can 
link:http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F[decommission
-            the datanode] or, less disruptive in that only the bad disks data 
will be rereplicated, can stop the datanode, unmount the bad volume (You can't 
umount a volume while the datanode is using it), and then restart the datanode 
(presuming you have set dfs.datanode.failed.volumes.tolerated > 0). The 
regionserver will throw some errors in its logs as it recalibrates where to get 
its data from -- it will likely roll its WAL log too -- but in general but for 
some latency spikes, it should keep on chugging. 
-
-.Short Circuit Reads
-[NOTE]
-====
-If you are doing short-circuit reads, you will have to move the regions off 
the regionserver before you stop the datanode; when short-circuiting reading, 
though chmod'd so regionserver cannot have access, because it already has the 
files open, it will be able to keep reading the file blocks from the bad disk 
even though the datanode is down.
-Move the regions back after you restart the datanode.
-====
-
-[[rolling]]
-=== Rolling Restart
-
-Some cluster configuration changes require either the entire cluster, or the 
RegionServers, to be restarted in order to pick up the changes.
-In addition, rolling restarts are supported for upgrading to a minor or 
maintenance release, and to a major release if at all possible.
-See the release notes for release you want to upgrade to, to find out about 
limitations to the ability to perform a rolling upgrade.
-
-There are multiple ways to restart your cluster nodes, depending on your 
situation.
-These methods are detailed below.
-
-==== Using the +rolling-restart.sh+ Script
-
-HBase ships with a script, [path]_bin/rolling-restart.sh_, that allows you to 
perform rolling restarts on the entire cluster, the master only, or the 
RegionServers only.
-The script is provided as a template for your own script, and is not 
explicitly tested.
-It requires password-less SSH login to be configured and assumes that you have 
deployed using a tarball.
-The script requires you to set some environment variables before running it.
-Examine the script and modify it to suit your needs.
-
-.[path]_rolling-restart.sh_ General Usage
-====
-----
-
-$ ./bin/rolling-restart.sh --help
-Usage: rolling-restart.sh [--config <hbase-confdir>] [--rs-only] 
[--master-only] [--graceful] [--maxthreads xx]
-----
-====
-
-Rolling Restart on RegionServers Only::
-  To perform a rolling restart on the RegionServers only, use the 
[code]+--rs-only+ option.
-  This might be necessary if you need to reboot the individual RegionServer or 
if you make a configuration change that only affects RegionServers and not the 
other HBase processes.
-
-Rolling Restart on Masters Only::
-  To perform a rolling restart on the active and backup Masters, use the 
[code]+--master-only+ option.
-  You might use this if you know that your configuration change only affects 
the Master and not the RegionServers, or if you need to restart the server 
where the active Master is running.
-
-Graceful Restart::
-  If you specify the [code]+--graceful+ option, RegionServers are restarted 
using the [path]_bin/graceful_stop.sh_ script, which moves regions off a 
RegionServer before restarting it.
-  This is safer, but can delay the restart.
-
-Limiting the Number of Threads::
-  To limit the rolling restart to using only a specific number of threads, use 
the [code]+--maxthreads+ option.
-
-[[rolling.restart.manual]]
-==== Manual Rolling Restart
-
-To retain more control over the process, you may wish to manually do a rolling 
restart across your cluster.
-This uses the +graceful-stop.sh+ command <<decommission,decommission>>.
-In this method, you can restart each RegionServer individually and then move 
its old regions back into place, retaining locality.
-If you also need to restart the Master, you need to do it separately, and 
restart the Master before restarting the RegionServers using this method.
-The following is an example of such a command.
-You may need to tailor it to your environment.
-This script does a rolling restart of RegionServers only.
-It disables the load balancer before moving the regions.
-
-----
-
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i; done &> /tmp/log.txt &;
-----
-
-Monitor the output of the [path]_/tmp/log.txt_ file to follow the progress of 
the script. 
-
-==== Logic for Crafting Your Own Rolling Restart Script
-
-Use the following guidelines if you want to create your own rolling restart 
script.
-
-. Extract the new release, verify its configuration, and synchronize it to all 
nodes of your cluster using +rsync+, +scp+, or another secure synchronization 
mechanism.
-. Use the hbck utility to ensure that the cluster is consistent.
-+
-----
-
-$ ./bin/hbck
-----
-+
-Perform repairs if required.
-See <<hbck,hbck>> for details.
-
-. Restart the master first.
-  You may need to modify these commands if your new HBase directory is 
different from the old one, such as for an upgrade.
-+
-----
-
-$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master
-----
-
-. Gracefully restart each RegionServer, using a script such as the following, 
from the Master.
-+
-----
-
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i; done &> /tmp/log.txt &
-----
-+
-If you are running Thrift or REST servers, pass the --thrift or --rest options.
-For other available options, run the +bin/graceful-stop.sh --help+             
 command.
-+
-It is important to drain HBase regions slowly when restarting multiple 
RegionServers.
-Otherwise, multiple regions go offline simultaneously and must be reassigned 
to other nodes, which may also go offline soon.
-This can negatively affect performance.
-You can inject delays into the script above, for instance, by adding a Shell 
command such as +sleep+.
-To wait for 5 minutes between each RegionServer restart, modify the above 
script to the following:
-+
-----
-
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i & sleep 5m; done &> /tmp/log.txt &
-----
-
-. Restart the Master again, to clear out the dead servers list and re-enable 
the load balancer.
-. Run the +hbck+ utility again, to be sure the cluster is consistent.
-
-[[adding.new.node]]
-=== Adding a New Node
-
-Adding a new regionserver in HBase is essentially free, you simply start it 
like this: +$ ./bin/hbase-daemon.sh start regionserver+ and it will register 
itself with the master.
-Ideally you also started a DataNode on the same machine so that the RS can 
eventually start to have local files.
-If you rely on ssh to start your daemons, don't forget to add the new hostname 
in [path]_conf/regionservers_ on the master. 
-
-At this point the region server isn't serving data because no regions have 
moved to it yet.
-If the balancer is enabled, it will start moving regions to the new RS.
-On a small/medium cluster this can have a very adverse effect on latency as a 
lot of regions will be offline at the same time.
-It is thus recommended to disable the balancer the same way it's done when 
decommissioning a node and move the regions manually (or even better, using a 
script that moves them one by one). 
-
-The moved regions will all have 0% locality and won't have any blocks in cache 
so the region server will have to use the network to serve requests.
-Apart from resulting in higher latency, it may also be able to use all of your 
network card's capacity.
-For practical purposes, consider that a standard 1GigE NIC won't be able to 
read much more than _100MB/s_.
-In this case, or if you are in a OLAP environment and require having locality, 
then it is recommended to major compact the moved regions. 
-
-== HBase Metrics
-
-HBase emits metrics which adhere to the 
link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[Hadoop
 metrics] API.
-Starting with HBase 0.95footnote:[The Metrics system was redone in
-          HBase 0.96. See Migration
-            to the New Metrics Hotness – Metrics2 by Elliot Clark for 
detail], HBase is configured to emit a default set of metrics with a default 
sampling period of every 10 seconds.
-You can use HBase metrics in conjunction with Ganglia.
-You can also filter which metrics are emitted and extend the metrics framework 
to capture custom metrics appropriate for your environment.
-
-=== Metric Setup
-
-For HBase 0.95 and newer, HBase ships with a default metrics configuration, or 
[firstterm]_sink_.
-This includes a wide variety of individual metrics, and emits them every 10 
seconds by default.
-To configure metrics for a given region server, edit the 
[path]_conf/hadoop-metrics2-hbase.properties_ file.
-Restart the region server for the changes to take effect.
-
-To change the sampling rate for the default sink, edit the line beginning with 
[literal]+*.period+.
-To filter which metrics are emitted or to extend the metrics framework, see 
link:http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html
      
-
-.HBase Metrics and Ganglia
-[NOTE]
-====
-By default, HBase emits a large number of metrics per region server.
-Ganglia may have difficulty processing all these metrics.
-Consider increasing the capacity of the Ganglia server or reducing the number 
of metrics emitted by HBase.
-See 
link:http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html#filtering[Metrics
 Filtering].
-====
-
-=== Disabling Metrics
-
-To disable metrics for a region server, edit the 
[path]_conf/hadoop-metrics2-hbase.properties_ file and comment out any 
uncommented lines.
-Restart the region server for the changes to take effect.
-
-[[discovering.available.metrics]]
-=== Discovering Available Metrics
-
-Rather than listing each metric which HBase emits by default, you can browse 
through the available metrics, either as a JSON output or via JMX.
-Different metrics are exposed for the Master process and each region server 
process.
-
-.Procedure: Access a JSON Output of Available Metrics
-. After starting HBase, access the region server's web UI, at 
[literal]+http://REGIONSERVER_HOSTNAME:60030+ by default (or port 16030 in 
HBase 1.0+).
-. Click the [label]#Metrics Dump# link near the top.
-  The metrics for the region server are presented as a dump of the JMX bean in 
JSON format.
-  This will dump out all metrics names and their values.
-  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
[literal]+?description=true+ so your URL becomes 
[literal]+http://REGIONSERVER_HOSTNAME:60030/jmx?description=true+.
-  Not all beans and attributes have descriptions. 
-. To view metrics for the Master, connect to the Master's web UI instead 
(defaults to [literal]+http://localhost:60010+ or port 16010 in HBase 1.0+) and 
click its [label]#Metrics
-  Dump# link.
-  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
[literal]+?description=true+ so your URL becomes 
[literal]+http://REGIONSERVER_HOSTNAME:60010/jmx?description=true+.
-  Not all beans and attributes have descriptions. 
-
-
-You can use many different tools to view JMX content by browsing MBeans.
-This procedure uses +jvisualvm+, which is an application usually available in 
the JDK. 
-
-.Procedure: Browse the JMX Output of Available Metrics
-. Start HBase, if it is not already running.
-. Run the command +jvisualvm+ command on a host with a GUI display.
-  You can launch it from the command line or another method appropriate for 
your operating system.
-. Be sure the [label]#VisualVM-MBeans# plugin is installed. Browse to *Tools 
-> Plugins*. Click [label]#Installed# and check whether the plugin is listed.
-  If not, click [label]#Available Plugins#, select it, and click btn:[Install].
-  When finished, click btn:[Close].
-. To view details for a given HBase process, double-click the process in the 
[label]#Local# sub-tree in the left-hand panel.
-  A detailed view opens in the right-hand panel.
-  Click the [label]#MBeans# tab which appears as a tab in the top of the 
right-hand panel.
-. To access the HBase metrics, navigate to the appropriate sub-bean:
-.* Master: 
-.* RegionServer: 
-
-. The name of each metric and its current value is displayed in the 
[label]#Attributes# tab.
-  For a view which includes more details, including the description of each 
attribute, click the [label]#Metadata# tab.
-
-=== Units of Measure for Metrics
-
-Different metrics are expressed in different units, as appropriate.
-Often, the unit of measure is in the name (as in the metric 
[code]+shippedKBs+). Otherwise, use the following guidelines.
-When in doubt, you may need to examine the source for a given metric.
-
-* Metrics that refer to a point in time are usually expressed as a timestamp.
-* Metrics that refer to an age (such as [code]+ageOfLastShippedOp+) are 
usually expressed in milliseconds.
-* Metrics that refer to memory sizes are in bytes.
-* Sizes of queues (such as [code]+sizeOfLogQueue+) are expressed as the number 
of items in the queue.
-  Determine the size by multiplying by the block size (default is 64 MB in 
HDFS).
-* Metrics that refer to things like the number of a given type of operations 
(such as [code]+logEditsRead+) are expressed as an integer.
-
-[[master_metrics]]
-=== Most Important Master Metrics
-
-Note: Counts are usually over the last metrics reporting interval.
-
-hbase.master.numRegionServers::
-  Number of live regionservers
-
-hbase.master.numDeadRegionServers::
-  Number of dead regionservers
-
-hbase.master.ritCount ::
-  The number of regions in transition
-
-hbase.master.ritCountOverThreshold::
-  The number of regions that have been in transition longer than a threshold 
time (default: 60 seconds)
-
-hbase.master.ritOldestAge::
-  The age of the longest region in transition, in milliseconds 
-
-[[rs_metrics]]
-=== Most Important RegionServer Metrics
-
-Note: Counts are usually over the last metrics reporting interval.
-
-hbase.regionserver.regionCount::
-  The number of regions hosted by the regionserver
-
-hbase.regionserver.storeFileCount::
-  The number of store files on disk currently managed by the regionserver
-
-hbase.regionserver.storeFileSize::
-  Aggregate size of the store files on disk
-
-hbase.regionserver.hlogFileCount::
-  The number of write ahead logs not yet archived
-
-hbase.regionserver.totalRequestCount::
-  The total number of requests received
-
-hbase.regionserver.readRequestCount::
-  The number of read requests received
-
-hbase.regionserver.writeRequestCount::
-  The number of write requests received
-
-hbase.regionserver.numOpenConnections::
-  The number of open connections at the RPC layer
-
-hbase.regionserver.numActiveHandler::
-  The number of RPC handlers actively servicing requests
-
-hbase.regionserver.numCallsInGeneralQueue::
-  The number of currently enqueued user requests
-
-hbase.regionserver.numCallsInReplicationQueue::
-  The number of currently enqueued operations received from replication
-
-hbase.regionserver.numCallsInPriorityQueue::
-  The number of currently enqueued priority (internal housekeeping) requests
-
-hbase.regionserver.flushQueueLength::
-  Current depth of the memstore flush queue.
-  If increasing, we are falling behind with clearing memstores out to HDFS.
-
-hbase.regionserver.updatesBlockedTime::
-  Number of milliseconds updates have been blocked so the memstore can be 
flushed
-
-hbase.regionserver.compactionQueueLength::
-  Current depth of the compaction request queue.
-  If increasing, we are falling behind with storefile compaction.
-
-hbase.regionserver.blockCacheHitCount::
-  The number of block cache hits
-
-hbase.regionserver.blockCacheMissCount::
-  The number of block cache misses
-
-hbase.regionserver.blockCacheExpressHitPercent ::
-  The percent of the time that requests with the cache turned on hit the cache
-
-hbase.regionserver.percentFilesLocal::
-  Percent of store file data that can be read from the local DataNode, 0-100
-
-hbase.regionserver.<op>_<measure>::
-  Operation latencies, where <op> is one of Append, Delete, Mutate, Get, 
Replay, Increment; and where <measure> is one of min, max, mean, median, 
75th_percentile, 95th_percentile, 99th_percentile
-
-hbase.regionserver.slow<op>Count ::
-  The number of operations we thought were slow, where <op> is one of the list 
above
-
-hbase.regionserver.GcTimeMillis::
-  Time spent in garbage collection, in milliseconds
-
-hbase.regionserver.GcTimeMillisParNew::
-  Time spent in garbage collection of the young generation, in milliseconds
-
-hbase.regionserver.GcTimeMillisConcurrentMarkSweep::
-  Time spent in garbage collection of the old generation, in milliseconds
-
-hbase.regionserver.authenticationSuccesses::
-  Number of client connections where authentication succeeded
-
-hbase.regionserver.authenticationFailures::
-  Number of client connection authentication failures
-
-hbase.regionserver.mutationsWithoutWALCount ::
-  Count of writes submitted with a flag indicating they should bypass the 
write ahead log
-
-[[ops.monitoring]]
-== HBase Monitoring
-
-[[ops.monitoring.overview]]
-=== Overview
-
-The following metrics are arguably the most important to monitor for each 
RegionServer for "macro monitoring", preferably with a system like 
link:http://opentsdb.net/[OpenTSDB].
-If your cluster is having performance issues it's likely that you'll see 
something unusual with this group. 
-
-HBase::
-  * See <<rs_metrics,rs metrics>>
-
-OS::
-  * IO Wait
-  * User CPU
-
-Java::
-  * GC
-
-For more information on HBase metrics, see <<hbase_metrics,hbase metrics>>. 
-
-[[ops.slow.query]]
-=== Slow Query Log
-
-The HBase slow query log consists of parseable JSON structures describing the 
properties of those client operations (Gets, Puts, Deletes, etc.) that either 
took too long to run, or produced too much output.
-The thresholds for "too long to run" and "too much output" are configurable, 
as described below.
-The output is produced inline in the main region server logs so that it is 
easy to discover further details from context with other logged events.
-It is also prepended with identifying tags [constant]+(responseTooSlow)+, 
[constant]+(responseTooLarge)+, [constant]+(operationTooSlow)+, and 
[constant]+(operationTooLarge)+ in order to enable easy filtering with grep, in 
case the user desires to see only slow queries. 
-
-==== Configuration
-
-There are two configuration knobs that can be used to adjust the thresholds 
for when queries are logged. 
-
-* [var]+hbase.ipc.warn.response.time+ Maximum number of milliseconds that a 
query can be run without being logged.
-  Defaults to 10000, or 10 seconds.
-  Can be set to -1 to disable logging by time. 
-* [var]+hbase.ipc.warn.response.size+ Maximum byte size of response that a 
query can return without being logged.
-  Defaults to 100 megabytes.
-  Can be set to -1 to disable logging by size. 
-
-==== Metrics
-
-The slow query log exposes to metrics to JMX.
-
-* [var]+hadoop.regionserver_rpc_slowResponse+ a global metric reflecting the 
durations of all responses that triggered logging.
-* [var]+hadoop.regionserver_rpc_methodName.aboveOneSec+ A metric reflecting 
the durations of all responses that lasted for more than one second.
-
-==== Output
-
-The output is tagged with operation e.g. [constant]+(operationTooSlow)+ if the 
call was a client operation, such as a Put, Get, or Delete, which we expose 
detailed fingerprint information for.
-If not, it is tagged [constant]+(responseTooSlow)+          and still produces 
parseable JSON output, but with less verbose information solely regarding its 
duration and size in the RPC itself. [constant]+TooLarge+ is substituted for 
[constant]+TooSlow+ if the response size triggered the logging, with 
[constant]+TooLarge+ appearing even in the case that both size and duration 
triggered logging. 
-
-==== Example
-
-
-[source]
-----
-2011-09-08 10:01:25,824 WARN org.apache.hadoop.ipc.HBaseServer: 
(operationTooSlow): 
{"tables":{"riley2":{"puts":[{"totalColumns":11,"families":{"actions":[{"timestamp":1315501284459,"qualifier":"0","vlen":9667580},{"timestamp":1315501284459,"qualifier":"1","vlen":10122412},{"timestamp":1315501284459,"qualifier":"2","vlen":11104617},{"timestamp":1315501284459,"qualifier":"3","vlen":13430635}]},"row":"cfcd208495d565ef66e7dff9f98764da:0"}],"families":["actions"]}},"processingtimems":956,"client":"10.47.34.63:33623","starttimems":1315501284456,"queuetimems":0,"totalPuts":1,"class":"HRegionServer","responsesize":0,"method":"multiPut"}
-----        
-
-Note that everything inside the "tables" structure is output produced by 
MultiPut's fingerprint, while the rest of the information is RPC-specific, such 
as processing time and client IP/port.
-Other client operations follow the same pattern and the same general 
structure, with necessary differences due to the nature of the individual 
operations.
-In the case that the call is not a client operation, that detailed fingerprint 
information will be completely absent. 
-
-This particular example, for example, would indicate that the likely cause of 
slowness is simply a very large (on the order of 100MB) multiput, as we can 
tell by the "vlen," or value length, fields of each put in the multiPut. 
-
-=== Block Cache Monitoring
-
-Starting with HBase 0.98, the HBase Web UI includes the ability to monitor and 
report on the performance of the block cache.
-To view the block cache reports, click .
-Following are a few examples of the reporting capabilities.
-
-.Basic Info
-image::bc_basic.png[]
-
-.Config
-image::bc_config.png[]
-
-.Stats
-image::bc_stats.png[]
-
-.L1 and L2
-image::bc_l1.png[]
-
-This is not an exhaustive list of all the screens and reports available.
-Have a look in the Web UI.
-
-== Cluster Replication
-
-NOTE: This information was previously available at 
link:http://hbase.apache.org/replication.html[Cluster Replication]. 
-
-HBase provides a cluster replication mechanism which allows you to keep one 
cluster's state synchronized with that of another cluster, using the 
write-ahead log (WAL) of the source cluster to propagate the changes.
-Some use cases for cluster replication include:
-
-* Backup and disaster recovery
-* Data aggregation
-* Geographic data distribution
-* Online data ingestion combined with offline data analytics
-
-NOTE: Replication is enabled at the granularity of the column family.
-Before enabling replication for a column family, create the table and all 
column families to be replicated, on the destination cluster.
-
-Cluster replication uses a source-push methodology.
-An HBase cluster can be a source (also called master or active, meaning that 
it is the originator of new data), a destination (also called slave or passive, 
meaning that it receives data via replication), or can fulfill both roles at 
once.
-Replication is asynchronous, and the goal of replication is eventual 
consistency.
-When the source receives an edit to a column family with replication enabled, 
that edit is propagated to all destination clusters using the WAL for that for 
that column family on the RegionServer managing the relevant region.
-
-When data is replicated from one cluster to another, the original source of 
the data is tracked via a cluster ID which is part of the metadata.
-In HBase 0.96 and newer 
(link:https://issues.apache.org/jira/browse/HBASE-7709[HBASE-7709]), all 
clusters which have already consumed the data are also tracked.
-This prevents replication loops.
-
-The WALs for each region server must be kept in HDFS as long as they are 
needed to replicate data to any slave cluster.
-Each region server reads from the oldest log it needs to replicate and keeps 
track of its progress processing WALs inside ZooKeeper to simplify failure 
recovery.
-The position marker which indicates a slave cluster's progress, as well as the 
queue of WALs to process, may be different for every slave cluster.
-
-The clusters participating in replication can be of different sizes.
-The master cluster relies on randomization to attempt to balance the stream of 
replication on the slave clusters.
-It is expected that the slave cluster has storage capacity to hold the 
replicated data, as well as any data it is responsible for ingesting.
-If a slave cluster does run out of room, or is inaccessible for other reasons, 
it throws an error and the master retains the WAL and retries the replication 
at intervals.
-
-.Terminology Changes
-[NOTE]
-====
-Previously, terms such as [firstterm]_master-master_, 
[firstterm]_master-slave_, and [firstterm]_cyclical_ were used to describe 
replication relationships in HBase.
-These terms added confusion, and have been abandoned in favor of discussions 
about cluster topologies appropriate for different scenarios.
-====
-
-.Cluster Topologies
-* A central source cluster might propagate changes out to multiple destination 
clusters, for failover or due to geographic distribution.
-* A source cluster might push changes to a destination cluster, which might 
also push its own changes back to the original cluster.
-* Many different low-latency clusters might push changes to one centralized 
cluster for backup or resource-intensive data analytics jobs.
-  The processed data might then be replicated back to the low-latency clusters.
-
-Multiple levels of replication may be chained together to suit your 
organization's needs.
-The following diagram shows a hypothetical scenario.
-Use the arrows to follow the data paths.
-
-.Example of a Complex Cluster Replication Configuration
-image::hbase_replication_diagram.jpg[]
-
-HBase replication borrows many concepts from the [firstterm]_statement-based 
replication_ design used by MySQL.
-Instead of SQL statements, entire WALEdits (consisting of multiple cell 
inserts coming from Put and Delete operations on the clients) are replicated in 
order to maintain atomicity. 
-
-=== Configuring Cluster Replication
-
-The following is a simplified procedure for configuring cluster replication.
-It may not cover every edge case.
-For more information, see the 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements[
 API documentation for replication].
-
-. Configure and start the source and destination clusters.
-  Create tables with the same names and column families on both the source and 
destination clusters, so that the destination cluster knows where to store data 
it will receive.
-  All hosts in the source and destination clusters should be reachable to each 
other.
-. On the source cluster, enable replication by setting 
[code]+hbase.replication+            to [literal]+true+ in 
[path]_hbase-site.xml_.
-. On the source cluster, in HBase Shell, add the destination cluster as a 
peer, using the [code]+add_peer+ command.
-  The syntax is as follows:
-+
-----
-hbase> add_peer 'ID' 'CLUSTER_KEY'
-----
-+
-The ID is a string (prior to 
link:https://issues.apache.org/jira/browse/HBASE-11367[HBASE-11367], it was a 
short integer), which _must not contain a hyphen_ (see 
link:https://issues.apache.org/jira/browse/HBASE-11394[HBASE-11394]). To 
compose the CLUSTER_KEY, use the following template:
-+
-----
-hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
-----
-+
-If both clusters use the same ZooKeeper cluster, you must use a different 
[code]+zookeeper.znode.parent+, because they cannot write in the same folder.
-
-. On the source cluster, configure each column family to be replicated by 
setting its REPLICATION_SCOPE to 1, using commands such as the following in 
HBase Shell.
-+
-----
-hbase> disable 'example_table' 
-hbase> alter 'example_table', {NAME => 'example_family', REPLICATION_SCOPE => 
'1'} 
-hbase> enable 'example_table'
-----
-
-. You can verify that replication is taking place by examining the logs on the 
source cluster for messages such as the following.
-+
-----
-Considering 1 rs, with ratio 0.1
-Getting 1 rs from peer cluster # 0 
-Choosing peer 10.10.1.49:62020
-----
-
-. To verify the validity of replicated data, you can use the included 
[code]+VerifyReplication+ MapReduce job on the source cluster, providing it 
with the ID of the replication peer and table name to verify.
-  Other options are possible, such as a time range or specific families to 
verify.
-+
-The command has the following form:
-+
-----
-hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication 
[--starttime=timestamp1] [--stoptime=timestamp [--families=comma separated list 
of families] <peerId><tablename>
-----
-+
-The [code]+VerifyReplication+ command prints out [literal]+GOODROWS+           
 and [literal]+BADROWS+ counters to indicate rows that did and did not 
replicate correctly. 
-
-
-=== Detailed Information About Cluster Replication
-
-.Replication Architecture Overview
-image::replication_overview.png[]
-
-==== Life of a WAL Edit
-
-A single WAL edit goes through several steps in order to be replicated to a 
slave cluster.
-
-. An HBase client uses a Put or Delete operation to manipulate data in HBase.
-. The region server writes the request to the WAL in a way allows it to be 
replayed if it is not written successfully.
-. If the changed cell corresponds to a column family that is scoped for 
replication, the edit is added to the queue for replication.
-. In a separate thread, the edit is read from the log, as part of a batch 
process.
-  Only the KeyValues that are eligible for replication are kept.
-  Replicable KeyValues are part of a column family whose schema is scoped 
GLOBAL, are not part of a catalog such as [code]+hbase:meta+, did not originate 
from the target slave cluster, and have not already been consumed by the target 
slave cluster.
-. The edit is tagged with the master's UUID and added to a buffer.
-  When the buffer is filled, or the reader reaches the end of the file, the 
buffer is sent to a random region server on the slave cluster.
-. The region server reads the edits sequentially and separates them into 
buffers, one buffer per table.
-  After all edits are read, each buffer is flushed using 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html[HTable],
 HBase's normal client.
-  The master's UUID and the UUIDs of slaves which have already consumed the 
data are preserved in the edits they are applied, in order to prevent 
replication loops.
-. In the master, the offset for the WAL that is currently being replicated is 
registered in ZooKeeper.
-
-. The first three steps, where the edit is inserted, are identical.
-. Again in a separate thread, the region server reads, filters, and edits the 
log edits in the same way as above.
-  The slave region server does not answer the RPC call.
-. The master sleeps and tries again a configurable number of times.
-. If the slave region server is still not available, the master selects a new 
subset of region server to replicate to, and tries again to send the buffer of 
edits.
-. Meanwhile, the WALs are rolled and stored in a queue in ZooKeeper.
-  Logs that are [firstterm]_archived_ by their region server, by moving them 
from the region server's log directory to a central log directory, will update 
their paths in the in-memory queue of the replicating thread.
-. When the slave cluster is finally available, the buffer is applied in the 
same way as during normal processing.
-  The master region server will then replicate the backlog of logs that 
accumulated during the outage.
-
-.Spreading Queue Failover Load
-When replication is active, a subset of region servers in the source cluster 
is responsible for shipping edits to the sink.
-This responsibility must be failed over like all other region server functions 
should a process or node crash.
-The following configuration settings are recommended for maintaining an even 
distribution of replication activity over the remaining live servers in the 
source cluster:
-
-* Set [code]+replication.source.maxretriesmultiplier+ to [literal]+300+.
-* Set [code]+replication.source.sleepforretries+ to [literal]+1+ (1 second). 
This value, combined with the value of 
[code]+replication.source.maxretriesmultiplier+, causes the retry cycle to last 
about 5 minutes.
-* Set [code]+replication.sleep.before.failover+ to [literal]+30000+ (30 
seconds) in the source cluster site configuration.
-
-.Preserving Tags During Replication
-By default, the codec used for replication between clusters strips tags, such 
as cell-level ACLs, from cells.
-To prevent the tags from being stripped, you can use a different codec which 
does not strip them.
-Configure [code]+hbase.replication.rpc.codec+ to use 
[literal]+org.apache.hadoop.hbase.codec.KeyValueCodecWithTags+, on both the 
source and sink RegionServers involved in the replication.
-This option was introduced in 
link:https://issues.apache.org/jira/browse/HBASE-10322[HBASE-10322].
-
-==== Replication Internals
-
-Replication State in ZooKeeper::
-  HBase replication maintains its state in ZooKeeper.
-  By default, the state is contained in the base node 
[path]_/hbase/replication_.
-  This node contains two child nodes, the [code]+Peers+ znode and the 
[code]+RS+                znode.
-
-The [code]+Peers+ Znode::
-  The [code]+peers+ znode is stored in [path]_/hbase/replication/peers_ by 
default.
-  It consists of a list of all peer replication clusters, along with the 
status of each of them.
-  The value of each peer is its cluster key, which is provided in the HBase 
Shell.
-  The cluster key contains a list of ZooKeeper nodes in the cluster's quorum, 
the client port for the ZooKeeper quorum, and the base znode for HBase in HDFS 
on that cluster.
-
-The [code]+RS+ Znode::
-  The [code]+rs+ znode contains a list of WAL logs which need to be replicated.
-  This list is divided into a set of queues organized by region server and the 
peer cluster the region server is shipping the logs to.
-  The rs znode has one child znode for each region server in the cluster.
-  The child znode name is the region server's hostname, client port, and start 
code.
-  This list includes both live and dead region servers.
-
-==== Choosing Region Servers to Replicate To
-
-When a master cluster region server initiates a replication source to a slave 
cluster, it first connects to the slave's ZooKeeper ensemble using the provided 
cluster key . It then scans the [path]_rs/_ directory to discover all the 
available sinks (region servers that are accepting incoming streams of edits to 
replicate) and randomly chooses a subset of them using a configured ratio which 
has a default value of 10%. For example, if a slave cluster has 150 machines, 
15 will be chosen as potential recipient for edits that this master cluster 
region server sends.
-Because this selection is performed by each master region server, the 
probability that all slave region servers are used is very high, and this 
method works for clusters of any size.
-For example, a master cluster of 10 machines replicating to a slave cluster of 
5 machines with a ratio of 10% causes the master cluster region servers to 
choose one machine each at random.
-
-A ZooKeeper watcher is placed on the [path]_${zookeeper.znode.parent}/rs_ node 
of the slave cluster by each of the master cluster's region servers.
-This watch is used to monitor changes in the composition of the slave cluster.
-When nodes are removed from the slave cluster, or if nodes go down or come 
back up, the master cluster's region servers will respond by selecting a new 
pool of slave region servers to replicate to.
-
-==== Keeping Track of Logs
-
-Each master cluster region server has its own znode in the replication znodes 
hierarchy.
-It contains one znode per peer cluster (if 5 slave clusters, 5 znodes are 
created), and each of these contain a queue of WALs to process.
-Each of these queues will track the WALs created by that region server, but 
they can differ in size.
-For example, if one slave cluster becomes unavailable for some time, the WALs 
should not be deleted, so they need to stay in the queue while the others are 
processed.
-See <<rs.failover.details,rs.failover.details>> for an example.
-
-When a source is instantiated, it contains the current WAL that the region 
server is writing to.
-During log rolling, the new file is added to the queue of each slave cluster's 
znode just before it is made available.
-This ensures that all the sources are aware that a new log exists before the 
region server is able to append edits into it, but this operations is now more 
expensive.
-The queue items are discarded when the replication thread cannot read more 
entries from a file (because it reached the end of the last block) and there 
are other files in the queue.
-This means that if a source is up to date and replicates from the log that the 
region server writes to, reading up to the "end" of the current file will not 
delete the item in the queue.
-
-A log can be archived if it is no longer used or if the number of logs exceeds 
[code]+hbase.regionserver.maxlogs+ because the insertion rate is faster than 
regions are flushed.
-When a log is archived, the source threads are notified that the path for that 
log changed.
-If a particular source has already finished with an archived log, it will just 
ignore the message.
-If the log is in the queue, the path will be updated in memory.
-If the log is currently being replicated, the change will be done atomically 
so that the reader doesn't attempt to open the file when has already been moved.
-Because moving a file is a NameNode operation , if the reader is currently 
reading the log, it won't generate any exception.
-
-==== Reading, Filtering and Sending Edits
-
-By default, a source attempts to read from a WAL and ship log entries to a 
sink as quickly as possible.
-Speed is limited by the filtering of log entries Only KeyValues that are 
scoped GLOBAL and that do not belong to catalog tables will be retained.
-Speed is also limited by total size of the list of edits to replicate per 
slave, which is limited to 64 MB by default.
-With this configuration, a master cluster region server with three slaves 
would use at most 192 MB to store data to replicate.
-This does not account for the data which was filtered but not garbage 
collected.
-
-Once the maximum size of edits has been buffered or the reader reaces the end 
of the WAL, the source thread stops reading and chooses at random a sink to 
replicate to (from the list that was generated by keeping only a subset of 
slave region servers). It directly issues a RPC to the chosen region server and 
waits for the method to return.
-If the RPC was successful, the source determines whether the current file has 
been emptied or it contains more data which needs to be read.
-If the file has been emptied, the source deletes the znode in the queue.
-Otherwise, it registers the new offset in the log's znode.
-If the RPC threw an exception, the source will retry 10 times before trying to 
find a different sink.
-
-==== Cleaning Logs
-
-If replication is not enabled, the master's log-cleaning thread deletes old 
logs using a configured TTL.
-This TTL-based method does not work well with replication, because archived 
logs which have exceeded their TTL may still be in a queue.
-The default behavior is augmented so that if a log is past its TTL, the 
cleaning thread looks up every queue until it finds the log, while caching 
queues it has found.
-If the log is not found in any queues, the log will be deleted.
-The next time the cleaning process needs to look for a log, it starts by using 
its cached list.
-
-[[rs.failover.details]]
-==== Region Server Failover
-
-When no region servers are failing, keeping track of the logs in ZooKeeper 
adds no value.
-Unfortunately, region servers do fail, and since ZooKeeper is highly 
available, it is useful for managing the transfer of the queues in the event of 
a failure.
-
-Each of the master cluster region servers keeps a watcher on every other 
region server, in order to be notified when one dies (just as the master does). 
When a failure happens, they all race to create a znode called [literal]+lock+ 
inside the dead region server's znode that contains its queues.
-The region server that creates it successfully then transfers all the queues 
to its own znode, one at a time since ZooKeeper does not support renaming 
queues.
-After queues are all transferred, they are deleted from the old location.
-The znodes that were recovered are renamed with the ID of the slave cluster 
appended with the name of the dead server.
-
-Next, the master cluster region server creates one new source thread per 
copied queue, and each of the source threads follows the read/filter/ship 
pattern.
-The main difference is that those queues will never receive new data, since 
they do not belong to their new region server.
-When the reader hits the end of the last log, the queue's znode is deleted and 
the master cluster region server closes that replication source.
-
-Given a master cluster with 3 region servers replicating to a single slave 
with id [literal]+2+, the following hierarchy represents what the znodes layout 
could be at some point in time.
-The region servers' znodes all contain a [literal]+peers+          znode which 
contains a single queue.
-The znode names in the queues represent the actual file names on HDFS in the 
form [literal]+address,port.timestamp+.
-
-----
-
-/hbase/replication/rs/
-  1.1.1.1,60020,123456780/
-    2/
-      1.1.1.1,60020.1234  (Contains a position)
-      1.1.1.1,60020.1265
-  1.1.1.2,60020,123456790/
-    2/
-      1.1.1.2,60020.1214  (Contains a position)
-      1.1.1.2,60020.1248
-      1.1.1.2,60020.1312
-  1.1.1.3,60020,    123456630/
-    2/
-      1.1.1.3,60020.1280  (Contains a position)
-----
-
-Assume that 1.1.1.2 loses its ZooKeeper session.
-The survivors will race to create a lock, and, arbitrarily, 1.1.1.3 wins.
-It will then start transferring all the queues to its local peers znode by 
appending the name of the dead server.
-Right before 1.1.1.3 is able to clean up the old znodes, the layout will look 
like the following:
-
-----
-
-/hbase/replication/rs/
-  1.1.1.1,60020,123456780/
-    2/
-      1.1.1.1,60020.1234  (Contains a position)
-      1.1.1.1,60020.1265
-  1.1.1.2,60020,123456790/
-    lock
-    2/
-      1.1.1.2,60020.1214  (Contains a position)
-      1.1.1.2,60020.1248
-      1.1.1.2,60020.1312
-  1.1.1.3,60020,123456630/
-    2/
-      1.1.1.3,60020.1280  (Contains a position)
-
-    2-1.1.1.2,60020,123456790/
-      1.1.1.2,60020.1214  (Contains a position)
-      1.1.1.2,60020.1248
-      1.1.1.2,60020.1312
-----
-
-Some time later, but before 1.1.1.3 is able to finish replicating the last WAL 
from 1.1.1.2, it dies too.
-Some new logs were also created in the normal queues.
-The last region server will then try to lock 1.1.1.3's znode and will begin 
transferring all the queues.
-The new layout will be:
-
-----
-
-/hbase/replication/rs/
-  1.1.1.1,60020,123456780/
-    2/
-      1.1.1.1,60020.1378  (Contains a position)
-
-    2-1.1.1.3,60020,123456630/
-      1.1.1.3,60020.1325  (Contains a position)
-      1.1.1.3,60020.1401
-
-    2-1.1.1.2,60020,123456790-1.1.1.3,60020,123456630/
-      1.1.1.2,60020.1312  (Contains a position)
-  1.1.1.3,60020,123456630/
-    lock
-    2/
-      1.1.1.3,60020.1325  (Contains a position)
-      1.1.1.3,60020.1401
-
-    2-1.1.1.2,60020,123456790/
-      1.1.1.2,60020.1312  (Contains a position)
-----
-
-=== Replication Metrics
-
-The following metrics are exposed at the global region server level and (since 
HBase 0.95) at the peer level:
-
-[code]+source.sizeOfLogQueue+::
-  number of WALs to process (excludes the one which is being processed) at the 
Replication source
-
-[code]+source.shippedOps+::
-  number of mutations shipped
-
-[code]+source.logEditsRead+::
-  number of mutations read from WALs at the replication source
-
-[code]+source.ageOfLastShippedOp+::
-  age of last batch that was shipped by the replication source
-
-=== Replication Configuration Options
-
-[cols="1,1,1", options="header"]
-|===
-| Option
-| Description
-| Default
-
-| zookeeper.znode.parent
-| The name of the base ZooKeeper znode used for HBase
-| /hbase
-
-| zookeeper.znode.replication
-| The name of the base znode used for replication
-| replication
-
-| zookeeper.znode.replication.peers
-| The name of the peer znode
-| peers
-
-| zookeeper.znode.replication.peers.state
-| The name of peer-state znode
-| peer-state
-
-| zookeeper.znode.replication.rs
-| The name of the rs znode
-| rs
-
-| hbase.replication
-| Whether replication is enabled or disabled on a given
-                cluster
-| false
-
-| eplication.sleep.before.failover
-| How many milliseconds a worker should sleep before attempting to replicate
-                a dead region server's WAL queues.
-|
-
-| replication.executor.workers
-| The number of region servers a given region server should attempt to
-                  failover simultaneously.
-| 1
-|===
-
-[[ops.backup]]
-== HBase Backup
-
-There are two broad strategies for performing HBase backups: backing up with a 
full cluster shutdown, and backing up on a live cluster.
-Each approach has pros and cons. 
-
-For additional information, see 
link:http://blog.sematext.com/2011/03/11/hbase-backup-options/[HBase Backup
-        Options] over on the Sematext Blog. 
-
-[[ops.backup.fullshutdown]]
-=== Full Shutdown Backup
-
-Some environments can tolerate a periodic full shutdown of their HBase 
cluster, for example if it is being used a back-end analytic capacity and not 
serving front-end web-pages.
-The benefits are that the NameNode/Master are RegionServers are down, so there 
is no chance of missing any in-flight changes to either StoreFiles or metadata.
-The obvious con is that the cluster is down.
-The steps include: 
-
-[[ops.backup.fullshutdown.stop]]
-==== Stop HBase
-
-
-
-[[ops.backup.fullshutdown.distcp]]
-==== Distcp
-
-Distcp could be used to either copy the contents of the HBase directory in 
HDFS to either the same cluster in another directory, or to a different 
cluster. 
-
-Note: Distcp works in this situation because the cluster is down and there are 
no in-flight edits to files.
-Distcp-ing of files in the HBase directory is not generally recommended on a 
live cluster. 
-
-[[ops.backup.fullshutdown.restore]]
-==== Restore (if needed)
-
-The backup of the hbase directory from HDFS is copied onto the 'real' hbase 
directory via distcp.
-The act of copying these files creates new HDFS metadata, which is why a 
restore of the NameNode edits from the time of the HBase backup isn't required 
for this kind of restore, because it's a restore (via distcp) of a specific 
HDFS directory (i.e., the HBase part) not the entire HDFS file-system. 
-
-[[ops.backup.live.replication]]
-=== Live Cluster Backup - Replication
-
-This approach assumes that there is a second cluster.
-See the HBase page on 
link:http://hbase.apache.org/replication.html[replication] for more 
information. 
-
-[[ops.backup.live.copytable]]
-=== Live Cluster Backup - CopyTable
-
-The <<copytable,copytable>> utility could either be used to copy data from one 
table to another on the same cluster, or to copy data to another table on 
another cluster. 
-
-Since the cluster is up, there is a risk that edits could be missed in the 
copy process. 
-
-[[ops.backup.live.export]]
-=== Live Cluster Backup - Export
-
-The <<export,export>> approach dumps the content of a table to HDFS on the 
same cluster.
-To restore the data, the <<import,import>> utility would be used. 
-
-Since the cluster is up, there is a risk that edits could be missed in the 
export process. 
-
-[[ops.snapshots]]
-== HBase Snapshots
-
-HBase Snapshots allow you to take a snapshot of a table without too much 
impact on Region Servers.
-Snapshot, Clone and restore operations don't involve data copying.
-Also, Exporting the snapshot to another cluster doesn't have impact on the 
Region Servers. 
-
-Prior to version 0.94.6, the only way to backup or to clone a table is to use 
CopyTable/ExportTable, or to copy all the hfiles in HDFS after disabling the 
table.
-The disadvantages of these methods are that you can degrade region server 
performance (Copy/Export Table) or you need to disable the table, that means no 
reads or writes; and this is usually unacceptable. 
-
-[[ops.snapshots.configuration]]
-=== Configuration
-
-To turn on the snapshot support just set the [var]+hbase.snapshot.enabled+     
   property to true.
-(Snapshots are enabled by default in 0.95+ and off by default in 0.94.6+)
-
-[source,java]
-----
-
-  <property>
-    <name>hbase.snapshot.enabled</name>
-    <value>true</value>
-  </property>
-----
-
-[[ops.snapshots.takeasnapshot]]
-=== Take a Snapshot
-
-You can take a snapshot of a table regardless of whether it is enabled or 
disabled.
-The snapshot operation doesn't involve any data copying.
-
-----
-
-$ ./bin/hbase shell
-hbase> snapshot 'myTable', 'myTableSnapshot-122112'
-----
-
-.Take a Snapshot Without Flushing
-The default behavior is to perform a flush of data in memory before the 
snapshot is taken.
-This means that data in memory is included in the snapshot.
-In most cases, this is the desired behavior.
-However, if your set-up can tolerate data in memory being excluded from the 
snapshot, you can use the +SKIP_FLUSH+ option of the +snapshot+ command to 
disable and flushing while taking the snapshot.
-
-----
-hbase> snapshot 'mytable', 'snapshot123', {SKIP_FLUSH => true}
-----
-
-WARNING: There is no way to determine or predict whether a very concurrent 
insert or update will be included in a given snapshot, whether flushing is 
enabled or disabled.
-A snapshot is only a representation of a table during a window of time.
-The amount of time the snapshot operation will take to reach each Region 
Server may vary from a few seconds to a minute, depending on the resource load 
and speed of the hardware or network, among other factors.
-There is also no way to know whether a given insert or update is in memory or 
has been flushed.
-
-[[ops.snapshots.list]]
-=== Listing Snapshots
-
-List all snapshots taken (by printing the names and relative information).
-
-----
-
-$ ./bin/hbase shell
-hbase> list_snapshots
-----
-
-[[ops.snapshots.delete]]
-=== Deleting Snapshots
-
-You can remove a snapshot, and the files retained for that snapshot will be 
removed if no longer needed.
-
-----
-
-$ ./bin/hbase shell
-hbase> delete_snapshot 'myTableSnapshot-122112'
-----
-
-[[ops.snapshots.clone]]
-=== Clone a table from snapshot
-
-From a snapshot you can create a new table (clone operation) with the same 
data that you had when the snapshot was taken.
-The clone operation, doesn't involve data copies, and a change to the cloned 
table doesn't impact the snapshot or the original table.
-
-----
-
-$ ./bin/hbase shell
-hbase> clone_snapshot 'myTableSnapshot-122112', 'myNewTestTable'
-----
-
-[[ops.snapshots.restore]]
-=== Restore a snapshot
-
-The restore operation requires the table to be disabled, and the table will be 
restored to the state at the time when the snapshot was taken, changing both 
data and schema if required.
-
-----
-
-$ ./bin/hbase shell
-hbase> disable 'myTable'
-hbase> restore_snapshot 'myTableSnapshot-122112'
-----
-
-NOTE: Since Replication works at log level and snapshots at file-system level, 
after a restore, the replicas will be in a different state from the master.
-If you want to use restore, you need to stop replication and redo the 
bootstrap. 
-
-In case of partial data-loss due to misbehaving client, instead of a full 
restore that requires the table to be disabled, you can clone the table from 
the snapshot and use a Map-Reduce job to copy the data that you need, from the 
clone to the main one. 
-
-[[ops.snapshots.acls]]
-=== Snapshots operations and ACLs
-
-If you are using security with the AccessController Coprocessor (See 
<<hbase.accesscontrol.configuration,hbase.accesscontrol.configuration>>), only 
a global administrator can take, clone, or restore a snapshot, and these 
actions do not capture the ACL rights.
-This means that restoring a table preserves the ACL rights of the existing 
table, while cloning a table creates a new table that has no ACL rights until 
the administrator adds them.
-
-[[ops.snapshots.export]]
-=== Export to another cluster
-
-The ExportSnapshot tool copies all the data related to a snapshot (hfiles, 
logs, snapshot metadata) to another cluster.
-The tool executes a Map-Reduce job, similar to distcp, to copy files between 
the two clusters, and since it works at file-system level the hbase cluster 
does not have to be online.
-
-To copy a snapshot called MySnapshot to an HBase cluster srv2 
(hdfs:///srv2:8082/hbase) using 16 mappers:
-
-[source,bourne]
-----
-$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16
-----
-
-.Limiting Bandwidth Consumption
-You can limit the bandwidth consumption when exporting a snapshot, by 
specifying the [code]+-bandwidth+ parameter, which expects an integer 
representing megabytes per second.
-The following example limits the above example to 200 MB/sec.
-
-[source,bourne]
-----
-$ bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 
MySnapshot -copy-to hdfs://srv2:8082/hbase -mappers 16 -bandwidth 200
-----
-
-[[ops.capacity]]
-== Capacity Planning and Region Sizing
-
-There are several considerations when planning the capacity for an HBase 
cluster and performing the initial configuration.
-Start with a solid understanding of how HBase handles data internally.
-
-[[ops.capacity.nodes]]
-=== Node count and hardware/VM configuration
-
-[[ops.capacity.no

<TRUNCATED>

Reply via email to