http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/docbkx/ops_mgt.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml
deleted file mode 100644
index 3e38ff7..0000000
--- a/src/main/docbkx/ops_mgt.xml
+++ /dev/null
@@ -1,2600 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter
-  version="5.0"
-  xml:id="ops_mgt"
-  xmlns="http://docbook.org/ns/docbook";
-  xmlns:xlink="http://www.w3.org/1999/xlink";
-  xmlns:xi="http://www.w3.org/2001/XInclude";
-  xmlns:svg="http://www.w3.org/2000/svg";
-  xmlns:m="http://www.w3.org/1998/Math/MathML";
-  xmlns:html="http://www.w3.org/1999/xhtml";
-  xmlns:db="http://docbook.org/ns/docbook";>
-  <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work forf additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-  <title>Apache HBase Operational Management</title>
-  <para> This chapter will cover operational tools and practices required of a 
running Apache HBase
-    cluster. The subject of operations is related to the topics of <xref
-      linkend="trouble" />, <xref
-      linkend="performance" />, and <xref
-      linkend="configuration" /> but is a distinct topic in itself. </para>
-
-  <section
-    xml:id="tools">
-    <title>HBase Tools and Utilities</title>
-
-    <para>HBase provides several tools for administration, analysis, and 
debugging of your cluster.
-      The entry-point to most of these tools is the 
<filename>bin/hbase</filename> command, though
-      some tools are available in the <filename>dev-support/</filename> 
directory.</para>
-    <para>To see usage instructions for <filename>bin/hbase</filename> 
command, run it with no
-      arguments, or with the <option>-h</option> argument. These are the usage 
instructions for
-      HBase 0.98.x. Some commands, such as <command>version</command>, 
<command>pe</command>,
-        <command>ltt</command>, <command>clean</command>, are not available in 
previous
-      versions.</para>
-    <screen>
-$ <userinput>bin/hbase</userinput>
-<![CDATA[Usage: hbase [<options>] <command> [<args>]]]>
-Options:
-  --config DIR    Configuration direction to use. Default: ./conf
-  --hosts HOSTS   Override the list in 'regionservers' file
-
-Commands:
-Some commands take arguments. Pass no args or -h for usage.
-  shell           Run the HBase shell
-  hbck            Run the hbase 'fsck' tool
-  hlog            Write-ahead-log analyzer
-  hfile           Store file analyzer
-  zkcli           Run the ZooKeeper shell
-  upgrade         Upgrade hbase
-  master          Run an HBase HMaster node
-  regionserver    Run an HBase HRegionServer node
-  zookeeper       Run a Zookeeper server
-  rest            Run an HBase REST server
-  thrift          Run the HBase Thrift server
-  thrift2         Run the HBase Thrift2 server
-  clean           Run the HBase clean up script
-  classpath       Dump hbase CLASSPATH
-  mapredcp        Dump CLASSPATH entries required by mapreduce
-  pe              Run PerformanceEvaluation
-  ltt             Run LoadTestTool
-  version         Print the version
-  CLASSNAME       Run the class named CLASSNAME      
-    </screen>
-    <para>Some of the tools and utilities below are Java classes which are 
passed directly to the
-        <filename>bin/hbase</filename> command, as referred to in the last 
line of the usage
-      instructions. Others, such as <command>hbase shell</command> (<xref 
linkend="shell"/>),
-        <command>hbase upgrade</command> (<xref linkend="upgrading"/>), and 
<command>hbase
-        thrift</command> (<xref linkend="thrift"/>), are documented elsewhere 
in this guide.</para>
-    <section
-      xml:id="canary">
-      <title>Canary</title>
-      <para> There is a Canary class can help users to canary-test the HBase 
cluster status, with
-        every column-family for every regions or regionservers granularity. To 
see the usage, use
-        the <literal>--help</literal> parameter. </para>
-      <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
org.apache.hadoop.hbase.tool.Canary -help
-
-Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 
[table2]...] | [regionserver1 [regionserver2]..]
- where [opts] are:
-   -help          Show this help and exit.
-   -regionserver  replace the table argument to regionserver,
-      which means to enable regionserver mode
-   -daemon        Continuous check at defined intervals.
-   -interval &lt;N>  Interval between checks (sec)
-   -e             Use region/regionserver as regular expression
-      which means the region/regionserver is regular expression pattern
-   -f &lt;B>         stop whole program if first error occurs, default is true
-   -t &lt;N>         timeout for a check, default is 600000 
(milliseconds)</screen>
-      <para> This tool will return non zero error codes to user for 
collaborating with other
-        monitoring tools, such as Nagios. The error code definitions are: 
</para>
-      <programlisting language="java">private static final int USAGE_EXIT_CODE 
= 1;
-private static final int INIT_ERROR_EXIT_CODE = 2;
-private static final int TIMEOUT_ERROR_EXIT_CODE = 3;
-private static final int ERROR_EXIT_CODE = 4;</programlisting>
-      <para> Here are some examples based on the following given case. There 
are two HTable called
-        test-01 and test-02, they have two column family cf1 and cf2 
respectively, and deployed on
-        the 3 regionservers. see following table. </para>
-
-      <informaltable>
-        <tgroup
-          cols="3"
-          align="center"
-          colsep="1"
-          rowsep="1">
-          <colspec
-            colname="regionserver"
-            align="center" />
-          <colspec
-            colname="test-01"
-            align="center" />
-          <colspec
-            colname="test-02"
-            align="center" />
-          <thead>
-            <row>
-              <entry>RegionServer</entry>
-              <entry>test-01</entry>
-              <entry>test-02</entry>
-            </row>
-          </thead>
-          <tbody>
-            <row>
-              <entry>rs1</entry>
-              <entry>r1</entry>
-              <entry>r2</entry>
-            </row>
-            <row>
-              <entry>rs2</entry>
-              <entry>r2</entry>
-              <entry />
-            </row>
-            <row>
-              <entry>rs3</entry>
-              <entry>r2</entry>
-              <entry>r1</entry>
-            </row>
-          </tbody>
-        </tgroup>
-      </informaltable>
-      <para> Following are some examples based on the previous given case. 
</para>
-      <section>
-        <title>Canary test for every column family (store) of every region of 
every table</title>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
org.apache.hadoop.hbase.tool.Canary
-            
-3/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf1 in 
2ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,,1386230156732.0e3c7d77ffb6361ea1b996ac1042ca9a. column family cf2 in 
2ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,0004883,1386230156732.87b55e03dfeade00f441125159f8ca87. column family 
cf1 in 4ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-01,0004883,1386230156732.87b55e03dfeade00f441125159f8ca87. column family 
cf2 in 1ms
-...
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,,1386559511167.aa2951a86289281beee480f107bb36ee. column family cf1 in 
5ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,,1386559511167.aa2951a86289281beee480f107bb36ee. column family cf2 in 
3ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,0004883,1386559511167.cbda32d5e2e276520712d84eaaa29d84. column family 
cf1 in 31ms
-13/12/09 03:26:32 INFO tool.Canary: read from region 
test-02,0004883,1386559511167.cbda32d5e2e276520712d84eaaa29d84. column family 
cf2 in 8ms
-</screen>
-        <para> So you can see, table test-01 has two regions and two column 
families, so the Canary
-          tool will pick 4 small piece of data from 4 (2 region * 2 store) 
different stores. This is
-          a default behavior of the this tool does. </para>
-      </section>
-
-      <section>
-        <title>Canary test for every column family (store) of every region of 
specific
-          table(s)</title>
-        <para> You can also test one or more specific tables.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
orghapache.hadoop.hbase.tool.Canary test-01 test-02</screen>
-      </section>
-
-      <section>
-        <title>Canary test with regionserver granularity</title>
-        <para> This will pick one small piece of data from each regionserver, 
and can also put your
-          resionserver name as input options for canary-test specific 
regionservers.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
org.apache.hadoop.hbase.tool.Canary -regionserver
-            
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-01 on region 
server:rs2 in 72ms
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-02 on region 
server:rs3 in 34ms
-13/12/09 06:05:17 INFO tool.Canary: Read from table:test-01 on region 
server:rs1 in 56ms</screen>
-      </section>
-      <section>
-        <title>Canary test with regular expression pattern</title>
-        <para> This will test both table test-01 and test-02.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
orghapache.hadoop.hbase.tool.Canary -e test-0[1-2]</screen>
-      </section>
-
-      <section>
-        <title>Run canary test as daemon mode</title>
-        <para> Run repeatedly with interval defined in option -interval whose 
default value is 6
-          seconds. This daemon will stop itself and return non-zero error code 
if any error occurs,
-          due to the default value of option -f is true.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
orghapache.hadoop.hbase.tool.Canary -daemon</screen>
-        <para>Run repeatedly with internal 5 seconds and will not stop itself 
even error occurs in
-          the test.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
orghapache.hadoop.hbase.tool.Canary -daemon -interval 50000 -f false</screen>
-      </section>
-
-      <section>
-        <title>Force timeout if canary test stuck</title>
-        <para>In some cases, we suffered the request stucked on the 
regionserver and not response
-          back to the client. The regionserver in problem, would also not 
indicated to be dead by
-          Master, which would bring the clients hung. So we provide the 
timeout option to kill the
-          canary test forcefully and return non-zero error code as well. This 
run sets the timeout
-          value to 60 seconds, the default value is 600 seconds.</para>
-        <screen language="bourne">$ ${HBASE_HOME}/bin/hbase 
orghapache.hadoop.hbase.tool.Canary -t 600000</screen>
-      </section>
-
-    </section>
-
-    <section
-      xml:id="health.check">
-      <title>Health Checker</title>
-      <para>You can configure HBase to run a script on a period and if it 
fails N times
-        (configurable), have the server exit. See <link
-          xlink:href="">HBASE-7351 Periodic health check script</link> for 
configurations and
-        detail. </para>
-    </section>
-
-    <section
-      xml:id="driver">
-      <title>Driver</title>
-      <para>Several frequently-accessed utilities are provided as 
<code>Driver</code> classes, and executed by
-        the <filename>bin/hbase</filename> command. These utilities represent 
MapReduce jobs which
-        run on your cluster. They are run in the following way, replacing
-          <replaceable>UtilityName</replaceable> with the utility you want to 
run. This command
-        assumes you have set the environment variable 
<literal>HBASE_HOME</literal> to the directory
-        where HBase is unpacked on your server.</para>
-      <screen language="bourne">
-${HBASE_HOME}/bin/hbase 
org.apache.hadoop.hbase.mapreduce.<replaceable>UtilityName</replaceable>        
-      </screen>
-      <para>The following utilities are available:</para>
-      <variablelist>
-        <varlistentry>
-          <term><command>LoadIncrementalHFiles</command></term>
-          <listitem><para>Complete a bulk data load.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>CopyTable</command></term>
-          <listitem><para>Export a table from the local cluster to a peer 
cluster.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>Export</command></term>
-          <listitem><para>Write table data to HDFS.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>Import</command></term>
-          <listitem><para>Import data written by a previous 
<command>Export</command> operation.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>ImportTsv</command></term>
-          <listitem><para>Import data in TSV format.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>RowCounter</command></term>
-          <listitem><para>Count rows in an HBase table.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term><command>replication.VerifyReplication</command></term>
-          <listitem><para>Compare the data from tables in two different 
clusters. WARNING: It
-            doesn't work for incrementColumnValues'd cells since the timestamp 
is changed. Note that
-          this command is in a different package than the 
others.</para></listitem>
-        </varlistentry>
-      </variablelist>
-      <para>Each command except <command>RowCounter</command> accepts a single
-        <literal>--help</literal> argument to print usage instructions.</para>
-    </section>
-    <section
-      xml:id="hbck">
-      <title>HBase <application>hbck</application></title>
-      <subtitle>An <command>fsck</command> for your HBase install</subtitle>
-      <para>To run <application>hbck</application> against your HBase cluster 
run <command>$
-          ./bin/hbase hbck</command> At the end of the command's output it 
prints
-          <literal>OK</literal> or <literal>INCONSISTENCY</literal>. If your 
cluster reports
-        inconsistencies, pass <command>-details</command> to see more detail 
emitted. If
-        inconsistencies, run <command>hbck</command> a few times because the 
inconsistency may be
-        transient (e.g. cluster is starting up or a region is splitting). 
Passing
-          <command>-fix</command> may correct the inconsistency (This latter 
is an experimental
-        feature). </para>
-      <para>For more information, see <xref
-          linkend="hbck.in.depth" />. </para>
-    </section>
-    <section
-      xml:id="hfile_tool2">
-      <title>HFile Tool</title>
-      <para>See <xref
-          linkend="hfile_tool" />.</para>
-    </section>
-    <section
-      xml:id="wal_tools">
-      <title>WAL Tools</title>
-
-      <section
-        xml:id="hlog_tool">
-        <title><classname>FSHLog</classname> tool</title>
-
-        <para>The main method on <classname>FSHLog</classname> offers manual 
split and dump
-          facilities. Pass it WALs or the product of a split, the content of 
the
-            <filename>recovered.edits</filename>. directory.</para>
-
-        <para>You can get a textual dump of a WAL file content by doing the 
following:</para>
-        <screen language="bourne"> $ ./bin/hbase 
org.apache.hadoop.hbase.regionserver.wal.FSHLog --dump 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012
 </screen>
-        <para>The return code will be non-zero if issues with the file so you 
can test wholesomeness
-          of file by redirecting <varname>STDOUT</varname> to 
<code>/dev/null</code> and testing the
-          program return.</para>
-
-        <para>Similarly you can force a split of a log file directory by 
doing:</para>
-        <screen language="bourne"> $ ./bin/hbase 
org.apache.hadoop.hbase.regionserver.wal.FSHLog --split 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/</screen>
-
-        <section
-          xml:id="hlog_tool.prettyprint">
-          <title>WAL Pretty Printer</title>
-          <para>The WAL Pretty Printer is a tool with configurable options to
-            print the contents of a WAL. You can invoke it via the hbase cli 
with the 'wal' command.
-          </para>
-          <screen langauge="bourne"> $ ./bin/hbase wal 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012</screen>
-          <note>
-            <title>WAL Printing in older versions of HBase</title>
-            <para>Prior to version 2.0, the WAL Pretty Printer was called the
-              <classname>HLogPrettyPrinter</classname>, after an internal name 
for HBase's write
-              ahead log. In those versions, you can pring the contents of a 
WAL using the same
-              configuration as above, but with the 'hlog' command.
-            </para>
-            <screen langauge="bourne"> $ ./bin/hbase hlog 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012</screen>
-          </note>
-        </section>
-
-      </section>
-    </section>
-    <section
-      xml:id="compression.tool">
-      <title>Compression Tool</title>
-      <para>See <xref
-          linkend="compression.test" />.</para>
-    </section>
-    <section
-      xml:id="copytable">
-      <title>CopyTable</title>
-      <para> CopyTable is a utility that can copy part or of all of a table, 
either to the same
-        cluster or another cluster. The target table must first exist. The 
usage is as
-        follows:</para>
-
-      <screen language="bourne">
-$ <userinput>./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help 
</userinput>       
-/bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --help
-Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] 
[--new.name=NEW] [--peer.adr=ADR] &lt;tablename&gt;
-
-Options:
- rs.class     hbase.regionserver.class of the peer cluster, 
-              specify if different from current cluster
- rs.impl      hbase.regionserver.impl of the peer cluster,
- startrow     the start row
- stoprow      the stop row
- starttime    beginning of the time range (unixtime in millis)
-              without endtime means from starttime to forever
- endtime      end of the time range.  Ignored if no starttime specified.
- versions     number of cell versions to copy
- new.name     new table's name
- peer.adr     Address of the peer cluster given in the format
-              
hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
- families     comma-separated list of families to copy
-              To copy from cf1 to cf2, give sourceCfName:destCfName.
-              To keep the same name, just give "cfName"
- all.cells    also copy delete markers and deleted cells
-
-Args:
- tablename    Name of the table to copy
-
-Examples:
- To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
- $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable 
--starttime=1265875194289 --endtime=1265878794289 
--peer.adr=server1,server2,server3:2181:/hbase 
--families=myOldCf:myNewCf,cf2,cf3 TestTable
-
-For performance consider the following general options:
-  It is recommended that you set the following to >=100. A higher value uses 
more memory but
-  decreases the round trip time to the server and may increase performance.
-    -Dhbase.client.scanner.caching=100
-  The following should always be set to false, to prevent writing data twice, 
which may produce
-  inaccurate results.
-    -Dmapred.map.tasks.speculative.execution=false       
-      </screen>
-      <note>
-        <title>Scanner Caching</title>
-        <para>Caching for the input Scan is configured via 
<code>hbase.client.scanner.caching</code>
-          in the job configuration. </para>
-      </note>
-      <note>
-        <title>Versions</title>
-        <para>By default, CopyTable utility only copies the latest version of 
row cells unless
-            <code>--versions=n</code> is explicitly specified in the command. 
</para>
-      </note>
-      <para> See Jonathan Hsieh's <link
-          
xlink:href="http://www.cloudera.com/blog/2012/06/online-hbase-backups-with-copytable-2/";>Online
-          HBase Backups with CopyTable</link> blog post for more on 
<command>CopyTable</command>.
-      </para>
-    </section>
-    <section
-      xml:id="export">
-      <title>Export</title>
-      <para>Export is a utility that will dump the contents of table to HDFS 
in a sequence file.
-        Invoke via:</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.Export &lt;tablename&gt; &lt;outputdir&gt; 
[&lt;versions&gt; [&lt;starttime&gt; [&lt;endtime&gt;]]]
-</screen>
-
-      <para>Note: caching for the input Scan is configured via
-          <code>hbase.client.scanner.caching</code> in the job configuration. 
</para>
-    </section>
-    <section
-      xml:id="import">
-      <title>Import</title>
-      <para>Import is a utility that will load data that has been exported 
back into HBase. Invoke
-        via:</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
-</screen>
-      <para>To import 0.94 exported files in a 0.96 cluster or onwards, you 
need to set system
-        property "hbase.import.version" when running the import command as 
below:</para>
-      <screen language="bourne">$ bin/hbase -Dhbase.import.version=0.94 
org.apache.hadoop.hbase.mapreduce.Import &lt;tablename&gt; &lt;inputdir&gt;
-</screen>
-    </section>
-    <section
-      xml:id="importtsv">
-      <title>ImportTsv</title>
-      <para>ImportTsv is a utility that will load data in TSV format into 
HBase. It has two distinct
-        usages: loading data from TSV format in HDFS into HBase via Puts, and 
preparing StoreFiles
-        to be loaded via the <code>completebulkload</code>. </para>
-      <para>To load data via Puts (i.e., non-bulk loading):</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c 
&lt;tablename&gt; &lt;hdfs-inputdir&gt;
-</screen>
-
-      <para>To generate StoreFiles for bulk-loading:</para>
-      <programlisting language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c 
-Dimporttsv.bulk.output=hdfs://storefile-outputdir &lt;tablename&gt; 
&lt;hdfs-data-inputdir&gt;
-</programlisting>
-      <para>These generated StoreFiles can be loaded into HBase via <xref
-          linkend="completebulkload" />. </para>
-      <section
-        xml:id="importtsv.options">
-        <title>ImportTsv Options</title>
-        <para>Running <command>ImportTsv</command> with no arguments prints 
brief usage
-          information:</para>
-        <screen>
-Usage: importtsv -Dimporttsv.columns=a,b,c &lt;tablename&gt; &lt;inputdir&gt;
-
-Imports the given input directory of TSV data into the specified table.
-
-The column names of the TSV data must be specified using the 
-Dimporttsv.columns
-option. This option takes the form of comma-separated column names, where each
-column name is either a simple column family, or a columnfamily:qualifier. The 
special
-column name HBASE_ROW_KEY is used to designate that this column should be used
-as the row key for each imported record. You must specify exactly one column
-to be the row key, and you must specify a column name for every column that 
exists in the
-input data.
-
-By default importtsv will load data directly into HBase. To instead generate
-HFiles of data to prepare for a bulk data load, pass the option:
-  -Dimporttsv.bulk.output=/path/for/output
-  Note: the target table will be created with default column family 
descriptors if it does not already exist.
-
-Other options that may be specified with -D include:
-  -Dimporttsv.skip.bad.lines=false - fail if encountering an invalid line
-  '-Dimporttsv.separator=|' - eg separate on pipes instead of tabs
-  -Dimporttsv.timestamp=currentTimeAsLong - use the specified timestamp for 
the import
-  -Dimporttsv.mapper.class=my.Mapper - A user-defined Mapper to use instead of 
org.apache.hadoop.hbase.mapreduce.TsvImporterMapper
-        </screen>
-      </section>
-      <section
-        xml:id="importtsv.example">
-        <title>ImportTsv Example</title>
-        <para>For example, assume that we are loading data into a table called 
'datatsv' with a
-          ColumnFamily called 'd' with two columns "c1" and "c2". </para>
-        <para>Assume that an input file exists as follows:
-          <screen>
-row1   c1      c2
-row2   c1      c2
-row3   c1      c2
-row4   c1      c2
-row5   c1      c2
-row6   c1      c2
-row7   c1      c2
-row8   c1      c2
-row9   c1      c2
-row10  c1      c2
-          </screen>
-        </para>
-        <para>For ImportTsv to use this imput file, the command line needs to 
look like this:</para>
-        <screen language="bourne">
- HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` 
${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar importtsv 
-Dimporttsv.columns=HBASE_ROW_KEY,d:c1,d:c2 
-Dimporttsv.bulk.output=hdfs://storefileoutput datatsv hdfs://inputfile
- </screen>
-        <para> ... and in this example the first column is the rowkey, which 
is why the
-          HBASE_ROW_KEY is used. The second and third columns in the file will 
be imported as "d:c1"
-          and "d:c2", respectively. </para>
-      </section>
-      <section
-        xml:id="importtsv.warning">
-        <title>ImportTsv Warning</title>
-        <para>If you have preparing a lot of data for bulk loading, make sure 
the target HBase table
-          is pre-split appropriately. </para>
-      </section>
-      <section
-        xml:id="importtsv.also">
-        <title>See Also</title>
-        <para>For more information about bulk-loading HFiles into HBase, see 
<xref
-            linkend="arch.bulk.load" /></para>
-      </section>
-    </section>
-
-    <section
-      xml:id="completebulkload">
-      <title>CompleteBulkLoad</title>
-      <para>The <code>completebulkload</code> utility will move generated 
StoreFiles into an HBase
-        table. This utility is often used in conjunction with output from <xref
-          linkend="importtsv" />. </para>
-      <para>There are two ways to invoke this utility, with explicit classname 
and via the
-        driver:</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
&lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
-</screen>
-      <para> .. and via the Driver..</para>
-      <screen language="bourne">HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase 
classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar 
completebulkload &lt;hdfs://storefileoutput&gt; &lt;tablename&gt;
-</screen>
-      <section
-        xml:id="completebulkload.warning">
-        <title>CompleteBulkLoad Warning</title>
-        <para>Data generated via MapReduce is often created with file 
permissions that are not
-          compatible with the running HBase process. Assuming you're running 
HDFS with permissions
-          enabled, those permissions will need to be updated before you run 
CompleteBulkLoad.</para>
-        <para>For more information about bulk-loading HFiles into HBase, see 
<xref
-            linkend="arch.bulk.load" />. </para>
-      </section>
-
-    </section>
-    <section
-      xml:id="walplayer">
-      <title>WALPlayer</title>
-      <para>WALPlayer is a utility to replay WAL files into HBase. </para>
-      <para>The WAL can be replayed for a set of tables or all tables, and a 
timerange can be
-        provided (in milliseconds). The WAL is filtered to this set of tables. 
The output can
-        optionally be mapped to another set of tables. </para>
-      <para>WALPlayer can also generate HFiles for later bulk importing, in 
that case only a single
-        table and no mapping can be specified. </para>
-      <para>Invoke via:</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.WALPlayer [options] &lt;wal inputdir&gt; 
&lt;tables&gt; [&lt;tableMappings>]&gt;
-</screen>
-      <para>For example:</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.WALPlayer /backuplogdir oldTable1,oldTable2 
newTable1,newTable2
-</screen>
-      <para> WALPlayer, by default, runs as a mapreduce job. To NOT run 
WALPlayer as a mapreduce job
-        on your cluster, force it to run all in the local process by adding 
the flags
-          <code>-Dmapreduce.jobtracker.address=local</code> on the command 
line. </para>
-    </section>
-    <section
-      xml:id="rowcounter">
-      <title>RowCounter and CellCounter</title>
-      <para><link
-          
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html";>RowCounter</link>
-        is a mapreduce job to count all the rows of a table. This is a good 
utility to use as a
-        sanity check to ensure that HBase can read all the blocks of a table 
if there are any
-        concerns of metadata inconsistency. It will run the mapreduce all in a 
single process but it
-        will run faster if you have a MapReduce cluster in place for it to 
exploit.</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.RowCounter &lt;tablename&gt; [&lt;column1&gt; 
&lt;column2&gt;...]
-</screen>
-      <para>Note: caching for the input Scan is configured via
-          <code>hbase.client.scanner.caching</code> in the job configuration. 
</para>
-      <para>HBase ships another diagnostic mapreduce job called <link
-          
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/CellCounter.html";>CellCounter</link>.
-        Like RowCounter, it gathers more fine-grained statistics about your 
table. The statistics
-        gathered by RowCounter are more fine-grained and include: </para>
-      <itemizedlist>
-        <listitem>
-          <para>Total number of rows in the table.</para>
-        </listitem>
-        <listitem>
-          <para>Total number of CFs across all rows.</para>
-        </listitem>
-        <listitem>
-          <para>Total qualifiers across all rows.</para>
-        </listitem>
-        <listitem>
-          <para>Total occurrence of each CF.</para>
-        </listitem>
-        <listitem>
-          <para>Total occurrence of each qualifier.</para>
-        </listitem>
-        <listitem>
-          <para>Total number of versions of each qualifier.</para>
-        </listitem>
-      </itemizedlist>
-      <para>The program allows you to limit the scope of the run. Provide a 
row regex or prefix to
-        limit the rows to analyze. Use 
<code>hbase.mapreduce.scan.column.family</code> to specify
-        scanning a single column family.</para>
-      <screen language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.mapreduce.CellCounter &lt;tablename&gt; 
&lt;outputDir&gt; [regex or prefix]</screen>
-      <para>Note: just like RowCounter, caching for the input Scan is 
configured via
-          <code>hbase.client.scanner.caching</code> in the job configuration. 
</para>
-    </section>
-    <section
-      xml:id="mlockall">
-      <title>mlockall</title>
-      <para>It is possible to optionally pin your servers in physical memory 
making them less likely
-        to be swapped out in oversubscribed environments by having the servers 
call <link
-          xlink:href="http://linux.die.net/man/2/mlockall";>mlockall</link> on 
startup. See <link
-          
xlink:href="https://issues.apache.org/jira/browse/HBASE-4391";>HBASE-4391 Add 
ability to
-          start RS as root and call mlockall</link> for how to build the 
optional library and have
-        it run on startup. </para>
-    </section>
-    <section
-      xml:id="compaction.tool">
-      <title>Offline Compaction Tool</title>
-      <para>See the usage for the <link
-          
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html";>Compaction
-          Tool</link>. Run it like this <command>./bin/hbase
-          org.apache.hadoop.hbase.regionserver.CompactionTool</command>
-      </para>
-    </section>
-
-    <section>
-      <title><command>hbase clean</command></title>
-      <para>The <command>hbase clean</command> command cleans HBase data from 
ZooKeeper, HDFS, or
-        both. It is appropriate to use for testing. Run it with no options for 
usage instructions.
-        The <command>hbase clean</command> command was introduced in HBase 
0.98.</para>
-      <screen>
-$ <userinput>bin/hbase clean</userinput>
-Usage: hbase clean (--cleanZk|--cleanHdfs|--cleanAll)
-Options:
-        --cleanZk   cleans hbase related data from zookeeper.
-        --cleanHdfs cleans hbase related data from hdfs.
-        --cleanAll  cleans hbase related data from both zookeeper and hdfs.    
    
-      </screen>
-    </section>
-    <section>
-      <title><command>hbase pe</command></title>
-      <para>The <command>hbase pe</command> command is a shortcut provided to 
run the
-          <code>org.apache.hadoop.hbase.PerformanceEvaluation</code> tool, 
which is used for
-        testing. The <command>hbase pe</command> command was introduced in 
HBase 0.98.4.</para>
-      <para>The PerformanceEvaluation tool accepts many different options and 
commands. For usage
-        instructions, run the command with no options.</para>
-      <para>To run PerformanceEvaluation prior to HBase 0.98.4, issue the 
command
-          <command>hbase 
org.apache.hadoop.hbase.PerformanceEvaluation</command>.</para>
-      <para>The PerformanceEvaluation tool has received many updates in recent 
HBase releases,
-        including support for namespaces, support for tags, cell-level ACLs 
and visibility labels,
-        multiget support for RPC calls, increased sampling sizes, an option to 
randomly sleep during
-        testing, and ability to "warm up" the cluster before testing 
starts.</para>
-    </section>
-    <section>
-      <title><command>hbase ltt</command></title>
-      <para>The <command>hbase ltt</command> command is a shortcut provided to 
run the
-        <code>org.apache.hadoop.hbase.util.LoadTestTool</code> utility, which 
is used for
-        testing. The <command>hbase ltt</command> command was introduced in 
HBase 0.98.4.</para>
-      <para>You must specify either <option>-write</option> or 
<option>-update-read</option> as the
-        first option. For general usage instructions, pass the 
<option>-h</option> option.</para>
-      <para>To run LoadTestTool prior to HBase 0.98.4, issue the command 
<command>hbase
-          org.apache.hadoop.hbase.util.LoadTestTool</command>.</para>
-      <para>The LoadTestTool has received many updates in recent HBase 
releases, including support
-        for namespaces, support for tags, cell-level ACLS and visibility 
labels, testing
-        security-related features, ability to specify the number of regions 
per server, tests for
-        multi-get RPC calls, and tests relating to replication.</para>
-    </section>
-  </section>
-  <!--  tools -->
-
-  <section
-    xml:id="ops.regionmgt">
-    <title>Region Management</title>
-    <section
-      xml:id="ops.regionmgt.majorcompact">
-      <title>Major Compaction</title>
-      <para>Major compactions can be requested via the HBase shell or <link
-          
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29";>HBaseAdmin.majorCompact</link>.
 </para>
-      <para>Note: major compactions do NOT do region merges. See <xref
-          linkend="compaction" /> for more information about compactions. 
</para>
-    </section>
-    <section
-      xml:id="ops.regionmgt.merge">
-      <title>Merge</title>
-      <para>Merge is a utility that can merge adjoining regions in the same 
table (see
-        org.apache.hadoop.hbase.util.Merge).</para>
-      <programlisting language="bourne">$ bin/hbase 
org.apache.hadoop.hbase.util.Merge &lt;tablename&gt; &lt;region1&gt; 
&lt;region2&gt;
-</programlisting>
-      <para>If you feel you have too many regions and want to consolidate 
them, Merge is the utility
-        you need. Merge must run be done when the cluster is down. See the 
<link
-          
xlink:href="http://ofps.oreilly.com/titles/9781449396107/performance.html";>O'Reilly
 HBase
-          Book</link> for an example of usage. </para>
-      <para>You will need to pass 3 parameters to this application. The first 
one is the table name.
-        The second one is the fully qualified name of the first region to 
merge, like
-        "table_name,\x0A,1342956111995.7cef47f192318ba7ccc75b1bbf27a82b.". The 
third one is the
-        fully qualified name for the second region to merge. </para>
-      <para>Additionally, there is a Ruby script attached to <link
-          
xlink:href="https://issues.apache.org/jira/browse/HBASE-1621";>HBASE-1621</link> 
for region
-        merging. </para>
-    </section>
-  </section>
-
-  <section
-    xml:id="node.management">
-    <title>Node Management</title>
-    <section
-      xml:id="decommission">
-      <title>Node Decommission</title>
-      <para>You can stop an individual RegionServer by running the following 
script in the HBase
-        directory on the particular node:</para>
-      <screen language="bourne">$ ./bin/hbase-daemon.sh stop 
regionserver</screen>
-      <para> The RegionServer will first close all regions and then shut 
itself down. On shutdown,
-        the RegionServer's ephemeral node in ZooKeeper will expire. The master 
will notice the
-        RegionServer gone and will treat it as a 'crashed' server; it will 
reassign the nodes the
-        RegionServer was carrying. </para>
-      <note>
-        <title>Disable the Load Balancer before Decommissioning a node</title>
-        <para>If the load balancer runs while a node is shutting down, then 
there could be
-          contention between the Load Balancer and the Master's recovery of 
the just decommissioned
-          RegionServer. Avoid any problems by disabling the balancer first. 
See <xref
-            linkend="lb" /> below. </para>
-      </note>
-      <note>
-        <title xml:id="considerAsDead.sh">Kill Node Tool</title>
-        <para>In hbase-2.0, in the bin directory, we added a script named
-          <filename>considerAsDead.sh</filename> that can be used to kill a 
regionserver.
-          Hardware issues could be detected by specialized monitoring tools 
before the 
-          zookeeper timeout has expired. 
<filename>considerAsDead.sh</filename> is a
-          simple function to mark a RegionServer as dead.  It deletes all the 
znodes
-          of the server, starting the recovery process.  Plug in the script 
into
-          your monitoring/fault detection tools to initiate faster failover. Be
-          careful how you use this disruptive tool. Copy the script if you 
need to
-          make use of it in a version of hbase previous to hbase-2.0.
-        </para>
-      </note>
-      <para> A downside to the above stop of a RegionServer is that regions 
could be offline for a
-        good period of time. Regions are closed in order. If many regions on 
the server, the first
-        region to close may not be back online until all regions close and 
after the master notices
-        the RegionServer's znode gone. In Apache HBase 0.90.2, we added 
facility for having a node
-        gradually shed its load and then shutdown itself down. Apache HBase 
0.90.2 added the
-          <filename>graceful_stop.sh</filename> script. Here is its 
usage:</para>
-      <screen language="bourne">$ ./bin/graceful_stop.sh
-Usage: graceful_stop.sh [--config &amp;conf-dir>] [--restart] [--reload] 
[--thrift] [--rest] &amp;hostname>
- thrift      If we should stop/start thrift before/after the hbase stop/start
- rest        If we should stop/start rest before/after the hbase stop/start
- restart     If we should restart after graceful stop
- reload      Move offloaded regions back on to the stopped server
- debug       Move offloaded regions back on to the stopped server
- hostname    Hostname of server we are to stop</screen>
-      <para> To decommission a loaded RegionServer, run the following: 
<command>$
-          ./bin/graceful_stop.sh HOSTNAME</command> where 
<varname>HOSTNAME</varname> is the host
-        carrying the RegionServer you would decommission. </para>
-      <note>
-        <title>On <varname>HOSTNAME</varname></title>
-        <para>The <varname>HOSTNAME</varname> passed to 
<filename>graceful_stop.sh</filename> must
-          match the hostname that hbase is using to identify RegionServers. 
Check the list of
-          RegionServers in the master UI for how HBase is referring to 
servers. Its usually hostname
-          but can also be FQDN. Whatever HBase is using, this is what you 
should pass the
-            <filename>graceful_stop.sh</filename> decommission script. If you 
pass IPs, the script
-          is not yet smart enough to make a hostname (or FQDN) of it and so it 
will fail when it
-          checks if server is currently running; the graceful unloading of 
regions will not run.
-        </para>
-      </note>
-      <para> The <filename>graceful_stop.sh</filename> script will move the 
regions off the
-        decommissioned RegionServer one at a time to minimize region churn. It 
will verify the
-        region deployed in the new location before it will moves the next 
region and so on until the
-        decommissioned server is carrying zero regions. At this point, the
-          <filename>graceful_stop.sh</filename> tells the RegionServer 
<command>stop</command>. The
-        master will at this point notice the RegionServer gone but all regions 
will have already
-        been redeployed and because the RegionServer went down cleanly, there 
will be no WAL logs to
-        split. </para>
-      <note
-        xml:id="lb">
-        <title>Load Balancer</title>
-        <para> It is assumed that the Region Load Balancer is disabled while 
the
-            <command>graceful_stop</command> script runs (otherwise the 
balancer and the
-          decommission script will end up fighting over region deployments). 
Use the shell to
-          disable the balancer:</para>
-        <programlisting>hbase(main):001:0> balance_switch false
-true
-0 row(s) in 0.3590 seconds</programlisting>
-        <para> This turns the balancer OFF. To reenable, do:</para>
-        <programlisting>hbase(main):001:0> balance_switch true
-false
-0 row(s) in 0.3590 seconds</programlisting>
-        <para>The <command>graceful_stop</command> will check the balancer and 
if enabled, will turn
-          it off before it goes to work. If it exits prematurely because of 
error, it will not have
-          reset the balancer. Hence, it is better to manage the balancer apart 
from
-            <command>graceful_stop</command> reenabling it after you are done 
w/ graceful_stop.
-        </para>
-      </note>
-      <section
-        xml:id="draining.servers">
-        <title>Decommissioning several Regions Servers concurrently</title>
-        <para>If you have a large cluster, you may want to decommission more 
than one machine at a
-          time by gracefully stopping mutiple RegionServers concurrently. To 
gracefully drain
-          multiple regionservers at the same time, RegionServers can be put 
into a "draining" state.
-          This is done by marking a RegionServer as a draining node by 
creating an entry in
-          ZooKeeper under the <filename>hbase_root/draining</filename> znode. 
This znode has format
-            <code>name,port,startcode</code> just like the regionserver 
entries under
-            <filename>hbase_root/rs</filename> znode. </para>
-        <para>Without this facility, decommissioning mulitple nodes may be 
non-optimal because
-          regions that are being drained from one region server may be moved 
to other regionservers
-          that are also draining. Marking RegionServers to be in the draining 
state prevents this
-          from happening. See this <link
-            
xlink:href="http://inchoate-clatter.blogspot.com/2012/03/hbase-ops-automation.html";>blog
-            post</link> for more details.</para>
-      </section>
-
-      <section
-        xml:id="bad.disk">
-        <title>Bad or Failing Disk</title>
-        <para>It is good having <xref
-            linkend="dfs.datanode.failed.volumes.tolerated" /> set if you have 
a decent number of
-          disks per machine for the case where a disk plain dies. But usually 
disks do the "John
-          Wayne" -- i.e. take a while to go down spewing errors in 
<filename>dmesg</filename> -- or
-          for some reason, run much slower than their companions. In this case 
you want to
-          decommission the disk. You have two options. You can <link
-            
xlink:href="http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F";>decommission
-            the datanode</link> or, less disruptive in that only the bad disks 
data will be
-          rereplicated, can stop the datanode, unmount the bad volume (You 
can't umount a volume
-          while the datanode is using it), and then restart the datanode 
(presuming you have set
-          dfs.datanode.failed.volumes.tolerated > 0). The regionserver will 
throw some errors in its
-          logs as it recalibrates where to get its data from -- it will likely 
roll its WAL log too
-          -- but in general but for some latency spikes, it should keep on 
chugging. </para>
-        <note>
-          <title>Short Circuit Reads</title>
-          <para>If you are doing short-circuit reads, you will have to move 
the regions off the
-            regionserver before you stop the datanode; when short-circuiting 
reading, though chmod'd
-            so regionserver cannot have access, because it already has the 
files open, it will be
-            able to keep reading the file blocks from the bad disk even though 
the datanode is down.
-            Move the regions back after you restart the datanode.</para>
-        </note>
-      </section>
-    </section>
-    <section
-      xml:id="rolling">
-      <title>Rolling Restart</title>
-
-      <para>Some cluster configuration changes require either the entire 
cluster, or the
-          RegionServers, to be restarted in order to pick up the changes. In 
addition, rolling
-          restarts are supported for upgrading to a minor or maintenance 
release, and to a major
-          release if at all possible. See the release notes for release you 
want to upgrade to, to
-          find out about limitations to the ability to perform a rolling 
upgrade.</para>
-      <para>There are multiple ways to restart your cluster nodes, depending 
on your situation.
-        These methods are detailed below.</para>
-      <section>
-        <title>Using the <command>rolling-restart.sh</command> Script</title>
-        
-        <para>HBase ships with a script, 
<filename>bin/rolling-restart.sh</filename>, that allows
-          you to perform rolling restarts on the entire cluster, the master 
only, or the
-          RegionServers only. The script is provided as a template for your 
own script, and is not
-          explicitly tested. It requires password-less SSH login to be 
configured and assumes that
-          you have deployed using a tarball. The script requires you to set 
some environment
-          variables before running it. Examine the script and modify it to 
suit your needs.</para>
-        <example>
-          <title><filename>rolling-restart.sh</filename> General Usage</title>
-          <screen language="bourne">
-$ <userinput>./bin/rolling-restart.sh --help</userinput><![CDATA[
-Usage: rolling-restart.sh [--config <hbase-confdir>] [--rs-only] 
[--master-only] [--graceful] [--maxthreads xx]          
-        ]]></screen>
-        </example>
-        <variablelist>
-          <varlistentry>
-            <term>Rolling Restart on RegionServers Only</term>
-            <listitem>
-              <para>To perform a rolling restart on the RegionServers only, 
use the
-                  <code>--rs-only</code> option. This might be necessary if 
you need to reboot the
-                individual RegionServer or if you make a configuration change 
that only affects
-                RegionServers and not the other HBase processes.</para>
-              <para>If you need to restart only a single RegionServer, or if 
you need to do extra
-                actions during the restart, use the 
<filename>bin/graceful_stop.sh</filename>
-                command instead. See <xref linkend="rolling.restart.manual" 
/>.</para>
-            </listitem>
-          </varlistentry>
-          <varlistentry>
-            <term>Rolling Restart on Masters Only</term>
-            <listitem>
-              <para>To perform a rolling restart on the active and backup 
Masters, use the
-                  <code>--master-only</code> option. You might use this if you 
know that your
-                configuration change only affects the Master and not the 
RegionServers, or if you
-                need to restart the server where the active Master is 
running.</para>
-              <para>If you are not running backup Masters, the Master is 
simply restarted. If you
-                are running backup Masters, they are all stopped before any 
are restarted, to avoid
-                a race condition in ZooKeeper to determine which is the new 
Master. First the main
-                Master is restarted, then the backup Masters are restarted. 
Directly after restart,
-                it checks for and cleans out any regions in transition before 
taking on its normal
-                workload.</para>
-            </listitem>
-          </varlistentry>
-          <varlistentry>
-            <term>Graceful Restart</term>
-            <listitem>
-              <para>If you specify the <code>--graceful</code> option, 
RegionServers are restarted
-                using the <filename>bin/graceful_stop.sh</filename> script, 
which moves regions off
-                a RegionServer before restarting it. This is safer, but can 
delay the
-                restart.</para>
-            </listitem>
-          </varlistentry>
-          <varlistentry>
-            <term>Limiting the Number of Threads</term>
-            <listitem>
-              <para>To limit the rolling restart to using only a specific 
number of threads, use the
-                  <code>--maxthreads</code> option.</para>
-            </listitem>
-          </varlistentry>
-        </variablelist>
-      </section>
-      <section xml:id="rolling.restart.manual">
-        <title>Manual Rolling Restart</title>
-        <para>To retain more control over the process, you may wish to 
manually do a rolling restart
-          across your cluster. This uses the 
<command>graceful-stop.sh</command> command <xref
-            linkend="decommission" />. In this method, you can restart each 
RegionServer
-          individually and then move its old regions back into place, 
retaining locality. If you
-          also need to restart the Master, you need to do it separately, and 
restart the Master
-          before restarting the RegionServers using this method. The following 
is an example of such
-          a command. You may need to tailor it to your environment. This 
script does a rolling
-          restart of RegionServers only. It disables the load balancer before 
moving the
-          regions.</para>
-        <screen><![CDATA[
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i; done &> /tmp/log.txt &;     
-        ]]></screen>
-        <para>Monitor the output of the <filename>/tmp/log.txt</filename> file 
to follow the
-          progress of the script. </para>
-      </section>
-
-      <section>
-        <title>Logic for Crafting Your Own Rolling Restart Script</title>
-        <para>Use the following guidelines if you want to create your own 
rolling restart script.</para>
-        <orderedlist>
-          <listitem>
-            <para>Extract the new release, verify its configuration, and 
synchronize it to all nodes
-              of your cluster using <command>rsync</command>, 
<command>scp</command>, or another
-              secure synchronization mechanism.</para></listitem>
-          <listitem><para>Use the hbck utility to ensure that the cluster is 
consistent.</para>
-          <screen>
-$ ./bin/hbck            
-          </screen>
-            <para>Perform repairs if required. See <xref linkend="hbck" /> for 
details.</para>
-          </listitem>
-          <listitem><para>Restart the master first. You may need to modify 
these commands if your
-            new HBase directory is different from the old one, such as for an 
upgrade.</para>
-          <screen>
-$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master        
    
-          </screen>
-          </listitem>
-          <listitem><para>Gracefully restart each RegionServer, using a script 
such as the
-            following, from the Master.</para>
-          <screen><![CDATA[
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i; done &> /tmp/log.txt &            
-          ]]></screen>
-            <para>If you are running Thrift or REST servers, pass the --thrift 
or --rest options.
-              For other available options, run the 
<command>bin/graceful-stop.sh --help</command>
-              command.</para>
-            <para>It is important to drain HBase regions slowly when 
restarting multiple
-              RegionServers. Otherwise, multiple regions go offline 
simultaneously and must be
-              reassigned to other nodes, which may also go offline soon. This 
can negatively affect
-              performance. You can inject delays into the script above, for 
instance, by adding a
-              Shell command such as <command>sleep</command>. To wait for 5 
minutes between each
-              RegionServer restart, modify the above script to the 
following:</para>
-            <screen><![CDATA[
-$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart 
--reload --debug $i & sleep 5m; done &> /tmp/log.txt &            
-          ]]></screen>
-          </listitem>
-          <listitem><para>Restart the Master again, to clear out the dead 
servers list and re-enable
-          the load balancer.</para></listitem>
-          <listitem><para>Run the <command>hbck</command> utility again, to be 
sure the cluster is
-            consistent.</para></listitem>
-        </orderedlist>
-      </section>
-    </section>
-    <section
-      xml:id="adding.new.node">
-      <title>Adding a New Node</title>
-      <para>Adding a new regionserver in HBase is essentially free, you simply 
start it like this:
-          <command>$ ./bin/hbase-daemon.sh start regionserver</command> and it 
will register itself
-        with the master. Ideally you also started a DataNode on the same 
machine so that the RS can
-        eventually start to have local files. If you rely on ssh to start your 
daemons, don't forget
-        to add the new hostname in <filename>conf/regionservers</filename> on 
the master. </para>
-      <para>At this point the region server isn't serving data because no 
regions have moved to it
-        yet. If the balancer is enabled, it will start moving regions to the 
new RS. On a
-        small/medium cluster this can have a very adverse effect on latency as 
a lot of regions will
-        be offline at the same time. It is thus recommended to disable the 
balancer the same way
-        it's done when decommissioning a node and move the regions manually 
(or even better, using a
-        script that moves them one by one). </para>
-      <para>The moved regions will all have 0% locality and won't have any 
blocks in cache so the
-        region server will have to use the network to serve requests. Apart 
from resulting in higher
-        latency, it may also be able to use all of your network card's 
capacity. For practical
-        purposes, consider that a standard 1GigE NIC won't be able to read 
much more than
-          <emphasis>100MB/s</emphasis>. In this case, or if you are in a OLAP 
environment and
-        require having locality, then it is recommended to major compact the 
moved regions. </para>
-
-    </section>
-  </section>
-  <!--  node mgt -->
-
-  <section xml:id="hbase_metrics">
-    <title>HBase Metrics</title>
-    <para>HBase emits metrics which adhere to the <link
-        
xlink:href="http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html";
-        >Hadoop metrics</link> API. Starting with HBase 
0.95<footnote><para>The Metrics system was redone in
-          HBase 0.96. See <link 
xlink:href="https://blogs.apache.org/hbase/entry/migration_to_the_new_metrics";>Migration
-            to the New Metrics Hotness – Metrics2</link> by Elliot Clark for 
detail</para></footnote>,
-      HBase is configured to emit a default
-      set of metrics with a default sampling period of every 10 seconds. You 
can use HBase
-      metrics in conjunction with Ganglia. You can also filter which metrics 
are emitted and extend
-      the metrics framework to capture custom metrics appropriate for your 
environment.</para>
-    <section xml:id="metric_setup">
-      <title>Metric Setup</title>
-      <para>For HBase 0.95 and newer, HBase ships with a default metrics 
configuration, or
-          <firstterm>sink</firstterm>. This includes a wide variety of 
individual metrics, and emits
-        them every 10 seconds by default. To configure metrics for a given 
region server, edit the
-          <filename>conf/hadoop-metrics2-hbase.properties</filename> file. 
Restart the region server
-        for the changes to take effect.</para>
-      <para>To change the sampling rate for the default sink, edit the line 
beginning with
-          <literal>*.period</literal>. To filter which metrics are emitted or 
to extend the metrics
-        framework, see <link
-          
xlink:href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html";
-        />
-      </para>
-      <note xml:id="rs_metrics_ganglia">
-        <title>HBase Metrics and Ganglia</title>
-        <para>By default, HBase emits a large number of metrics per region 
server. Ganglia may have
-          difficulty processing all these metrics. Consider increasing the 
capacity of the Ganglia
-          server or reducing the number of metrics emitted by HBase. See <link
-            
xlink:href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html#filtering";
-            >Metrics Filtering</link>.</para>
-      </note>
-    </section>
-    <section>
-      <title>Disabling Metrics</title>
-      <para>To disable metrics for a region server, edit the
-          <filename>conf/hadoop-metrics2-hbase.properties</filename> file and 
comment out any
-        uncommented lines. Restart the region server for the changes to take 
effect.</para>
-    </section>
-
-    <section xml:id="discovering.available.metrics">
-      <title>Discovering Available Metrics</title>
-      <para>Rather than listing each metric which HBase emits by default, you 
can browse through the
-        available metrics, either as a JSON output or via JMX. Different 
metrics are
-        exposed for the Master process and each region server process.</para>
-      <procedure>
-        <title>Access a JSON Output of Available Metrics</title>
-        <step>
-          <para>After starting HBase, access the region server's web UI, at
-              <literal>http://REGIONSERVER_HOSTNAME:60030</literal> by default 
(or port 16030 in HBase 1.0+).</para>
-        </step>
-        <step>
-          <para>Click the <guilabel>Metrics Dump</guilabel> link near the top. 
The metrics for the region server are
-            presented as a dump of the JMX bean in JSON format. This will dump 
out all metrics names and their
-            values.
-            To include metrics descriptions in the listing &#x2014; this can 
be useful when you are exploring
-            what is available &#x2014; add a query string of
-            <literal>?description=true</literal> so your URL becomes
-            
<literal>http://REGIONSERVER_HOSTNAME:60030/jmx?description=true</literal>.
-            Not all beans and attributes have descriptions.
-          </para>
-        </step>
-        <step>
-          <para>To view metrics for the Master, connect to the Master's web UI 
instead (defaults to
-              <literal>http://localhost:60010</literal> or port 16010 in HBase 
1.0+) and click its <guilabel>Metrics
-                Dump</guilabel> link.
-            To include metrics descriptions in the listing &#x2014; this can 
be useful when you are exploring
-            what is available &#x2014; add a query string of
-            <literal>?description=true</literal> so your URL becomes
-            
<literal>http://REGIONSERVER_HOSTNAME:60010/jmx?description=true</literal>.
-            Not all beans and attributes have descriptions.
-            </para>
-        </step>
-      </procedure>
-
-      <procedure>
-        <title>Browse the JMX Output of Available Metrics</title>
-        <para>You can use many different tools to view JMX content by browsing 
MBeans. This
-          procedure uses <command>jvisualvm</command>, which is an application 
usually available in the JDK.
-            </para>
-        <step>
-          <para>Start HBase, if it is not already running.</para>
-        </step>
-        <step>
-          <para>Run the command <command>jvisualvm</command> command on a host 
with a GUI display.
-            You can launch it from the command line or another method 
appropriate for your operating
-            system.</para>
-        </step>
-        <step>
-          <para>Be sure the <guilabel>VisualVM-MBeans</guilabel> plugin is 
installed. Browse to <menuchoice>
-              <guimenu>Tools</guimenu>
-              <guimenuitem>Plugins</guimenuitem>
-            </menuchoice>. Click <guilabel>Installed</guilabel> and check 
whether the plugin is
-            listed. If not, click <guilabel>Available Plugins</guilabel>, 
select it, and click
-              <guibutton>Install</guibutton>. When finished, click
-            <guibutton>Close</guibutton>.</para>
-        </step>
-        <step>
-          <para>To view details for a given HBase process, double-click the 
process in the
-              <guilabel>Local</guilabel> sub-tree in the left-hand panel. A 
detailed view opens in
-            the right-hand panel. Click the <guilabel>MBeans</guilabel> tab 
which appears as a tab
-            in the top of the right-hand panel.</para>
-        </step>
-        <step>
-          <para>To access the HBase metrics, navigate to the appropriate 
sub-bean:</para>
-          <itemizedlist>
-            <listitem>
-              <para>Master: <menuchoice>
-                  <guimenu>Hadoop</guimenu>
-                  <guisubmenu>HBase</guisubmenu>
-                  <guisubmenu>Master</guisubmenu>
-                  <guisubmenu>Server</guisubmenu>
-                </menuchoice></para>
-            </listitem>
-            <listitem>
-              <para>RegionServer: <menuchoice>
-                  <guimenu>Hadoop</guimenu>
-                  <guisubmenu>HBase</guisubmenu>
-                  <guisubmenu>RegionServer</guisubmenu>
-                  <guisubmenu>Server</guisubmenu>
-                </menuchoice></para>
-            </listitem>
-          </itemizedlist>
-        </step>
-        <step>
-          <para>The name of each metric and its current value is displayed in 
the
-              <guilabel>Attributes</guilabel> tab. For a view which includes 
more details, including
-            the description of each attribute, click the 
<guilabel>Metadata</guilabel> tab.</para>
-        </step>
-      </procedure>
-    </section>
-    <section>
-      <title>Units of Measure for Metrics</title>
-      <para>Different metrics are expressed in different units, as 
appropriate. Often, the unit of
-        measure is in the name (as in the metric <code>shippedKBs</code>). 
Otherwise, use the
-        following guidelines. When in doubt, you may need to examine the 
source for a given
-        metric.</para>
-      <itemizedlist>
-        <listitem>
-          <para>Metrics that refer to a point in time are usually expressed as 
a timestamp.</para>
-        </listitem>
-        <listitem>
-          <para>Metrics that refer to an age (such as 
<code>ageOfLastShippedOp</code>) are usually
-            expressed in milliseconds.</para>
-        </listitem>
-        <listitem>
-          <para>Metrics that refer to memory sizes are in bytes.</para>
-        </listitem>
-        <listitem>
-          <para>Sizes of queues (such as <code>sizeOfLogQueue</code>) are 
expressed as the number of
-            items in the queue. Determine the size by multiplying by the block 
size (default is 64
-            MB in HDFS).</para>
-        </listitem>
-        <listitem>
-          <para>Metrics that refer to things like the number of a given type 
of operations (such as
-              <code>logEditsRead</code>) are expressed as an integer.</para>
-        </listitem>
-      </itemizedlist>
-    </section>
-    <section xml:id="master_metrics">
-      <title>Most Important Master Metrics</title>
-      <para>Note: Counts are usually over the last metrics reporting 
interval.</para>
-      <variablelist>
-        <varlistentry>
-          <term>hbase.master.numRegionServers</term>
-          <listitem><para>Number of live regionservers</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.master.numDeadRegionServers</term>
-          <listitem><para>Number of dead regionservers</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.master.ritCount </term>
-          <listitem><para>The number of regions in transition</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.master.ritCountOverThreshold</term>
-          <listitem><para>The number of regions that have been in transition 
longer than
-            a threshold time (default: 60 seconds)</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.master.ritOldestAge</term>
-          <listitem><para>The age of the longest region in transition, in 
milliseconds
-            </para></listitem>
-        </varlistentry>
-      </variablelist>
-    </section>
-    <section xml:id="rs_metrics">
-      <title>Most Important RegionServer Metrics</title>
-      <para>Note: Counts are usually over the last metrics reporting 
interval.</para>
-      <variablelist>
-        <varlistentry>
-          <term>hbase.regionserver.regionCount</term>
-          <listitem><para>The number of regions hosted by the 
regionserver</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.storeFileCount</term>
-          <listitem><para>The number of store files on disk currently managed 
by the
-            regionserver</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.storeFileSize</term>
-          <listitem><para>Aggregate size of the store files on 
disk</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.hlogFileCount</term>
-          <listitem><para>The number of write ahead logs not yet 
archived</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.totalRequestCount</term>
-          <listitem><para>The total number of requests 
received</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.readRequestCount</term>
-          <listitem><para>The number of read requests 
received</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.writeRequestCount</term>
-          <listitem><para>The number of write requests 
received</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.numOpenConnections</term>
-          <listitem><para>The number of open connections at the RPC 
layer</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.numActiveHandler</term>
-          <listitem><para>The number of RPC handlers actively servicing 
requests</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.numCallsInGeneralQueue</term>
-          <listitem><para>The number of currently enqueued user 
requests</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.numCallsInReplicationQueue</term>
-          <listitem><para>The number of currently enqueued operations received 
from
-            replication</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.numCallsInPriorityQueue</term>
-          <listitem><para>The number of currently enqueued priority (internal 
housekeeping)
-            requests</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.flushQueueLength</term>
-          <listitem><para>Current depth of the memstore flush queue. If 
increasing, we are falling
-            behind with clearing memstores out to HDFS.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.updatesBlockedTime</term>
-          <listitem><para>Number of milliseconds updates have been blocked so 
the memstore can be
-            flushed</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.compactionQueueLength</term>
-          <listitem><para>Current depth of the compaction request queue. If 
increasing, we are
-            falling behind with storefile compaction.</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.blockCacheHitCount</term>
-          <listitem><para>The number of block cache hits</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.blockCacheMissCount</term>
-          <listitem><para>The number of block cache misses</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.blockCacheExpressHitPercent </term>
-          <listitem><para>The percent of the time that requests with the cache 
turned on hit the
-            cache</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.percentFilesLocal</term>
-          <listitem><para>Percent of store file data that can be read from the 
local DataNode,
-            0-100</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.&lt;op&gt;_&lt;measure&gt;</term>
-          <listitem><para>Operation latencies, where &lt;op&gt; is one of 
Append, Delete, Mutate,
-            Get, Replay, Increment; and where &lt;measure&gt; is one of min, 
max, mean, median,
-            75th_percentile, 95th_percentile, 99th_percentile</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.slow&lt;op&gt;Count </term>
-          <listitem><para>The number of operations we thought were slow, where 
&lt;op&gt; is one
-            of the list above</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.GcTimeMillis</term>
-          <listitem><para>Time spent in garbage collection, in 
milliseconds</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.GcTimeMillisParNew</term>
-          <listitem><para>Time spent in garbage collection of the young 
generation, in
-            milliseconds</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.GcTimeMillisConcurrentMarkSweep</term>
-          <listitem><para>Time spent in garbage collection of the old 
generation, in
-            milliseconds</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.authenticationSuccesses</term>
-          <listitem><para>Number of client connections where authentication 
succeeded</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.authenticationFailures</term>
-          <listitem><para>Number of client connection authentication 
failures</para></listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>hbase.regionserver.mutationsWithoutWALCount </term>
-          <listitem><para>Count of writes submitted with a flag indicating 
they should bypass the
-            write ahead log</para></listitem>
-        </varlistentry>
-      </variablelist>
-    </section>
-  </section>      
-
-  <section
-    xml:id="ops.monitoring">
-    <title>HBase Monitoring</title>
-    <section
-      xml:id="ops.monitoring.overview">
-      <title>Overview</title>
-      <para>The following metrics are arguably the most important to monitor 
for each RegionServer
-        for "macro monitoring", preferably with a system like <link
-          xlink:href="http://opentsdb.net/";>OpenTSDB</link>. If your cluster 
is having performance
-        issues it's likely that you'll see something unusual with this group. 
</para>
-      <itemizedlist>
-        <title>HBase:</title>
-        <listitem>
-          <para>See <xref
-              linkend="rs_metrics" /></para>
-        </listitem>
-      </itemizedlist>
-
-      <itemizedlist>
-        <title>OS:</title>
-        <listitem>
-          <para>IO Wait</para>
-        </listitem>
-        <listitem>
-          <para>User CPU</para>
-        </listitem>
-      </itemizedlist>
-      <itemizedlist>
-        <title>Java:</title>
-        <listitem>
-          <para>GC</para>
-        </listitem>
-      </itemizedlist>
-      <para> For more information on HBase metrics, see <xref
-          linkend="hbase_metrics" />. </para>
-    </section>
-
-    <section
-      xml:id="ops.slow.query">
-      <title>Slow Query Log</title>
-      <para>The HBase slow query log consists of parseable JSON structures 
describing the properties
-        of those client operations (Gets, Puts, Deletes, etc.) that either 
took too long to run, or
-        produced too much output. The thresholds for "too long to run" and 
"too much output" are
-        configurable, as described below. The output is produced inline in the 
main region server
-        logs so that it is easy to discover further details from context with 
other logged events.
-        It is also prepended with identifying tags 
<constant>(responseTooSlow)</constant>,
-          <constant>(responseTooLarge)</constant>, 
<constant>(operationTooSlow)</constant>, and
-          <constant>(operationTooLarge)</constant> in order to enable easy 
filtering with grep, in
-        case the user desires to see only slow queries. </para>
-
-      <section>
-        <title>Configuration</title>
-        <para>There are two configuration knobs that can be used to adjust the 
thresholds for when
-          queries are logged. </para>
-
-        <itemizedlist>
-          <listitem>
-            <para><varname>hbase.ipc.warn.response.time</varname> Maximum 
number of milliseconds
-              that a query can be run without being logged. Defaults to 10000, 
or 10 seconds. Can be
-              set to -1 to disable logging by time. </para>
-          </listitem>
-          <listitem>
-            <para><varname>hbase.ipc.warn.response.size</varname> Maximum byte 
size of response that
-              a query can return without being logged. Defaults to 100 
megabytes. Can be set to -1
-              to disable logging by size. </para>
-          </listitem>
-        </itemizedlist>
-      </section>
-
-      <section>
-        <title>Metrics</title>
-        <para>The slow query log exposes to metrics to JMX.</para>
-        <itemizedlist>
-          <listitem>
-            <para><varname>hadoop.regionserver_rpc_slowResponse</varname> a 
global metric reflecting
-              the durations of all responses that triggered logging.</para>
-          </listitem>
-          <listitem>
-            
<para><varname>hadoop.regionserver_rpc_methodName.aboveOneSec</varname> A metric
-              reflecting the durations of all responses that lasted for more 
than one second.</para>
-          </listitem>
-        </itemizedlist>
-
-      </section>
-
-      <section>
-        <title>Output</title>
-        <para>The output is tagged with operation e.g. 
<constant>(operationTooSlow)</constant> if
-          the call was a client operation, such as a Put, Get, or Delete, 
which we expose detailed
-          fingerprint information for. If not, it is tagged 
<constant>(responseTooSlow)</constant>
-          and still produces parseable JSON output, but with less verbose 
information solely
-          regarding its duration and size in the RPC itself. 
<constant>TooLarge</constant> is
-          substituted for <constant>TooSlow</constant> if the response size 
triggered the logging,
-          with <constant>TooLarge</constant> appearing even in the case that 
both size and duration
-          triggered logging. </para>
-      </section>
-      <section>
-        <title>Example</title>
-        <para>
-          <programlisting>2011-09-08 10:01:25,824 WARN 
org.apache.hadoop.ipc.HBaseServer: (operationTooSlow): 
{"tables":{"riley2":{"puts":[{"totalColumns":11,"families":{"actions":[{"timestamp":1315501284459,"qualifier":"0","vlen":9667580},{"timestamp":1315501284459,"qualifier":"1","vlen":10122412},{"timestamp":1315501284459,"qualifier":"2","vlen":11104617},{"timestamp":1315501284459,"qualifier":"3","vlen":13430635}]},"row":"cfcd208495d565ef66e7dff9f98764da:0"}],"families":["actions"]}},"processingtimems":956,"client":"10.47.34.63:33623","starttimems":1315501284456,"queuetimems":0,"totalPuts":1,"class":"HRegionServer","responsesize":0,"method":"multiPut"}</programlisting>
-        </para>
-
-        <para>Note that everything inside the "tables" structure is output 
produced by MultiPut's
-          fingerprint, while the rest of the information is RPC-specific, such 
as processing time
-          and client IP/port. Other client operations follow the same pattern 
and the same general
-          structure, with necessary differences due to the nature of the 
individual operations. In
-          the case that the call is not a client operation, that detailed 
fingerprint information
-          will be completely absent. </para>
-
-        <para>This particular example, for example, would indicate that the 
likely cause of slowness
-          is simply a very large (on the order of 100MB) multiput, as we can 
tell by the "vlen," or
-          value length, fields of each put in the multiPut. </para>
-      </section>
-    </section>
-    <section>
-      <title>Block Cache Monitoring</title>
-      <para>Starting with HBase 0.98, the HBase Web UI includes the ability to 
monitor and report on
-        the performance of the block cache. To view the block cache reports, 
click <menuchoice>
-          <guimenu>Tasks</guimenu>
-          <guisubmenu>Show Non-RPC Tasks</guisubmenu>
-          <guimenuitem>Block Cache</guimenuitem>
-        </menuchoice>. Following are a few examples of the reporting 
capabilities.</para>
-      <figure>
-        <title>Basic Info</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="bc_basic.png" width="100%"/>
-          </imageobject>
-          <caption>
-            <para>Shows the cache implementation</para>
-          </caption>
-        </mediaobject>
-      </figure>
-      <figure>
-        <title>Config</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="bc_config.png" width="100%"/>
-          </imageobject>
-          <caption>
-            <para>Shows all cache configuration options.</para>
-          </caption>
-        </mediaobject>
-      </figure>
-      <figure>
-        <title>Stats</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="bc_stats.png" width="100%"/>
-          </imageobject>
-          <caption>
-            <para>Shows statistics about the performance of the cache.</para>
-          </caption>
-        </mediaobject>
-      </figure>
-      <figure>
-        <title>L1 and L2</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="bc_l1.png" width="100%"/>
-          </imageobject>
-          <imageobject>
-            <imagedata fileref="bc_l2_buckets.png" width="100%"/>
-          </imageobject>
-          <caption>
-            <para>Shows information about the L1 and L2 caches.</para>
-          </caption>
-        </mediaobject>
-      </figure>
-      <para>This is not an exhaustive list of all the screens and reports 
available. Have a look in
-        the Web UI.</para>
-    </section>
-
-
-
-  </section>
-
-  <section
-    xml:id="cluster_replication">
-    <title>Cluster Replication</title>
-    <note>
-      <para>This information was previously available at <link
-          xlink:href="http://hbase.apache.org/replication.html";>Cluster 
Replication</link>. </para>
-    </note>
-    <para>HBase provides a cluster replication mechanism which allows you to 
keep one cluster's
-      state synchronized with that of another cluster, using the write-ahead 
log (WAL) of the source
-      cluster to propagate the changes. Some use cases for cluster replication 
include:</para>
-    <itemizedlist>
-      <listitem><para>Backup and disaster recovery</para></listitem>
-      <listitem><para>Data aggregation</para></listitem>
-      <listitem><para>Geographic data distribution</para></listitem>
-      <listitem><para>Online data ingestion combined with offline data 
analytics</para></listitem>
-    </itemizedlist>
-    <note><para>Replication is enabled at the granularity of the column 
family. Before enabling
-      replication for a column family, create the table and all column 
families to be replicated, on
-      the destination cluster.</para></note>
-    <para>Cluster replication uses a source-push methodology. An HBase cluster 
can be a source (also
-      called master or active, meaning that it is the originator of new data), 
a destination (also
-      called slave or passive, meaning that it receives data via replication), 
or can fulfill both
-      roles at once. Replication is asynchronous, and the goal of replication 
is eventual
-      consistency. When the source receives an edit to a column family with 
replication enabled,
-      that edit is propagated to all destination clusters using the WAL for 
that for that column
-      family on the RegionServer managing the relevant region.</para>
-    <para>When data is replicated from one cluster to another, the original 
source of the data is
-      tracked via a cluster ID which is part of the metadata. In HBase 0.96 
and newer (<link
-        
xlink:href="https://issues.apache.org/jira/browse/HBASE-7709";>HBASE-7709</link>),
 all
-      clusters which have already consumed the data are also tracked. This 
prevents replication
-      loops.</para>
-    <para>The WALs for each region server must be kept in HDFS as long as they 
are needed to
-      replicate data to any slave cluster. Each region server reads from the 
oldest log it needs to
-      replicate and keeps track of its progress processing WALs inside 
ZooKeeper to simplify failure
-      recovery. The position marker which indicates a slave cluster's 
progress, as well as the queue
-      of WALs to process, may be different for every slave cluster.</para>
-    <para>The clusters participating in replication can be of different sizes. 
The master cluster
-      relies on randomization to attempt to balance the stream of replication 
on the slave clusters.
-      It is expected that the slave cluster has storage capacity to hold the 
replicated data, as
-      well as any data it is responsible for ingesting. If a slave cluster 
does run out of room, or
-      is inaccessible for other reasons, it throws an error and the master 
retains the WAL and
-      retries the replication at intervals.</para>
-    <note>
-      <title>Terminology Changes</title>
-      <para>Previously, terms such as <firstterm>master-master</firstterm>,
-          <firstterm>master-slave</firstterm>, and 
<firstterm>cyclical</firstterm> were used to
-        describe replication relationships in HBase. These terms added 
confusion, and have been
-        abandoned in favor of discussions about cluster topologies appropriate 
for different
-        scenarios.</para>
-    </note>
-    <itemizedlist>
-    <title>Cluster Topologies</title>
-    <listitem>
-      <para>A central source cluster might propagate changes out to multiple 
destination clusters,
-        for failover or due to geographic distribution.</para>
-    </listitem>
-      <listitem>
-      <para>A source cluster might push changes to a destination cluster, 
which might also push
-        its own changes back to the original cluster.</para></listitem>
-      <listitem>
-      <para>Many different low-latency clusters might push changes to one 
centralized cluster for
-        backup or resource-intensive data analytics jobs. The processed data 
might then be
-        replicated back to the low-latency clusters.</para>
-      </listitem>
-    </itemizedlist>
-    <para>Multiple levels of replication may be chained together to suit your 
organization's needs.
-      The following diagram shows a hypothetical scenario. Use the arrows to 
follow the data
-      paths.</para>
-    <figure>
-      <title>Example of a Complex Cluster Replication Configuration</title>
-      <mediaobject>
-        <imageobject><imagedata 
fileref="hbase_replication_diagram.jpg"/></imageobject>
-      </mediaobject>
-      <caption>
-        <para>At the top of the diagram, the San Jose and Tokyo clusters, 
shown in red,
-          replicate changes to each other, and each also replicates changes to 
a User Data and a
-          Payment Data cluster.</para>
-        <para>Each cluster in the second row, shown in blue, replicates its 
changes to the All Data
-          Backup 1 cluster, shown in grey. The All Data Backup 1 cluster 
replicates changes to the
-          All Data Backup 2 cluster (also shown in grey), as well as the Data 
Analysis cluster
-          (shown in green). All Data Backup 2 also propagates any of its own 
changes back to All
-          Data Backup 1.</para>
-        <para>The Data Analysis cluster runs MapReduce jobs on its data, and 
then pushes the
-          processed data back to the San Jose and Tokyo clusters.</para>
-      </caption>
-    </figure>
-    
-    <para>HBase replication borrows many concepts from the <firstterm><link
-          
xlink:href="http://dev.mysql.com/doc/refman/5.1/en/replication-formats.html";
-          >statement-based replication</link></firstterm> design used by 
MySQL. Instead of SQL
-      statements, entire WALEdits (consisting of multiple cell inserts coming 
from Put and Delete
-      operations on the clients) are replicated in order to maintain 
atomicity. </para>
-    
-    <section>
-      <title>Configuring Cluster Replication</title>
-      <para>The following is a simplified procedure for configuring cluster 
replication. It may not
-        cover every edge case. For more information, see the <link
-          
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements";
-          > API documentation for replication</link>.</para>
-      <itemizedlist>
-        <listitem>
-          <para>Configure and start the source and destination clusters. 
Create tables with the same
-            names and column 

<TRUNCATED>

Reply via email to