[03/11] hbase git commit: HBASE-13908 update site docs for 1.2 RC.

busbey Sun, 03 Jan 2016 03:20:13 -0800

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/hbase_mob.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/hbase_mob.adoc 
b/src/main/asciidoc/_chapters/hbase_mob.adoc
new file mode 100644
index 0000000..3f67181
--- /dev/null
+++ b/src/main/asciidoc/_chapters/hbase_mob.adoc
@@ -0,0 +1,236 @@
+////
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+////
+
+[[hbase_mob]]
+== Storing Medium-sized Objects (MOB)
+:doctype: book
+:numbered:
+:toc: left
+:icons: font
+:experimental:
+:toc: left
+:source-language: java
+
+Data comes in many sizes, and saving all of your data in HBase, including 
binary
+data such as images and documents, is ideal. While HBase can technically handle
+binary objects with cells that are larger than 100 KB in size, HBase's normal
+read and write paths are optimized for values smaller than 100KB in size. When
+HBase deals with large numbers of objects over this threshold, referred to here
+as medium objects, or MOBs, performance is degraded due to write amplification
+caused by splits and compactions. When using MOBs, ideally your objects will 
be between
+100KB and 10MB. HBase ***FIX_VERSION_NUMBER*** adds support
+for better managing large numbers of MOBs while maintaining performance,
+consistency, and low operational overhead. MOB support is provided by the work
+done in link:https://issues.apache.org/jira/browse/HBASE-11339[HBASE-11339]. To
+take advantage of MOB, you need to use <<hfilev3,HFile version 3>>. Optionally,
+configure the MOB file reader's cache settings for each RegionServer (see
+<<mob.cache.configure>>), then configure specific columns to hold MOB data.
+Client code does not need to change to take advantage of HBase MOB support. The
+feature is transparent to the client.
+
+=== Configuring Columns for MOB
+
+You can configure columns to support MOB during table creation or alteration,
+either in HBase Shell or via the Java API. The two relevant properties are the
+boolean `IS_MOB` and the `MOB_THRESHOLD`, which is the number of bytes at which
+an object is considered to be a MOB. Only `IS_MOB` is required. If you do not
+specify the `MOB_THRESHOLD`, the default threshold value of 100 KB is used.
+
+.Configure a Column for MOB Using HBase Shell
+====
+----
+hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400}
+hbase> alter 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 102400}
+----
+====
+
+.Configure a Column for MOB Using the Java API
+====
+[source,java]
+----
+...
+HColumnDescriptor hcd = new HColumnDescriptor(âfâ);
+hcd.setMobEnabled(true);
+...
+hcd.setMobThreshold(102400L);
+...
+----
+====
+
+
+=== Testing MOB
+
+The utility `org.apache.hadoop.hbase.IntegrationTestIngestMOB` is provided to 
assist with testing
+the MOB feature. The utility is run as follows:
+[source,bash]
+----
+$ sudo -u hbase hbase org.apache.hadoop.hbase.IntegrationTestIngestMOB \
+            -threshold 102400 \
+            -minMobDataSize 512 \
+            -maxMobDataSize 5120
+----
+
+* `*threshold*` is the threshold at which cells are considered to be MOBs.
+   The default is 1 kB, expressed in bytes.
+* `*minMobDataSize*` is the minimum value for the size of MOB data.
+   The default is 512 B, expressed in bytes.
+* `*maxMobDataSize*` is the maximum value for the size of MOB data.
+   The default is 5 kB, expressed in bytes.
+
+
+[[mob.cache.configure]]
+=== Configuring the MOB Cache
+
+
+Because there can be a large number of MOB files at any time, as compared to 
the number of HFiles,
+MOB files are not always kept open. The MOB file reader cache is a LRU cache 
which keeps the most
+recently used MOB files open. To configure the MOB file reader's cache on each 
RegionServer, add
+the following properties to the RegionServer's `hbase-site.xml`, customize the 
configuration to
+suit your environment, and restart or rolling restart the RegionServer.
+
+.Example MOB Cache Configuration
+====
+[source,xml]
+----
+<property>
+    <name>hbase.mob.file.cache.size</name>
+    <value>1000</value>
+    <description>
+      Number of opened file handlers to cache.
+      A larger value will benefit reads by providing more file handlers per mob
+      file cache and would reduce frequent file opening and closing.
+      However, if this is set too high, this could lead to a "too many opened 
file handers"
+      The default value is 1000.
+    </description>
+</property>
+<property>
+    <name>hbase.mob.cache.evict.period</name>
+    <value>3600</value>
+    <description>
+      The amount of time in seconds after which an unused file is evicted from 
the
+      MOB cache. The default value is 3600 seconds.
+    </description>
+</property>
+<property>
+    <name>hbase.mob.cache.evict.remain.ratio</name>
+    <value>0.5f</value>
+    <description>
+      A multiplier (between 0.0 and 1.0), which determines how many files 
remain cached
+      after the threshold of files that remains cached after a cache eviction 
occurs
+      which is triggered by reaching the `hbase.mob.file.cache.size` threshold.
+      The default value is 0.5f, which means that half the files (the 
least-recently-used
+      ones) are evicted.
+    </description>
+</property>
+----
+====
+
+=== MOB Optimization Tasks
+
+==== Manually Compacting MOB Files
+
+To manually compact MOB files, rather than waiting for the
+<<mob.cache.configure,configuration>> to trigger compaction, use the
+`compact_mob` or `major_compact_mob` HBase shell commands. These commands
+require the first argument to be the table name, and take an optional column
+family as the second argument. If the column family is omitted, all MOB-enabled
+column families are compacted.
+
+----
+hbase> compact_mob 't1', 'c1'
+hbase> compact_mob 't1'
+hbase> major_compact_mob 't1', 'c1'
+hbase> major_compact_mob 't1'
+----
+
+These commands are also available via `Admin.compactMob` and
+`Admin.majorCompactMob` methods.
+
+==== MOB Sweeper
+
+HBase MOB a MapReduce job called the Sweeper tool for
+optimization. The Sweeper tool coalesces small MOB files or MOB files with many
+deletions or updates. The Sweeper tool is not required if you use native MOB 
compaction, which
+does not rely on MapReduce.
+
+To configure the Sweeper tool, set the following options:
+
+[source,xml]
+----
+<property>
+    <name>hbase.mob.sweep.tool.compaction.ratio</name>
+    <value>0.5f</value>
+    <description>
+      If there are too many cells deleted in a mob file, it's regarded
+      as an invalid file and needs to be merged.
+      If existingCellsSize/mobFileSize is less than ratio, it's regarded
+      as an invalid file. The default value is 0.5f.
+    </description>
+</property>
+<property>
+    <name>hbase.mob.sweep.tool.compaction.mergeable.size</name>
+    <value>134217728</value>
+    <description>
+      If the size of a mob file is less than this value, it's regarded as a 
small
+      file and needs to be merged. The default value is 128MB.
+    </description>
+</property>
+<property>
+    <name>hbase.mob.sweep.tool.compaction.memstore.flush.size</name>
+    <value>134217728</value>
+    <description>
+      The flush size for the memstore used by sweep job. Each sweep reducer 
owns such a memstore.
+      The default value is 128MB.
+    </description>
+</property>
+<property>
+    <name>hbase.master.mob.ttl.cleaner.period</name>
+    <value>86400</value>
+    <description>
+      The period that ExpiredMobFileCleanerChore runs. The unit is second.
+      The default value is one day.
+    </description>
+</property>
+----
+
+Next, add the HBase install directory, _`$HBASE_HOME`/*_, and HBase library 
directory to
+_yarn-site.xml_ Adjust this example to suit your environment.
+[source,xml]
+----
+<property>
+    <description>Classpath for typical applications.</description>
+    <name>yarn.application.classpath</name>
+    <value>
+        $HADOOP_CONF_DIR,
+        $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
+        $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
+        $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
+        $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,
+        $HBASE_HOME/*, $HBASE_HOME/lib/*
+    </value>
+</property>
+----
+
+Finally, run the `sweeper` tool for each column which is configured for MOB.
+[source,bash]
+----
+$ org.apache.hadoop.hbase.mob.compactions.Sweeper _tableName_ _familyName_
+----


http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/hbck_in_depth.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/hbck_in_depth.adoc 
b/src/main/asciidoc/_chapters/hbck_in_depth.adoc
index 1b30c59..1e1f9fb 100644
--- a/src/main/asciidoc/_chapters/hbck_in_depth.adoc
+++ b/src/main/asciidoc/_chapters/hbck_in_depth.adoc
@@ -29,7 +29,7 @@
 :experimental:
 
 HBaseFsck (hbck) is a tool for checking for region consistency and table 
integrity problems and repairing a corrupted HBase.
-It works in two basic modes -- a read-only inconsistency identifying mode and 
a multi-phase read-write repair mode. 
+It works in two basic modes -- a read-only inconsistency identifying mode and 
a multi-phase read-write repair mode.
 
 === Running hbck to identify inconsistencies
 
@@ -42,10 +42,10 @@ $ ./bin/hbase hbck
 ----
 
 At the end of the commands output it prints OK or tells you the number of 
INCONSISTENCIES present.
-You may also want to run run hbck a few times because some inconsistencies can 
be transient (e.g.
+You may also want to run hbck a few times because some inconsistencies can be 
transient (e.g.
 cluster is starting up or a region is splitting). Operationally you may want 
to run hbck regularly and setup alert (e.g.
 via nagios) if it repeatedly reports inconsistencies . A run of hbck will 
report a list of inconsistencies along with a brief description of the regions 
and tables affected.
-The using the `-details` option will report more details including a 
representative listing of all the splits present in all the tables. 
+The using the `-details` option will report more details including a 
representative listing of all the splits present in all the tables.
 
 [source,bourne]
 ----
@@ -66,9 +66,9 @@ $ ./bin/hbase hbck TableFoo TableBar
 === Inconsistencies
 
 If after several runs, inconsistencies continue to be reported, you may have 
encountered a corruption.
-These should be rare, but in the event they occur newer versions of HBase 
include the hbck tool enabled with automatic repair options. 
+These should be rare, but in the event they occur newer versions of HBase 
include the hbck tool enabled with automatic repair options.
 
-There are two invariants that when violated create inconsistencies in HBase: 
+There are two invariants that when violated create inconsistencies in HBase:
 
 * HBase's region consistency invariant is satisfied if every region is 
assigned and deployed on exactly one region server, and all places where this 
state kept is in accordance.
 * HBase's table integrity invariant is satisfied if for each table, every 
possible row key resolves to exactly one region.
@@ -77,20 +77,20 @@ Repairs generally work in three phases -- a read-only 
information gathering phas
 Starting from version 0.90.0, hbck could detect region consistency problems 
report on a subset of possible table integrity problems.
 It also included the ability to automatically fix the most common 
inconsistency, region assignment and deployment consistency problems.
 This repair could be done by using the `-fix` command line option.
-These problems close regions if they are open on the wrong server or on 
multiple region servers and also assigns regions to region servers if they are 
not open. 
+These problems close regions if they are open on the wrong server or on 
multiple region servers and also assigns regions to region servers if they are 
not open.
 
 Starting from HBase versions 0.90.7, 0.92.2 and 0.94.0, several new command 
line options are introduced to aid repairing a corrupted HBase.
-This hbck sometimes goes by the nickname ``uberhbck''. Each particular version 
of uber hbck is compatible with the HBase's of the same major version (0.90.7 
uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions <=0.92.1 
may require restarting the master or failing over to a backup master. 
+This hbck sometimes goes by the nickname ``uberhbck''. Each particular version 
of uber hbck is compatible with the HBase's of the same major version (0.90.7 
uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions <=0.92.1 
may require restarting the master or failing over to a backup master.
 
 === Localized repairs
 
 When repairing a corrupted HBase, it is best to repair the lowest risk 
inconsistencies first.
 These are generally region consistency repairs -- localized single region 
repairs, that only modify in-memory data, ephemeral zookeeper data, or patch 
holes in the META table.
 Region consistency requires that the HBase instance has the state of the 
region's data in HDFS (.regioninfo files), the region's row in the hbase:meta 
table., and region's deployment/assignments on region servers and the master in 
accordance.
-Options for repairing region consistency include: 
+Options for repairing region consistency include:
 
 * `-fixAssignments` (equivalent to the 0.90 `-fix` option) repairs unassigned, 
incorrectly assigned or multiply assigned regions.
-* `-fixMeta` which removes meta rows when corresponding regions are not 
present in HDFS and adds new meta rows if they regions are present in HDFS 
while not in META.                To fix deployment and assignment problems you 
can run this command: 
+* `-fixMeta` which removes meta rows when corresponding regions are not 
present in HDFS and adds new meta rows if they regions are present in HDFS 
while not in META.                To fix deployment and assignment problems you 
can run this command:
 
 [source,bourne]
 ----
@@ -177,7 +177,7 @@ $ ./bin/hbase hbck -fixMetaOnly -fixAssignments
 ==== Special cases: HBase version file is missing
 
 HBase's data on the file system requires a version file in order to start.
-If this flie is missing, you can use the `-fixVersionFile` option to 
fabricating a new HBase version file.
+If this file is missing, you can use the `-fixVersionFile` option to 
fabricating a new HBase version file.
 This assumes that the version of hbck you are running is the appropriate 
version for the HBase cluster.
 
 ==== Special case: Root and META are corrupt.
@@ -205,8 +205,8 @@ However, there could be some lingering offline split 
parents sometimes.
 They are in META, in HDFS, and not deployed.
 But HBase can't clean them up.
 In this case, you can use the `-fixSplitParents` option to reset them in META 
to be online and not split.
-Therefore, hbck can merge them with other regions if fixing overlapping 
regions option is used. 
+Therefore, hbck can merge them with other regions if fixing overlapping 
regions option is used.
 
-This option should not normally be used, and it is not in `-fixAll`. 
+This option should not normally be used, and it is not in `-fixAll`.
 
 :numbered:

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/mapreduce.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/mapreduce.adoc 
b/src/main/asciidoc/_chapters/mapreduce.adoc
index 2a42af2..75718fd 100644
--- a/src/main/asciidoc/_chapters/mapreduce.adoc
+++ b/src/main/asciidoc/_chapters/mapreduce.adoc
@@ -33,7 +33,9 @@ A good place to get started with MapReduce is 
http://hadoop.apache.org/docs/r2.6
 MapReduce version 2 (MR2)is now part of 
link:http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/[YARN].
 
 This chapter discusses specific configuration steps you need to take to use 
MapReduce on data within HBase.
-In addition, it discusses other interactions and issues between HBase and 
MapReduce jobs.
+In addition, it discusses other interactions and issues between HBase and 
MapReduce
+jobs. Finally, it discusses <<cascading,Cascading>>, an
+link:http://www.cascading.org/[alternative API] for MapReduce.
 
 .`mapred` and `mapreduce`
 [NOTE]
@@ -63,7 +65,7 @@ The dependencies only need to be available on the local 
`CLASSPATH`.
 The following example runs the bundled HBase 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter]
 MapReduce job against a table named `usertable`.
 If you have not set the environment variables expected in the command (the 
parts prefixed by a `$` sign and surrounded by curly braces), you can use the 
actual system paths instead.
 Be sure to use the correct version of the HBase JAR for your system.
-The backticks (``` symbols) cause ths shell to execute the sub-commands, 
setting the output of `hbase classpath` (the command to dump HBase CLASSPATH) 
to `HADOOP_CLASSPATH`.
+The backticks (``` symbols) cause the shell to execute the sub-commands, 
setting the output of `hbase classpath` (the command to dump HBase CLASSPATH) 
to `HADOOP_CLASSPATH`.
 This example assumes you use a BASH-compatible shell.
 
 [source,bash]
@@ -277,7 +279,7 @@ That is where the logic for map-task assignment resides.
 
 The following is an example of using HBase as a MapReduce source in read-only 
manner.
 Specifically, there is a Mapper instance but no Reducer, and nothing is being 
emitted from the Mapper.
-There job would be defined as follows...
+The job would be defined as follows...
 
 [source,java]
 ----
@@ -590,7 +592,54 @@ public class MyMapper extends TableMapper<Text, 
LongWritable> {
 == Speculative Execution
 
 It is generally advisable to turn off speculative execution for MapReduce jobs 
that use HBase as a source.
-This can either be done on a per-Job basis through properties, on on the 
entire cluster.
+This can either be done on a per-Job basis through properties, or on the 
entire cluster.
 Especially for longer running jobs, speculative execution will create 
duplicate map-tasks which will double-write your data to HBase; this is 
probably not what you want.
 
 See <<spec.ex,spec.ex>> for more information.
+
+[[cascading]]
+== Cascading
+
+link:http://www.cascading.org/[Cascading] is an alternative API for MapReduce, 
which
+actually uses MapReduce, but allows you to write your MapReduce code in a 
simplified
+way.
+
+The following example shows a Cascading `Flow` which "sinks" data into an 
HBase cluster. The same
+`hBaseTap` API could be used to "source" data as well.
+
+[source, java]
+----
+// read data from the default filesystem
+// emits two fields: "offset" and "line"
+Tap source = new Hfs( new TextLine(), inputFileLhs );
+
+// store data in an HBase cluster
+// accepts fields "num", "lower", and "upper"
+// will automatically scope incoming fields to their proper familyname, "left" 
or "right"
+Fields keyFields = new Fields( "num" );
+String[] familyNames = {"left", "right"};
+Fields[] valueFields = new Fields[] {new Fields( "lower" ), new Fields( 
"upper" ) };
+Tap hBaseTap = new HBaseTap( "multitable", new HBaseScheme( keyFields, 
familyNames, valueFields ), SinkMode.REPLACE );
+
+// a simple pipe assembly to parse the input into fields
+// a real app would likely chain multiple Pipes together for more complex 
processing
+Pipe parsePipe = new Each( "insert", new Fields( "line" ), new RegexSplitter( 
new Fields( "num", "lower", "upper" ), " " ) );
+
+// "plan" a cluster executable Flow
+// this connects the source Tap and hBaseTap (the sink Tap) to the parsePipe
+Flow parseFlow = new FlowConnector( properties ).connect( source, hBaseTap, 
parsePipe );
+
+// start the flow, and block until complete
+parseFlow.complete();
+
+// open an iterator on the HBase table we stuffed data into
+TupleEntryIterator iterator = parseFlow.openSink();
+
+while(iterator.hasNext())
+  {
+  // print out each tuple from HBase
+  System.out.println( "iterator.next() = " + iterator.next() );
+  }
+
+iterator.close();
+----

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/ops_mgt.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/ops_mgt.adoc 
b/src/main/asciidoc/_chapters/ops_mgt.adoc
index a4dbccb..e8d44eb 100644
--- a/src/main/asciidoc/_chapters/ops_mgt.adoc
+++ b/src/main/asciidoc/_chapters/ops_mgt.adoc
@@ -199,7 +199,7 @@ $ ${HBASE_HOME}/bin/hbase 
org.apache.hadoop.hbase.tool.Canary -t 600000
 
 By default, the canary tool only check the read operations, it's hard to find 
the problem in the
 write path. To enable the write sniffing, you can run canary with the 
`-writeSniffing` option.
-When the write sniffing is enabled, the canary tool will create a hbase table 
and make sure the
+When the write sniffing is enabled, the canary tool will create an hbase table 
and make sure the
 regions of the table distributed on all region servers. In each sniffing 
period, the canary will
 try to put data to these regions to check the write availability of each 
region server.
 ----
@@ -351,7 +351,7 @@ You can invoke it via the HBase cli with the 'wal' command.
 [NOTE]
 ====
 Prior to version 2.0, the WAL Pretty Printer was called the 
`HLogPrettyPrinter`, after an internal name for HBase's write ahead log.
-In those versions, you can pring the contents of a WAL using the same 
configuration as above, but with the 'hlog' command.
+In those versions, you can print the contents of a WAL using the same 
configuration as above, but with the 'hlog' command.
 
 ----
  $ ./bin/hbase hlog 
hdfs://example.org:8020/hbase/.logs/example.org,60020,1283516293161/10.10.21.10%3A60020.1283973724012
@@ -523,7 +523,7 @@ row9        c1      c2
 row10  c1      c2
 ----
 
-For ImportTsv to use this imput file, the command line needs to look like this:
+For ImportTsv to use this input file, the command line needs to look like this:
 
 ----
 
@@ -637,10 +637,14 @@ See 
link:https://issues.apache.org/jira/browse/HBASE-4391[HBASE-4391 Add ability
 [[compaction.tool]]
 === Offline Compaction Tool
 
-See the usage for the 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html[Compaction
-          Tool].
-Run it like this +./bin/hbase
-          org.apache.hadoop.hbase.regionserver.CompactionTool+
+See the usage for the
+link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/CompactionTool.html[CompactionTool].
+Run it like:
+
+[source, bash]
+----
+$ ./bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool
+----
 
 === `hbase clean`
 
@@ -777,7 +781,7 @@ To decommission a loaded RegionServer, run the following: +$
 ====
 The `HOSTNAME` passed to _graceful_stop.sh_ must match the hostname that hbase 
is using to identify RegionServers.
 Check the list of RegionServers in the master UI for how HBase is referring to 
servers.
-Its usually hostname but can also be FQDN.
+It's usually hostname but can also be FQDN.
 Whatever HBase is using, this is what you should pass the _graceful_stop.sh_ 
decommission script.
 If you pass IPs, the script is not yet smart enough to make a hostname (or 
FQDN) of it and so it will fail when it checks if server is currently running; 
the graceful unloading of regions will not run.
 ====
@@ -817,12 +821,12 @@ Hence, it is better to manage the balancer apart from 
`graceful_stop` reenabling
 [[draining.servers]]
 ==== Decommissioning several Regions Servers concurrently
 
-If you have a large cluster, you may want to decommission more than one 
machine at a time by gracefully stopping mutiple RegionServers concurrently.
+If you have a large cluster, you may want to decommission more than one 
machine at a time by gracefully stopping multiple RegionServers concurrently.
 To gracefully drain multiple regionservers at the same time, RegionServers can 
be put into a "draining" state.
 This is done by marking a RegionServer as a draining node by creating an entry 
in ZooKeeper under the _hbase_root/draining_ znode.
 This znode has format `name,port,startcode` just like the regionserver entries 
under _hbase_root/rs_ znode.
 
-Without this facility, decommissioning mulitple nodes may be non-optimal 
because regions that are being drained from one region server may be moved to 
other regionservers that are also draining.
+Without this facility, decommissioning multiple nodes may be non-optimal 
because regions that are being drained from one region server may be moved to 
other regionservers that are also draining.
 Marking RegionServers to be in the draining state prevents this from happening.
 See this 
link:http://inchoate-clatter.blogspot.com/2012/03/hbase-ops-automation.html[blog
             post] for more details.
@@ -987,7 +991,7 @@ To configure metrics for a given region server, edit the 
_conf/hadoop-metrics2-h
 Restart the region server for the changes to take effect.
 
 To change the sampling rate for the default sink, edit the line beginning with 
`*.period`.
-To filter which metrics are emitted or to extend the metrics framework, see 
link:http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html
+To filter which metrics are emitted or to extend the metrics framework, see 
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html
 
 .HBase Metrics and Ganglia
 [NOTE]
@@ -1010,15 +1014,15 @@ Rather than listing each metric which HBase emits by 
default, you can browse thr
 Different metrics are exposed for the Master process and each region server 
process.
 
 .Procedure: Access a JSON Output of Available Metrics
-. After starting HBase, access the region server's web UI, at 
`http://REGIONSERVER_HOSTNAME:60030` by default (or port 16030 in HBase 1.0+).
+. After starting HBase, access the region server's web UI, at 
pass:[http://REGIONSERVER_HOSTNAME:60030] by default (or port 16030 in HBase 
1.0+).
 . Click the [label]#Metrics Dump# link near the top.
   The metrics for the region server are presented as a dump of the JMX bean in 
JSON format.
   This will dump out all metrics names and their values.
-  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
`?description=true` so your URL becomes 
`http://REGIONSERVER_HOSTNAME:60030/jmx?description=true`.
+  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
`?description=true` so your URL becomes 
pass:[http://REGIONSERVER_HOSTNAME:60030/jmx?description=true].
   Not all beans and attributes have descriptions.
-. To view metrics for the Master, connect to the Master's web UI instead 
(defaults to `http://localhost:60010` or port 16010 in HBase 1.0+) and click 
its [label]#Metrics
+. To view metrics for the Master, connect to the Master's web UI instead 
(defaults to pass:[http://localhost:60010] or port 16010 in HBase 1.0+) and 
click its [label]#Metrics
   Dump# link.
-  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
`?description=true` so your URL becomes 
`http://REGIONSERVER_HOSTNAME:60010/jmx?description=true`.
+  To include metrics descriptions in the listing -- this can be useful when 
you are exploring what is available -- add a query string of 
`?description=true` so your URL becomes 
pass:[http://REGIONSERVER_HOSTNAME:60010/jmx?description=true].
   Not all beans and attributes have descriptions.
 
 
@@ -1252,7 +1256,8 @@ Have a look in the Web UI.
 
 == Cluster Replication
 
-NOTE: This information was previously available at 
link:http://hbase.apache.org/replication.html[Cluster Replication].
+NOTE: This information was previously available at
+link:http://hbase.apache.org#replication[Cluster Replication].
 
 HBase provides a cluster replication mechanism which allows you to keep one 
cluster's state synchronized with that of another cluster, using the 
write-ahead log (WAL) of the source cluster to propagate the changes.
 Some use cases for cluster replication include:
@@ -1332,13 +1337,13 @@ list_peers:: list all replication relationships known 
by this cluster
 enable_peer <ID>::
   Enable a previously-disabled replication relationship
 disable_peer <ID>::
-  Disable a replication relationship. HBase will no longer send edits to that 
peer cluster, but it still keeps track of all the new WALs that it will need to 
replicate if and when it is re-enabled. 
+  Disable a replication relationship. HBase will no longer send edits to that 
peer cluster, but it still keeps track of all the new WALs that it will need to 
replicate if and when it is re-enabled.
 remove_peer <ID>::
   Disable and remove a replication relationship. HBase will no longer send 
edits to that peer cluster or keep track of WALs.
 enable_table_replication <TABLE_NAME>::
-  Enable the table replication switch for all it's column families. If the 
table is not found in the destination cluster then it will create one with the 
same name and column families. 
+  Enable the table replication switch for all its column families. If the 
table is not found in the destination cluster then it will create one with the 
same name and column families.
 disable_table_replication <TABLE_NAME>::
-  Disable the table replication switch for all it's column families. 
+  Disable the table replication switch for all its column families.
 
 === Verifying Replicated Data
 
@@ -1457,7 +1462,7 @@ Speed is also limited by total size of the list of edits 
to replicate per slave,
 With this configuration, a master cluster region server with three slaves 
would use at most 192 MB to store data to replicate.
 This does not account for the data which was filtered but not garbage 
collected.
 
-Once the maximum size of edits has been buffered or the reader reaces the end 
of the WAL, the source thread stops reading and chooses at random a sink to 
replicate to (from the list that was generated by keeping only a subset of 
slave region servers). It directly issues a RPC to the chosen region server and 
waits for the method to return.
+Once the maximum size of edits has been buffered or the reader reaches the end 
of the WAL, the source thread stops reading and chooses at random a sink to 
replicate to (from the list that was generated by keeping only a subset of 
slave region servers). It directly issues a RPC to the chosen region server and 
waits for the method to return.
 If the RPC was successful, the source determines whether the current file has 
been emptied or it contains more data which needs to be read.
 If the file has been emptied, the source deletes the znode in the queue.
 Otherwise, it registers the new offset in the log's znode.
@@ -1630,6 +1635,197 @@ You can use the HBase Shell command `status 
'replication'` to monitor the replic
 * `status 'replication', 'source'` -- prints the status for each replication 
source, sorted by hostname.
 * `status 'replication', 'sink'` -- prints the status for each replication 
sink, sorted by hostname.
 
+== Running Multiple Workloads On a Single Cluster
+
+HBase provides the following mechanisms for managing the performance of a 
cluster
+handling multiple workloads:
+. <<quota>>
+. <<request-queues>>
+. <<multiple-typed-queues>>
+
+[[quota]]
+=== Quotas
+HBASE-11598 introduces quotas, which allow you to throttle requests based on
+the following limits:
+
+. <<request-quotas,The number or size of requests(read, write, or read+write) 
in a given timeframe>>
+. <<namespace-quotas,The number of tables allowed in a namespace>>
+
+These limits can be enforced for a specified user, table, or namespace.
+
+.Enabling Quotas
+
+Quotas are disabled by default. To enable the feature, set the 
`hbase.quota.enabled`
+property to `true` in _hbase-site.xml_ file for all cluster nodes.
+
+.General Quota Syntax
+. THROTTLE_TYPE can be expressed as READ, WRITE, or the default type(read + 
write).
+. Timeframes  can be expressed in the following units: `sec`, `min`, `hour`, 
`day`
+. Request sizes can be expressed in the following units: `B` (bytes), `K` 
(kilobytes),
+`M` (megabytes), `G` (gigabytes), `T` (terabytes), `P` (petabytes)
+. Numbers of requests are expressed as an integer followed by the string `req`
+. Limits relating to time are expressed as req/time or size/time. For instance 
`10req/day`
+or `100P/hour`.
+. Numbers of tables or regions are expressed as integers.
+
+[[request-quotas]]
+.Setting Request Quotas
+You can set quota rules ahead of time, or you can change the throttle at 
runtime. The change
+will propagate after the quota refresh period has expired. This expiration 
period
+defaults to 5 minutes. To change it, modify the `hbase.quota.refresh.period` 
property
+in `hbase-site.xml`. This property is expressed in milliseconds and defaults 
to `300000`.
+
+----
+# Limit user u1 to 10 requests per second
+hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10req/sec'
+
+# Limit user u1 to 10 read requests per second
+hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => READ, USER => 'u1', LIMIT 
=> '10req/sec'
+
+# Limit user u1 to 10 M per day everywhere
+hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => '10M/day'
+
+# Limit user u1 to 10 M write size per sec
+hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => WRITE, USER => 'u1', LIMIT 
=> '10M/sec'
+
+# Limit user u1 to 5k per minute on table t2
+hbase> set_quota TYPE => THROTTLE, USER => 'u1', TABLE => 't2', LIMIT => 
'5K/min'
+
+# Limit user u1 to 10 read requests per sec on table t2
+hbase> set_quota TYPE => THROTTLE, THROTTLE_TYPE => READ, USER => 'u1', TABLE 
=> 't2', LIMIT => '10req/sec'
+
+# Remove an existing limit from user u1 on namespace ns2
+hbase> set_quota TYPE => THROTTLE, USER => 'u1', NAMESPACE => 'ns2', LIMIT => 
NONE
+
+# Limit all users to 10 requests per hour on namespace ns1
+hbase> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '10req/hour'
+
+# Limit all users to 10 T per hour on table t1
+hbase> set_quota TYPE => THROTTLE, TABLE => 't1', LIMIT => '10T/hour'
+
+# Remove all existing limits from user u1
+hbase> set_quota TYPE => THROTTLE, USER => 'u1', LIMIT => NONE
+
+# List all quotas for user u1 in namespace ns2
+hbase> list_quotas USER => 'u1, NAMESPACE => 'ns2'
+
+# List all quotas for namespace ns2
+hbase> list_quotas NAMESPACE => 'ns2'
+
+# List all quotas for table t1
+hbase> list_quotas TABLE => 't1'
+
+# list all quotas
+hbase> list_quotas
+----
+
+You can also place a global limit and exclude a user or a table from the limit 
by applying the
+`GLOBAL_BYPASS` property.
+----
+hbase> set_quota NAMESPACE => 'ns1', LIMIT => '100req/min'               # a 
per-namespace request limit
+hbase> set_quota USER => 'u1', GLOBAL_BYPASS => true                     # 
user u1 is not affected by the limit
+----
+
+[[namespace_quotas]]
+.Setting Namespace Quotas
+
+You can specify the maximum number of tables or regions allowed in a given 
namespace, either
+when you create the namespace or by altering an existing namespace, by setting 
the
+`hbase.namespace.quota.maxtables property`  on the namespace.
+
+.Limiting Tables Per Namespace
+----
+# Create a namespace with a max of 5 tables
+hbase> create_namespace 'ns1', {'hbase.namespace.quota.maxtables'=>'5'}
+
+# Alter an existing namespace to have a max of 8 tables
+hbase> alter_namespace 'ns2', {METHOD => 'set', 
'hbase.namespace.quota.maxtables'=>'8'}
+
+# Show quota information for a namespace
+hbase> describe_namespace 'ns2'
+
+# Alter an existing namespace to remove a quota
+hbase> alter_namespace 'ns2', {METHOD => 'unset', 
NAME=>'hbase.namespace.quota.maxtables'}
+----
+
+.Limiting Regions Per Namespace
+----
+# Create a namespace with a max of 10 regions
+hbase> create_namespace 'ns1', {'hbase.namespace.quota.maxregions'=>'10'
+
+# Show quota information for a namespace
+hbase> describe_namespace 'ns1'
+
+# Alter an existing namespace to have a max of 20 tables
+hbase> alter_namespace 'ns2', {METHOD => 'set', 
'hbase.namespace.quota.maxregions'=>'20'}
+
+# Alter an existing namespace to remove a quota
+hbase> alter_namespace 'ns2', {METHOD => 'unset', NAME=> 
'hbase.namespace.quota.maxregions'}
+----
+
+[[request_queues]]
+=== Request Queues
+If no throttling policy is configured, when the RegionServer receives multiple 
requests,
+they are now placed into a queue waiting for a free execution slot 
(HBASE-6721).
+The simplest queue is a FIFO queue, where each request waits for all previous 
requests in the queue
+to finish before running. Fast or interactive queries can get stuck behind 
large requests.
+
+If you are able to guess how long a request will take, you can reorder 
requests by
+pushing the long requests to the end of the queue and allowing short requests 
to preempt
+them. Eventually, you must still execute the large requests and prioritize the 
new
+requests behind them. The short requests will be newer, so the result is not 
terrible,
+but still suboptimal compared to a mechanism which allows large requests to be 
split
+into multiple smaller ones.
+
+HBASE-10993 introduces such a system for deprioritizing long-running scanners. 
There
+are two types of queues, `fifo` and `deadline`. To configure the type of queue 
used,
+configure the `hbase.ipc.server.callqueue.type` property in `hbase-site.xml`. 
There
+is no way to estimate how long each request may take, so de-prioritization 
only affects
+scans, and is based on the number of ânextâ calls a scan request has made. 
An assumption
+is made that when you are doing a full table scan, your job is not likely to 
be interactive,
+so if there are concurrent requests, you can delay long-running scans up to a 
limit tunable by
+setting the `hbase.ipc.server.queue.max.call.delay` property. The slope of the 
delay is calculated
+by a simple square root of `(numNextCall * weight)` where the weight is
+configurable by setting the `hbase.ipc.server.scan.vtime.weight` property.
+
+[[multiple-typed-queues]]
+=== Multiple-Typed Queues
+
+You can also prioritize or deprioritize different kinds of requests by 
configuring
+a specified number of dedicated handlers and queues. You can segregate the 
scan requests
+in a single queue with a single handler, and all the other available queues 
can service
+short `Get` requests.
+
+You can adjust the IPC queues and handlers based on the type of workload, 
using static
+tuning options. This approach is an interim first step that will eventually 
allow
+you to change the settings at runtime, and to dynamically adjust values based 
on the load.
+
+.Multiple Queues
+
+To avoid contention and separate different kinds of requests, configure the
+`hbase.ipc.server.callqueue.handler.factor` property, which allows you to 
increase the number of
+queues and control how many handlers can share the same queue., allows admins 
to increase the number
+of queues and decide how many handlers share the same queue.
+
+Using more queues reduces contention when adding a task to a queue or 
selecting it
+from a queue. You can even configure one queue per handler. The trade-off is 
that
+if some queues contain long-running tasks, a handler may need to wait to 
execute from that queue
+rather than stealing from another queue which has waiting tasks.
+
+.Read and Write Queues
+With multiple queues, you can now divide read and write requests, giving more 
priority
+(more queues) to one or the other type. Use the 
`hbase.ipc.server.callqueue.read.ratio`
+property to choose to serve more reads or more writes.
+
+.Get and Scan Queues
+Similar to the read/write split, you can split gets and scans by tuning the 
`hbase.ipc.server.callqueue.scan.ratio`
+property to give more priority to gets or to scans. A scan ratio of `0.1` will 
give
+more queue/handlers to the incoming gets, which means that more gets can be 
processed
+at the same time and that fewer scans can be executed at the same time. A 
value of
+`0.9` will give more queue/handlers to scans, so the number of scans executed 
will
+increase and the number of gets will decrease.
+
+
 [[ops.backup]]
 == HBase Backup
 
@@ -1853,7 +2049,7 @@ Aside from the disk space necessary to store the data, 
one RS may not be able to
 [[ops.capacity.nodes.throughput]]
 ==== Read/Write throughput
 
-Number of nodes can also be driven by required thoughput for reads and/or 
writes.
+Number of nodes can also be driven by required throughput for reads and/or 
writes.
 The throughput one can get per node depends a lot on data (esp.
 key/value sizes) and request patterns, as well as node and system 
configuration.
 Planning should be done for peak load if it is likely that the load would be 
the main driver of the increase of the node count.
@@ -2018,7 +2214,7 @@ or in code it would be as follows:
 
 [source,java]
 ----
-void rename(Admin admin, String oldTableName, String newTableName) {
+void rename(Admin admin, String oldTableName, TableName newTableName) {
   String snapshotName = randomName();
   admin.disableTable(oldTableName);
   admin.snapshot(snapshotName, oldTableName);

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/other_info.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/other_info.adoc 
b/src/main/asciidoc/_chapters/other_info.adoc
index 046b747..6143876 100644
--- a/src/main/asciidoc/_chapters/other_info.adoc
+++ b/src/main/asciidoc/_chapters/other_info.adoc
@@ -31,50 +31,50 @@
 [[other.info.videos]]
 === HBase Videos
 
-.Introduction to HBase 
-* 
link:http://www.cloudera.com/content/cloudera/en/resources/library/presentation/chicago_data_summit_apache_hbase_an_introduction_todd_lipcon.html[Introduction
 to HBase] by Todd Lipcon (Chicago Data Summit 2011). 
-* 
link:http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon[Introduction 
to HBase] by Todd Lipcon (2010).         
-link:http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase[Building
 Real Time Services at Facebook with HBase] by Jonathan Gray (Hadoop World 
2011). 
+.Introduction to HBase
+* 
link:http://www.cloudera.com/content/cloudera/en/resources/library/presentation/chicago_data_summit_apache_hbase_an_introduction_todd_lipcon.html[Introduction
 to HBase] by Todd Lipcon (Chicago Data Summit 2011).
+* 
link:http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon[Introduction 
to HBase] by Todd Lipcon (2010).
+link:http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase[Building
 Real Time Services at Facebook with HBase] by Jonathan Gray (Hadoop World 
2011).
 
-link:http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop[HBase
 and Hadoop, Mixing Real-Time and Batch Processing at StumbleUpon] by JD Cryans 
(Hadoop World 2010). 
+link:http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop[HBase
 and Hadoop, Mixing Real-Time and Batch Processing at StumbleUpon] by JD Cryans 
(Hadoop World 2010).
 
 [[other.info.pres]]
 === HBase Presentations (Slides)
 
-link:http://www.cloudera.com/content/cloudera/en/resources/library/hadoopworld/hadoop-world-2011-presentation-video-advanced-hbase-schema-design.html[Advanced
 HBase Schema Design] by Lars George (Hadoop World 2011). 
+link:http://www.cloudera.com/content/cloudera/en/resources/library/hadoopworld/hadoop-world-2011-presentation-video-advanced-hbase-schema-design.html[Advanced
 HBase Schema Design] by Lars George (Hadoop World 2011).
 
-link:http://www.slideshare.net/cloudera/chicago-data-summit-apache-hbase-an-introduction[Introduction
 to HBase] by Todd Lipcon (Chicago Data Summit 2011). 
+link:http://www.slideshare.net/cloudera/chicago-data-summit-apache-hbase-an-introduction[Introduction
 to HBase] by Todd Lipcon (Chicago Data Summit 2011).
 
-link:http://www.slideshare.net/cloudera/hw09-practical-h-base-getting-the-most-from-your-h-base-install[Getting
 The Most From Your HBase Install] by Ryan Rawson, Jonathan Gray (Hadoop World 
2009). 
+link:http://www.slideshare.net/cloudera/hw09-practical-h-base-getting-the-most-from-your-h-base-install[Getting
 The Most From Your HBase Install] by Ryan Rawson, Jonathan Gray (Hadoop World 
2009).
 
 [[other.info.papers]]
 === HBase Papers
 
-link:http://research.google.com/archive/bigtable.html[BigTable] by Google 
(2006). 
+link:http://research.google.com/archive/bigtable.html[BigTable] by Google 
(2006).
 
-link:http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html[HBase 
and HDFS Locality] by Lars George (2010). 
+link:http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html[HBase 
and HDFS Locality] by Lars George (2010).
 
-link:http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf[No 
Relation: The Mixed Blessings of Non-Relational Databases] by Ian Varley 
(2009). 
+link:http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf[No 
Relation: The Mixed Blessings of Non-Relational Databases] by Ian Varley (2009).
 
 [[other.info.sites]]
 === HBase Sites
 
-link:http://www.cloudera.com/blog/category/hbase/[Cloudera's HBase Blog] has a 
lot of links to useful HBase information. 
+link:http://www.cloudera.com/blog/category/hbase/[Cloudera's HBase Blog] has a 
lot of links to useful HBase information.
 
-* 
link:http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/[CAP
 Confusion] is a relevant entry for background information on distributed 
storage systems.        
+* 
link:http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/[CAP
 Confusion] is a relevant entry for background information on distributed 
storage systems.
 
-link:http://wiki.apache.org/hadoop/HBase/HBasePresentations[HBase Wiki] has a 
page with a number of presentations. 
+link:http://wiki.apache.org/hadoop/HBase/HBasePresentations[HBase Wiki] has a 
page with a number of presentations.
 
-link:http://refcardz.dzone.com/refcardz/hbase[HBase RefCard] from DZone. 
+link:http://refcardz.dzone.com/refcardz/hbase[HBase RefCard] from DZone.
 
 [[other.info.books]]
 === HBase Books
 
-link:http://shop.oreilly.com/product/0636920014348.do[HBase:  The Definitive 
Guide] by Lars George. 
+link:http://shop.oreilly.com/product/0636920014348.do[HBase:  The Definitive 
Guide] by Lars George.
 
 [[other.info.books.hadoop]]
 === Hadoop Books
 
-link:http://shop.oreilly.com/product/9780596521981.do[Hadoop:  The Definitive 
Guide] by Tom White. 
+link:http://shop.oreilly.com/product/9780596521981.do[Hadoop:  The Definitive 
Guide] by Tom White.
 
 :numbered:

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/performance.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/performance.adoc 
b/src/main/asciidoc/_chapters/performance.adoc
index 526fd01..5155f0a 100644
--- a/src/main/asciidoc/_chapters/performance.adoc
+++ b/src/main/asciidoc/_chapters/performance.adoc
@@ -88,7 +88,7 @@ Multiple rack configurations carry the same potential issues 
as multiple switche
 * Poor switch capacity performance
 * Insufficient uplink to another rack
 
-If the the switches in your rack have appropriate switching capacity to handle 
all the hosts at full speed, the next most likely issue will be caused by 
homing more of your cluster across racks.
+If the switches in your rack have appropriate switching capacity to handle all 
the hosts at full speed, the next most likely issue will be caused by homing 
more of your cluster across racks.
 The easiest way to avoid issues when spanning multiple racks is to use port 
trunking to create a bonded uplink to other racks.
 The downside of this method however, is in the overhead of ports that could 
potentially be used.
 An example of this is, creating an 8Gbps port channel from rack A to rack B, 
using 8 of your 24 ports to communicate between racks gives you a poor ROI, 
using too few however can mean you're not getting the most out of your cluster.
@@ -102,14 +102,14 @@ Are all the network interfaces functioning correctly? Are 
you sure? See the Trou
 
 [[perf.network.call_me_maybe]]
 === Network Consistency and Partition Tolerance
-The link:http://en.wikipedia.org/wiki/CAP_theorem[CAP Theorem] states that a 
distributed system can maintain two out of the following three charateristics: 
-- *C*onsistency -- all nodes see the same data. 
+The link:http://en.wikipedia.org/wiki/CAP_theorem[CAP Theorem] states that a 
distributed system can maintain two out of the following three characteristics:
+- *C*onsistency -- all nodes see the same data.
 - *A*vailability -- every request receives a response about whether it 
succeeded or failed.
 - *P*artition tolerance -- the system continues to operate even if some of its 
components become unavailable to the others.
 
-HBase favors consistency and partition tolerance, where a decision has to be 
made. Coda Hale explains why partition tolerance is so important, in 
http://codahale.com/you-cant-sacrifice-partition-tolerance/. 
+HBase favors consistency and partition tolerance, where a decision has to be 
made. Coda Hale explains why partition tolerance is so important, in 
http://codahale.com/you-cant-sacrifice-partition-tolerance/.
 
-Robert Yokota used an automated testing framework called 
link:https://aphyr.com/tags/jepsen[Jepson] to test HBase's partition tolerance 
in the face of network partitions, using techniques modeled after Aphyr's 
link:https://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network-partitions[Call
 Me Maybe] series. The results, available as a 
link:http://old.eng.yammer.com/call-me-maybe-hbase/[blog post] and an 
link:http://old.eng.yammer.com/call-me-maybe-hbase-addendum/[addendum], show 
that HBase performs correctly.
+Robert Yokota used an automated testing framework called 
link:https://aphyr.com/tags/jepsen[Jepson] to test HBase's partition tolerance 
in the face of network partitions, using techniques modeled after Aphyr's 
link:https://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network-partitions[Call
 Me Maybe] series. The results, available as a 
link:https://rayokota.wordpress.com/2015/09/30/call-me-maybe-hbase/[blog post] 
and an 
link:https://rayokota.wordpress.com/2015/09/30/call-me-maybe-hbase-addendum/[addendum],
 show that HBase performs correctly.
 
 [[jvm]]
 == Java
@@ -196,7 +196,8 @@ tableDesc.addFamily(cfDesc);
 ----
 ====
 
-See the API documentation for 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html[CacheConfig].
+See the API documentation for
+link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html[CacheConfig].
 
 [[perf.rs.memstore.size]]
 === `hbase.regionserver.global.memstore.size`
@@ -546,7 +547,7 @@ To disable the WAL, see <<wal.disable>>.
 === HBase Client: Group Puts by RegionServer
 
 In addition to using the writeBuffer, grouping `Put`s by RegionServer can 
reduce the number of client RPC calls per writeBuffer flush.
-There is a utility `HTableUtil` currently on TRUNK that does this, but you can 
either copy that or implement your own version for those still on 0.90.x or 
earlier.
+There is a utility `HTableUtil` currently on MASTER that does this, but you 
can either copy that or implement your own version for those still on 0.90.x or 
earlier.
 
 [[perf.hbase.write.mr.reducer]]
 === MapReduce: Skip The Reducer
@@ -555,7 +556,7 @@ When writing a lot of data to an HBase table from a MR job 
(e.g., with link:http
 When a Reducer step is used, all of the output (Puts) from the Mapper will get 
spooled to disk, then sorted/shuffled to other Reducers that will most likely 
be off-node.
 It's far more efficient to just write directly to HBase.
 
-For summary jobs where HBase is used as a source and a sink, then writes will 
be coming from the Reducer step (e.g., summarize values then write out result). 
This is a different processing problem than from the the above case.
+For summary jobs where HBase is used as a source and a sink, then writes will 
be coming from the Reducer step (e.g., summarize values then write out result). 
This is a different processing problem than from the above case.
 
 [[perf.one.region]]
 === Anti-Pattern: One Hot Region
@@ -564,7 +565,7 @@ If all your data is being written to one region at a time, 
then re-read the sect
 
 Also, if you are pre-splitting regions and all your data is _still_ winding up 
in a single region even though your keys aren't monotonically increasing, 
confirm that your keyspace actually works with the split strategy.
 There are a variety of reasons that regions may appear "well split" but won't 
work with your data.
-As the HBase client communicates directly with the RegionServers, this can be 
obtained via 
link:hhttp://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte[])[Table.getRegionLocation].
+As the HBase client communicates directly with the RegionServers, this can be 
obtained via 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte%5B%5D)[Table.getRegionLocation].
 
 See <<precreate.regions>>, as well as <<perf.configurations>>
 
@@ -606,7 +607,7 @@ When columns are selected explicitly with `scan.addColumn`, 
HBase will schedule
 When rows have few columns and each column has only a few versions this can be 
inefficient.
 A seek operation is generally slower if does not seek at least past 5-10 
columns/versions or 512-1024 bytes.
 
-In order to opportunistically look ahead a few columns/versions to see if the 
next column/version can be found that way before a seek operation is scheduled, 
a new attribute `Scan.HINT_LOOKAHEAD` can be set the on Scan object.
+In order to opportunistically look ahead a few columns/versions to see if the 
next column/version can be found that way before a seek operation is scheduled, 
a new attribute `Scan.HINT_LOOKAHEAD` can be set on the Scan object.
 The following code instructs the RegionServer to attempt two iterations of 
next before a seek is scheduled:
 
 [source,java]
@@ -676,7 +677,7 @@ Enabling Bloom Filters can save your having to go to disk 
and can help improve r
 link:http://en.wikipedia.org/wiki/Bloom_filter[Bloom filters] were developed 
over in link:https://issues.apache.org/jira/browse/HBASE-1200[HBase-1200 Add 
bloomfilters].
 For description of the development process -- why static blooms rather than 
dynamic -- and for an overview of the unique properties that pertain to blooms 
in HBase, as well as possible future directions, see the _Development Process_ 
section of the document 
link:https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf[BloomFilters
 in HBase] attached to 
link:https://issues.apache.org/jira/browse/HBASE-1200[HBASE-1200].
 The bloom filters described here are actually version two of blooms in HBase.
-In versions up to 0.19.x, HBase had a dynamic bloom option based on work done 
by the link:http://www.one-lab.org[European Commission One-Lab Project 034819].
+In versions up to 0.19.x, HBase had a dynamic bloom option based on work done 
by the link:http://www.one-lab.org/[European Commission One-Lab Project 034819].
 The core of the HBase bloom work was later pulled up into Hadoop to implement 
org.apache.hadoop.io.BloomMapFile.
 Version 1 of HBase blooms never worked that well.
 Version 2 is a rewrite from scratch though again it starts with the one-lab 
work.
@@ -730,7 +731,7 @@ However, if hedged reads are enabled, the client waits some 
configurable amount
 Whichever read returns first is used, and the other read request is discarded.
 Hedged reads can be helpful for times where a rare slow read is caused by a 
transient error such as a failing disk or flaky network connection.
 
-Because a HBase RegionServer is a HDFS client, you can enable hedged reads in 
HBase, by adding the following properties to the RegionServer's hbase-site.xml 
and tuning the values to suit your environment.
+Because an HBase RegionServer is a HDFS client, you can enable hedged reads in 
HBase, by adding the following properties to the RegionServer's hbase-site.xml 
and tuning the values to suit your environment.
 
 .Configuration for Hedged Reads
 * `dfs.client.hedged.read.threadpool.size` - the number of threads dedicated 
to servicing hedged reads.
@@ -781,7 +782,8 @@ Be aware that `Table.delete(Delete)` doesn't use the 
writeBuffer.
 It will execute an RegionServer RPC with each invocation.
 For a large number of deletes, consider `Table.delete(List)`.
 
-See 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete%28org.apache.hadoop.hbase.client.Delete%29
+See
++++<a 
href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete%28org.apache.hadoop.hbase.client.Delete%29";>hbase.client.Delete</a>+++.
 
 [[perf.hdfs]]
 == HDFS
@@ -868,7 +870,7 @@ If you are running on EC2 and post performance questions on 
the dist-list, pleas
 == Collocating HBase and MapReduce
 
 It is often recommended to have different clusters for HBase and MapReduce.
-A better qualification of this is: don't collocate a HBase that serves live 
requests with a heavy MR workload.
+A better qualification of this is: don't collocate an HBase that serves live 
requests with a heavy MR workload.
 OLTP and OLAP-optimized systems have conflicting requirements and one will 
lose to the other, usually the former.
 For example, short latency-sensitive disk reads will have to wait in line 
behind longer reads that are trying to squeeze out as much throughput as 
possible.
 MR jobs that write to HBase will also generate flushes and compactions, which 
will in turn invalidate blocks in the <<block.cache>>.

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/preface.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/preface.adoc 
b/src/main/asciidoc/_chapters/preface.adoc
index 960fcc4..50df7ff 100644
--- a/src/main/asciidoc/_chapters/preface.adoc
+++ b/src/main/asciidoc/_chapters/preface.adoc
@@ -29,20 +29,29 @@
 
 This is the official reference guide for the 
link:http://hbase.apache.org/[HBase] version it ships with.
 
-Herein you will find either the definitive documentation on an HBase topic as 
of its standing when the referenced HBase version shipped, or it will point to 
the location in link:http://hbase.apache.org/apidocs/index.html[Javadoc], 
link:https://issues.apache.org/jira/browse/HBASE[JIRA] or 
link:http://wiki.apache.org/hadoop/Hbase[wiki] where the pertinent information 
can be found.
+Herein you will find either the definitive documentation on an HBase topic as 
of its
+standing when the referenced HBase version shipped, or it will point to the 
location
+in link:http://hbase.apache.org/apidocs/index.html[Javadoc] or
+link:https://issues.apache.org/jira/browse/HBASE[JIRA] where the pertinent 
information can be found.
 
 .About This Guide
-This reference guide is a work in progress. The source for this guide can be 
found in the _src/main/asciidoc directory of the HBase source. This reference 
guide is marked up using link:http://asciidoc.org/[AsciiDoc] from which the 
finished guide is generated as part of the 'site' build target. Run
+This reference guide is a work in progress. The source for this guide can be 
found in the
+_src/main/asciidoc directory of the HBase source. This reference guide is 
marked up
+using link:http://asciidoc.org/[AsciiDoc] from which the finished guide is 
generated as part of the
+'site' build target. Run
 [source,bourne]
 ----
 mvn site
 ----
 to generate this documentation.
 Amendments and improvements to the documentation are welcomed.
-Click 
link:https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12310753&issuetype=1&components=12312132&summary=SHORT+DESCRIPTION[this
 link] to file a new documentation bug against Apache HBase with some values 
pre-selected.
+Click
+link:https://issues.apache.org/jira/secure/CreateIssueDetails!init.jspa?pid=12310753&issuetype=1&components=12312132&summary=SHORT+DESCRIPTION[this
 link]
+to file a new documentation bug against Apache HBase with some values 
pre-selected.
 
 .Contributing to the Documentation
-For an overview of AsciiDoc and suggestions to get started contributing to the 
documentation, see the <<appendix_contributing_to_documentation,relevant 
section later in this documentation>>.
+For an overview of AsciiDoc and suggestions to get started contributing to the 
documentation,
+see the <<appendix_contributing_to_documentation,relevant section later in 
this documentation>>.
 
 .Heads-up if this is your first foray into the world of distributed 
computing...
 If this is your first foray into the wonderful world of Distributed Computing, 
then you are in for some interesting times.
@@ -57,7 +66,7 @@ Yours, the HBase Community.
 
 .Reporting Bugs
 
-Please use link:https://issues.apache.org/jira/browse/hbase[JIRA] to report 
non-security-related bugs. 
+Please use link:https://issues.apache.org/jira/browse/hbase[JIRA] to report 
non-security-related bugs.
 
 To protect existing HBase installations from new vulnerabilities, please *do 
not* use JIRA to report security-related bugs. Instead, send your report to the 
mailing list priv...@apache.org, which allows anyone to send messages, but 
restricts who can read them. Someone on that list will contact you to follow up 
on your report.
 

http://git-wip-us.apache.org/repos/asf/hbase/blob/6f07973d/src/main/asciidoc/_chapters/rpc.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/rpc.adoc 
b/src/main/asciidoc/_chapters/rpc.adoc
index 43e7156..1d363eb 100644
--- a/src/main/asciidoc/_chapters/rpc.adoc
+++ b/src/main/asciidoc/_chapters/rpc.adoc
@@ -47,7 +47,7 @@ For more background on how we arrived at this spec., see 
link:https://docs.googl
 
 
 . A wire-format we can evolve
-. A format that does not require our rewriting server core or radically 
changing its current architecture (for later).        
+. A format that does not require our rewriting server core or radically 
changing its current architecture (for later).
 
 === TODO
 
@@ -58,7 +58,7 @@ For more background on how we arrived at this spec., see 
link:https://docs.googl
 . Diagram on how it works
 . A grammar that succinctly describes the wire-format.
   Currently we have these words and the content of the rpc protobuf idl but a 
grammar for the back and forth would help with groking rpc.
-  Also, a little state machine on client/server interactions would help with 
understanding (and ensuring correct implementation).        
+  Also, a little state machine on client/server interactions would help with 
understanding (and ensuring correct implementation).
 
 === RPC
 
@@ -71,14 +71,15 @@ Optionally, Cells(KeyValues) can be passed outside of 
protobufs in follow-behind
 
 
 
-For more detail on the protobufs involved, see the 
link:http://svn.apache.org/viewvc/hbase/trunk/hbase-protocol/src/main/protobuf/RPC.proto?view=markup[RPC.proto]
            file in trunk.
+For more detail on the protobufs involved, see the
+link:https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=blob;f=hbase-protocol/src/main/protobuf/RPC.proto;hb=HEAD[RPC.proto]
            file in master.
 
 ==== Connection Setup
 
 Client initiates connection.
 
 ===== Client
-On connection setup, client sends a preamble followed by a connection header. 
+On connection setup, client sends a preamble followed by a connection header.
 
 .<preamble>
 [source]
@@ -105,7 +106,7 @@ After client sends preamble and connection header, server 
does NOT respond if su
 No response means server is READY to accept requests and to give out response.
 If the version or authentication in the preamble is not agreeable or the 
server has trouble parsing the preamble, it will throw a 
org.apache.hadoop.hbase.ipc.FatalConnectionException explaining the error and 
will then disconnect.
 If the client in the connection header -- i.e.
-the protobuf'd Message that comes after the connection preamble -- asks for 
for a Service the server does not support or a codec the server does not have, 
again we throw a FatalConnectionException with explanation.
+the protobuf'd Message that comes after the connection preamble -- asks for a 
Service the server does not support or a codec the server does not have, again 
we throw a FatalConnectionException with explanation.
 
 ==== Request
 
@@ -117,7 +118,7 @@ The header includes the method name and optionally, 
metadata on the optional Cel
 The parameter type suits the method being invoked: i.e.
 if we are doing a getRegionInfo request, the protobuf Message param will be an 
instance of GetRegionInfoRequest.
 The response will be a GetRegionInfoResponse.
-The CellBlock is optionally used ferrying the bulk of the RPC data: i.e 
Cells/KeyValues.
+The CellBlock is optionally used ferrying the bulk of the RPC data: i.e. 
Cells/KeyValues.
 
 ===== Request Parts
 
@@ -181,7 +182,7 @@ Codecs will live on the server for all time so old clients 
can connect.
 
 .Constraints
 In some part, current wire-format -- i.e.
-all requests and responses preceeded by a length -- has been dictated by 
current server non-async architecture.
+all requests and responses preceded by a length -- has been dictated by 
current server non-async architecture.
 
 .One fat pb request or header+param
 We went with pb header followed by pb param making a request and a pb header 
followed by pb response for now.
@@ -190,7 +191,7 @@ Doing header+param rather than a single protobuf Message 
with both header and pa
 . Is closer to what we currently have
 . Having a single fat pb requires extra copying putting the already pb'd param 
into the body of the fat request pb (and same making result)
 . We can decide whether to accept the request or not before we read the param; 
for example, the request might be low priority.
-  As is, we read header+param in one go as server is currently implemented so 
this is a TODO.            
+  As is, we read header+param in one go as server is currently implemented so 
this is a TODO.
 
 The advantages are minor.
 If later, fat request has clear advantage, can roll out a v2 later.
@@ -204,18 +205,18 @@ Codec must implement hbase's `Codec` Interface.
 After connection setup, all passed cellblocks will be sent with this codec.
 The server will return cellblocks using this same codec as long as the codec 
is on the servers' CLASSPATH (else you will get 
`UnsupportedCellCodecException`).
 
-To change the default codec, set `hbase.client.default.rpc.codec`. 
+To change the default codec, set `hbase.client.default.rpc.codec`.
 
 To disable cellblocks completely and to go pure protobuf, set the default to 
the empty String and do not specify a codec in your Configuration.
 So, set `hbase.client.default.rpc.codec` to the empty string and do not set 
`hbase.client.rpc.codec`.
 This will cause the client to connect to the server with no codec specified.
 If a server sees no codec, it will return all responses in pure protobuf.
-Running pure protobuf all the time will be slower than running with 
cellblocks. 
+Running pure protobuf all the time will be slower than running with cellblocks.
 
 .Compression
-Uses hadoops compression codecs.
+Uses hadoop's compression codecs.
 To enable compressing of passed CellBlocks, set `hbase.client.rpc.compressor` 
to the name of the Compressor to use.
-Compressor must implement Hadoops' CompressionCodec Interface.
+Compressor must implement Hadoop's CompressionCodec Interface.
 After connection setup, all passed cellblocks will be sent compressed.
 The server will return cellblocks compressed using this same compressor as 
long as the compressor is on its CLASSPATH (else you will get 
`UnsupportedCompressionCodecException`).

[03/11] hbase git commit: HBASE-13908 update site docs for 1.2 RC.

Reply via email to