http://git-wip-us.apache.org/repos/asf/hbase-site/blob/6c67ddd7/book.html ---------------------------------------------------------------------- diff --git a/book.html b/book.html index 0621ea8..7977239 100644 --- a/book.html +++ b/book.html @@ -485,7 +485,7 @@ See <a href="#java">Java</a> for information about supported JDK versions.</p> <div class="title">Procedure: Download, Configure, and Start HBase in Standalone Mode</div> <ol class="arabic"> <li> -<p>Choose a download site from this list of <a href="https://www.apache.org/dyn/closer.cgi/hbase/">Apache Download Mirrors</a>. +<p>Choose a download site from this list of <a href="https://www.apache.org/dyn/closer.lua/hbase/">Apache Download Mirrors</a>. Click on the suggested top link. This will take you to a mirror of <em>HBase Releases</em>. Click on the folder named <em>stable</em> and then download the binary file that ends in <em>.tar.gz</em> to your local filesystem. @@ -6703,6 +6703,9 @@ Quitting...</code></pre> <li> <p>hbase.regionserver.region.split.policy is now SteppingSplitPolicy. Previously it was IncreasingToUpperBoundRegionSplitPolicy.</p> </li> +<li> +<p>replication.source.ratio is now 0.5. Previously it was 0.1.</p> +</li> </ul> </div> <div id="upgrade2.0.regions.on.master" class="paragraph"> @@ -6915,13 +6918,81 @@ Quitting...</code></pre> </div> </div> <div class="sect3"> -<h4 id="upgrade2.0.rolling.upgrades"><a class="anchor" href="#upgrade2.0.rolling.upgrades"></a>13.1.2. Rolling Upgrade from 1.x to 2.x</h4> +<h4 id="upgrade2.0.coprocessors"><a class="anchor" href="#upgrade2.0.coprocessors"></a>13.1.2. Upgrading Coprocessors to 2.0</h4> +<div class="paragraph"> +<p>Coprocessors have changed substantially in 2.0 ranging from top level design changes in class +hierarchies to changed/removed methods, interfaces, etc. +(Parent jira: <a href="https://issues.apache.org/jira/browse/HBASE-18169">HBASE-18169 Coprocessor fix +and cleanup before 2.0.0 release</a>). Some of the reasons for such widespread changes:</p> +</div> +<div class="olist arabic"> +<ol class="arabic"> +<li> +<p>Pass Interfaces instead of Implementations; e.g. TableDescriptor instead of HTableDescriptor and +Region instead of HRegion (<a href="https://issues.apache.org/jira/browse/HBASE-18241">HBASE-18241</a> +Change client.Table and client.Admin to not use HTableDescriptor).</p> +</li> +<li> +<p>Design refactor so implementers need to fill out less boilerplate and so we can do more +compile-time checking (<a href="https://issues.apache.org/jira/browse/HBASE-17732">HBASE-17732</a>)</p> +</li> +<li> +<p>Purge Protocol Buffers from Coprocessor API +(<a href="https://issues.apache.org/jira/browse/HBASE-18859">HBASE-18859</a>, +<a href="https://issues.apache.org/jira/browse/HBASE-16769">HBASE-16769</a>, etc)</p> +</li> +<li> +<p>Cut back on what we expose to Coprocessors removing hooks on internals that were too private to +expose (for eg. <a href="https://issues.apache.org/jira/browse/HBASE-18453">HBASE-18453</a> +CompactionRequest should not be exposed to user directly; +<a href="https://issues.apache.org/jira/browse/HBASE-18298">HBASE-18298</a> RegionServerServices Interface +cleanup for CP expose; etc)</p> +</li> +</ol> +</div> +<div class="paragraph"> +<p>To use coprocessors in 2.0, they should be rebuilt against new API otherwise they will fail to +load and HBase processes will die.</p> +</div> +<div class="paragraph"> +<p>Suggested order of changes to upgrade the coprocessors:</p> +</div> +<div class="olist arabic"> +<ol class="arabic"> +<li> +<p>Directly implement observer interfaces instead of extending Base*Observer classes. Change +<code>Foo extends BaseXXXObserver</code> to <code>Foo implements XXXObserver</code>. +(<a href="https://issues.apache.org/jira/browse/HBASE-17312">HBASE-17312</a>).</p> +</li> +<li> +<p>Adapt to design change from Inheritence to Composition +(<a href="https://issues.apache.org/jira/browse/HBASE-17732">HBASE-17732</a>) by following +<a href="https://github.com/apache/hbase/blob/master/dev-support/design-docs/Coprocessor_Design_Improvements-Use_composition_instead_of_inheritance-HBASE-17732.adoc#migrating-existing-cps-to-new-design">this +example</a>.</p> +</li> +<li> +<p>getTable() has been removed from the CoprocessorEnvrionment, coprocessors should self-manage +Table instances.</p> +</li> +</ol> +</div> +<div class="paragraph"> +<p>Some examples of writing coprocessors with new API can be found in hbase-example module +<a href="https://github.com/apache/hbase/tree/branch-2.0/hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example">here</a> .</p> +</div> +<div class="paragraph"> +<p>Lastly, if an api has been changed/removed that breaks you in an irreparable way, and if there’s a +good justification to add it back, bring it our notice (<a href="mailto:d...@hbase.apache.org">d...@hbase.apache.org</a>).</p> +</div> +</div> +<div class="sect3"> +<h4 id="upgrade2.0.rolling.upgrades"><a class="anchor" href="#upgrade2.0.rolling.upgrades"></a>13.1.3. Rolling Upgrade from 1.x to 2.x</h4> <div class="paragraph"> <p>There is no rolling upgrade from HBase 1.x+ to HBase 2.x+. In order to perform a zero downtime upgrade, you will need to run an additional cluster in parallel and handle failover in application logic.</p> </div> </div> <div class="sect3"> -<h4 id="upgrade2.0.process"><a class="anchor" href="#upgrade2.0.process"></a>13.1.3. Upgrade process from 1.x to 2.x</h4> +<h4 id="upgrade2.0.process"><a class="anchor" href="#upgrade2.0.process"></a>13.1.4. Upgrade process from 1.x to 2.x</h4> <div class="paragraph"> <p>To upgrade an existing HBase 1.x cluster, you should:</p> </div> @@ -6931,6 +7002,9 @@ Quitting...</code></pre> <p>Clean shutdown of existing 1.x cluster</p> </li> <li> +<p>Update coprocessors</p> +</li> +<li> <p>Upgrade Master roles first</p> </li> <li> @@ -10043,18 +10117,33 @@ If you don’t have time to build it both ways and compare, my advice would </div> </div> <div class="sect2"> -<h3 id="_optimize_on_the_server_side_for_low_latency"><a class="anchor" href="#_optimize_on_the_server_side_for_low_latency"></a>45.4. Optimize on the Server Side for Low Latency</h3> +<h3 id="shortcircuit.reads"><a class="anchor" href="#shortcircuit.reads"></a>45.4. Optimize on the Server Side for Low Latency</h3> +<div class="paragraph"> +<p>Skip the network for local blocks when the RegionServer goes to read from HDFS by exploiting HDFS’s +<a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short-Circuit Local Reads</a> facility. +Note how setup must be done both at the datanode and on the dfsclient ends of the conneciton — i.e. at the RegionServer +and how both ends need to have loaded the hadoop native <code>.so</code> library. +After configuring your hadoop setting <em>dfs.client.read.shortcircuit</em> to <em>true</em> and configuring +the <em>dfs.domain.socket.path</em> path for the datanode and dfsclient to share and restarting, next configure +the regionserver/dfsclient side.</p> +</div> <div class="ulist"> <ul> <li> -<p>Skip the network for local blocks. In <code>hbase-site.xml</code>, set the following parameters:</p> +<p>In <code>hbase-site.xml</code>, set the following parameters:</p> <div class="ulist"> <ul> <li> <p><code>dfs.client.read.shortcircuit = true</code></p> </li> <li> -<p><code>dfs.client.read.shortcircuit.buffer.size = 131072</code> (Important to avoid OOME)</p> +<p><code>dfs.client.read.shortcircuit.skip.checksum = true</code> so we don’t double checksum (HBase does its own checksumming to save on i/os. See <a href="#hbase.regionserver.checksum.verify.performance"><code>hbase.regionserver.checksum.verify</code></a> for more on this.</p> +</li> +<li> +<p><code>dfs.domain.socket.path</code> to match what was set for the datanodes.</p> +</li> +<li> +<p><code>dfs.client.read.shortcircuit.buffer.size = 131072</code> Important to avoid OOME — hbase has a default it uses if unset, see <code>hbase.dfs.client.read.shortcircuit.buffer.size</code>; its default is 131072.</p> </li> </ul> </div> @@ -10077,6 +10166,24 @@ If you don’t have time to build it both ways and compare, my advice would </li> </ul> </div> +<div class="paragraph"> +<p>Check the RegionServer logs after restart. You should only see complaint if misconfiguration. +Otherwise, shortcircuit read operates quietly in background. It does not provide metrics so +no optics on how effective it is but read latencies should show a marked improvement, especially if +good data locality, lots of random reads, and dataset is larger than available cache.</p> +</div> +<div class="paragraph"> +<p>Other advanced configurations that you might play with, especially if shortcircuit functionality +is complaining in the logs, include <code>dfs.client.read.shortcircuit.streams.cache.size</code> and +<code>dfs.client.socketcache.capacity</code>. Documentation is sparse on these options. You’ll have to +read source code.</p> +</div> +<div class="paragraph"> +<p>For more on short-circuit reads, see Colin’s old blog on rollout, +<a href="http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/">How Improved Short-Circuit Local Reads Bring Better Performance and Security to Hadoop</a>. +The <a href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</a> issue also makes for an +interesting read showing the HDFS community at its best (caveat a few comments).</p> +</div> </div> <div class="sect2"> <h3 id="_jvm_tuning"><a class="anchor" href="#_jvm_tuning"></a>45.5. JVM Tuning</h3> @@ -37373,7 +37480,7 @@ The server will return cellblocks compressed using this same compressor as long <div id="footer"> <div id="footer-text"> Version 3.0.0-SNAPSHOT<br> -Last updated 2018-04-04 14:29:50 UTC +Last updated 2018-04-05 14:29:11 UTC </div> </div> </body>
http://git-wip-us.apache.org/repos/asf/hbase-site/blob/6c67ddd7/bulk-loads.html ---------------------------------------------------------------------- diff --git a/bulk-loads.html b/bulk-loads.html index 2f77955..15e7cfa 100644 --- a/bulk-loads.html +++ b/bulk-loads.html @@ -7,7 +7,7 @@ <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20180404" /> + <meta name="Date-Revision-yyyymmdd" content="20180405" /> <meta http-equiv="Content-Language" content="en" /> <title>Apache HBase – Bulk Loads in Apache HBase (TM) @@ -63,7 +63,7 @@ <li> <a href="license.html" title="License">License</a> </li> - <li> <a href="http://www.apache.org/dyn/closer.cgi/hbase/" title="Downloads">Downloads</a> + <li> <a href="http://www.apache.org/dyn/closer.lua/hbase/" title="Downloads">Downloads</a> </li> <li> <a href="https://issues.apache.org/jira/browse/HBASE?report=com.atlassian.jira.plugin.system.project:changelog-panel#selectedTab=com.atlassian.jira.plugin.system.project%3Achangelog-panel" title="Release Notes">Release Notes</a> @@ -296,7 +296,7 @@ under the License. --> <a href="https://www.apache.org/">The Apache Software Foundation</a>. All rights reserved. - <li id="publishDate" class="pull-right">Last Published: 2018-04-04</li> + <li id="publishDate" class="pull-right">Last Published: 2018-04-05</li> </p> </div>