http://git-wip-us.apache.org/repos/asf/hbase-site/blob/5018ccb3/book.html ---------------------------------------------------------------------- diff --git a/book.html b/book.html index 51057fa..818c23e 100644 --- a/book.html +++ b/book.html @@ -5483,7 +5483,7 @@ See the entry for <code>hbase.hregion.majorcompaction</code> in the <a href="#co <div class="paragraph"> <p>Major compactions are absolutely necessary for StoreFile clean-up. Do not disable them altogether. -You can run major compactions manually via the HBase shell or via the <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact(org.apache.hadoop.hbase.TableName)">Admin API</a>.</p> +You can run major compactions manually via the HBase shell or via the <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-">Admin API</a>.</p> </div> </td> </tr> @@ -6127,7 +6127,7 @@ For new installations, do not deploy 0.94.y, 0.96.y, or 0.98.y. Deploy our stab </table> </div> <div class="paragraph"> -<p>Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop’s versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on <a href="http://wiki.apache.org/hadoop/Hbase/HBaseVersions">HBase Versioning</a> which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0.</p> +<p>Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop’s versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on <a href="https://web.archive.org/web/20150905071342/https://wiki.apache.org/hadoop/Hbase/HBaseVersions">HBase Versioning</a> which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0.</p> </div> <div id="hbase.development.series" class="paragraph"> <div class="title">Odd/Even Versioning or "Development" Series Releases</div> @@ -7197,7 +7197,7 @@ By default, the timestamp represents the time on the RegionServer when the data <h2 id="conceptual.view"><a class="anchor" href="#conceptual.view"></a>19. Conceptual View</h2> <div class="sectionbody"> <div class="paragraph"> -<p>You can read a very understandable explanation of the HBase data model in the blog post <a href="http://jimbojw.com/wiki/index.php?title=Understanding_Hbase_and_BigTable">Understanding HBase and BigTable</a> by Jim R. Wilson. +<p>You can read a very understandable explanation of the HBase data model in the blog post <a href="http://jimbojw.com/#understanding%20hbase">Understanding HBase and BigTable</a> by Jim R. Wilson. Another good explanation is available in the PDF <a href="http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/9353-login1210_khurana.pdf">Introduction to Basic Schema Design</a> by Amandeep Khurana.</p> </div> <div class="paragraph"> @@ -7555,13 +7555,13 @@ Operations are applied via <a href="http://hbase.apache.org/apidocs/org/apache/h <h3 id="_get"><a class="anchor" href="#_get"></a>26.1. Get</h3> <div class="paragraph"> <p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</a> returns attributes for a specified row. -Gets are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get(org.apache.hadoop.hbase.client.Get)">Table.get</a>.</p> +Gets are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#get-org.apache.hadoop.hbase.client.Get-">Table.get</a></p> </div> </div> <div class="sect2"> <h3 id="_put"><a class="anchor" href="#_put"></a>26.2. Put</h3> <div class="paragraph"> -<p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</a> either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)">Table.put</a> (non-writeBuffer) or <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List,%20java.lang.Object%5B%5D)">Table.batch</a> (non-writeBuffer).</p> +<p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</a> either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put-org.apache.hadoop.hbase.client.Put-">Table.put</a> (non-writeBuffer) or <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-">Table.batch</a> (non-writeBuffer)</p> </div> </div> <div class="sect2"> @@ -7602,7 +7602,7 @@ ResultScanner rs = table.getScanner(scan); <h3 id="_delete"><a class="anchor" href="#_delete"></a>26.4. Delete</h3> <div class="paragraph"> <p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Delete.html">Delete</a> removes a row from a table. -Deletes are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete(org.apache.hadoop.hbase.client.Delete)">Table.delete</a>.</p> +Deletes are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete-org.apache.hadoop.hbase.client.Delete-">Table.delete</a>.</p> </div> <div class="paragraph"> <p>HBase does not modify data in place, and so deletes are handled by creating new markers called <em>tombstones</em>. @@ -7707,10 +7707,10 @@ The below discussion of <a href="http://hbase.apache.org/apidocs/org/apache/hado <div class="ulist"> <ul> <li> -<p>to return more than one version, see <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions()">Get.setMaxVersions()</a></p> +<p>to return more than one version, see <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions--">Get.setMaxVersions()</a></p> </li> <li> -<p>to return versions other than the latest, see <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setTimeRange(long,%20long)">Get.setTimeRange()</a></p> +<p>to return versions other than the latest, see <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setTimeRange-long-long-">Get.setTimeRange()</a></p> <div class="paragraph"> <p>To retrieve the latest version that is less than or equal to a given value, thus giving the 'latest' state of the record at a certain point in time, just use a range from 0 to the desired version and set the max versions to 1.</p> </div> @@ -8333,7 +8333,7 @@ This is the main trade-off.</p> <div class="paragraph"> <p><a href="https://issues.apache.org/jira/browse/HBASE-4811">HBASE-4811</a> implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. -See <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean" class="bare">https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean</a> for more information.</p> +See <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed-boolean-">Scan.setReversed()</a> for more information.</p> </div> </td> </tr> @@ -8883,7 +8883,7 @@ This approach would be useful if scanning by hostname was a priority.</p> <div class="paragraph"> <p><a href="https://issues.apache.org/jira/browse/HBASE-4811">HBASE-4811</a> implements an API to scan a table or a range within a table in reverse, reducing the need to optimize your schema for forward or reverse scanning. This feature is available in HBase 0.98 and later. -See <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean" class="bare">https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean</a> for more information.</p> +See <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed-boolean-">Scan.setReversed()</a> for more information.</p> </div> </td> </tr> @@ -8937,8 +8937,7 @@ The rowkey of LOG_TYPES would be:</p> </div> <div class="paragraph"> <p>A column for this rowkey could be a long with an assigned number, which could be obtained -by using an -<a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long%29">HBase counter</a>.</p> +by using an <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#incrementColumnValue-byte:A-byte:A-byte:A-long-">HBase counter</a></p> </div> <div class="paragraph"> <p>So the resulting composite rowkey would be:</p> @@ -8967,7 +8966,7 @@ by using an <p>This effectively is the OpenTSDB approach. What OpenTSDB does is re-write data and pack rows into columns for certain time-periods. For a detailed explanation, see: <a href="http://opentsdb.net/schema.html" class="bare">http://opentsdb.net/schema.html</a>, and -<a href="http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-lessons-learned-from-opentsdb.html">Lessons Learned from OpenTSDB</a> +<a href="https://www.slideshare.net/cloudera/4-opentsdb-hbasecon">Lessons Learned from OpenTSDB</a> from HBaseCon2012.</p> </div> <div class="paragraph"> @@ -12472,12 +12471,12 @@ The correct way to apply cell level labels is to do so in the application code w filter out cells that you do not have access to. A superuser can set the default set of authorizations for a given user by using the <code>set_auths</code> HBase Shell command or the -<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/visibility/VisibilityClient.html#setAuths(org.apache.hadoop.hbase.client.Connection,%20java.lang.String\">],%20java.lang.String)[VisibilityClient.setAuths()</a> method.</p> +<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/security/visibility/VisibilityClient.html#setAuths-org.apache.hadoop.hbase.client.Connection-java.lang.String:A-java.lang.String-">VisibilityClient.setAuths()</a> method.</p> </div> <div class="paragraph"> <p>You can specify a different authorization during the Scan or Get, by passing the AUTHORIZATIONS option in HBase Shell, or the -<a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAuthorizations%28org.apache.hadoop.hbase.security.visibility.Authorizations%29">setAuthorizations()</a> +<a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAuthorizations-org.apache.hadoop.hbase.security.visibility.Authorizations-">Scan.setAuthorizations()</a> method if you use the API. This authorization will be combined with your default set as an additional filter. It will further filter your results, rather than giving you additional authorization.</p> @@ -13085,7 +13084,7 @@ Even HDFS doesn’t do well with anything less than 5 DataNodes (due to thin <div class="sect2"> <h3 id="arch.overview.hbasehdfs"><a class="anchor" href="#arch.overview.hbasehdfs"></a>64.3. What Is The Difference Between HBase and Hadoop/HDFS?</h3> <div class="paragraph"> -<p><a href="http://hadoop.apache.org/hdfs/">HDFS</a> is a distributed file system that is well suited for the storage of large files. +<p><a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">HDFS</a> is a distributed file system that is well suited for the storage of large files. Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files. HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables. This can sometimes be a point of conceptual confusion. @@ -13156,9 +13155,7 @@ If a region has both an empty start and an empty end key, it is the only region </div> <div class="paragraph"> <p>In the (hopefully unlikely) event that programmatic processing of catalog metadata -is required, see the -<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte%5B%5D%29">Writables</a> -utility.</p> +is required, see the <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/RegionInfo.html#parseFrom-byte:A-">RegionInfo.parseFrom</a> utility.</p> </div> </div> <div class="sect2"> @@ -13288,7 +13285,7 @@ Please use <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/clie <p>For additional information on write durability, review the <a href="/acid-semantics.html">ACID semantics</a> page.</p> </div> <div class="paragraph"> -<p>For fine-grained control of batching of <code>Put</code>s or <code>Delete</code>s, see the <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch%28java.util.List%29">batch</a> methods on Table.</p> +<p>For fine-grained control of batching of <code>Put</code>s or <code>Delete</code>s, see the <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-">batch</a> methods on Table.</p> </div> </div> <div class="sect2"> @@ -17103,7 +17100,7 @@ ResultScanner scanner = table.getScanner(scan);</code></pre> <p>More information about the design and implementation can be found at the jira issue: <a href="https://issues.apache.org/jira/browse/HBASE-10070">HBASE-10070</a></p> </li> <li> -<p>HBaseCon 2014 <a href="http://hbasecon.com/sessions/#session15">talk</a> also contains some details and <a href="http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time">slides</a>.</p> +<p>HBaseCon 2014 talk: <a href="http://hbase.apache.org/www.hbasecon.com/#2014-PresentationsRecordings">HBase Read High Availability Using Timeline-Consistent Region Replicas</a> also contains some details and <a href="http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time">slides</a>.</p> </li> </ol> </div> @@ -18094,7 +18091,10 @@ curl -vi -X PUT \ <div class="sectionbody"> <div class="paragraph"> <p>FB’s Chip Turner wrote a pure C/C++ client. -<a href="https://github.com/facebook/native-cpp-hbase-client">Check it out</a>.</p> +<a href="https://github.com/hinaria/native-cpp-hbase-client">Check it out</a>.</p> +</div> +<div class="paragraph"> +<p>C++ client implementation. To see <a href="https://issues.apache.org/jira/browse/HBASE-14850">HBASE-14850</a>.</p> </div> </div> </div> @@ -19847,8 +19847,8 @@ package.</p> <div class="paragraph"> <p>Observer coprocessors are triggered either before or after a specific event occurs. Observers that happen before an event use methods that start with a <code>pre</code> prefix, -such as <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#prePut%28org.apache.hadoop.hbase.coprocessor.ObserverContext,%20org.apache.hadoop.hbase.client.Put,%20org.apache.hadoop.hbase.regionserver.wal.WALEdit,%20org.apache.hadoop.hbase.client.Durability%29"><code>prePut</code></a>. Observers that happen just after an event override methods that start -with a <code>post</code> prefix, such as <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#postPut%28org.apache.hadoop.hbase.coprocessor.ObserverContext,%20org.apache.hadoop.hbase.client.Put,%20org.apache.hadoop.hbase.regionserver.wal.WALEdit,%20org.apache.hadoop.hbase.client.Durability%29"><code>postPut</code></a>.</p> +such as <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#prePut-org.apache.hadoop.hbase.coprocessor.ObserverContext-org.apache.hadoop.hbase.client.Put-org.apache.hadoop.hbase.wal.WALEdit-org.apache.hadoop.hbase.client.Durability-"><code>prePut</code></a>. Observers that happen just after an event override methods that start +with a <code>post</code> prefix, such as <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/coprocessor/RegionObserver.html#postPut-org.apache.hadoop.hbase.coprocessor.ObserverContext-org.apache.hadoop.hbase.client.Put-org.apache.hadoop.hbase.wal.WALEdit-org.apache.hadoop.hbase.client.Durability-"><code>postPut</code></a>.</p> </div> <div class="sect3"> <h4 id="_use_cases_for_observer_coprocessors"><a class="anchor" href="#_use_cases_for_observer_coprocessors"></a>88.1.1. Use Cases for Observer Coprocessors</h4> @@ -19921,7 +19921,7 @@ average or summation for an entire table which spans hundreds of regions.</p> <div class="paragraph"> <p>In contrast to observer coprocessors, where your code is run transparently, endpoint coprocessors must be explicitly invoked using the -<a href="https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html#coprocessorService%28java.lang.Class,%20byte%5B%5D,%20byte%5B%5D,%20org.apache.hadoop.hbase.client.coprocessor.Batch.Call%29">CoprocessorService()</a> +<a href="https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html#coprocessorService-java.lang.Class-byte:A-byte:A-org.apache.hadoop.hbase.client.coprocessor.Batch.Call-">CoprocessorService()</a> method available in <a href="https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/Table.html">Table</a> or @@ -21557,7 +21557,7 @@ It’s far more efficient to just write directly to HBase.</p> <div class="paragraph"> <p>Also, if you are pre-splitting regions and all your data is <em>still</em> winding up in a single region even though your keys aren’t monotonically increasing, confirm that your keyspace actually works with the split strategy. There are a variety of reasons that regions may appear "well split" but won’t work with your data. -As the HBase client communicates directly with the RegionServers, this can be obtained via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte%5B%5D)">Table.getRegionLocation</a>.</p> +As the HBase client communicates directly with the RegionServers, this can be obtained via <a href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/RegionLocator.html#getRegionLocation-byte:A-">RegionLocator.getRegionLocation</a>.</p> </div> <div class="paragraph"> <p>See <a href="#precreate.regions">Table Creation: Pre-Creating Regions</a>, as well as <a href="#perf.configurations">HBase Configurations</a></p> @@ -21688,7 +21688,7 @@ If the target table(s) have too few regions then the reads could likely be serve <p><a href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filters</a> were developed over in <a href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200 Add bloomfilters</a>. For description of the development process — why static blooms rather than dynamic — and for an overview of the unique properties that pertain to blooms in HBase, as well as possible future directions, see the <em>Development Process</em> section of the document <a href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters in HBase</a> attached to <a href="https://issues.apache.org/jira/browse/HBASE-1200">HBASE-1200</a>. The bloom filters described here are actually version two of blooms in HBase. -In versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the <a href="http://www.one-lab.org/">European Commission One-Lab Project 034819</a>. +In versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the <a href="http://www.onelab.org">European Commission One-Lab Project 034819</a>. The core of the HBase bloom work was later pulled up into Hadoop to implement org.apache.hadoop.io.BloomMapFile. Version 1 of HBase blooms never worked that well. Version 2 is a rewrite from scratch though again it starts with the one-lab work.</p> @@ -21850,7 +21850,7 @@ As is documented in <a href="#datamodel">Data Model</a>, marking rows as deleted Tombstones only get cleaned up with major compactions.</p> </div> <div class="paragraph"> -<p>See also <a href="#compaction">Compaction</a> and <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact%28java.lang.String%29">Admin.majorCompact</a>.</p> +<p>See also <a href="#compaction">Compaction</a> and <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-">Admin.majorCompact</a>.</p> </div> </div> <div class="sect2"> @@ -21861,8 +21861,7 @@ It will execute an RegionServer RPC with each invocation. For a large number of deletes, consider <code>Table.delete(List)</code>.</p> </div> <div class="paragraph"> -<p>See -<a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete%28org.apache.hadoop.hbase.client.Delete%29">hbase.client.Delete</a>.</p> +<p>See <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#delete-org.apache.hadoop.hbase.client.Delete-">hbase.client.Delete</a></p> </div> </div> </div> @@ -22799,7 +22798,11 @@ If you have this problematic combination of components in your environment, to w The refresh will rewrite the credential cache without the problematic formatting.</p> </div> <div class="paragraph"> -<p>Finally, depending on your Kerberos configuration, you may need to install the <a href="http://docs.oracle.com/javase/1.4.2/docs/guide/security/jce/JCERefGuide.html">Java Cryptography Extension</a>, or JCE. +<p>Prior to JDK 1.4, the JCE was an unbundled product, and as such, the JCA and JCE were regularly referred to as separate, distinct components. +As JCE is now bundled in the JDK 7.0, the distinction is becoming less apparent. Since the JCE uses the same architecture as the JCA, the JCE should be more properly thought of as a part of the JCA.</p> +</div> +<div class="paragraph"> +<p>You may need to install the <a href="https://docs.oracle.com/javase/1.5.0/docs/guide/security/jce/JCERefGuide.html">Java Cryptography Extension</a>, or JCE because of JDK 1.5 or earlier version. Insure the JCE jars are on the classpath on both server and client systems.</p> </div> <div class="paragraph"> @@ -22859,7 +22862,7 @@ For example (substitute VERSION with your HBase version):</p> </div> </div> <div class="paragraph"> -<p>See <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpathfor" class="bare">http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpathfor</a> more information on HBase MapReduce jobs and classpaths.</p> +<p>See <a href="#hbase.mapreduce.classpath">HBase, MapReduce, and the CLASSPATH</a> for more information on HBase MapReduce jobs and classpaths.</p> </div> </div> <div class="sect2"> @@ -22909,7 +22912,7 @@ For example…​</p> <p>…​returns a list of the regions under the HBase table 'myTable' and their disk utilization.</p> </div> <div class="paragraph"> -<p>For more information on HDFS shell commands, see the <a href="http://hadoop.apache.org/common/docs/current/file_system_shell.html">HDFS FileSystem Shell documentation</a>.</p> +<p>For more information on HDFS shell commands, see the <a href="http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html">HDFS FileSystem Shell documentation</a>.</p> </div> </div> <div class="sect2"> @@ -22944,7 +22947,7 @@ The NameNode web application will provide links to the all the DataNodes in the </div> </div> <div class="paragraph"> -<p>See the <a href="http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html">HDFS User Guide</a> for other non-shell diagnostic utilities like <code>fsck</code>.</p> +<p>See the <a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS User Guide</a> for other non-shell diagnostic utilities like <code>fsck</code>.</p> </div> <div class="sect3"> <h4 id="trouble.namenode.0size.hlogs"><a class="anchor" href="#trouble.namenode.0size.hlogs"></a>113.2.1. Zero size WALs with data in them</h4> @@ -25047,7 +25050,7 @@ For general usage instructions, pass the <code>-h</code> option.</p> <div class="sect2"> <h3 id="ops.regionmgt.majorcompact"><a class="anchor" href="#ops.regionmgt.majorcompact"></a>131.1. Major Compaction</h3> <div class="paragraph"> -<p>Major compactions can be requested via the HBase shell or <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact%28java.lang.String%29">Admin.majorCompact</a>.</p> +<p>Major compactions can be requested via the HBase shell or <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html#majorCompact-org.apache.hadoop.hbase.TableName-">Admin.majorCompact</a>.</p> </div> <div class="paragraph"> <p>Note: major compactions do NOT do region merges. @@ -25067,7 +25070,7 @@ See <a href="#compaction">compaction</a> for more information about compactions. <div class="paragraph"> <p>If you feel you have too many regions and want to consolidate them, Merge is the utility you need. Merge must run be done when the cluster is down. -See the <a href="http://ofps.oreilly.com/titles/9781449396107/performance.html">O’Reilly HBase +See the <a href="https://web.archive.org/web/20111231002503/http://ofps.oreilly.com/titles/9781449396107/performance.html">O’Reilly HBase Book</a> for an example of usage.</p> </div> <div class="paragraph"> @@ -25434,7 +25437,7 @@ In this case, or if you are in a OLAP environment and require having locality, t <h2 id="hbase_metrics"><a class="anchor" href="#hbase_metrics"></a>133. HBase Metrics</h2> <div class="sectionbody"> <div class="paragraph"> -<p>HBase emits metrics which adhere to the <a href="http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html">Hadoop metrics</a> API. +<p>HBase emits metrics which adhere to the <a href="https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Metrics.html">Hadoop Metrics</a> API. Starting with HBase 0.95<sup class="footnote">[<a id="_footnoteref_5" class="footnote" href="#_footnote_5" title="View footnote.">5</a>]</sup>, HBase is configured to emit a default set of metrics with a default sampling period of every 10 seconds. You can use HBase metrics in conjunction with Ganglia. You can also filter which metrics are emitted and extend the metrics framework to capture custom metrics appropriate for your environment.</p> @@ -25887,7 +25890,7 @@ Have a look in the Web UI.</p> </td> <td class="content"> This information was previously available at -<a href="http://hbase.apache.org#replication">Cluster Replication</a>. +<a href="https://hbase.apache.org/0.94/replication.html">Cluster Replication</a>. </td> </tr> </table> @@ -26954,7 +26957,7 @@ The act of copying these files creates new HDFS metadata, which is why a restore <h3 id="ops.backup.live.replication"><a class="anchor" href="#ops.backup.live.replication"></a>137.2. Live Cluster Backup - Replication</h3> <div class="paragraph"> <p>This approach assumes that there is a second cluster. -See the HBase page on <a href="http://hbase.apache.org/book.html#replication">replication</a> for more information.</p> +See the HBase page on <a href="http://hbase.apache.org/book.html#_cluster_replication">replication</a> for more information.</p> </div> </div> <div class="sect2"> @@ -27766,7 +27769,7 @@ FreeNode offers a web-based client, but most people prefer a native client, and <div class="sect2"> <h3 id="_jira"><a class="anchor" href="#_jira"></a>143.4. Jira</h3> <div class="paragraph"> -<p>Check for existing issues in <a href="https://issues.apache.org/jira/browse/HBASE">Jira</a>. +<p>Check for existing issues in <a href="https://issues.apache.org/jira/projects/HBASE/issues">Jira</a>. If it’s either a new feature request, enhancement, or a bug, file a ticket.</p> </div> <div class="paragraph"> @@ -28342,8 +28345,7 @@ You can use maven profile <code>compile-thrift</code> to do this.</p> <div class="paragraph"> <p>If you see <code>Unable to find resource 'VM_global_library.vm'</code>, ignore it. It’s not an error. -It is <a href="http://jira.codehaus.org/browse/MSITE-286">officially - ugly</a> though.</p> +It is <a href="https://issues.apache.org/jira/browse/MSITE-286">officially ugly</a> though.</p> </div> </div> </div> @@ -28515,8 +28517,7 @@ Adjust the version in all the POM files appropriately. If you are making a release candidate, you must remove the <code>-SNAPSHOT</code> label from all versions in all pom.xml files. If you are running this receipe to publish a snapshot, you must keep the <code>-SNAPSHOT</code> suffix on the hbase version. -The <a href="http://mojo.codehaus.org/versions-maven-plugin/">Versions - Maven Plugin</a> can be of use here. +The <a href="http://www.mojohaus.org/versions-maven-plugin/">Versions Maven Plugin</a> can be of use here. To set a version in all the many poms of the hbase multi-module project, use a command like the following:</p> </div> <div class="listingblock"> @@ -28716,7 +28717,7 @@ You can always delete it if the build goes haywire.</p> </div> </li> <li> -<p>Sign, fingerprint and then 'stage' your release candiate version directory via svnpubsub by committing your directory to <a href="https://dist.apache.org/repos/dist/dev/hbase/">The 'dev' distribution directory</a> (See comments on <a href="https://issues.apache.org/jira/browse/HBASE-10554">HBASE-10554 Please delete old releases from mirroring system</a> but in essence it is an svn checkout of <a href="https://dist.apache.org/repos/dist/dev/hbase — releases" class="bare">https://dist.apache.org/repos/dist/dev/hbase — releases</a> are at <a href="https://dist.apache.org/repos/dist/release/hbase" class="bare">https://dist.apache.org/repos/dist/release/hbase</a>). In the <em>version directory</em> run the following commands:</p> +<p>Sign, fingerprint and then 'stage' your release candiate version directory via svnpubsub by committing your directory to <a href="https://dist.apache.org/repos/dist/dev/hbase/">The 'dev' distribution directory</a> (See comments on <a href="https://issues.apache.org/jira/browse/HBASE-10554">HBASE-10554 Please delete old releases from mirroring system</a> but in essence, it is an svn checkout of <a href="https://dist.apache.org/repos/dist/dev/hbase" class="bare">https://dist.apache.org/repos/dist/dev/hbase</a>. And releases are at <a href="https://dist.apache.org/repos/dist/release/hbase" class="bare">https://dist.apache.org/repos/dist/release/hbase</a>). In the <em>version directory</em> run the following commands:</p> <div class="listingblock"> <div class="content"> <pre class="CodeRay highlight"><code data-lang="bourne">$ for i in *.tar.gz; do echo $i; gpg --print-mds $i > $i.mds ; done @@ -28921,7 +28922,7 @@ For example, to skip the tests in <code>hbase-server</code> and <code>hbase-comm <h3 id="hbase.unittests"><a class="anchor" href="#hbase.unittests"></a>151.2. Unit Tests</h3> <div class="paragraph"> <p>Apache HBase test cases are subdivided into four categories: small, medium, large, and -integration with corresponding JUnit <a href="http://www.junit.org/node/581">categories</a>: <code>SmallTests</code>, <code>MediumTests</code>, <code>LargeTests</code>, <code>IntegrationTests</code>. +integration with corresponding JUnit <a href="https://github.com/junit-team/junit4/wiki/Categories">categories</a>: <code>SmallTests</code>, <code>MediumTests</code>, <code>LargeTests</code>, <code>IntegrationTests</code>. JUnit categories are denoted using java annotations and look like this in your unit test code.</p> </div> <div class="listingblock"> @@ -29376,7 +29377,7 @@ For other deployment options, a ClusterManager can be implemented and plugged in <h4 id="maven.build.commands.integration.tests.destructive"><a class="anchor" href="#maven.build.commands.integration.tests.destructive"></a>151.5.3. Destructive integration / system tests (ChaosMonkey)</h4> <div class="paragraph"> <p>HBase 0.96 introduced a tool named <code>ChaosMonkey</code>, modeled after -<a href="http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html">same-named tool by Netflix’s Chaos Monkey tool</a>. +<a href="https://netflix.github.io/chaosmonkey/">same-named tool by Netflix’s Chaos Monkey tool</a>. ChaosMonkey simulates real-world faults in a running cluster by killing or disconnecting random servers, or injecting other failures into the environment. You can use ChaosMonkey as a stand-alone tool @@ -30257,7 +30258,7 @@ Use the <b class="button">Submit Patch</b> button in JIRA <div class="sect4"> <h5 id="_reject"><a class="anchor" href="#_reject"></a>Reject</h5> <div class="paragraph"> -<p>Patches which do not adhere to the guidelines in <a href="https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/Hbase/HowToContribute#">HowToContribute</a> and to the <a href="https://wiki.apache.org/hadoop/Hbase/HowToCommit/hadoop/CodeReviewChecklist#">code review checklist</a> should be rejected. +<p>Patches which do not adhere to the guidelines in <a href="https://hbase.apache.org/book.html#developer">HowToContribute</a> and to the <a href="https://wiki.apache.org/hadoop/CodeReviewChecklist">code review checklist</a> should be rejected. Committers should always be polite to contributors and try to instruct and encourage them to contribute better patches. If a committer wishes to improve an unacceptable patch, then it should first be rejected, and a new patch should be attached by the committer for review.</p> </div> @@ -30619,7 +30620,7 @@ For information on unit tests for HBase itself, see <a href="#hbase.tests">hbase <h2 id="_junit"><a class="anchor" href="#_junit"></a>153. JUnit</h2> <div class="sectionbody"> <div class="paragraph"> -<p>HBase uses <a href="http://junit.org">JUnit</a> 4 for unit tests</p> +<p>HBase uses <a href="http://junit.org">JUnit</a> for unit tests</p> </div> <div class="paragraph"> <p>This example will add unit tests to the following example class:</p> @@ -30956,7 +30957,7 @@ Starting the mini-cluster takes about 20-30 seconds, but that should be appropri <h2 id="_protobuf"><a class="anchor" href="#_protobuf"></a>157. Protobuf</h2> <div class="sectionbody"> <div class="paragraph"> -<p>HBase uses Google’s <a href="http://protobuf.protobufs">protobufs</a> wherever +<p>HBase uses Google’s <a href="https://developers.google.com/protocol-buffers/">protobufs</a> wherever it persists metadata — in the tail of hfiles or Cells written by HBase into the system hbase:meta table or when HBase writes znodes to zookeeper, etc. — and when it passes objects over the wire making @@ -31299,7 +31300,7 @@ The ZooKeeper client and server libraries manage their own ticket refreshment by <div class="sect2"> <h3 id="_hbase_managed_zookeeper_configuration"><a class="anchor" href="#_hbase_managed_zookeeper_configuration"></a>159.2. HBase-managed ZooKeeper Configuration</h3> <div class="paragraph"> -<p>On each node that will run a zookeeper, a master, or a regionserver, create a <a href="http://docs.oracle.com/javase/1.4.2/docs/guide/security/jgss/tutorials/LoginConfigFile.html">JAAS</a> configuration file in the conf directory of the node’s <em>HBASE_HOME</em> directory that looks like the following:</p> +<p>On each node that will run a zookeeper, a master, or a regionserver, create a <a href="http://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html">JAAS</a> configuration file in the conf directory of the node’s <em>HBASE_HOME</em> directory that looks like the following:</p> </div> <div class="listingblock"> <div class="content"> @@ -31608,9 +31609,9 @@ It is a suggested policy rather than a hard requirement. We want to try it first to see if it works before we cast it in stone.</p> </div> <div class="paragraph"> -<p>Apache HBase is made of <a href="https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel">components</a>. +<p>Apache HBase is made of <a href="https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page">components</a>. Components have one or more <a href="#owner">OWNER</a>s. -See the 'Description' field on the <a href="https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel">components</a> JIRA page for who the current owners are by component.</p> +See the 'Description' field on the <a href="https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page">components</a> JIRA page for who the current owners are by component.</p> </div> <div class="paragraph"> <p>Patches that fit within the scope of a single Apache HBase component require, at least, a +1 by one of the component’s owners before commit. @@ -31661,7 +31662,7 @@ We also are currently in violation of this basic tenet — repli <div class="sectionbody"> <div id="owner" class="paragraph"> <div class="title">Component Owner/Lieutenant</div> -<p>Component owners are listed in the description field on this Apache HBase JIRA <a href="https://issues.apache.org/jira/browse/HBASE#selectedTab=com.atlassian.jira.plugin.system.project%3Acomponents-panel">components</a> page. +<p>Component owners are listed in the description field on this Apache HBase JIRA <a href="https://issues.apache.org/jira/projects/HBASE?selectedItem=com.atlassian.jira.jira-projects-plugin:components-page">components</a> page. The owners are listed in the 'Description' field rather than in the 'Component Lead' field because the latter only allows us list one individual whereas it is encouraged that components have multiple owners.</p> </div> <div class="paragraph"> @@ -33737,8 +33738,7 @@ See <a href="#brand.new.compressor">brand.new.compressor</a>).</p> <div id="lzo.compression" class="paragraph"> <div class="title">Install LZO Support</div> <p>HBase cannot ship with LZO because of incompatibility between HBase, which uses an Apache Software License (ASL) and LZO, which uses a GPL license. -See the <a href="http://wiki.apache.org/hadoop/UsingLzoCompression">Using LZO - Compression</a> wiki page for information on configuring LZO support for HBase.</p> +See the <a href="https://github.com/twitter/hadoop-lzo/blob/master/README.md">Hadoop-LZO at Twitter</a> for information on configuring LZO support for HBase.</p> </div> <div class="paragraph"> <p>If you depend upon LZO compression, consider configuring your RegionServers to fail to start if LZO is not available. @@ -34666,22 +34666,21 @@ For more information see <a href="#hbase.encryption.server">hbase.encryption.ser <div class="title">Introduction to HBase</div> <ul> <li> -<p><a href="http://www.cloudera.com/content/cloudera/en/resources/library/presentation/chicago_data_summit_apache_hbase_an_introduction_todd_lipcon.html">Introduction to HBase</a> by Todd Lipcon (Chicago Data Summit 2011).</p> +<p><a href="https://vimeo.com/23400732">Introduction to HBase</a> by Todd Lipcon (Chicago Data Summit 2011).</p> </li> <li> -<p><a href="http://www.cloudera.com/videos/intorduction-hbase-todd-lipcon">Introduction to HBase</a> by Todd Lipcon (2010). -<a href="http://www.cloudera.com/videos/hadoop-world-2011-presentation-video-building-realtime-big-data-services-at-facebook-with-hadoop-and-hbase">Building Real Time Services at Facebook with HBase</a> by Jonathan Gray (Hadoop World 2011).</p> +<p><a href="https://vimeo.com/26804675">Building Real Time Services at Facebook with HBase</a> by Jonathan Gray (Berlin buzzwords 2011)</p> +</li> +<li> +<p><a href="http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop">The Multiple Uses Of HBase</a> by Jean-Daniel Cryans(Berlin buzzwords 2011).</p> </li> </ul> </div> -<div class="paragraph"> -<p><a href="http://www.cloudera.com/videos/hw10_video_how_stumbleupon_built_and_advertising_platform_using_hbase_and_hadoop">HBase and Hadoop, Mixing Real-Time and Batch Processing at StumbleUpon</a> by JD Cryans (Hadoop World 2010).</p> -</div> </div> <div class="sect2"> <h3 id="other.info.pres"><a class="anchor" href="#other.info.pres"></a>I.2. HBase Presentations (Slides)</h3> <div class="paragraph"> -<p><a href="http://www.cloudera.com/content/cloudera/en/resources/library/hadoopworld/hadoop-world-2011-presentation-video-advanced-hbase-schema-design.html">Advanced HBase Schema Design</a> by Lars George (Hadoop World 2011).</p> +<p><a href="https://www.slideshare.net/cloudera/hadoop-world-2011-advanced-hbase-schema-design-lars-george-cloudera">Advanced HBase Schema Design</a> by Lars George (Hadoop World 2011).</p> </div> <div class="paragraph"> <p><a href="http://www.slideshare.net/cloudera/chicago-data-summit-apache-hbase-an-introduction">Introduction to HBase</a> by Todd Lipcon (Chicago Data Summit 2011).</p> @@ -34707,15 +34706,8 @@ For more information see <a href="#hbase.encryption.server">hbase.encryption.ser <div class="paragraph"> <p><a href="https://blog.cloudera.com/blog/category/hbase/">Cloudera’s HBase Blog</a> has a lot of links to useful HBase information.</p> </div> -<div class="ulist"> -<ul> -<li> -<p><a href="https://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/">CAP Confusion</a> is a relevant entry for background information on distributed storage systems.</p> -</li> -</ul> -</div> <div class="paragraph"> -<p><a href="http://wiki.apache.org/hadoop/HBase/HBasePresentations">HBase Wiki</a> has a page with a number of presentations.</p> +<p><a href="https://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/">CAP Confusion</a> is a relevant entry for background information on distributed storage systems.</p> </div> <div class="paragraph"> <p><a href="http://refcardz.dzone.com/refcardz/hbase">HBase RefCard</a> from DZone.</p> @@ -35240,7 +35232,7 @@ The server will return cellblocks compressed using this same compressor as long <div id="footer"> <div id="footer-text"> Version 3.0.0-SNAPSHOT<br> -Last updated 2017-10-27 14:29:36 UTC +Last updated 2017-10-28 14:29:35 UTC </div> </div> </body>
http://git-wip-us.apache.org/repos/asf/hbase-site/blob/5018ccb3/bulk-loads.html ---------------------------------------------------------------------- diff --git a/bulk-loads.html b/bulk-loads.html index f94c657..6bfce3e 100644 --- a/bulk-loads.html +++ b/bulk-loads.html @@ -7,7 +7,7 @@ <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20171027" /> + <meta name="Date-Revision-yyyymmdd" content="20171028" /> <meta http-equiv="Content-Language" content="en" /> <title>Apache HBase – Bulk Loads in Apache HBase (TM) @@ -311,7 +311,7 @@ under the License. --> <a href="https://www.apache.org/">The Apache Software Foundation</a>. All rights reserved. - <li id="publishDate" class="pull-right">Last Published: 2017-10-27</li> + <li id="publishDate" class="pull-right">Last Published: 2017-10-28</li> </p> </div>