ndimiduk commented on a change in pull request #1597: URL: https://github.com/apache/hbase/pull/1597#discussion_r417530403
########## File path: conf/hbase-site.xml ########## @@ -21,4 +20,31 @@ */ --> <configuration> + <!-- + The following properties are set for running HBase as a single process on a + developer workstation. With this configuration, HBase is running in + "stand-alone" mode and without a distributed file system. In this mode, and + without further configuration, HBase and ZooKeeper data are stored on the + local filesystem, in a path under the value configured for `hbase.tmp.dir`. + This value is overridden from its default value of `/tmp` because many + systems clean `/tmp` on a regular basis. Instead, it points to a path within + this HBase installation directory. + + Running against the `LocalFileSystem`, as opposed to a distributed + filesystem, runs the risk of data integrity issues and data loss. Normally + HBase will refuse to run in such an environment. Setting + `hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior, + permitting operation. This configuration is for the developer workstation + only and __should not be used in production!__ + + See also https://hbase.apache.org/book.html#standalone_dist + --> + <property> + <name>hbase.tmp.dir</name> + <value>./tmp</value> Review comment: Hadoop `Configuration` supports environment variables. Let me make this even my explicit by changing this value to `${env.HBASE_HOME:-.}/tmp`. ########## File path: conf/hbase-site.xml ########## @@ -21,4 +20,31 @@ */ --> <configuration> + <!-- + The following properties are set for running HBase as a single process on a + developer workstation. With this configuration, HBase is running in + "stand-alone" mode and without a distributed file system. In this mode, and + without further configuration, HBase and ZooKeeper data are stored on the + local filesystem, in a path under the value configured for `hbase.tmp.dir`. + This value is overridden from its default value of `/tmp` because many + systems clean `/tmp` on a regular basis. Instead, it points to a path within + this HBase installation directory. + + Running against the `LocalFileSystem`, as opposed to a distributed + filesystem, runs the risk of data integrity issues and data loss. Normally + HBase will refuse to run in such an environment. Setting + `hbase.unsafe.stream.capability.enforce` to `false` overrides this behavior, + permitting operation. This configuration is for the developer workstation + only and __should not be used in production!__ Review comment: > The only other thing I could think of would be to carry `hbase.cluster.distributed=false` here to try to be more confident that they are actually changing the config, giving it some thought. This is the default value, but sure, I see no harm in adding it here explicitly. A user is expected to edit that value when they go to production anyway, so there's no harm in saving them some typing. > I could imagine a process which starts with hbase-site.xml from a release tarball, and then additions on top of that. This change would adversely affect them. You're correct, this change affects them. After reflection, I prefer the change where we ship these defaults configured for development/testing users, instead of the initial change that altered the code. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
