Branching off this subject from the original thread.

On Wed, Apr 15, 2020 at 9:56 AM Andrew Purtell <andrew.purt...@gmail.com>
wrote:

> Quick Start and Production are exclusive configurations.
>

Yes absolutely.

Quick Start, as you say, should have as few steps to up and running as
> possible.
>
> Production requires a real distributed filesystem for persistence and that
> means HDFS and that means, whatever the provisioning and deployment and
> process management (Ambari or k8s or...) choices are going to be, they will
> not be a Quick Start.
>
> We’ve always had this problem. The Quick Start simply can’t produce a
> system capable of durability because prerequisites for durability are not
> quick to set up.
>

Is this exclusively due to the implementation of `LocalFileSystem` or are
there other issues at play? I've seen there's also `RawLocalFileSystem` but
I haven't investigated their relationship, it's capabilities, or if we
might profit from its use for the Quick Start experience.

Specifically about /tmp...  I agree that’s not a good default. Time and
> again I’ve heard people complain that the tmp cleaner has removed their
> test data. It shouldn’t be surprising but is and that is real feedback on
> mismatch of user expectation to what we are providing in that
> configuration. Addressing this aspect of the Quick Start experience would
> be a simple change: make the default a new directory in $HOME, perhaps
> “.hbase” .
>

I propose changing the default value of `hbase.tmp.dir` as shipped in the
default hbase-site.xml to be simply `tmp`, as I documented in my change on
HBASE-24106. That way it's not hidden somewhere and it's self-contained to
this unpacking of the source/binary distribution. I.e., there's no need to
worry about upgrading the data stored there when a user experiments with a
new version.

> On Apr 15, 2020, at 9:40 AM, Nick Dimiduk <ndimi...@apache.org> wrote:
> >
> > Hi folks,
> >
> > I'd like to bring up the topic of the experience of new users as it
> > pertains to use of the `LocalFileSystem` and its associated (lack of)
> data
> > durability guarantees. By default, an unconfigured HBase runs with its
> root
> > directory on a `file:///` path. This patch is picked up as an instance of
> > `LocalFileSystem`. Hadoop has long offered this class, but it has never
> > supported `hsync` or `hflush` stream characteristics. Thus, when HBase
> runs
> > on this configuration, it is unable to ensure that WAL writes are
> durable,
> > and thus will ACK a write without this assurance. This is the case, even
> > when running in a fully durable WAL mode.
> >
> > This impacts a new user, someone kicking the tires on HBase following our
> > Getting Started docs. On Hadoop 2.8 and before, an unconfigured HBase
> will
> > WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The book
> > describes a process of disabling enforcement of stream capability
> > enforcement as a first step. This is a mandatory configuration for
> running
> > HBase directly out of our binary distribution.
> >
> > HBASE-24086 restores the behavior on Hadoop 2.10+ to that of running on
> > 2.8: log a warning and cary on. The critique of this approach is that
> it's
> > far too subtle, too quiet for a system operating in a state known to not
> > provide data durability.
> >
> > I have two assumptions/concerns around the state of things, which
> prompted
> > my solution on HBASE-24086 and the associated doc update on HBASE-24106.
> >
> > 1. No one should be running a production system on `LocalFileSystem`.
> >
> > The initial implementation checked both for `LocalFileSystem` and
> > `hbase.cluster.distributed`. When running on the former and the latter is
> > false, we assume the user is running a non-production deployment and
> carry
> > on with the warning. When the latter is true, we assume the user
> intended a
> > production deployment and the process terminates due to stream capability
> > enforcement. Subsequent code review resulted in skipping the
> > `hbase.cluster.distributed` check and simply warning, as was done on 2.8
> > and earlier.
> >
> > (As I understand it, we've long used the `hbase.cluster.distributed`
> > configuration to decide if the user intends this runtime to be a
> production
> > deployment or not.)
> >
> > Is this a faulty assumption? Is there a use-case we support where we
> > condone running production deployment on the non-durable
> `LocalFileSystem`?
> >
> > 2. The Quick Start experience should require no configuration at all.
> >
> > Our stack is difficult enough to run in a fully durable production
> > environment. We should make it a priority to ensure it's as easy as
> > possible to try out HBase. Forcing a user to make decisions about data
> > durability before they even launch the web ui is a terrible experience,
> in
> > my opinion, and should be a non-starter for us as a project.
> >
> > (In my opinion, the need to configure either `hbase.rootdir` or
> > `hbase.tmp.dir` away from `/tmp` is equally bad for a Getting Started
> > experience. It is a second, more subtle question of data durability that
> we
> > should avoid out of the box. But I'm happy to leave that for another
> > thread.)
> >
> > Thank you for your time,
> > Nick
>

Reply via email to