I think this is the core point to touch on.

As people who work on distributed systems, obviously, we default to the line of thinking: "if you put your data on one node, you can't survive that node failing". Do our users also immediately come to that conclusion? If not, is it acceptable that we just do better in reminding them of this caveat?

Like I think Busbey was alluding to earlier, I think we have room to be more prescriptive on how you can use the local filesystem to run HBase with the big-fat-warning that this is no longer a distributed system that can handle node failure. As far as I remember, using RawLocalFileSystem (and HBase changes to also avoid any unwrapping back into LocalFileSystem) would be a nice little improvement that should be relatively straightforward.

Would that be acceptable in your eyes, Andrew? Or, is the issue more fundamental in your mind that we should not be telling users how they can run HBase in a manner that doesn't implicitly handle at least failure of one node?

I think people are going to always come up with unexpected ways to try to run HBase. They're going to slap it on top of random filesystems. I don't think we can keep on top of every possible permutation (especially if we consider things like persistent volumes from K8s, AWS, Azure that try to make traditionally non-fault-tolerant volumes magically fault-tolerant).

On 4/16/20 11:59 AM, Andrew Purtell wrote:
The data can not be said to be durable because there is one set of files that 
can be irreversibly corrupted or lost.

On Apr 15, 2020, at 3:52 PM, Vladimir Rodionov <[email protected]> wrote:

FileOutputStream.getFileChannel().force(true) will get all durability we
need. Just a simple code change?


On Wed, Apr 15, 2020 at 12:32 PM Andrew Purtell <[email protected]>
wrote:

This thread talks of “durability” via filesystem characteristics but also
for single system quick Start type deployments. For durability we need
multi server deployments. No amount of hacking a single system deployment
is going to give us durability as users will expect (“don’t lose my data”).
I believe my comments are on topic.


On Apr 15, 2020, at 11:03 AM, Nick Dimiduk <[email protected]> wrote:

On Wed, Apr 15, 2020 at 10:28 AM Andrew Purtell <[email protected]>
wrote:

Nick's mail doesn't make a distinction between avoiding data loss via
typical tmp cleaner configurations, unfortunately adjacent to mention of
"durability", and real data durability, which implies more than what a
single system configuration can offer, no matter how many tweaks we
make to
LocalFileSystem. Maybe I'm being pedantic but this is something to be
really clear about IMHO.


I prefer to focus the attention of this thread to the question of data
durability via `FileSystem` characteristics. I agree that there are
concerns of durability (and others) around the use of the path under
/tmp.
Let's keep that discussion in the other thread.

On Wed, Apr 15, 2020 at 10:05 AM Sean Busbey <[email protected]> wrote:

I think the first assumption no longer holds. Especially with the move
to flexible compute environments I regularly get asked by folks what
the smallest HBase they can start with for production. I can keep
saying 3/5/7 nodes or whatever but I guarantee there are folks who
want to and will run HBase with a single node. Probably those
deployments won't want to have the distributed flag set. None of them
really have a good option for where the WALs go, and failing loud when
they try to go to LocalFileSystem is the best option I've seen so far
to make sure folks realize they are getting into muddy waters.

I agree with the second assumption. Our quickstart in general is too
complicated. Maybe if we include big warnings in the guide itself, we
could make a quickstart specific artifact to download that has the
unsafe disabling config in place?

Last fall I toyed with the idea of adding an "hbase-local" module to
the hbase-filesystem repo that could start us out with some
optimizations for single node set ups. We could start with a fork of
RawLocalFileSystem (which will call OutputStream flush operations in
response to hflush/hsync) that properly advertises its
StreamCapabilities to say that it supports the operations we need.
Alternatively we could make our own implementation of FileSystem that
uses NIO stuff. Either of these approaches would solve both problems.

On Wed, Apr 15, 2020 at 11:40 AM Nick Dimiduk <[email protected]>
wrote:

Hi folks,

I'd like to bring up the topic of the experience of new users as it
pertains to use of the `LocalFileSystem` and its associated (lack of)
data
durability guarantees. By default, an unconfigured HBase runs with its
root
directory on a `file:///` path. This patch is picked up as an instance
of
`LocalFileSystem`. Hadoop has long offered this class, but it has
never
supported `hsync` or `hflush` stream characteristics. Thus, when HBase
runs
on this configuration, it is unable to ensure that WAL writes are
durable,
and thus will ACK a write without this assurance. This is the case,
even
when running in a fully durable WAL mode.

This impacts a new user, someone kicking the tires on HBase following
our
Getting Started docs. On Hadoop 2.8 and before, an unconfigured HBase
will
WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The book
describes a process of disabling enforcement of stream capability
enforcement as a first step. This is a mandatory configuration for
running
HBase directly out of our binary distribution.

HBASE-24086 restores the behavior on Hadoop 2.10+ to that of running
on
2.8: log a warning and cary on. The critique of this approach is that
it's
far too subtle, too quiet for a system operating in a state known to
not
provide data durability.

I have two assumptions/concerns around the state of things, which
prompted
my solution on HBASE-24086 and the associated doc update on
HBASE-24106.

1. No one should be running a production system on `LocalFileSystem`.

The initial implementation checked both for `LocalFileSystem` and
`hbase.cluster.distributed`. When running on the former and the latter
is
false, we assume the user is running a non-production deployment and
carry
on with the warning. When the latter is true, we assume the user
intended a
production deployment and the process terminates due to stream
capability
enforcement. Subsequent code review resulted in skipping the
`hbase.cluster.distributed` check and simply warning, as was done on
2.8
and earlier.

(As I understand it, we've long used the `hbase.cluster.distributed`
configuration to decide if the user intends this runtime to be a
production
deployment or not.)

Is this a faulty assumption? Is there a use-case we support where we
condone running production deployment on the non-durable
`LocalFileSystem`?

2. The Quick Start experience should require no configuration at all.

Our stack is difficult enough to run in a fully durable production
environment. We should make it a priority to ensure it's as easy as
possible to try out HBase. Forcing a user to make decisions about data
durability before they even launch the web ui is a terrible
experience,
in
my opinion, and should be a non-starter for us as a project.

(In my opinion, the need to configure either `hbase.rootdir` or
`hbase.tmp.dir` away from `/tmp` is equally bad for a Getting Started
experience. It is a second, more subtle question of data durability
that
we
should avoid out of the box. But I'm happy to leave that for another
thread.)

Thank you for your time,
Nick



--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
  - A23, Crosstalk


Reply via email to