Re: [DISCUSS] New User Experience and Data Durability Guarantees on LocalFileSystem (HBASE-24086)

Josh Elser Fri, 17 Apr 2020 11:17:13 -0700

I think this is the core point to touch on.

As people who work on distributed systems, obviously, we default to theline of thinking: "if you put your data on one node, you can't survivethat node failing". Do our users also immediately come to thatconclusion? If not, is it acceptable that we just do better in remindingthem of this caveat?

Like I think Busbey was alluding to earlier, I think we have room to bemore prescriptive on how you can use the local filesystem to run HBasewith the big-fat-warning that this is no longer a distributed systemthat can handle node failure. As far as I remember, usingRawLocalFileSystem (and HBase changes to also avoid any unwrapping backinto LocalFileSystem) would be a nice little improvement that should berelatively straightforward.

Would that be acceptable in your eyes, Andrew? Or, is the issue morefundamental in your mind that we should not be telling users how theycan run HBase in a manner that doesn't implicitly handle at leastfailure of one node?

I think people are going to always come up with unexpected ways to tryto run HBase. They're going to slap it on top of random filesystems. Idon't think we can keep on top of every possible permutation (especiallyif we consider things like persistent volumes from K8s, AWS, Azure thattry to make traditionally non-fault-tolerant volumes magicallyfault-tolerant).


On 4/16/20 11:59 AM, Andrew Purtell wrote:

The data can not be said to be durable because there is one set of files that 
can be irreversibly corrupted or lost.

On Apr 15, 2020, at 3:52 PM, Vladimir Rodionov <[email protected]> wrote:

FileOutputStream.getFileChannel().force(true) will get all durability we
need. Just a simple code change?

On Wed, Apr 15, 2020 at 12:32 PM Andrew Purtell <[email protected]>
wrote:

This thread talks of “durability” via filesystem characteristics but also
for single system quick Start type deployments. For durability we need
multi server deployments. No amount of hacking a single system deployment
is going to give us durability as users will expect (“don’t lose my data”).
I believe my comments are on topic.

On Apr 15, 2020, at 11:03 AM, Nick Dimiduk <[email protected]> wrote:


On Wed, Apr 15, 2020 at 10:28 AM Andrew Purtell <[email protected]>

wrote:

Nick's mail doesn't make a distinction between avoiding data loss via
typical tmp cleaner configurations, unfortunately adjacent to mention of
"durability", and real data durability, which implies more than what a
single system configuration can offer, no matter how many tweaks we

make to

LocalFileSystem. Maybe I'm being pedantic but this is something to be
really clear about IMHO.


I prefer to focus the attention of this thread to the question of data
durability via `FileSystem` characteristics. I agree that there are
concerns of durability (and others) around the use of the path under

/tmp.

Let's keep that discussion in the other thread.

On Wed, Apr 15, 2020 at 10:05 AM Sean Busbey <[email protected]> wrote:

I think the first assumption no longer holds. Especially with the move
to flexible compute environments I regularly get asked by folks what
the smallest HBase they can start with for production. I can keep
saying 3/5/7 nodes or whatever but I guarantee there are folks who
want to and will run HBase with a single node. Probably those
deployments won't want to have the distributed flag set. None of them
really have a good option for where the WALs go, and failing loud when
they try to go to LocalFileSystem is the best option I've seen so far
to make sure folks realize they are getting into muddy waters.

I agree with the second assumption. Our quickstart in general is too
complicated. Maybe if we include big warnings in the guide itself, we
could make a quickstart specific artifact to download that has the
unsafe disabling config in place?

Last fall I toyed with the idea of adding an "hbase-local" module to
the hbase-filesystem repo that could start us out with some
optimizations for single node set ups. We could start with a fork of
RawLocalFileSystem (which will call OutputStream flush operations in
response to hflush/hsync) that properly advertises its
StreamCapabilities to say that it supports the operations we need.
Alternatively we could make our own implementation of FileSystem that
uses NIO stuff. Either of these approaches would solve both problems.

On Wed, Apr 15, 2020 at 11:40 AM Nick Dimiduk <[email protected]>

wrote:


Hi folks,

I'd like to bring up the topic of the experience of new users as it
pertains to use of the `LocalFileSystem` and its associated (lack of)

data

durability guarantees. By default, an unconfigured HBase runs with its

root

directory on a `file:///` path. This patch is picked up as an instance

of

`LocalFileSystem`. Hadoop has long offered this class, but it has

never

supported `hsync` or `hflush` stream characteristics. Thus, when HBase

runs

on this configuration, it is unable to ensure that WAL writes are

durable,

and thus will ACK a write without this assurance. This is the case,

even

when running in a fully durable WAL mode.

This impacts a new user, someone kicking the tires on HBase following

our

Getting Started docs. On Hadoop 2.8 and before, an unconfigured HBase

will

WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The book
describes a process of disabling enforcement of stream capability
enforcement as a first step. This is a mandatory configuration for

running

HBase directly out of our binary distribution.

HBASE-24086 restores the behavior on Hadoop 2.10+ to that of running

on

2.8: log a warning and cary on. The critique of this approach is that

it's

far too subtle, too quiet for a system operating in a state known to

not

provide data durability.

I have two assumptions/concerns around the state of things, which

prompted

my solution on HBASE-24086 and the associated doc update on

HBASE-24106.


1. No one should be running a production system on `LocalFileSystem`.

The initial implementation checked both for `LocalFileSystem` and
`hbase.cluster.distributed`. When running on the former and the latter

is

false, we assume the user is running a non-production deployment and

carry

on with the warning. When the latter is true, we assume the user

intended a

production deployment and the process terminates due to stream

capability

enforcement. Subsequent code review resulted in skipping the
`hbase.cluster.distributed` check and simply warning, as was done on

2.8

and earlier.

(As I understand it, we've long used the `hbase.cluster.distributed`
configuration to decide if the user intends this runtime to be a

production

deployment or not.)

Is this a faulty assumption? Is there a use-case we support where we
condone running production deployment on the non-durable

`LocalFileSystem`?


2. The Quick Start experience should require no configuration at all.

Our stack is difficult enough to run in a fully durable production
environment. We should make it a priority to ensure it's as easy as
possible to try out HBase. Forcing a user to make decisions about data
durability before they even launch the web ui is a terrible

experience,

in

my opinion, and should be a non-starter for us as a project.

(In my opinion, the need to configure either `hbase.rootdir` or
`hbase.tmp.dir` away from `/tmp` is equally bad for a Getting Started
experience. It is a second, more subtle question of data durability

that

we

should avoid out of the box. But I'm happy to leave that for another
thread.)

Thank you for your time,
Nick



--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
  - A23, Crosstalk

Re: [DISCUSS] New User Experience and Data Durability Guarantees on LocalFileSystem (HBASE-24086)

Reply via email to