Maybe still keep the enforcement on stream capability but provide a docker image for quick start? In the docker image we could set the flag to false.
Vladimir Rodionov <vladrodio...@gmail.com> 于2020年4月16日周四 上午6:55写道: > This should work for locally attached storage for sure. > > On Wed, Apr 15, 2020 at 3:52 PM Vladimir Rodionov <vladrodio...@gmail.com> > wrote: > > > FileOutputStream.getFileChannel().force(true) will get all durability we > > need. Just a simple code change? > > > > > > On Wed, Apr 15, 2020 at 12:32 PM Andrew Purtell < > andrew.purt...@gmail.com> > > wrote: > > > >> This thread talks of “durability” via filesystem characteristics but > also > >> for single system quick Start type deployments. For durability we need > >> multi server deployments. No amount of hacking a single system > deployment > >> is going to give us durability as users will expect (“don’t lose my > data”). > >> I believe my comments are on topic. > >> > >> > >> > On Apr 15, 2020, at 11:03 AM, Nick Dimiduk <ndimi...@apache.org> > wrote: > >> > > >> > On Wed, Apr 15, 2020 at 10:28 AM Andrew Purtell <apurt...@apache.org > > > >> wrote: > >> > > >> >> Nick's mail doesn't make a distinction between avoiding data loss via > >> >> typical tmp cleaner configurations, unfortunately adjacent to mention > >> of > >> >> "durability", and real data durability, which implies more than what > a > >> >> single system configuration can offer, no matter how many tweaks we > >> make to > >> >> LocalFileSystem. Maybe I'm being pedantic but this is something to be > >> >> really clear about IMHO. > >> >> > >> > > >> > I prefer to focus the attention of this thread to the question of data > >> > durability via `FileSystem` characteristics. I agree that there are > >> > concerns of durability (and others) around the use of the path under > >> /tmp. > >> > Let's keep that discussion in the other thread. > >> > > >> >> On Wed, Apr 15, 2020 at 10:05 AM Sean Busbey <bus...@apache.org> > >> wrote: > >> >> > >> >>> I think the first assumption no longer holds. Especially with the > move > >> >>> to flexible compute environments I regularly get asked by folks what > >> >>> the smallest HBase they can start with for production. I can keep > >> >>> saying 3/5/7 nodes or whatever but I guarantee there are folks who > >> >>> want to and will run HBase with a single node. Probably those > >> >>> deployments won't want to have the distributed flag set. None of > them > >> >>> really have a good option for where the WALs go, and failing loud > when > >> >>> they try to go to LocalFileSystem is the best option I've seen so > far > >> >>> to make sure folks realize they are getting into muddy waters. > >> >>> > >> >>> I agree with the second assumption. Our quickstart in general is too > >> >>> complicated. Maybe if we include big warnings in the guide itself, > we > >> >>> could make a quickstart specific artifact to download that has the > >> >>> unsafe disabling config in place? > >> >>> > >> >>> Last fall I toyed with the idea of adding an "hbase-local" module to > >> >>> the hbase-filesystem repo that could start us out with some > >> >>> optimizations for single node set ups. We could start with a fork of > >> >>> RawLocalFileSystem (which will call OutputStream flush operations in > >> >>> response to hflush/hsync) that properly advertises its > >> >>> StreamCapabilities to say that it supports the operations we need. > >> >>> Alternatively we could make our own implementation of FileSystem > that > >> >>> uses NIO stuff. Either of these approaches would solve both > problems. > >> >>> > >> >>> On Wed, Apr 15, 2020 at 11:40 AM Nick Dimiduk <ndimi...@apache.org> > >> >> wrote: > >> >>>> > >> >>>> Hi folks, > >> >>>> > >> >>>> I'd like to bring up the topic of the experience of new users as it > >> >>>> pertains to use of the `LocalFileSystem` and its associated (lack > of) > >> >>> data > >> >>>> durability guarantees. By default, an unconfigured HBase runs with > >> its > >> >>> root > >> >>>> directory on a `file:///` path. This patch is picked up as an > >> instance > >> >> of > >> >>>> `LocalFileSystem`. Hadoop has long offered this class, but it has > >> never > >> >>>> supported `hsync` or `hflush` stream characteristics. Thus, when > >> HBase > >> >>> runs > >> >>>> on this configuration, it is unable to ensure that WAL writes are > >> >>> durable, > >> >>>> and thus will ACK a write without this assurance. This is the case, > >> >> even > >> >>>> when running in a fully durable WAL mode. > >> >>>> > >> >>>> This impacts a new user, someone kicking the tires on HBase > following > >> >> our > >> >>>> Getting Started docs. On Hadoop 2.8 and before, an unconfigured > HBase > >> >>> will > >> >>>> WARN and cary on. Hadoop 2.10+, HBase will refuse to start. The > book > >> >>>> describes a process of disabling enforcement of stream capability > >> >>>> enforcement as a first step. This is a mandatory configuration for > >> >>> running > >> >>>> HBase directly out of our binary distribution. > >> >>>> > >> >>>> HBASE-24086 restores the behavior on Hadoop 2.10+ to that of > running > >> on > >> >>>> 2.8: log a warning and cary on. The critique of this approach is > that > >> >>> it's > >> >>>> far too subtle, too quiet for a system operating in a state known > to > >> >> not > >> >>>> provide data durability. > >> >>>> > >> >>>> I have two assumptions/concerns around the state of things, which > >> >>> prompted > >> >>>> my solution on HBASE-24086 and the associated doc update on > >> >> HBASE-24106. > >> >>>> > >> >>>> 1. No one should be running a production system on > `LocalFileSystem`. > >> >>>> > >> >>>> The initial implementation checked both for `LocalFileSystem` and > >> >>>> `hbase.cluster.distributed`. When running on the former and the > >> latter > >> >> is > >> >>>> false, we assume the user is running a non-production deployment > and > >> >>> carry > >> >>>> on with the warning. When the latter is true, we assume the user > >> >>> intended a > >> >>>> production deployment and the process terminates due to stream > >> >> capability > >> >>>> enforcement. Subsequent code review resulted in skipping the > >> >>>> `hbase.cluster.distributed` check and simply warning, as was done > on > >> >> 2.8 > >> >>>> and earlier. > >> >>>> > >> >>>> (As I understand it, we've long used the > `hbase.cluster.distributed` > >> >>>> configuration to decide if the user intends this runtime to be a > >> >>> production > >> >>>> deployment or not.) > >> >>>> > >> >>>> Is this a faulty assumption? Is there a use-case we support where > we > >> >>>> condone running production deployment on the non-durable > >> >>> `LocalFileSystem`? > >> >>>> > >> >>>> 2. The Quick Start experience should require no configuration at > all. > >> >>>> > >> >>>> Our stack is difficult enough to run in a fully durable production > >> >>>> environment. We should make it a priority to ensure it's as easy as > >> >>>> possible to try out HBase. Forcing a user to make decisions about > >> data > >> >>>> durability before they even launch the web ui is a terrible > >> experience, > >> >>> in > >> >>>> my opinion, and should be a non-starter for us as a project. > >> >>>> > >> >>>> (In my opinion, the need to configure either `hbase.rootdir` or > >> >>>> `hbase.tmp.dir` away from `/tmp` is equally bad for a Getting > Started > >> >>>> experience. It is a second, more subtle question of data durability > >> >> that > >> >>> we > >> >>>> should avoid out of the box. But I'm happy to leave that for > another > >> >>>> thread.) > >> >>>> > >> >>>> Thank you for your time, > >> >>>> Nick > >> >>> > >> >> > >> >> > >> >> -- > >> >> Best regards, > >> >> Andrew > >> >> > >> >> Words like orphans lost among the crosstalk, meaning torn from > truth's > >> >> decrepit hands > >> >> - A23, Crosstalk > >> >> > >> > > >