I think it's important to have some _external_ method to verify
correctness. There is a lot of space between the system thinks it is
correct and the user is confident that it is correct.

We don't intentionally set out to lose data, and yet it still happens
sometimes in HBase. It's good that we have things like VerifyReplication
though, to reassure us that things are proper.

On Fri, Dec 1, 2017 at 1:04 PM, Vladimir Rodionov <vladrodio...@gmail.com>
wrote:

> Thanks, Mike
>
> #1 is done
> #4 is done, but not committed yet
> #3 is questionable to say the least. All B&R tools provide guarantee of
> operation correctness if operation succeeds. Otherwise, what is the point
>      of separate Fault-tolerance work? FT includes correctness guarantee as
> well, we track all the needed WAL files or bulk-loaded files during
> incremental backup
>      and guarantee that every single file will be converted and moved to
> backup destination.If you need additional guarantee - restore backup into
> separate table and do verification yourself.
>
> #2 is ongoing
>
> On Fri, Dec 1, 2017 at 10:30 AM, Mike Drob <md...@apache.org> wrote:
>
> > The list is what Josh proposed in the original email to the list.
> >
> > What is the JIRA for #3?
> >
> > On Fri, Dec 1, 2017 at 12:20 PM, Vladimir Rodionov <
> vladrodio...@gmail.com
> > >
> > wrote:
> >
> > > Where did you get this from, Stack?
> > >
> > > I am doing scale testing now and this is last task on *my* list for
> > beta-1.
> > >
> > > On Thu, Nov 30, 2017 at 10:27 PM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Tue, Nov 7, 2017 at 8:30 PM, Josh Elser <els...@apache.org>
> wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > I've been working with Vlad and Ted offline to make sure we have a
> > plan
> > > > > that addresses the implementation gaps Vlad sees and the
> > > > barriers-for-entry
> > > > > previously stated to keep the feature in HBase 2.0. My hope is that
> > > this
> > > > > can be an honest discussion given 2.0-beta timelines, with a
> concrete
> > > > > action plan. I'm trying my best to not re-hash the
> > > > logic/reasoning/caveats
> > > > > behind previous concerns; anything folks feel is a blocker that I
> > > haven't
> > > > > covered below is unintentional.
> > > > >
> > > > > The list:
> > > > >
> > > > > 1. Documentation. It must be updated and committed, ensuring it
> > covers
> > > > the
> > > > > details operators/architects need to know to use it effectively
> > > > > (HBASE-16574). Vlad will help with content, myself and/or Frank
> will
> > > get
> > > > it
> > > > > updated to asciidoc.
> > > > >
> > > > > 2. Distributed testing missing. Vlad has taken my previous document
> > on
> > > > > goals and translated that into an implementation outline[1]. Ted
> and
> > I
> > > > have
> > > > > already weighed in -- I believe it hits the salient points for the
> > > > quality
> > > > > of testing we're looking for. I'll get started on this while Vlad
> > does
> > > #4
> > > > > (after consensus on approach, of course). Needs JIRA issue
> (maybe?).
> > > > >
> > > > > 3. Operator utility to verify backups. In abstract, this should
> just
> > be
> > > > > the same guts of a tool like VerifyReplication. In practice, this
> > > should
> > > > be
> > > > > the same code that #3 uses (if not _actually_ the same guts as
> > > > > VerifyReplication). The hope is that this will be encapsulated
> > > > (time-wise)
> > > > > by #3. Needs JIRA issue (maybe?).
> > > > >
> > > > > 4. Polish DistCP for bulk-loaded files/fault-tolerance
> > (HBASE-17852). I
> > > > > don't have specifics here -- will rely on Vlad to correct me if
> > > there's a
> > > > > better JIRA issue to track than the aforementioned. Will rely on
> > > details
> > > > to
> > > > > show up the JIRA issue to track it.
> > > > >
> > > > > Current due dates:
> > > > >
> > > > >
> > > > Checking in on the plan.
> > > >
> > > >
> > > > > 1. End of week (2017/11/10)
> > > > >
> > > >
> > > > I believe this is done.
> > > >
> > > >
> > > > > 2. Before US Thanksgiving (2017/11/22)
> > > > > 3. Same as #2
> > > > > 4. Same as #1
> > > > >
> > > > >
> > > > These were not done in time for thanksgiving? Correct me if I'm
> wrong.
> > > >
> > > > Thanks,
> > > > St.Ack
> > > >
> > > >
> > > >
> > > > > My current thought is that this is reasonable for implementation
> > times,
> > > > > and would not derail the rest of the beta-1 train. I appreciate the
> > > > > patience from all parties, and I hope that those trying to make
> this
> > > > better
> > > > > can find a little more time to give some feedback. Thanks for the
> > long
> > > > read
> > > > > if nothing else.
> > > > >
> > > > > - Josh
> > > > >
> > > > > [1] https://docs.google.com/document/d/1xbPlLKjOcPq2LDqjbSkF6uND
> > > > > AG0mzgOxek6P3POLeMc/edit?usp=sharing
> > > > >
> > > >
> > >
> >
>

Reply via email to