Re: Flush failures and Data-recovery

Aaron McCurry Tue, 18 Mar 2014 03:50:37 -0700

On Tue, Mar 18, 2014 at 2:27 AM, Ravikumar Govindarajan <
[email protected]> wrote:


> >
> > Hmm, are you evaluating 0.2.2 (hasn't been released yet), because in
> 0.2.2
> > there is no transaction log.  At least by name there's no transaction log
>
>
> I have not yet moved to 0.2.2. Commit on each mutate call sounds
> interesting. It removes the need for a transaction-log and hence worries on
> HDFS failures are remarkably simple to handle.
>

Yes, the main reason for the switch to this approach is because it removes
strain on the Namenode during very quick commit and refreshes.  When I was
working on this feature I wrote a micro benchmark that tested the time
taken during a commit in Lucene.

The basics of the benchmark was, add a Document, Commit, Refresh and then
repeat.

* The baseline was a RAMDirectory, and if remember correctly the commit
time was around 0.6 ms.
* The baseline for Blur with a standard HdfsDirectory commit time was about
180 ms.

At the time I guessed that most of the time was spent dealing with the
metadata (or Namenode interactions).  So I created the HdfsKeyValueStore
class and then built a directory on top of that.  After that I needed the
JoinDirectory to join long term and short term behaviors.

* Once it was completed, the same benchmark for the JoinDirectory was ~1.6
ms.

So with that huge improvement in performance we were able to remove the
Near in NRT (Near Real Time) updates, and just have Real Time updates.

However now with the enqueueMutate call (0.2.2 only), we now have a NRT
option that has huge performance increases over the previous NRT versions
in Blur.


>
> How exactly does the JoinDirectory work? Initially all data goes into the
> short KV directory and when a merge happens it switches over to regular
> HDFS Directory. Is that the case?
>

Exactly.

Aaron


>
> --
> Ravi
>
>
> On Mon, Mar 17, 2014 at 7:26 PM, Aaron McCurry <[email protected]> wrote:
>
> > On Mon, Mar 17, 2014 at 2:44 AM, Ravikumar Govindarajan <
> > [email protected]> wrote:
> >
> > > I have been trying to understand how Blur behaves when a flush-failure
> > > happens because of underlying hadoop issues.
> > >
> >
> > I am assuming that the flush failure you are talking about is the sync()
> > call in the HdfsKeyValueStore buried inside the various Lucene
> Directories.
> >
> >
> > >
> > > The reason is a somewhat abnormal behavior of lucene in that, during a
> > > flush-failure, lucene scraps the entire data in RAM and throws
> exception.
> > > When a commit happens, Blur deletes the current transaction-log file
> > > without reliably storing in HDFS, leading to loss of data.
> > >
> >
> > Hmm, are you evaluating 0.2.2 (hasn't been released yet), because in
> 0.2.2
> > there is no transaction log.  At least by name there's no transaction
> log.
> >
> >
> > >
> > > As I understand, there are 2 major reasons of a flush-failure
> > >
> > > 1. Failure of a data-node involved in flush [one-out-of-3 copies
> etc..].
> > > This should be internally handled in hadoop transparently, without
> Blur's
> > > intervention. Please let me know if I understood it right.
> > >
> >
> > Correct this will be handled by HDFS.
> >
> >
> > >
> > > 2. Failure of NameNode [GC struggle, down, network overload/delay
> etc...]
> > >     Not sure what need to be done here to avoid data-loss.
> > >
> >
> > Well, is 0.2.2 if this occurs the user will not see a success in the
> mutate
> > call.  They will see a BlurException (with a wrapped IOException inside)
> > explaining that something terrible happened to the underlying file
> system.
> >  At this point I don't see this as something that we can handle.  I feel
> > that informing systems/users that an error has occurred is good enough.
> >
> >
> > >
> > > As of now, restarting a shard-server immediately after encountering the
> > > first flush-failure is the only solution I can think of.
> > >
> >
> > Likely if HDFS has failed, there will be larger issues beyond just Blur.
> >  At least that's what I have seen.
> >
> > Thanks!
> >
> > Aaron
> >
> >
> > >
> > > --
> > > Ravi
> > >
> >
>

Re: Flush failures and Data-recovery

Reply via email to