Re: Retry HTable.put() on client-side to handle temp connectivity problem

Alex Baranau Tue, 28 Jun 2011 10:08:08 -0700

> if the sink "dies" for some reason, then it should
> push that back to the upstream parts of the flume dataflow, and have them
> buffer data on local disk.


True. But this seem to be a separate issue:
https://issues.cloudera.org/browse/FLUME-390.

Alex Baranau
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

On Tue, Jun 28, 2011 at 7:40 PM, Doug Meil <[email protected]>wrote:

> I agree with what Todd & Gary said.   I don't like retry-forever,
> especially as a default option in HBase.
>
>
> -----Original Message-----
> From: Gary Helmling [mailto:[email protected]]
> Sent: Tuesday, June 28, 2011 12:18 PM
> To: [email protected]
> Cc: Jonathan Hsieh
> Subject: Re: Retry HTable.put() on client-side to handle temp connectivity
> problem
>
> I'd also be wary of changing the default to retry forever.  This might be
> hard to differentiate from a hang or deadlock for new users and seems to
> violate "least surprise".
>
> In many cases it's preferable to have some kind of predictable failure as
> well.  So I think this would appear to be a regression in behavior.  If
> you're serving say web site data from hbase, you may prefer an occasional
> error or timeout rather than having page loading hang forever.
>
> I'm all for making "retry forever" a configurable option, but do we need
> any new knobs here?
>
> --gh
>
>
> On Mon, Jun 27, 2011 at 3:23 PM, Joey Echeverria <[email protected]>
> wrote:
>
> > If I could override the default, I'd be a hesitant +1. I'd rather see
> > the default be something like retry 10 times, then throw an error.
> > With one option being infinite retries.
> >
> > -Joey
> >
> > On Mon, Jun 27, 2011 at 2:21 PM, Stack <[email protected]> wrote:
> > > I'd be fine with changing the default in hbase so clients just keep
> > > trying.  What do others think?
> > > St.Ack
> > >
> > > On Mon, Jun 27, 2011 at 1:56 PM, Alex Baranau
> > > <[email protected]>
> > wrote:
> > >> The code I pasted works for me: it reconnects successfully. Just
> > >> thought
> > it
> > >> might be not the best way to do it.. I realized that by using HBase
> > >> configuration properties we could just say that it's up to user to
> > configure
> > >> HBase client (created by Flume) properly (e.g. by adding
> > >> hbase-site.xml
> > with
> > >> settings to classpath). On the other hand, it looks to me that
> > >> users of HBase sinks will *always* want it to retry writing to
> > >> HBase until it
> > works
> > >> out. But default configuration works not this way: sinks stops when
> > HBase is
> > >> temporarily down or inaccessible. Hence it makes using the sink
> > >> more complicated (because default configuration sucks), which I'd
> > >> like to
> > avoid
> > >> here by adding the code above. Ideally the default configuration
> > >> should
> > work
> > >> the best way for general-purpose case.
> > >>
> > >> I understood what are the ways to implement/configure such
> > >> behavior. I
> > think
> > >> we should discuss what is the best default behavior and do we need
> > >> to
> > allow
> > >> user override it on Flume ML (or directly at
> > >> https://issues.cloudera.org/browse/FLUME-685).
> > >>
> > >> Thank you guys,
> > >>
> > >> Alex Baranau
> > >> ----
> > >> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop
> > >> -
> > HBase
> > >>
> > >>
> > >> On Mon, Jun 27, 2011 at 11:40 PM, Stack <[email protected]> wrote:
> > >>
> > >>> Either should work Alex.  Your version will go "for ever".  Have
> > >>> you tried yanking hbase out from under the client to see if it
> reconnects?
> > >>>
> > >>> Good on you,
> > >>> St.Ack
> > >>>
> > >>> On Mon, Jun 27, 2011 at 1:33 PM, Alex Baranau <
> > [email protected]>
> > >>> wrote:
> > >>> > Yes, that is what intended, I think. To make the whole picture
> > >>> > clear,
> > >>> here's
> > >>> > the context:
> > >>> >
> > >>> > * there's a Flume's HBase sink (read: HBase client) which writes
> > >>> > data
> > >>> from
> > >>> > Flume "pipe" (read: some event-based messages source) to HTable;
> > >>> > * when HBase is down for some time (with default HBase
> > >>> > configuration
> > on
> > >>> > Flume's sink side) HTable.put throws exception and client exits
> > >>> > (it
> > >>> usually
> > >>> > takes ~10 min to fail);
> > >>> > * Flume is smart enough to accumulate data to be written
> > >>> > reliably if
> > sink
> > >>> > behaves badly (not writing for some time, pauses, etc.), so it
> > >>> > would
> > be
> > >>> > great if the sink tries to write data until HBase is up again, BUT:
> > >>> > * but here, as we have complete "failure" of sink process
> > >>> > (thread
> > needs
> > >>> to
> > >>> > be restarted) the data never reaches HTable even after HBase
> > >>> > cluster
> > is
> > >>> > brought up again.
> > >>> >
> > >>> > So you suggest instead of this extra construction around
> > >>> > HTable.put
> > to
> > >>> use
> > >>> > configuration properties "hbase.client.pause" and
> > >>> > "hbase.client.retries.number"? I.e. make retries attempts to be
> > >>> (reasonably)
> > >>> > close to "perform forever". Is that what you meant?
> > >>> >
> > >>> > Thank you,
> > >>> > Alex Baranau
> > >>> > ----
> > >>> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -
> > >>> > Hadoop -
> > >>> HBase
> > >>> >
> > >>> > On Mon, Jun 27, 2011 at 11:16 PM, Ted Yu <[email protected]>
> > wrote:
> > >>> >
> > >>> >> This would retry indefinitely, right ?
> > >>> >> Normally maximum retry duration would govern how long the retry
> > >>> >> is attempted.
> > >>> >>
> > >>> >> On Mon, Jun 27, 2011 at 1:08 PM, Alex Baranau <
> > [email protected]
> > >>> >> >wrote:
> > >>> >>
> > >>> >> > Hello,
> > >>> >> >
> > >>> >> > Just wanted to confirm that I'm doing things in a proper way
> here.
> > How
> > >>> >> > about
> > >>> >> > this code to handle the temp cluster connectivity problems
> > >>> >> > (or
> > cluster
> > >>> >> down
> > >>> >> > time) on client-side?
> > >>> >> >
> > >>> >> > +    // HTable.put() will fail with exception if connection
> > >>> >> > + to
> > cluster
> > >>> is
> > >>> >> > temporarily broken or
> > >>> >> > +    // cluster is temporarily down. To be sure data is
> > >>> >> > + written we
> > >>> retry
> > >>> >> > writing.
> > >>> >> > +    boolean dataWritten = false;
> > >>> >> > +    do {
> > >>> >> > +      try {
> > >>> >> > +        table.put(p);
> > >>> >> > +        dataWritten = true;
> > >>> >> > +      } catch (IOException ioe) { // indicates cluster
> > connectivity
> > >>> >> > problem
> > >>> >> > (also thrown when cluster is down)
> > >>> >> > +        LOG.error("Writing data to HBase failed, will try
> > >>> >> > + again
> > in "
> > >>> +
> > >>> >> > RETRY_INTERVAL_ON_WRITE_FAIL + " sec", ioe);
> > >>> >> > +
> > >>> >> > + Thread.currentThread().wait(RETRY_INTERVAL_ON_WRITE_FAIL
> > *
> > >>> >> 1000);
> > >>> >> > +      }
> > >>> >> > +    } while (!dataWritten);
> > >>> >> >
> > >>> >> > Thank you in advance,
> > >>> >> > Alex Baranau
> > >>> >> > ----
> > >>> >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -
> > Hadoop -
> > >>> >> HBase
> > >>> >> >
> > >>> >>
> > >>> >
> > >>>
> > >>
> > >
> >
> >
> >
> > --
> > Joseph Echeverria
> > Cloudera, Inc.
> > 443.305.9434
> >
>

Re: Retry HTable.put() on client-side to handle temp connectivity problem

Reply via email to