I'd be fine with changing the default in hbase so clients just keep trying. What do others think? St.Ack
On Mon, Jun 27, 2011 at 1:56 PM, Alex Baranau <[email protected]> wrote: > The code I pasted works for me: it reconnects successfully. Just thought it > might be not the best way to do it.. I realized that by using HBase > configuration properties we could just say that it's up to user to configure > HBase client (created by Flume) properly (e.g. by adding hbase-site.xml with > settings to classpath). On the other hand, it looks to me that users of > HBase sinks will *always* want it to retry writing to HBase until it works > out. But default configuration works not this way: sinks stops when HBase is > temporarily down or inaccessible. Hence it makes using the sink more > complicated (because default configuration sucks), which I'd like to avoid > here by adding the code above. Ideally the default configuration should work > the best way for general-purpose case. > > I understood what are the ways to implement/configure such behavior. I think > we should discuss what is the best default behavior and do we need to allow > user override it on Flume ML (or directly at > https://issues.cloudera.org/browse/FLUME-685). > > Thank you guys, > > Alex Baranau > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase > > > On Mon, Jun 27, 2011 at 11:40 PM, Stack <[email protected]> wrote: > >> Either should work Alex. Your version will go "for ever". Have you >> tried yanking hbase out from under the client to see if it reconnects? >> >> Good on you, >> St.Ack >> >> On Mon, Jun 27, 2011 at 1:33 PM, Alex Baranau <[email protected]> >> wrote: >> > Yes, that is what intended, I think. To make the whole picture clear, >> here's >> > the context: >> > >> > * there's a Flume's HBase sink (read: HBase client) which writes data >> from >> > Flume "pipe" (read: some event-based messages source) to HTable; >> > * when HBase is down for some time (with default HBase configuration on >> > Flume's sink side) HTable.put throws exception and client exits (it >> usually >> > takes ~10 min to fail); >> > * Flume is smart enough to accumulate data to be written reliably if sink >> > behaves badly (not writing for some time, pauses, etc.), so it would be >> > great if the sink tries to write data until HBase is up again, BUT: >> > * but here, as we have complete "failure" of sink process (thread needs >> to >> > be restarted) the data never reaches HTable even after HBase cluster is >> > brought up again. >> > >> > So you suggest instead of this extra construction around HTable.put to >> use >> > configuration properties "hbase.client.pause" and >> > "hbase.client.retries.number"? I.e. make retries attempts to be >> (reasonably) >> > close to "perform forever". Is that what you meant? >> > >> > Thank you, >> > Alex Baranau >> > ---- >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - >> HBase >> > >> > On Mon, Jun 27, 2011 at 11:16 PM, Ted Yu <[email protected]> wrote: >> > >> >> This would retry indefinitely, right ? >> >> Normally maximum retry duration would govern how long the retry is >> >> attempted. >> >> >> >> On Mon, Jun 27, 2011 at 1:08 PM, Alex Baranau <[email protected] >> >> >wrote: >> >> >> >> > Hello, >> >> > >> >> > Just wanted to confirm that I'm doing things in a proper way here. How >> >> > about >> >> > this code to handle the temp cluster connectivity problems (or cluster >> >> down >> >> > time) on client-side? >> >> > >> >> > + // HTable.put() will fail with exception if connection to cluster >> is >> >> > temporarily broken or >> >> > + // cluster is temporarily down. To be sure data is written we >> retry >> >> > writing. >> >> > + boolean dataWritten = false; >> >> > + do { >> >> > + try { >> >> > + table.put(p); >> >> > + dataWritten = true; >> >> > + } catch (IOException ioe) { // indicates cluster connectivity >> >> > problem >> >> > (also thrown when cluster is down) >> >> > + LOG.error("Writing data to HBase failed, will try again in " >> + >> >> > RETRY_INTERVAL_ON_WRITE_FAIL + " sec", ioe); >> >> > + Thread.currentThread().wait(RETRY_INTERVAL_ON_WRITE_FAIL * >> >> 1000); >> >> > + } >> >> > + } while (!dataWritten); >> >> > >> >> > Thank you in advance, >> >> > Alex Baranau >> >> > ---- >> >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - >> >> HBase >> >> > >> >> >> > >> >
