Re: Confusion with new 1.0 clint API

Stack Tue, 09 Feb 2016 06:43:28 -0800

On Tue, Feb 9, 2016 at 6:37 AM, Serega Sheypak <serega.shey...@gmail.com>
wrote:


> Hi, thanks for reply!
>
>
> > What should we add here to make the doc more clear on BufferedMutator?
> >
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html
> It's pretty clear.
>
> > And in the example?
> Example shows that both connection and buffered mutator are thread safe and
> they are closed at once.
> - Should keep single instance of connection and bufferedmutator per thread?
>

No. Connection says you should generally share the Connection instance.
Ditto for BufferedMutator. It is backed by a buffer so you are batching
your writes when you use it. If you share the BM, your batching will more
'regular'.


> - Should I keep connection per thread and instantiate mutator for each
> request?
>

See above.


> - what happens if autoflush enabled?
>

When the backing buffer is full, it gets flushed to the cluster.


> - is it possible to turn on sync mode for saving data?
>

Yes. Don't use a BufferedMutator. Your throughput will go down as you do an
RPC per write.


> - Connection could get into failed/invalid state for some reason (RS down,
> RS up, some networking partitioned happend). Is it possible?



It could happen but I think by now we've seen nearly all of the ways in
which a Connection can fail and internally, it compensates.



> If it's
> possible, then what is the right way to handle it: "close" failed
> connection and ask for new one?
>
>
Good question. Connection internally will retry and 'ride over' near all
cluster issues but the catastrophic.

St.Ack



> > I assume users is collection of User's.
> > Have you tried obtaining / closing mutator for each User instead of
> sharing
> > the mutator ?
>
> > If another flush, say because there were lots of puts, then when close
> > comes in, we are out of threads and you'd get below.
> Looks like it's the root cause, let me try!
>
> Thank you for detailed explanation!
>
>
>
> 2016-02-09 7:10 GMT+01:00 Stack <st...@duboce.net>:
>
> > On Mon, Feb 8, 2016 at 3:01 PM, Serega Sheypak <serega.shey...@gmail.com
> >
> > wrote:
> >
> > > Hi, I'm confused with new HBase 1.0 API. API says that application
> should
> > > manage connections (Previously HConnections) on their own. Nothing is
> > > managed itnernally now.
> > >
> > >
> > That is right.
> >
> >
> >
> >
> > > Here is an example:
> > >
> > >
> >
> https://hbase.apache.org/xref/org/apache/hadoop/hbase/client/example/BufferedMutatorExample.html
> > >
> > > It gives no clue about lifecycle :(
> > >
> >
> >
> > Connection is fairly explicit:
> >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html
> >
> > What should we add here to make the doc more clear on BufferedMutator?
> >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html
> >
> > And in the example?
> >
> >
> >
> > > Right now I create single connection instance for servlet and
> > > BufferedMutator per request.
> > >
> > >
> > >
> > The recommendation is a singled BufferedMutator shared across requests.
> >
> >
> >
> >
> > > //getConnection returns single instance, it doesn't return new
> connection
> > > each time
> > > def mutator =
> > > getConnection().getBufferedMutator(getBufferedMutatorParams())
> > >
> >
> >
> > getConnection is your method?
> >
> >
> > getBufferedMutator creates a new one?
> >
> >
> >
> > > users.each{ mutator.mutate(toPut(it))}
> > > mutator.close() //exception is thrown here
> > >
> > >
> > The close is flushing out all the writes.
> >
> > If a BufferedMutator per servlet instance, there are probably many when
> > many requests coming in.
> >
> > See what happens when you create one:
> >
> >
> >
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/client/ConnectionImplementation.html#313
> >
> > which calls through to here....
> >
> >
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/client/HTable.html#126
> >
> > ... which creates an executor of max 1 task only.
> >
> > If another flush, say because there were lots of puts, then when close
> > comes in, we are out of threads and you'd get below.
> >
> > St.Ack
> >
> >
> >
> >
> > >
> > > And I get tons of exceptions thrown on "mutator.close()", what do I do
> > > wrong?
> > >
> > > WARNING: #905, the task was rejected by the pool. This is unexpected.
> > > Server is node04.server.com, 60020,1447338864601
> > > java.util.concurrent.RejectedExecutionException: Task
> > > java.util.concurrent.FutureTask@5cff3b40 rejected from
> > > java.util.concurrent.ThreadPoolExecutor@686c2853[Terminated, pool
> size =
> > > 0,
> > > active threads = 0, queued tasks = 0, completed tasks = 1]
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
> > > at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
> > > at
> > >
> > >
> >
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:956)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$000(AsyncProcess.java:574)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.AsyncProcess.submitMultiActions(AsyncProcess.java:423)
> > > at
> > >
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:403)
> > > at
> > >
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
> > > at
> > >
> > >
> >
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.close(BufferedMutatorImpl.java:158)
> > > at org.apache.hadoop.hbase.client.BufferedMutator$close$0.call(Unknown
> > > Source)
> > >
> >
>

Re: Confusion with new 1.0 clint API

Reply via email to