Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-25 Thread Solomon Duskis
@Devaraja, Would you mind posting that on https://issues.apache.org/jira/browse/HBASE-12728? The HBase group is talking about this topic on that JIRA issue. Thanks, -Solomon On Wed, Dec 24, 2014 at 9:40 PM, Devaraja Swami devarajasw...@gmail.com wrote: Would like to add my perspective as a

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-24 Thread Devaraja Swami
Would like to add my perspective as a user. (Thanks to Aaron Beppu for uncovering this hidden issue). In my applications, I have some tables for which I need autoflushing, and others for which I need a write buffer. Plus the size of the write buffer is different for different tables. All these

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Pradeep Gollakota
Hi Aaron, Just out of curiosity, have you considered using asynchbase? https://github.com/OpenTSDB/asynchbase On Fri, Dec 19, 2014 at 9:00 AM, Nick Dimiduk ndimi...@apache.org wrote: Hi Aaron, Your analysis is spot on and I do not believe this is by design. I see the write buffer is owned

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I believe HTableMultiplexer[1] is meant to stand in for HTablePool for buffered writing. FWIW, I've not used it. 1: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableMultiplexer.html On Fri, Dec 19, 2014 at 9:00 AM, Nick Dimiduk ndimi...@apache.org wrote: Hi Aaron, Your

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Aaron Beppu
Nick : Thanks, I've created an issue [1]. Pradeep : Yes, I have considered using that. However for the moment, we've set it out of scope, since our migration from 0.94 - 0.98 is already a bit complicated, and we hoped to separate isolate these changes by not moving to the async client until after

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
Aaron: Please post a copy of that feedback on the JIRA, pretty sure we will be having an improvement discussion there. On Fri, Dec 19, 2014 at 10:58 AM, Aaron Beppu abe...@siftscience.com wrote: Nick : Thanks, I've created an issue [1]. Pradeep : Yes, I have considered using that. However for

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Nick Dimiduk
Thanks for the reminder about the Multiplexer, Andrew. It sort-of solves this problem, but think it's semantics of dropping writes are not desirable in the general case. Further, my understanding was that the new connection implementation is designed to handle this kind of use-case (hence cc'ing

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I don't like the dropped writes either. Just pointing out what we have now. There is a gap no doubt. On Fri, Dec 19, 2014 at 11:16 AM, Nick Dimiduk ndimi...@apache.org wrote: Thanks for the reminder about the Multiplexer, Andrew. It sort-of solves this problem, but think it's semantics of

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Solomon Duskis
Is this critical to sort out before 1.0, or is fixing this a post-1.0 enhancement? -Solomon On Fri, Dec 19, 2014 at 2:19 PM, Andrew Purtell apurt...@apache.org wrote: I don't like the dropped writes either. Just pointing out what we have now. There is a gap no doubt. On Fri, Dec 19, 2014 at

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Andrew Purtell
I think it would be critical if we're contemplating something that requires a breaking API change? Do we have that here? I'm not sure. On Fri, Dec 19, 2014 at 12:02 PM, Solomon Duskis sdus...@gmail.com wrote: Is this critical to sort out before 1.0, or is fixing this a post-1.0 enhancement?

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Solomon Duskis
My first thought based on this discussion was that it would require moving some methods (setAutoFlush() and setWriteBufferSize()) from Table to Connection. That would be a breaking API change. -Solomon On Fri, Dec 19, 2014 at 3:04 PM, Andrew Purtell apurt...@apache.org wrote: I think it would

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Stack
On Fri, Dec 19, 2014 at 12:20 PM, Solomon Duskis sdus...@gmail.com wrote: My first thought based on this discussion was that it would require moving some methods (setAutoFlush() and setWriteBufferSize()) from Table to Connection. That would be a breaking API change. This will mean a bunch

Re: Efficient use of buffered writes in a post-HTablePool world?

2014-12-19 Thread Nick Dimiduk
Could be in an API-compatible way, though semantics would change, which is probably worse. Table keeps these methods. When setAutoFlush is used, write buffer managed by connection is created. If multiple Table instances for the same table setWriteBufferSize(), perhaps the largest value wins.

Efficient use of buffered writes in a post-HTablePool world?

2014-12-17 Thread Aaron Beppu
Hi All, TLDR; in the absence of HTablePool, if HTable instances are short-lived, how should clients use buffered writes? I’m working on migrating a codebase from using 0.94.6 (CDH4.4) to 0.98.6 (CDH5.2). One issue I’m confused by is how to effectively use buffered writes now that HTablePool has