Re: BatchWriter performance on 1.4

2013-09-19 Thread Adam Fuchs
The addMutations method blocks when the client-side buffer fills up, so you may see a lot of time spent in that method due to a bottleneck downstream. There are a number of things you could try to speed that up. Here are a few: 1. Increase the BatchWriter's buffer size. This can smooth out the

Using Pig over Accumulo Data

2013-09-19 Thread Andrew Catterall
Hi Andrew, I've just seen your post to the accumulo mailing list. Not sure if you still looking at using pig over accumulo data but I recently got accumulo-pig running against accumulo-1.5.0 and had a couple of problems. To get this running I did the following: * Download the

Re: Using Pig over Accumulo Data

2013-09-19 Thread Andrew Wells
Thanks for the reply! I was just wondering if there was a more recent development for doing this. It is interesting to hear you got it working on 1.5.0. On Thu, Sep 19, 2013 at 9:33 AM, Andrew Catterall catteralland...@googlemail.com wrote: Hi Andrew, I've just seen your post to the

RE: BatchWriter performance on 1.4

2013-09-19 Thread Slater, David M.
Hi David, I've looked at generating rfiles directly, but I know that adds latency to the process, so I wanted to make sure I had found the upper bound for direct mutations before exploring that. The tables are pre-split, and all tservers are engaged in ingest (though the application itself

Re: BatchWriter performance on 1.4

2013-09-19 Thread Keith Turner
Are you aware of the multi table batch writer? I am not sure if it would be useful, but wanted to make sure you knew about it. It will use the same thread pool to process mutations for multiple tables. Also it will batch mutations for multiple tablets into the same rpc calls. On Wed, Sep 18,

Re: BatchWriter performance on 1.4

2013-09-19 Thread Keith Turner
On Thu, Sep 19, 2013 at 5:08 PM, Slater, David M. david.sla...@jhuapl.eduwrote: Thanks Keith, I’m looking at it now. It appears like what I would want. As for the proper usage… ** ** Would I create one using the Connector, then .getBatchWriter() for each of the tables I’m