Something is just wrong.  You should be able to do 17,000 records from a few
nodes with multiple threads against a fairly small cluster.  You should be
able to come close to that from a single node into a dozen region servers.

On Thu, Mar 24, 2011 at 5:32 PM, Vivek Krishna <>wrote:

> I have a total of 10 clients-nodes with 3-10 threads running on each node.
> Record size ~1K
> Viv
> On Thu, Mar 24, 2011 at 8:28 PM, Ted Dunning <>wrote:
>> Are you putting this data from a single host?  Is your sender
>> multi-threaded?
>> I note that (20 GB / 20 minutes < 20 MB / s) so you aren't particularly
>> stressing the network.  You would likely be stressing a single threaded
>> client pretty severely.
>> What is your record size?  It may be that you are bound up by the number
>> of records being inserted rather than the total data size.
>> On Thu, Mar 24, 2011 at 5:22 PM, Vivek Krishna <>wrote:
>>> Data Size - 20 GB.  It took about an hour with default hbase setting and
>>> after varying several parameters, we were able to get this done in ~20
>>> minutes.  This is slow and we are trying to improve.
>>> We wrote a java client which would essentially `put` to hbase tables in
>>> batches.  Our fine-tuning parameters include,
>>> 1.  Disabling compaction
>>> 2.  Varying batch sizes of put ( tried with 1000, 5000, 10000, 20000,
>>> 40000
>>> )
>>> 3.  Setting AutoFlush to on/off.
>>> 4.  Varying write buffer(in client)  with 2mb, 128mb,256mb
>>> 5.  Changing regionserver.handler.count to 100
>>> 6.  Varying regionserver size from 128 to 256/512/1024.
>>> 7.  Increasing number of regions.
>>> 8.  Creating regions with keys pre-specified (so that clients hit the
>>> regions directly)
>>> 9.  Varying number of clients (from 30 clients to 100 clients)
>>> The above was tested on a 38 node cluster with 2 regions each.
>>> We did not try disabling WAL fearing loss of data.
>>> Are there any other parameters that we missed during the process?
>>> Viv

Reply via email to