This rate is dramatically slow than I would suspect. In our tests, a single insertion program has trouble inserting more than about 24,000 records per second, but that is because we are inserting kilobyte values and the network interfaces are saturated at this point. These tests are being done using a modified ycsb. I can't recommend ycsb for any serious benchmarking, but it can still be useful. You might be interested in modeling your insertion code on the ycsb insertion code. You can find the modified ycsb in my github account at https://github.com/tdunning/YCSB
Ryan has run tests with smaller rows from a number of clients and seen >150,000 rows/second insert rate. This was, I think, done with a map-reduce program but that should just provide some parallelism on the driving side. Can you say more about your data and your insertion program? On Mon, Mar 21, 2011 at 1:17 PM, Stuart Scott <stuart.sc...@e-mis.com>wrote: > The reason I ask is that we have built a 12 data-node cluster, with > region servers, zoo keeper etc.. When we add a few million rows via the > Java APIs, the system periodically fails. Regions go-off line, > performance slows down dramatically from 1000 inserts per second to > under 200 per second. It just has a generally feeling of instability. We > have rebuilt the system several times. We have changed the write-buffer > sizes, added more memory, added periodical flushing of the tables etc.. >