Problems with write performance (25kb rows)

Dmitriy Lyfar Tue, 29 Dec 2009 02:14:55 -0800

Hello,

We did lot of tests and now moved to the ubuntu 9.10 for our cluster.
Current configuration
has 7 nodes (1 master/namenode + 6 regionservers/datanodes + 3 zk).
For namenode we have following config: 16GB memory, 16 CPU cores.
For datanodes: 12GB memory, 8 CPU cores.
1Gbit ethernet.


Hbase has following configuration: http://pastebin.com/m6c7358e6
Table looks like: META:field1, META:field2 ... META:field5 and
contents:field1, contents:field2

Client is implemented on java. I've checked hbase unit-tests and did
following implementation:
We have about 6M records to insert at once, client creates thread per each
100K records and
then wait until all threads will be finished. Each row is about 25Kb size.
Each thread creates its own HTable and HBaseConfiguration.
Something going wrong, because sometimes I get exception:

Exception in thread "Thread-9" java.util.ConcurrentModificationException

What does it mean? Because I'm not using shared object instances. Row keys
are time (in milliseconds) in reverse order. Looks like threads
trying to modify row with the same row key simultaneously.

As for timings:
For 5Kb rows we have about 35-40K records per second.
For 25Kb rows -- about 1-2K records per second.

So I have different throughput on different row size, looks illogical.

Also I see that nodes load is almost idle. Hbase jvm heap size is 5Gb on
each node and only 300-500Mb is used during test.
I've used all performance tuning advices, like autoflush off, write buffer =
12 MB, WAL is off.

Btw how dangerous is to switch WAL off? Thank you.

-- 
Regards, Lyfar Dmitriy

Problems with write performance (25kb rows)

Reply via email to