If you didn't configure anything more than the heap, PE will by default create a table with 1 region and a low (albeit default) memstore size. This means it's spending its time waiting on splits and it's recompacting your data all the time which wastes a lot of iops.
You didn't tell use which version you're using so here's two things to fix the former: 0.90: run the import a few times so that the regions can split, then run a major compaction. 0.92: use https://issues.apache.org/jira/browse/HBASE-4440, it's pretty easy to backport. To fix the latter, set MEMSTORE_SIZE to something better like 256MB and also once the table is pre-splitted change the MAX_FILESIZE to >1GB. J-D On Mon, Feb 6, 2012 at 7:15 AM, Assarsson, Emil <[email protected]> wrote: > Hi, > > I'm tryng to optimize a hbase cluster (on hdfs) with the test randomWrite. I > have 7 nodes: 1 zookeeper/name/hbase-master/jobtracker and 6 > region/data/tasktrackers. Each with 1 disk, 16G memory, 2 x 4 cores. I know > that I really should have more disks but for the time being I'm trying to do > the best with what I have. > > I have configured tasktrackers to run 1 map/1 red on each host. > > The problem I'm seeing is that I get very varying results spanning from > 16sec/100000inserts to 240sec/100000inserts. > Currently I'm using a 10G heapsize on hbase and 3G heapsize on hdfs. > > How do I find out what makes it this random? I think I should be able to get > around 22sec/100000inserts. > > > Best regards > > Emil Assarsson > Sony Ericsson Mobile Communications AB > > "The information in this email, and attachment(s) thereto, is strictly > confidential and may be legally privileged. It is intended solely for the > named recipient(s), and access to this e-mail, or any attachment(s) thereto, > by anyone else is unauthorized. Violations hereof may result in legal > actions. Any attachment(s) to this e-mail has been checked for viruses, but > please rely on your own virus-checker and procedures. If you contact us by > e-mail, we will store your name and address to facilitate communications in > the matter concerned. If you do not consent to us storing your name and > address for above stated purpose, please notify the sender promptly. Also, if > you are not the intended recipient please inform the sender by replying to > this transmission, and delete the e-mail, its attachment(s), and any copies > of it without, disclosing it." > >
