Hello Jun Li: So, you modified the code to set N=4?
What happens if you leave it at 1 and instead run the ten clients as in "sequentialWrite 10"? (The '10' in the former says, run 10 concurrent clients -- isn't that what you want?) The reason I ask is because I have not played around with changing N. To specify more clients, I just pass the client number as argument on command-line. There is also the --nomapred option which will run all clients in the one JVM (I find that this, for smaller numbers, can put up a heavier loading than running via MR). In my experience with rows of 1K, with a cluster of 4 machines each running a tasktracer that ran two concurrent children at a time, I was able to run tests with 8 clients writing to one regionserver. If I set the cell size down -- < 100 bytes -- I'd find that I'd OOME because of index sizes (The below configuration has biggest effect on heap used). If I ran with more than 8 clients, I'd run into issues where compactions were overwhelmed by the upload rate (we need to make our compactions run faster). St.Ack <property> <name>hbase.io.index.interval</name> <value>128</value> <description>The interval at which we record offsets in hbase store files/mapfiles. Default for stock mapfiles is 128. Index files are read into memory. If there are many of them, could prove a burden. If so play with the hadoop io.map.index.skip property and skip every nth index member when reading back the index into memory. Downside to high index interval is lowered access times. </description> </property> On Fri, Mar 27, 2009 at 10:58 AM, Jun Li <jltz922...@gmail.com> wrote: > Hi, I have just set up a small Linux machine cluster with 3 physical > machines (dual core, with RedHat Enterprise 5.) to run Hadoop and HBase. No > virtual machines are involved at this time in the test bed. The version of > HBase and Hadoop that I chose is hbase-0.19.0 and hadoop-0.19.0. > > To better understand the performance, I run the PerformanceEvaluation which > is supported in the test directory comes with the release. When I used the > client number N=1, all the testing about sequentialWrite, sequentialRead, > randomWrite, randomRead, scan, just work fine, for the row number chosen by > default, which is R = 1024*1024, with each row containing 1 KB of data. > > However, when I choose N= 4, with row number selected for 102400 (even > smaller than 1024*1024 described above), and run the following command: > > bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation --rows=10240 > sequentialWrite 10 > > the Map/Reduce fails, and I check the logs, It has the error message of: > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to > contact region server 15.25.119.59:60020 for region > TestTable,,1238136022072, row '0000010239', but failed after 10 > attempts. > Exceptions: > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > java.lang.OutOfMemoryError: Java heap space > > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:841) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:932) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:370) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:393) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:583) > at > org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:182) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) > at org.apache.hadoop.mapred.Child.main(Child.java:155) > > > All my HBase runtime configuration parameters, such as JVM Heap, are chosen > by the default setting. Could you provide help on addressing this issue? > > Also, it seems the on the Wiki page, > http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, the focus is on > the case of N=1. Could you share with me what is the higher number of N > (concurrent clients) that have been tested on HBase, given the comparable > number of rows at the range of 1024*1024, or above? > > Thank you and Best Regards. >