Hi, I have just set up a small Linux machine cluster with 3 physical machines (dual core, with RedHat Enterprise 5.) to run Hadoop and HBase. No virtual machines are involved at this time in the test bed. The version of HBase and Hadoop that I chose is hbase-0.19.0 and hadoop-0.19.0.
To better understand the performance, I run the PerformanceEvaluation which is supported in the test directory comes with the release. When I used the client number N=1, all the testing about sequentialWrite, sequentialRead, randomWrite, randomRead, scan, just work fine, for the row number chosen by default, which is R = 1024*1024, with each row containing 1 KB of data. However, when I choose N= 4, with row number selected for 102400 (even smaller than 1024*1024 described above), and run the following command: bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation --rows=10240 sequentialWrite 10 the Map/Reduce fails, and I check the logs, It has the error message of: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server 15.25.119.59:60020 for region TestTable,,1238136022072, row '0000010239', but failed after 10 attempts. Exceptions: java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:841) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:932) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:370) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:393) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:583) at org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:182) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child.main(Child.java:155) All my HBase runtime configuration parameters, such as JVM Heap, are chosen by the default setting. Could you provide help on addressing this issue? Also, it seems the on the Wiki page, http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, the focus is on the case of N=1. Could you share with me what is the higher number of N (concurrent clients) that have been tested on HBase, given the comparable number of rows at the range of 1024*1024, or above? Thank you and Best Regards.