Hello Jun Li:

So, you modified the code to set N=4?

What happens if you leave it at 1 and instead run the ten clients as in
"sequentialWrite 10"? (The '10' in the former says, run 10 concurrent
clients -- isn't that what you want?)

The reason I ask is because I have not played around with changing N.  To
specify more clients, I just pass the client number as argument on

There is also the --nomapred option which will run all clients in the one
JVM (I find that this, for smaller numbers, can put up a heavier loading
than running via MR).

In my experience with rows of 1K, with a cluster of 4 machines each running
a tasktracer  that ran two concurrent children at a time, I was able to run
tests with 8 clients writing to one regionserver.  If I set the cell size
down -- < 100 bytes -- I'd find that I'd OOME because of index sizes (The
below configuration has biggest effect on heap used).  If I ran with more
than 8 clients, I'd run into issues where compactions were overwhelmed by
the upload rate (we need to make our compactions run faster).


    <description>The interval at which we record offsets in hbase
    store files/mapfiles.  Default for stock mapfiles is 128.  Index
    files are read into memory.  If there are many of them, could prove
    a burden.  If so play with the hadoop io.map.index.skip property and
    skip every nth index member when reading back the index into memory.
    Downside to high index interval is lowered access times.

On Fri, Mar 27, 2009 at 10:58 AM, Jun Li <jltz922...@gmail.com> wrote:

> Hi, I have just set up a small Linux machine cluster with 3 physical
> machines (dual core, with RedHat Enterprise 5.) to run Hadoop and HBase. No
> virtual machines are involved at this time in the test bed. The version of
> HBase and Hadoop that I chose is hbase-0.19.0 and hadoop-0.19.0.
> To better understand the performance, I run the PerformanceEvaluation which
> is supported in the test directory comes with the release.  When I used the
> client number N=1,  all the testing about sequentialWrite, sequentialRead,
> randomWrite, randomRead, scan, just work fine, for the row number chosen by
> default, which is R = 1024*1024, with each row containing 1 KB of data.
> However, when I choose N= 4, with row number selected for 102400 (even
> smaller than 1024*1024 described above), and run the following command:
>  bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation  --rows=10240
> sequentialWrite 10
> the Map/Reduce fails, and I check the logs, It has the error message of:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server for region
> TestTable,,1238136022072, row '0000010239', but failed after 10
> attempts.
> Exceptions:
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:841)
>        at
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:932)
>        at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372)
>        at
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:370)
>        at
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:393)
>        at
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:583)
>        at
> org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:182)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
>        at org.apache.hadoop.mapred.Child.main(Child.java:155)
> All my HBase runtime configuration parameters, such as JVM Heap, are chosen
> by the default setting.  Could you provide help on addressing this issue?
> Also, it seems the on the Wiki page,
> http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, the focus is on
> the case of N=1. Could you share with me what is the higher number of N
> (concurrent clients) that have been tested on HBase, given the comparable
> number of rows  at the range of 1024*1024, or above?
> Thank you and Best Regards.

Reply via email to