Re: try to run PerformanceEvaluation and encounter RetriesExhaustedException

Jun Li Mon, 30 Mar 2009 23:08:57 -0700

Hi Stack,

Thank you very much for your reply.


No, I did not modify the source code of
org.apache.hadoop.hbase.PerformanceEvaluation.  In fact, for my current
configuration of 3 machines ( I Hbase Master, and 2 region servers), either
N = 4 or N =10 will introduce the same Out of Memory that I reported, when I
run the command of:

 bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 10
(or 4)
In my previous email, I reported that when I set N = 10, even the total
number of rows "--rows" being set 102400, will still introduce the
same OOME.

Compared to your configuration for 8 clients, I will need one more machine
to repeat your experience. Do you have your own special settings (JVM Heap,
Memory,  etc.) in HDFS or in HBase, that are different than the default
settings? If you do have such settings, could you share them with me? You
show "io.map.index.skip" in your email, do you recommend me to play with it?


In my actual solution that I am planning to build upon HBase, I can manage
to have many machines to form the HBase cluster, say, over a hundred of VMs.
Based on your experience, what would be the performance impact of adding
more and more region servers, to serve the concurrent clients, in the
current implementation of HBase (version 0.19.0 or higher)?  I would image
the number of concurrent clients that can be served should grow, up to
certain point, and then gets saturated.

Regards,

Jun



On Mon, Mar 30, 2009 at 4:12 AM, stack <st...@duboce.net> wrote:

> Hello Jun Li:
>
> So, you modified the code to set N=4?
>
> What happens if you leave it at 1 and instead run the ten clients as in
> "sequentialWrite 10"? (The '10' in the former says, run 10 concurrent
> clients -- isn't that what you want?)
>
> The reason I ask is because I have not played around with changing N.  To
> specify more clients, I just pass the client number as argument on
> command-line.
>
> There is also the --nomapred option which will run all clients in the one
> JVM (I find that this, for smaller numbers, can put up a heavier loading
> than running via MR).
>
> In my experience with rows of 1K, with a cluster of 4 machines each running
> a tasktracer  that ran two concurrent children at a time, I was able to run
> tests with 8 clients writing to one regionserver.  If I set the cell size
> down -- < 100 bytes -- I'd find that I'd OOME because of index sizes (The
> below configuration has biggest effect on heap used).  If I ran with more
> than 8 clients, I'd run into issues where compactions were overwhelmed by
> the upload rate (we need to make our compactions run faster).
>
> St.Ack
>
>  <property>
>    <name>hbase.io.index.interval</name>
>    <value>128</value>
>    <description>The interval at which we record offsets in hbase
>    store files/mapfiles.  Default for stock mapfiles is 128.  Index
>    files are read into memory.  If there are many of them, could prove
>    a burden.  If so play with the hadoop io.map.index.skip property and
>    skip every nth index member when reading back the index into memory.
>    Downside to high index interval is lowered access times.
>    </description>
>  </property>
>
>
> On Fri, Mar 27, 2009 at 10:58 AM, Jun Li <jltz922...@gmail.com> wrote:
>
> > Hi, I have just set up a small Linux machine cluster with 3 physical
> > machines (dual core, with RedHat Enterprise 5.) to run Hadoop and HBase.
> No
> > virtual machines are involved at this time in the test bed. The version
> of
> > HBase and Hadoop that I chose is hbase-0.19.0 and hadoop-0.19.0.
> >
> > To better understand the performance, I run the PerformanceEvaluation
> which
> > is supported in the test directory comes with the release.  When I used
> the
> > client number N=1,  all the testing about sequentialWrite,
> sequentialRead,
> > randomWrite, randomRead, scan, just work fine, for the row number chosen
> by
> > default, which is R = 1024*1024, with each row containing 1 KB of data.
> >
> > However, when I choose N= 4, with row number selected for 102400 (even
> > smaller than 1024*1024 described above), and run the following command:
> >
> >  bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation  --rows=10240
> > sequentialWrite 10
> >
> > the Map/Reduce fails, and I check the logs, It has the error message of:
> >
> > org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> > contact region server 15.25.119.59:60020 for region
> > TestTable,,1238136022072, row '0000010239', but failed after 10
> > attempts.
> > Exceptions:
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> > java.lang.OutOfMemoryError: Java heap space
> >
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:841)
> >        at
> >
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:932)
> >        at
> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1372)
> >        at
> >
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:370)
> >        at
> >
> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:393)
> >        at
> >
> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:583)
> >        at
> >
> org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:182)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:155)
> >
> >
> > All my HBase runtime configuration parameters, such as JVM Heap, are
> chosen
> > by the default setting.  Could you provide help on addressing this issue?
> >
> > Also, it seems the on the Wiki page,
> > http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, the focus is
> on
> > the case of N=1. Could you share with me what is the higher number of N
> > (concurrent clients) that have been tested on HBase, given the comparable
> > number of rows  at the range of 1024*1024, or above?
> >
> > Thank you and Best Regards.
> >
>

Re: try to run PerformanceEvaluation and encounter RetriesExhaustedException

Reply via email to