Thanks for the valuable tips Ryan. I probably should have replied sooner but
I was busy experimenting with my tiny cluster and I'd like to share some of
my experience in the list for future reference.

I listened to your advise and did a lot of reading on Sun's Garbage
Collector, particularly the new CMS collector and the parallel collector.
When I tried the CMS collector along with the incremental option I didn't
get a big performance hit although the incremental mode is suggested for
machines with one or two processors in Suns' documentation. When I removed
the incremental option, I got a small ~5% overall increase in my uploads.

Moreover, when I tried to set MaxNewSize and NexSize to 12m, the time spent
on minor collections reduced just like you suggested, but the number of the
minor collections increased dramatically so the total time spent on new
generation garbage collection ( # of minor collections * average time spent
on one collection ) increased my upload time. I then decided to increase
MaxNewSize and NewSize to 48m and I observed that I got less frequent but
longer minor collections. I also increased the client buffer size from 12m
to 24m.

Here are the best garbage collector settings I could come up with:

export HBASE_OPTS="-Xmx1536m -Xms1536m -XX:MaxNewSize=48m -XX:NewSize=48m
-XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled
-XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps -Xloggc:/tmp/hadoop-gc.log"

On a small cluster with 7 regionservers and 1 master node, I could upload
4GB of data (approximately 40m rows if I remember correct) in 16 minutes.

For comparison, the following settings:

export HBASE_OPTS="-Xmx1536m -Xms1536m -XX:MaxNewSize=6m -XX:NewSize=6m
-XX:+UseConcMarkSweepGC
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-Xloggc:/tmp/hadoop-gc.log

resulted in a 20 min upload time.

I also should mention again that I'm ona shared cluster with a total of 100
machines where each machine has a dual core CPU with 4 GB of memory. The 8
machines that I used were dedicated to me during the test time so there was
no one using them accept for me but I don't know if the network traffic in
the whole cluster affected the tests or not because I didn't measure it.


I thought this info might be useful for people using old hardware and
limited memory and processing power like me. I also saw that you posted some
of your experience in the Performance Tuning section of the wiki, so I can
write some advice for users like me if you think that would be helpful for
the rest of the community.

Thanks,
Jim


On Wed, Apr 29, 2009 at 5:54 PM, Ryan Rawson <ryano...@gmail.com> wrote:

> You might have to delve into tweaking the GC settings.  Here is what I am
> setting:
>
> export HBASE_OPTS="-XX:+UseConcMarkSweepGC -XX:NewSize=12m
> -XX:MaxNewSize=12m -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -Xloggc:/export/hadoop/logs/gc.log"
>
> What I found is once your in-memory complexity rises, the JVM resizes the
> ParNew/new-gen space up and up, thus extending the length of the so-called
> 'minor collections'.  At one point with a 150MB ParNew I was seeing
> typically 100-150ms and outliers of 400ms for 'minor' GC pauses.  If you
> have all machines in your cluster pausing for 100ms at semi-random
> intervals, that holds up your import as the clients are waiting on the
> paused JVM to continue.
>
> The key thing I had to do was set -XX:MaxNewSize=12m (about 2 * my L2 on
> xeons).  You get more frequent, but smaller, GCs.  Your heap also tends to
> grow larger than before, but with CMS it doesnt result in larger VM pauses,
> just more ram usage.  I I personally use -Xmx4500m.   With your machines,
> I'd consider a setting at least 1500, preferably 2000-2500m.  Of course I
> have a ton of heap to chuck at it, so CMS collections dont happen all the
> time (but when they do, they can prune 1500mb of garbage).
>
> Since you are on a 2 core, you will probably have to set the CMS to
> incremental:
> -XX:+CMSIncrementalMode
>
> To prevent the CMS GC from starving out your main threads.
>
> Good luck with it!
> -ryan
>
> On Wed, Apr 29, 2009 at 3:33 PM, Jim Twensky <jim.twen...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I'm doing some experiments to import large datasets to Hbase using a Map
> > job. Before posting some numbers, here is a summary of my test cluster:
> >
> > I have 7 regionservers and 1 master. I also run HDFS datanodes and Hadoop
> > tasktrackers on the same 7 regionservers. Similarly, I run the Hadoop
> > namenode on the same machine that I run the Hbase master. Each machine is
> > an
> > IBM e325 node that has two 2.2 GHz AMD64 processors, 4 GB RAM, and 80 GB
> > local disk.
> >
> > My dataset is simply the output of another map reduce job, consisting of
> 7
> > sequence files with a total size of 40 GB. Each file contains key, value
> > records of the form (Text, LongWritable). The keys are sentences or
> phrases
> > extracted from sentences and the values are frequencies. The total number
> > of
> > records is roughly 420m and an average key is around 100 bytes. (40GB /
> > 420m
> > - ignoring long writables)
> >
> > I tried to randomize the (key,value) pairs with another map reduce job
> and
> > I
> > also set:
> >
> >            table.setAutoFlush(false);
> >            table.setWriteBufferSize(1024*1024*10);
> >
> > based on some advice that I read before on the list. My Map function that
> > imports data to Hbase is as follow:
> >
> >    public void map(Text key, LongWritable value,
> > OutputCollector<NullWritable, NullWritable> output, Reporter reporter)
> > throws IOException {
> >
> >        BatchUpdate update = new BatchUpdate(key.toString());
> >        update.put("frequency:value",Bytes.toBytes(value.get()));
> >
> >        table.commit(update);
> >    }
> >
> > So far I can hit 20% of the import in 40-45 minutes so importing the
> whole
> > data set will presumbly take more than 3.5 hours. I tried diffirent write
> > buffer sizes between5 MB and 20 MB and didn't get any significant
> > improvements. I did my experiments with 1 or 2 mappers per node although
> 1
> > mapper per node seemed to do better than 2 nodes.  When I refresh the
> Hbase
> > master web interface during my imports, I see the requests are generally
> > divided equally to 7 regionservers and as I keep hitting the refreh
> button,
> > I can see that I get 10000 to 70000 requests at once.
> >
> > I read some earlier posts from Ryan and Stack, and I was actually
> expecting
> > at least twice better performonce so I decided to ask to the list whether
> > this is an expected performance or way below it.
> >
> > I'd appreciate any comments/suggestions.
> >
> >
> > Thanks,
> > Jim
> >
>

Reply via email to