What version of HBase / hdfs are you running with ?

Cheers



On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din
<akhtar.m...@gmail.com>wrote:

> Hi,
> I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB
> in size. Joining is done at reduce side. Output is written to either Hbase
> or HDFS depending upon configuration. The problem I am having is that Hbase
> takes about 60-80 minutes to write the processed data, on the other hand
> HDFS takes only 3-5 mins to write the same data. I really want to improve
> the Hbase speed and bring it down to 1-2 min.
>
> I am using amazon EC2 instances, launched a cluster of size 3 and later 10,
> have tried both c3.4xlarge and c3.8xlarge instances.
>
> I can see significant increase in performance while writing to HDFS as i
> use cluster with more nodes, having high specifications, but in the case of
> Hbase there was no significant change in performance.
>
> I have been going through different posts, articles and have read Hbase
> book to solve the Hbase performance issue but have not been able to succeed
> so far.
> Here are the few things i have tried out so far:
>
> *Client Side*
> - Turned off writing to WAL
> - Experimented with write buffer size
> - Turned off auto flush on table
> - Used cache, experimented with different sizes
>
>
> *Hbase Server Side*
> - Increased region servers heap size to 8 GB
> - Experimented with handlers count
> - Increased Memstore flush size to 512 MB
> - Experimented with hbase.hregion.max.filesize, tried different sizes
>
> There are many other parameters i have tried out following the suggestions
> from  different sources, but nothing worked so far.
>
> Your help will be really appreciated.
>
> --
> Regards
> Akhtar Muhammad Din
>

Reply via email to