There're 8 items under: http://hbase.apache.org/book.html#perf.writing
I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din <akhtar.m...@gmail.com>wrote: > Thanks guys for your precious time. > Vladimir, as Ted rightly said i want to improve write performance currently > (of course i want to read data as fast as possible later on) > Kevin, my current understanding of bulk load is that you generate > StoreFiles and later load through a command line program. I dont want to do > any manual step. Our system is getting data after every 15 minutes, so > requirement is to automate it through client API completely. > > > > On Sun, Jan 5, 2014 at 2:19 AM, Kevin O'dell <kevin.od...@cloudera.com > >wrote: > > > Have you tried writing out an hfile and then bulk loading the data? > > On Jan 4, 2014 4:01 PM, "Ted Yu" <yuzhih...@gmail.com> wrote: > > > > > bq. Output is written to either Hbase > > > > > > Looks like Akhtar wants to boost write performance to HBase. > > > MapReduce over snapshot files targets higher read throughput. > > > > > > Cheers > > > > > > > > > On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov > > > <vrodio...@carrieriq.com>wrote: > > > > > > > You cay try MapReduce over snapshot files > > > > https://issues.apache.org/jira/browse/HBASE-8369 > > > > > > > > but you will need to patch 0.94. > > > > > > > > Best regards, > > > > Vladimir Rodionov > > > > Principal Platform Engineer > > > > Carrier IQ, www.carrieriq.com > > > > e-mail: vrodio...@carrieriq.com > > > > > > > > ________________________________________ > > > > From: Akhtar Muhammad Din [akhtar.m...@gmail.com] > > > > Sent: Saturday, January 04, 2014 12:44 PM > > > > To: user@hbase.apache.org > > > > Subject: Re: Hbase Performance Issue > > > > > > > > im using CDH 4.5: > > > > Hadoop: 2.0.0-cdh4.5.0 > > > > HBase: 0.94.6-cdh4.5.0 > > > > > > > > Regards > > > > > > > > > > > > On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > > > What version of HBase / hdfs are you running with ? > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din > > > > > <akhtar.m...@gmail.com>wrote: > > > > > > > > > > > Hi, > > > > > > I have been running a map reduce job that joins 2 datasets of 1.3 > > > and 4 > > > > > GB > > > > > > in size. Joining is done at reduce side. Output is written to > > either > > > > > Hbase > > > > > > or HDFS depending upon configuration. The problem I am having is > > that > > > > > Hbase > > > > > > takes about 60-80 minutes to write the processed data, on the > other > > > > hand > > > > > > HDFS takes only 3-5 mins to write the same data. I really want to > > > > improve > > > > > > the Hbase speed and bring it down to 1-2 min. > > > > > > > > > > > > I am using amazon EC2 instances, launched a cluster of size 3 and > > > later > > > > > 10, > > > > > > have tried both c3.4xlarge and c3.8xlarge instances. > > > > > > > > > > > > I can see significant increase in performance while writing to > HDFS > > > as > > > > i > > > > > > use cluster with more nodes, having high specifications, but in > the > > > > case > > > > > of > > > > > > Hbase there was no significant change in performance. > > > > > > > > > > > > I have been going through different posts, articles and have read > > > Hbase > > > > > > book to solve the Hbase performance issue but have not been able > to > > > > > succeed > > > > > > so far. > > > > > > Here are the few things i have tried out so far: > > > > > > > > > > > > *Client Side* > > > > > > - Turned off writing to WAL > > > > > > - Experimented with write buffer size > > > > > > - Turned off auto flush on table > > > > > > - Used cache, experimented with different sizes > > > > > > > > > > > > > > > > > > *Hbase Server Side* > > > > > > - Increased region servers heap size to 8 GB > > > > > > - Experimented with handlers count > > > > > > - Increased Memstore flush size to 512 MB > > > > > > - Experimented with hbase.hregion.max.filesize, tried different > > sizes > > > > > > > > > > > > There are many other parameters i have tried out following the > > > > > suggestions > > > > > > from different sources, but nothing worked so far. > > > > > > > > > > > > Your help will be really appreciated. > > > > > > > > > > > > -- > > > > > > Regards > > > > > > Akhtar Muhammad Din > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Regards > > > > Akhtar Muhammad Din > > > > > > > > Confidentiality Notice: The information contained in this message, > > > > including any attachments hereto, may be confidential and is intended > > to > > > be > > > > read only by the individual or entity to whom this message is > > addressed. > > > If > > > > the reader of this message is not the intended recipient or an agent > or > > > > designee of the intended recipient, please note that any review, use, > > > > disclosure or distribution of this message or its attachments, in any > > > form, > > > > is strictly prohibited. If you have received this message in error, > > > please > > > > immediately notify the sender and/or notificati...@carrieriq.com and > > > > delete or destroy any copy of this message and its attachments. > > > > > > > > > > > > > -- > Regards > Akhtar Muhammad Din >