Akhtar: There is no manual step for bulk load. You essentially have your script that runs the map reduce job that creates the HFiles. On success of this script/command, you run the completebulkload command ... the whole bulk load can be automated, just like your map reduce job.
--Suraj On Mon, Jan 6, 2014 at 11:42 AM, Mike Axiak <m...@axiak.net> wrote: > I suggest you look at hannibal [1] to look at the distribution of the data > on your cluster: > > 1: https://github.com/sentric/hannibal > > > On Mon, Jan 6, 2014 at 2:14 PM, Doug Meil <doug.m...@explorysmedical.com > >wrote: > > > > > In addition to what everybody else said, look what *where* the regions > are > > for the target table. There may be 5 regions (for example), but look to > > see if they are all on the same RS. > > > > > > > > > > > > On 1/6/14 5:45 AM, "Nicolas Liochon" <nkey...@gmail.com> wrote: > > > > >It's very strange that you don't see a perf improvement when you > increase > > >the number of nodes. > > >Nothing in what you've done change the performances at the end? > > > > > >You may want to check: > > > - the number of regions for this table. Are all the region server busy? > > >Do > > >you have some split on the table? > > > - How much data you actually write. Is the compression enabled on this > > >table? > > > - Do you have compactions? You may want to change the max store file > > >settings for unfrequent write load (see > > >http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). > > > > > >It would be interesting to test as well the 0.96 release. > > > > > > > > > > > >On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov > > ><vrodio...@carrieriq.com>wrote: > > > > > >> > > >> I think in this case, writing data to HDFS or HFile directly (for > > >> subsequent bulk loading) > > >> is the best option. HBase will never compete in write speed with HDFS. > > >> > > >> Best regards, > > >> Vladimir Rodionov > > >> Principal Platform Engineer > > >> Carrier IQ, www.carrieriq.com > > >> e-mail: vrodio...@carrieriq.com > > >> > > >> ________________________________________ > > >> From: Ted Yu [yuzhih...@gmail.com] > > >> Sent: Saturday, January 04, 2014 2:33 PM > > >> To: user@hbase.apache.org > > >> Subject: Re: Hbase Performance Issue > > >> > > >> There're 8 items under: > > >> http://hbase.apache.org/book.html#perf.writing > > >> > > >> I guess you have through all of them :-) > > >> > > >> > > >> On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din > > >> <akhtar.m...@gmail.com>wrote: > > >> > > >> > Thanks guys for your precious time. > > >> > Vladimir, as Ted rightly said i want to improve write performance > > >> currently > > >> > (of course i want to read data as fast as possible later on) > > >> > Kevin, my current understanding of bulk load is that you generate > > >> > StoreFiles and later load through a command line program. I dont > want > > >>to > > >> do > > >> > any manual step. Our system is getting data after every 15 minutes, > so > > >> > requirement is to automate it through client API completely. > > >> > > > >> > > > >> > > >> Confidentiality Notice: The information contained in this message, > > >> including any attachments hereto, may be confidential and is intended > > >>to be > > >> read only by the individual or entity to whom this message is > > >>addressed. If > > >> the reader of this message is not the intended recipient or an agent > or > > >> designee of the intended recipient, please note that any review, use, > > >> disclosure or distribution of this message or its attachments, in any > > >>form, > > >> is strictly prohibited. If you have received this message in error, > > >>please > > >> immediately notify the sender and/or notificati...@carrieriq.com and > > >> delete or destroy any copy of this message and its attachments. > > >> > > > > >