Yes. The row keys generated should be falling in the range of one of the region's start and end key . So HBase internally can take care of distributing to the specified region server. As mentioned in http://hbase.apache.org/book/perf.writing.html, we also need to take care of not making one particular region as hot region.
If suppose the data for a span of 30 mins is collected and then it is passed on to HBase then the client can be written in such a way like the puts are equally distributed to the regions that comprises the 30 mins data. Hope this helps. Regards Ram > -----Original Message----- > From: jing wang [mailto:happygodwithw...@gmail.com] > Sent: Wednesday, September 05, 2012 8:00 PM > To: user@hbase.apache.org > Subject: Re: reduce influence of auto-splitting region > > Hi Ram, > > How to drive the data to the specific hourly region? Use the code > like > http://hbase.apache.org/book/perf.writing.html? > > > Thanks, > Jing Wang > > 2012/9/5 Ramkrishna.S.Vasudevan <ramkrishna.vasude...@huawei.com> > > > Hi JingWang > > > > It is not necessary that region split can cause GC problems. Based > on your > > use case we may need to configure heapspace for the RS. > > Coming back to region splits, presplit of the tables created is a > good > > option. > > Assume a case where I know that the data that is going to come into > hbase > > is > > on a hourly basis. Then one option could be presplit your table > based on > > the hours and assign the regions in roundrobin fashion to every RS. > > This will ensure that any particular hours data will go into one > region > > specified for that hour only. So after that hour is over the data > will be > > moving over to another region server. > > But here again every hour can be split equally into the different RS > like 5 > > or 10 regions with in an hour. > > These are some ways, but should be chosen as per the data that your > cluster > > will be operating upon. > > > > Regards > > Ram > > > > > -----Original Message----- > > > From: jing wang [mailto:happygodwithw...@gmail.com] > > > Sent: Wednesday, September 05, 2012 6:42 PM > > > To: user@hbase.apache.org > > > Subject: Re: reduce influence of auto-splitting region > > > > > > Hi Ram, > > > > > > Thanks for your advice. We did consider what you said. > > > As Hbase is used as a realtime storage,just like mysql/oracle. When > > > splitted, hbase may lead gc to 'stop the world' or some long time > full > > > gc. > > > Our application can't accpet this. > > > > > > Thanks, > > > Jing Wang > > > > > > 2012/9/5 Ramkrishna.S.Vasudevan <ramkrishna.vasude...@huawei.com> > > > > > > > You can use the property hbase.hregion.max.filesize. You can set > > > this to a > > > > higher value and control the splits through your application. > > > > > > > > Regards > > > > Ram > > > > > > > > > -----Original Message----- > > > > > From: jing wang [mailto:happygodwithw...@gmail.com] > > > > > Sent: Wednesday, September 05, 2012 3:48 PM > > > > > To: user@hbase.apache.org > > > > > Subject: reduce influence of auto-splitting region > > > > > > > > > > Hi there, > > > > > > > > > > Using Hbase as a realtime storage(7*24h), how to reduce the > > > influence > > > > > of > > > > > region auto-splitting? > > > > > Any advice will be appreciated! > > > > > > > > > > > > > > > Thanks, > > > > > Jing > > > > > > > > > > > >