Hbase region splits can be done through a variety of strategies. Data size can be a component in those strategies. There's no hard and fast rule of how large a region can be. There's some tradeoffs with larger or smaller region sizes. A region split strategy will depend upon a number of factors. Memstore use, scan parellelism, compaction strategies, Dara size an hardware. On Aug 19, 2015 3:06 PM, "Christopher" <ctubb...@apache.org> wrote:
> Forgive my ignorance about HBase, but wouldn't size of records count, > also? Your response seems to imply that number of records is what > matters for how many regions are needed. For what it's worth, > Accumulo's tablets are split based on storage size, not number of > records. I assumed the same was true for HBase. Am I wrong? > > -- > Christopher L Tubbs II > http://gravatar.com/ctubbsii > > > On Wed, Aug 19, 2015 at 2:28 PM, Ted Malaska <ted.mala...@cloudera.com> > wrote: > > I've been doing HBase for a long time and never had an issue with region > > count limits and I have clusters with 10s of billions of records. Many > > there would be issues around a couple Trillion records, but never got > that > > high yet. > > > > Ted Malaska > > > > On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser <josh.el...@gmail.com> > wrote: > > > >> Oh, one other thing that I should mention (was prompted off-list). > >> > >> (definition time since cross-list now: HBase regions == Accumulo > tablets) > >> > >> Accumulo will handle many more regions than HBase does now due to a > >> splittable metadata table. While I was told this was a very long and > >> arduous journey to implement correctly (WRT splitting, merges and bulk > >> loading), users with "too many regions" problems are extremely few and > far > >> between for Accumulo. > >> > >> I was very happy to see effort/design being put into this in HBase. And, > >> just to be fair in criticism/praises, HBase does appear to me to do > >> assignments of regions much faster than Accumulo does on a small cluster > >> (~5-10 nodes). Accumulo may take a few seconds to notice and reassign > >> tablets. I have yet to notice this with HBase (which also could be due > to > >> lack of personal testing). > >> > >> > >> Jerry He wrote: > >> > >>> Hi, folks > >>> > >>> We have people that are evaluating HBase vs Accumulo. > >>> Security is an important factor. > >>> > >>> But I think after the Cell security was added in HBase, there is no > more > >>> real gap compared to Accumulo. > >>> > >>> I know we have both HBase and Accumulo experts on this list. > >>> Could someone shred more light? > >>> I am looking for real gap comparing HBase to Accumulo if there is any > so > >>> that I can be prepared to address them. This is not limited to the > >>> security > >>> area. > >>> > >>> There are differences in some features and implementations. But they > don't > >>> see like real 'gaps'. > >>> > >>> Any comments and feedbacks are welcome. > >>> > >>> Thanks, > >>> > >>> Jerry > >>> > >>> >