Ryan, Yes. you are right. But my question is that, even through 1000 regions (250MB)) per regionserver, each regionserver can only support 250GB storage.
Please also check this thread "Help needed - Adding HBase to architecture", Stack and Andrew have put some talk there. Schubert On Fri, Jul 10, 2009 at 12:55 PM, Ryan Rawson <[email protected]> wrote: > That size is not memory-resident, so the total data size is not an > issue. The index size is what limits you with RAM, and its about 1 MB > per region (256MB region). > > -ryan > > On Thu, Jul 9, 2009 at 9:51 PM, zsongbo<[email protected]> wrote: > > Hi Ryan, > > > > Thanks. > > > > If your regionsize is about 250MB, than 400 regions can store 100GB data > on > > each regionserver. > > Now, if you have 100TB data, then you need 1000 regionservers. > > We are not google or yahoo who have so many nodes. > > > > Schubert > > > > On Fri, Jul 10, 2009 at 12:29 PM, Ryan Rawson <[email protected]> > wrote: > > > >> re: #2: in fact we don't know that... I know that I ran run 200-400 > >> regions on a regionserver with a heap size of 4-5gb. More even. I > >> bet I could have 1000 regions open on 4gb ram. Each region is ~ 1mb > >> of all the time data, so there we go. > >> > >> As for compactions, they are fairly fast, 0-30s or so depending on a > >> number of factors. Practically speaking it has not been a problem for > >> me, and I've put 1200 gb into hbase so far. > >> > >> On Thu, Jul 9, 2009 at 8:58 PM, zsongbo<[email protected]> wrote: > >> > Hi all, > >> > > >> > 1. In this configuration property: > >> > > >> > <property> > >> > <name>hbase.hstore.compactionThreshold</name> > >> > <value>3</value> > >> > <description> > >> > If more than this number of HStoreFiles in any one HStore > >> > (one HStoreFile is written per flush of memcache) then a compaction > >> > is run to rewrite all HStoreFiles files as one. Larger numbers > >> > put off compaction but when it runs, it takes longer to complete. > >> > During a compaction, updates cannot be flushed to disk. Long > >> > compactions require memory sufficient to carry the logging of > >> > all updates across the duration of the compaction. > >> > If too large, clients timeout during compaction. > >> > </description> > >> > </property> > >> > > >> > > >> > That says "During a compaction, updates cannot be flushed to disk." > >> > Does it mean that, when compaction, the memcache cannot be flushed to > >> disk? > >> > I think it is not good. > >> > > >> > 2. We know that HBase cannot serve too many regions on each > regionserver. > >> If > >> > only 200 regions(256MB), only 50GB storage can be used. > >> > I my tested whith have 1.5GB heap and 256MB regionsize, each > regionserver > >> > can support 150 regions, and then OutOfMem. > >> > Can anybody explain more detail here of the reason? > >> > > >> > To use more storage, can I set larger regionsize? such as 1GB, 10GB? > >> > I have worry about the compaction time would be long with so large > >> regions. > >> > > >> > Schubert > >> > > >> > > >
