HBase loads the index of the files on start-up, if you ran out of memory for those indexes (which are a fraction of the data size), you'd crash with OOME.
The index is supposed to be a smallish fraction of the total data size. I wouldn't run with less than -Xmx2000m On Mon, Apr 13, 2009 at 10:48 PM, Puri, Aseem <aseem.p...@honeywell.com>wrote: > > -----Original Message----- > From: Erik Holstad [mailto:erikhols...@gmail.com] > Sent: Monday, April 13, 2009 9:47 PM > To: hbase-user@hadoop.apache.org > Subject: Re: Some HBase FAQ > > On Mon, Apr 13, 2009 at 7:12 AM, Puri, Aseem > <aseem.p...@honeywell.com>wrote: > > > Hi > > > > I am new HBase user. I have some doubts regards > > functionality of HBase. I am working on HBase, things are going fine > but > > I am not clear how are things happening. Please help me by answering > > these questions. > > > > > > > > 1. I am inserting data in HBase table and all regions get > balanced > > across various Regionservers. But what will happens when data > increases > > and there is not enough space in Regionservers to accommodate all > > regions. So I will like this that some regions in Regionserver and > some > > are at HDFS but not on Regionserver or HBase Regioservers stop taking > > new data? > > > Not really sure what you mean here, but if you are asking what to do > when > you are > running out of disk space on the regionservers, the answer is add > another > machine > or two. > > --- I want ask that HBase RegionServer store regions data on HDFS. So > when HBase master starts it loads all region data from HDFS to > regionserver. So what will the scenario if there is not enough space in > regionservers to accommodate new data? Is some regions swapped out from > regionserver to create space for new regions and when needed swaps in > regions to regionserver from HDFS. Or something else will happen. > > > > > > > > > 2. When I insert data in HBase table, 3 to 4 mapfiles are > generated > > for one category, but after some time all mapfiles combines as one > file. > > Is this we call minor compaction actually? > > > When all current mapfiles and memcache are combined into one files, this > is called major compaction, see BigTable paper for more details. > > > > > > > > > 3. For my application where I will use HBase will have updates in > a > > table frequently. Should is use some other database as a intermediate > to > > store data temporarily like MySQL and then do bulk update on HBase or > > should I directly do updates on HBase. Please tell which technique > will > > be more optimized in HBase? > > > HBase is fast for reads which has so far been the main focus of the > development, with > 0.20 we can hopefully add even fast random reading to it to make it a > more > well rounded > system. Is HBase too slow for you today when writing to it and what are > your > requirements? > > ---- Basically I put this question for writing operation. Not any > complex requirement. I want your suggestion on that what technique > should I follow for write operation: > > a. If there is some update I should store data temporarily in MySQL and > then do bulk update on HBase > > b. As if there is an update I should directly update on HBase instead of > writing it in MySQL and after some time doing bulk update on HBase. > > What you say, what approach is more optimized? >