You should consider modeling your rows so that they are smaller than 1.5GB, the sweet spot for HBase is more like a few KBs per row. Else you end up with only 1 row per region which is totally inefficient for obvious reasons once you understand how HBase manages them.
The length is the size of the file in bytes. J-D On Sat, Mar 6, 2010 at 9:25 PM, steven zhuang <steven.zhuang.1...@gmail.com> wrote: > thanks, J.D. > > I think I know why the regionserver takes so much memory now, > there are some really big row in my table, 1.2-1.5 GB in size. seems that > the regionserver sometime will try to load the whole region into memory, I > don't know when this will happen, maybe when it does a major compaction or > reassign the region to other regionserver or when it's asked to open/online > a region?. > > you question is answered in line. > > On Sat, Mar 6, 2010 at 2:15 AM, Jean-Daniel Cryans <jdcry...@apache.org>wrote: > >> On Thu, Mar 4, 2010 at 7:19 PM, steven zhuang >> <steven.zhuang.1...@gmail.com> wrote: >> > thanks, J.D. >> > >> > I am still not sure about the second question, from the log >> I >> > can see lines like: >> > *org.apache.hadoop.hbase.regionserver.Store: loaded >> > /user/ccenterq/hbase/XXX/1702600912/queries/1289015788537930719, >> > isReference=false, sequence id=1389720128, length=**175533391**, >> > majorCompaction=true (this is the region data, not the index, right?)* >> > I do have some region really big, with millions of columns >> in >> > one column family, but isn't this length a little too big. >> >> The index and the metadata of the files of that Store in that region >> was loaded here. >> >> > >> > >> > About the third one, I am actually not very clear of how memory is >> > used in Hbase, if it's only the few KBs by holding region info, it won't >> > release right? >> >> I don't understand your question. Try an example? > > > sorry for not be clear, actually I am asking which part of the region has a > length of "175533391" in the following line, I think the index/meta-data > info for a region won't take so much memory. > > > *org.apache.hadoop.hbase.regionserver.Store: loaded >> /user/ccenterq/hbase/XXX/1702600912/queries/1289015788537930719, >> isReference=false, sequence id=1389720128, length=**175533391**, >> majorCompaction=true (this is the region data, not the index, right?)*