Hi James, How did the nodes crash? I am asking because it would be good to know where it hurts. As to your 6500 regions per region server, that is an order of magnitude high than we like to see. With that many regions you are going to run into a few issues:
1.) Small flushes due to memstore being split between too many regions 2.) Too many compactions due to your small flushes 3.) Huge storefile indexes due to high storefile count Typically we like to keep the regions to a couple hundred max for optimal performance. There is not a set max, but the max is until your performance degrades and cluster dies. As for recovering from a dead cluster...If you can truncate your data I would recommend moving to a 10GB region size(not ideal for .90), this way once you upgrade to .92(CDH4) you will be able to take advantage of the large region sizes without merging regions. If you can't truncate your table still move to 10GB region sizes(hbase.hregion.max.filesize), then merge your regions down until you are at a sane region count. On Mon, Jan 28, 2013 at 1:41 AM, James Chang <strategist...@gmail.com>wrote: > Dears, > > Does anyone know HBase 0.90.6(CDH3U4) has limitation about the total > regions per region server? I cannot find any setting in HBse's > configuration file!? or I missed something? One expert kindly provide a > mailing thread (http://search-hadoop.com/m/cyFfl1SHnbD) but seems > no advanced discuss... > > And when I try in my (CDH3U4) 6 nodes small cluster, the average > number of region is 6500 per region server, when one node crash yesterday, > there were 2 of the alive region servers's region become about 9000, then > these two nodes become very slow then dead. Afer all, whole cluster down > and the cluster cannot restart. > > So, my questions are: > 1. What's the maximum number of regions per region server ? > 2. In this incident, what's the best practice to recover the dead cluster? > (except add more nodes, because this need more time) > > ps. > > <property> > <name>hbase.hregion.max.filesize</name> > <value>1073741824</value> > </property> > > Best Regards. > James Chang > -- Kevin O'Dell Customer Operations Engineer, Cloudera