Hi, all: We have a HBase Cluster which has several hundreds of region servers and each RS hosts nearly 300 regions. Currently one of our tables has increased to 16 TB and some region exceeds 10 GB. Major compaction on these regions is painful as it produces a lot of disk I/O and will affect the performance of RS. The auto splitting size of IncreasingToUpperBoundRegionSplitPolicy increased to 16 GB or more for this huge table. My solution is set attribute MAX_FILESIZE on this table so ConstantSizeRegionSplitPolicy auto splitting will work again.
My question is: What are the common advices or configuration options to host such a huge table. If we decide to limit the region size, how can we decide the optimised region size? If region size is too large, major compaction is painful; but if region size is too small, then we have a lot of small region which will overwhelm the RS. 林豪 云平台 研发工程师 爱奇艺公司 QIYI.com, Inc. 地址:上海市徐汇区宜山路1388号民润大厦6层 邮编:201103 手机:+86 136 1180 1618 电话:+86 21 5451 9520 8393 传真:+86 21 5451 9529 邮箱:lin...@qiyi.com<mailto:zhouxiq...@qiyi.com> 网址:www.iQIYI.com<http://www.iqiyi.com/> [C:\Users\a\Desktop\常有人问我要的东西T T\爱奇艺联合Logo-02.png]<http://www.iqiyi.com/>