I am in a very similar situation.

I guess you can try one of the options.

Option one: avoid online insert by preparing data off-line. Do something like 
http://hbase.apache.org/0.94/book/ops_mgt.html#importtsv

Option two: If the first option doesn’t work for you. It will be better to 
reduce your region size and increase read/write timeout. So that you allow 
compact to happen while you insert data, but since the size is smaller, it 
takes less time to compact/split. With this option, you can have a table 
available 24/7 but the overall performance tends to go down dramatically once 
some regions starts compacting.

Option three: If you can afford some down time, ie, two hours every day. You 
can manage compact/split during that time. What I usually do is to run 
major-compact against all tables, then split ones that is large so that it has 
enough room to grow for the next day’s insert.

I hope it helps.

From: 林豪 [mailto:[email protected]]
Sent: Monday, December 14, 2015 11:51 PM
To: [email protected]
Subject: Common advices for hosting a huge table

Hi, all:

We have a HBase Cluster which has several hundreds of region servers and each 
RS hosts nearly 300 regions. Currently one of our tables has increased to 16 TB 
and some region exceeds 10 GB. Major compaction on these regions is painful as 
it produces a lot of disk I/O and will affect the performance of RS. The auto 
splitting size of IncreasingToUpperBoundRegionSplitPolicy increased to 16 GB or 
more for this huge table. My solution is set attribute MAX_FILESIZE on this 
table so ConstantSizeRegionSplitPolicy auto splitting will work again.

My question is: What are the common advices or configuration options to host 
such a huge table. If we decide to limit the region size, how can we decide the 
optimised region size? If region size is too large, major compaction is 
painful; but if region size is too small, then we have a lot of small region 
which will overwhelm the RS.

林豪
云平台  研发工程师

爱奇艺公司
QIYI.com, Inc.
地址:上海市徐汇区宜山路1388号民润大厦6层
邮编:201103
手机:+86 136 1180 1618
电话:+86 21 5451 9520 8393
传真:+86 21 5451 9529
邮箱:[email protected]<mailto:[email protected]>
网址:www.iQIYI.com<http://www.iqiyi.com/>
[cid:B21E048D-B27D-4528-92D0-36BAE7117128]<http://www.iqiyi.com/>

This email and any attachments transmitted with it are intended for use by the 
intended recipient(s) only. If you have received this email in error, please 
notify the sender immediately and then delete it. If you are not the intended 
recipient, you must not keep, use, disclose, copy or distribute this email 
without the author’s prior permission. We take precautions to minimize the risk 
of transmitting software viruses, but we advise you to perform your own virus 
checks on any attachment to this message. We cannot accept liability for any 
loss or damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the attorney-client 
privilege.

Reply via email to