Thanks for your advices.

For option three, I think major compaction on a large region will affect 
performance of the region server. So the down time shall be down time for all 
the table on that RS, am i Right?




On 12/16/15, 5:12 AM, "Ted Yu" <yuzhih...@gmail.com> wrote:

>w.r.t. option #1, also consider
>http://hbase.apache.org/book.html#arch.bulk.load
>
>FYI
>
>On Tue, Dec 15, 2015 at 12:17 PM, Frank Luo <j...@merkleinc.com> wrote:
>
>> I am in a very similar situation.
>>
>> I guess you can try one of the options.
>>
>> Option one: avoid online insert by preparing data off-line. Do something
>> like http://hbase.apache.org/0.94/book/ops_mgt.html#importtsv
>>
>> Option two: If the first option doesn’t work for you. It will be better to
>> reduce your region size and increase read/write timeout. So that you allow
>> compact to happen while you insert data, but since the size is smaller, it
>> takes less time to compact/split. With this option, you can have a table
>> available 24/7 but the overall performance tends to go down dramatically
>> once some regions starts compacting.
>>
>> Option three: If you can afford some down time, ie, two hours every day.
>> You can manage compact/split during that time. What I usually do is to run
>> major-compact against all tables, then split ones that is large so that it
>> has enough room to grow for the next day’s insert.
>>
>> I hope it helps.
>>
>> From: 林豪 [mailto:lin...@qiyi.com]
>> Sent: Monday, December 14, 2015 11:51 PM
>> To: user@hbase.apache.org
>> Subject: Common advices for hosting a huge table
>>
>> Hi, all:
>>
>> We have a HBase Cluster which has several hundreds of region servers and
>> each RS hosts nearly 300 regions. Currently one of our tables has increased
>> to 16 TB and some region exceeds 10 GB. Major compaction on these regions
>> is painful as it produces a lot of disk I/O and will affect the performance
>> of RS. The auto splitting size of IncreasingToUpperBoundRegionSplitPolicy
>> increased to 16 GB or more for this huge table. My solution is set
>> attribute MAX_FILESIZE on this table so ConstantSizeRegionSplitPolicy auto
>> splitting will work again.
>>
>> My question is: What are the common advices or configuration options to
>> host such a huge table. If we decide to limit the region size, how can we
>> decide the optimised region size? If region size is too large, major
>> compaction is painful; but if region size is too small, then we have a lot
>> of small region which will overwhelm the RS.
>>
>> 林豪
>> 云平台  研发工程师
>>
>> 爱奇艺公司
>> QIYI.com, Inc.
>> 地址:上海市徐汇区宜山路1388号民润大厦6层
>> 邮编:201103
>> 手机:+86 136 1180 1618
>> 电话:+86 21 5451 9520 8393
>> 传真:+86 21 5451 9529
>> 邮箱:lin...@qiyi.com<mailto:zhouxiq...@qiyi.com>
>> 网址:www.iQIYI.com<http://www.iqiyi.com/>
>> [cid:B21E048D-B27D-4528-92D0-36BAE7117128]<http://www.iqiyi.com/>
>>
>> This email and any attachments transmitted with it are intended for use by
>> the intended recipient(s) only. If you have received this email in error,
>> please notify the sender immediately and then delete it. If you are not the
>> intended recipient, you must not keep, use, disclose, copy or distribute
>> this email without the author’s prior permission. We take precautions to
>> minimize the risk of transmitting software viruses, but we advise you to
>> perform your own virus checks on any attachment to this message. We cannot
>> accept liability for any loss or damage caused by software viruses. The
>> information contained in this communication may be confidential and may be
>> subject to the attorney-client privilege.
>>

Reply via email to