Re: Why per tablet server 's upper limit is 4TB.

2017-08-31 Thread yuyunliuhen
It is very helpful. thanks !


原始邮件
发件人:Jean-Daniel cryansjdcry...@apache.org
收件人:useru...@kudu.apache.org
发送时间:2017年8月30日(周三) 23:50
主题:Re: Why per tablet server 's upper limit is 4TB.


Hi all,


We bumped the limit as of 1.4.0 but the new docs haven't been re-published, you 
can still see them 
here:https://github.com/apache/kudu/commit/0e41d4cce5ca96cb08612475f76539f6c8bc58f1


Hope this helps,


J-D


On Wed, Aug 30, 2017 at 3:56 AM, Denis Bolshakov bolshakov.de...@gmail.com 
wrote:

Mike Percy answers to @kinglee (from Kudu Slack channel)
there are multiple issues that interact but one issue is that if you have many 
tablets you will use many threads. Adar has been focusing on improving density 
lately and trying to quantify the scaling limits.




On 30 August 2017 at 13:22, yuyunliuhen yuyunliu...@gmail.com wrote:

"Recommended maximum amount of stored data, post-replication and 
post-compression, per tablet server is 4TB."
what will happen if the data more than 4T? the disk is large than before. 6T a 
disk is is common, there any test data or doc?







-- 

//with Best Regards
--Denis Bolshakov
e-mail: bolshakov.de...@gmail.com

Re: Question about per server data upper limit.

2017-08-31 Thread Adar Lieber-Dembo
The upper limit of 4 TB is for data on-disk (post-encoding,
post-compression, and post-replication); it does not include in-memory
data from memrowsets or deltamemstores.

The value of the limit is based on the kinds of workloads tested by
the Kudu development community. As a group we feel comfortable
supporting users up to 4 TB because we've run such workloads
ourselves. Beyond 4 TB, however, we're not exactly sure what becomes
slow, what breaks, etc.

Speaking from experience, as the amount of on-disk data grows,
tservers will take longer to start-up. You might become vulnerable to
KUDU-2050; we're not sure. In order to reach that amount of data
you'll probably also raise the number of tablets hosted by the
tserver. This can increase the tserver's thread count, file descriptor
count, and may cause slowdowns in other areas.

In short, nothing will "happen" the moment you cross 4 TB, it's just
that you'll be entering relatively uncharted waters and might
encounter unusual or unexpected behavior. If that doesn't deter you,
by all means give it a shot (and report back with your findings)!

On Wed, Aug 30, 2017 at 5:53 PM, 李津  wrote:
> why per tserver have the upper limit  of 4T and  it include the memrowset 
> data?   we also not testing more than 4T. what will happen if reach the upper 
> limit?