[ https://issues.apache.org/jira/browse/KUDU-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adar Dembo reassigned KUDU-1967: -------------------------------- Assignee: (was: Adar Dembo) I did a fair amount of work for this in past Kudu releases; the Jira tracks some of the remaining work to be done. > Umbrella JIRA for node density improvements > ------------------------------------------- > > Key: KUDU-1967 > URL: https://issues.apache.org/jira/browse/KUDU-1967 > Project: Kudu > Issue Type: Task > Components: fs, master, tablet, tserver > Affects Versions: 1.3.0 > Reporter: Adar Dembo > Priority: Major > Labels: data-scalability, roadmap-candidate > > For the Kudu 1.4 release, I'll be working to improve node density. > Here's a brief primer on Kudu's scalability targets today: > # We recommend no more than 4 TB of total data per node. This is specific to > Kudu data blocks, so this data is post-encoding and post-compression. > # We recommend no more than 1000 partitions (post-replication) per node. > # We recommend no more than 100 nodes per cluster. > # We recommend no more than 60 partitions per table per tserver. > For 1.4, here's what we'd like to achieve: > # Up to 16 TB of total data per node. Maybe even 48 TB, if possible. > # Up to 100 "hot" partitions per node. In this context, "hot" means > partitions that are actively servicing writes. > # Thousands of "cold" partitions per node. Put another way, it should be > drastically cheaper to serve "cold" partitions than it is today. > # Maintain the "100 nodes per cluster" limit. > # Remove the "no more than 60 partitions per table per node" limit. > I'll be linking various interesting JIRAs into this one, and I'll document, > for each one, which aspect of data scalability it affects. -- This message was sent by Atlassian Jira (v8.3.4#803005)