[ https://issues.apache.org/jira/browse/KUDU-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241949#comment-17241949 ]
Ravi Bhanot edited comment on KUDU-3147 at 12/1/20, 11:44 PM: -------------------------------------------------------------- [~ghenke_impala_d87e] Wanted to check if I can take up this ticket if it is not being worked upon by anybody. We do want to get rid of hot spotting that happens because of ranges. was (Author: ravibhanot): [~ghenke_impala_d87e] Wanted to check if I can take up this ticket if it is not being worked upon by anybody > Balance tablets based on range hash buckets > ------------------------------------------- > > Key: KUDU-3147 > URL: https://issues.apache.org/jira/browse/KUDU-3147 > Project: Kudu > Issue Type: Improvement > Components: master, perf > Affects Versions: 1.12.0 > Reporter: Grant Henke > Priority: Major > Labels: balancer, perf, roadmap-candidate > > When a user defines a schema that uses range + hash partitioning its is often > the case that the tablets in the latest range, based on time or any > semi-sequential data, are the only tablets that receive writes. Or even if > not the latest, it is common for a single range to receive a burst of writes > if backloading. > This is so common, that the default Kudu balancing scheme should consider > placing/rebalancing the tablets for the hash buckets within each range on as > many servers as possible in order to support the maximum write throughput. In > that case, `min(#buckets, #total-cluster-tservers)` tservers will be used to > handle the writes if the cluster is perfectly balanced. Today, even if > perfectly balanced, it is possible for all the hash buckets to be on a single > tserver. -- This message was sent by Atlassian Jira (v8.3.4#803005)