Hi guys, Read this doc https://kudu.apache.org/docs/schema_design.html#multilevel-partitioning and I have a question on this particular statement "Scans on multilevel partitioned tables can take advantage of partition pruning on any of the levels independently"
Does it mean, that both strategies below would be equivalent in terms of performance (i.e. minimum scans) partition by hash(shop_id), hash(customer_id) vs. partition by hash(customer_id), hash(shop_id) 60% of the queries are using both shop_id and customer_id but 40% of queries need to pull all customers for a specific shop_id. And almost never by customer_id alone (customer_id is not unique across shops and is assigned per shop). At the same time, if I partition by customer_id first, partitions will be distributed more evenly. Thanks! Boris