Optimize Read Performance

Xinrui Nie Fri, 13 Aug 2021 18:14:52 -0700

Hi Kylin team,


We are trying to optimize Kylin's read performance. In our case, each query 
will only get data from a sub-range of all dates. So we tried to build multiple 
segments in the cube which is partitioned by dates, and each query should only 
read relevant segments. But according to our tests, no matter whether we 
partition the cube, the scan count is always the same for the same query. For 
example, assume we have 2 cubes. One is partitioned into 30 segments, and the 
other one only has one segment. The 2 cubes cover the same range of dates. If 
we run a query that contains date filter in it (which means for the 30 segments 
cube, the query should only read some of the segments), the scan count is the 
same between the 2 cubes. Also the execution time of the query is almost the 
same. This confuses us. We know partitioning in Kylin helps reduce the rebuild 
time, which is to improve the write performance. But why does partitioning make 
no improvement to the read performance? And is there any way t
 o optimize read performance in Kylin? Thank you!

Optimize Read Performance

Reply via email to