[ https://issues.apache.org/jira/browse/KYLIN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722811#comment-16722811 ]
yuzhang qiu commented on KYLIN-3453: ------------------------------------ But why does kylin use the estimated result to split region(cuboid shard), rather than use the real size after step "convert cuboid data to HFile"? > Improve cube size estimation for TOPN, COUNT DISTINCT > ----------------------------------------------------- > > Key: KYLIN-3453 > URL: https://issues.apache.org/jira/browse/KYLIN-3453 > Project: Kylin > Issue Type: Improvement > Reporter: Chao Long > Assignee: Chao Long > Priority: Major > Fix For: v2.5.0 > > Attachments: image-2018-07-24-16-29-07-359.png, > image-2018-07-24-16-30-50-804.png, image-2018-07-24-16-33-43-231.png, > image-2018-07-24-16-37-09-199.png, image-2018-07-24-17-11-26-283.png, > image-2018-07-24-17-11-27-829.png, image-2018-07-24-17-12-25-880.png > > > Currently, Kylin has poor cube size estimation for TOPN, COUNT DISTINCT. We > should improve it, then we can get a reasonable split num when cube building. -- This message was sent by Atlassian JIRA (v7.6.3#76005)