[ https://issues.apache.org/jira/browse/KYLIN-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800379#comment-16800379 ]
XiaoXiang Yu edited comment on KYLIN-3905 at 3/25/19 3:42 AM: -------------------------------------------------------------- h3. Return time statistics of global dictionary(without Shrunken Dictionary) In following screenshot, we can find that cache swap occupy most duration when build base cuboid related to Bitmap measure. !image-2019-03-25-11-29-19-383.png! was (Author: hit_lacus): h3. Return time statistics of global dictionary(without Shrunken Dictionary) !image-2019-03-25-11-29-19-383.png! > Enable shrunken dictionary default > ---------------------------------- > > Key: KYLIN-3905 > URL: https://issues.apache.org/jira/browse/KYLIN-3905 > Project: Kylin > Issue Type: Improvement > Components: Measure - Count Distinct > Reporter: XiaoXiang Yu > Assignee: XiaoXiang Yu > Priority: Minor > Fix For: v2.6.2 > > Attachments: image-2019-03-25-11-26-59-198.png, > image-2019-03-25-11-27-26-149.png, image-2019-03-25-11-27-46-175.png, > image-2019-03-25-11-28-14-256.png, image-2019-03-25-11-29-19-383.png > > > In dev mail list's discussion, I suggest to enable shrunken dictionary by > default, and received some dev's aggrement. > When using bitmap measure on a large cardinality column(require global > dictionaty), build base cuboid step need frequent cache swap so it cannot > finished within a reasonable period. > When shrunken dictionary enabled, a new step will be added to build > separated dictionary for each `InputSplit`, Mapper of **BuildBaseCuboid** > step only has to fetch a smaller dictionary for itself, instead of a larger > global dictionary. It will reduce cache swap and make **BuildBaseCuboid** > step run as quicker as possible. > > http://mail-archives.apache.org/mod_mbox/kylin-dev//201903.mbox/%3c62efcb72-b235-4fc3-9add-0fc510d97...@kyligence.io%3e -- This message was sent by Atlassian JIRA (v7.6.3#76005)