[ 
https://issues.apache.org/jira/browse/KYLIN-3905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiaoXiang Yu updated KYLIN-3905:
--------------------------------
    Description: 
In dev mail list's discussion, I suggest to enable shrunken dictionary by 
default, and received some dev's aggrement. 

When using bitmap measure on a large cardinality column(require global 
dictionaty), build base cuboid step need frequent cache swap so it cannot 
finished within a reasonable period.
 When shrunken dictionary enabled, a new step will be added to build separated 
dictionary for each `InputSplit`, Mapper of **BuildBaseCuboid** step only has 
to fetch a smaller dictionary for itself, instead of a larger global 
dictionary. It will reduce cache swap and make **BuildBaseCuboid** step run as 
quicker as possible.

 

http://mail-archives.apache.org/mod_mbox/kylin-dev//201903.mbox/%3c62efcb72-b235-4fc3-9add-0fc510d97...@kyligence.io%3e

  was:
In dev mail list's discussion, I suggest to enable 

 

http://mail-archives.apache.org/mod_mbox/kylin-dev//201903.mbox/%3c62efcb72-b235-4fc3-9add-0fc510d97...@kyligence.io%3e

 

When using bitmap measure on a large cardinality column(require global 
dictionaty), build base cuboid step need frequent cache swap so it cannot 
finished within a reasonable period.
 When shrunken dictionary enabled, a new step will be added to build separated 
dictionary for each `InputSplit`, Mapper of **BuildBaseCuboid** step only has 
to fetch a smaller dictionary for itself, instead of a larger global 
dictionary. It will reduce cache swap and make **BuildBaseCuboid** step run as 
quicker as possible.


> Enable shrunken dictionary default
> ----------------------------------
>
>                 Key: KYLIN-3905
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3905
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Measure - Count Distinct
>            Reporter: XiaoXiang Yu
>            Assignee: XiaoXiang Yu
>            Priority: Minor
>             Fix For: v2.6.2
>
>         Attachments: image-2019-03-25-11-26-59-198.png, 
> image-2019-03-25-11-27-26-149.png, image-2019-03-25-11-27-46-175.png, 
> image-2019-03-25-11-28-14-256.png, image-2019-03-25-11-29-19-383.png
>
>
> In dev mail list's discussion, I suggest to enable shrunken dictionary by 
> default, and received some dev's aggrement. 
> When using bitmap measure on a large cardinality column(require global 
> dictionaty), build base cuboid step need frequent cache swap so it cannot 
> finished within a reasonable period.
>  When shrunken dictionary enabled, a new step will be added to build 
> separated dictionary for each `InputSplit`, Mapper of **BuildBaseCuboid** 
> step only has to fetch a smaller dictionary for itself, instead of a larger 
> global dictionary. It will reduce cache swap and make **BuildBaseCuboid** 
> step run as quicker as possible.
>  
> http://mail-archives.apache.org/mod_mbox/kylin-dev//201903.mbox/%3c62efcb72-b235-4fc3-9add-0fc510d97...@kyligence.io%3e



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to