[ 
https://issues.apache.org/jira/browse/KYLIN-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangyuan Deng updated KYLIN-3729:
---------------------------------
    Attachment: image-2018-12-19-12-01-20-430.png

> CLUSTER BY CAST(field AS STRING) will accelerate base cuboid build with UHC 
> global dict
> ---------------------------------------------------------------------------------------
>
>                 Key: KYLIN-3729
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3729
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>    Affects Versions: v2.5.2
>            Reporter: Fangyuan Deng
>            Assignee: Fangyuan Deng
>            Priority: Minor
>         Attachments: image-2018-12-19-12-01-20-430.png, 
> image-2018-12-19-12-01-36-915.png
>
>
> As we know global dict is a sliced  appendTrieTree using cache-loader , so if 
> we convert values to ids using global dict, ordered values will help.
> And now we can set kylin.source.hive.flat-table-cluster-by-dict-column = uhc 
> column, to make source data CLUSTER BY uhc-column, this get better.
> But the appendTrieTree is order by string, so we can  CLUSTER BY 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to