[jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size

2018-08-24 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591258#comment-16591258
 ] 

ASF subversion and git services commented on KYLIN-3071:


Commit e8e20529f42e6c9865d13664e31a9f0422e17086 in kylin's branch 
refs/heads/master from yanghao3
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=e8e2052 ]

KYLIN-3071: Add config to reuse dict to reduce dict size


> Add config to reuse dict to reduce dict size 
> -
>
> Key: KYLIN-3071
> URL: https://issues.apache.org/jira/browse/KYLIN-3071
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Yang Hao
>Assignee: Yang Hao
>Priority: Major
> Fix For: v2.5.0
>
> Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not 
> enabled, it only use the history dict which is equal, it may generate many 
> dict. We should supply a config to use contains instead of equal to reuse old 
> dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size

2018-05-25 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490410#comment-16490410
 ] 

liyang commented on KYLIN-3071:
---

I see the difference between Growing Dict and Reuse Dict. It is useful.

However there is a penalty of Reuse Dict, which could impact segment pruning at 
query time. Currently dictionary is used to transform filters. For example, a 
filter like {{A='non-exist-value'}} would be transformed to {{false}}, and such 
segment can be pruned to improve query performance. Having extra values in 
dictionary will weaken the effectiveness of such pruning.

Given the above side effect, I'd suggest the Reuse Dict feature be off by 
default.

> Add config to reuse dict to reduce dict size 
> -
>
> Key: KYLIN-3071
> URL: https://issues.apache.org/jira/browse/KYLIN-3071
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Yang Hao
>Assignee: Yang Hao
>Priority: Major
> Fix For: Future
>
> Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not 
> enabled, it only use the history dict which is equal, it may generate many 
> dict. We should supply a config to use contains instead of equal to reuse old 
> dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size

2018-05-07 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465781#comment-16465781
 ] 

Shaofeng SHI commented on KYLIN-3071:
-

[~liyang.g...@gmail.com] Growing Dict will base on the biggest dict to build a 
new dictionary (when there is new value not covered). The drawback is, a small 
segment may uses a very big dictionary.

 

Hao's change is, if there is an existing dictionary contains all the values, 
then use it. It won't make the dictionary bigger.

> Add config to reuse dict to reduce dict size 
> -
>
> Key: KYLIN-3071
> URL: https://issues.apache.org/jira/browse/KYLIN-3071
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Yang Hao
>Assignee: Yang Hao
>Priority: Major
> Fix For: Future
>
> Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not 
> enabled, it only use the history dict which is equal, it may generate many 
> dict. We should supply a config to use contains instead of equal to reuse old 
> dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size

2018-05-07 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465584#comment-16465584
 ] 

liyang commented on KYLIN-3071:
---

How is this different from {{KylinConfigBase.isGrowingDictEnabled()}} ?

> Add config to reuse dict to reduce dict size 
> -
>
> Key: KYLIN-3071
> URL: https://issues.apache.org/jira/browse/KYLIN-3071
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Yang Hao
>Assignee: Yang Hao
>Priority: Major
> Fix For: Future
>
> Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not 
> enabled, it only use the history dict which is equal, it may generate many 
> dict. We should supply a config to use contains instead of equal to reuse old 
> dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3071) Add config to reuse dict to reduce dict size

2018-05-06 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465113#comment-16465113
 ] 

Shaofeng SHI commented on KYLIN-3071:
-

I think we can enable it by default, this is more useful than growing dict.

 

[~liyang.g...@gmail.com] any concern on it?

> Add config to reuse dict to reduce dict size 
> -
>
> Key: KYLIN-3071
> URL: https://issues.apache.org/jira/browse/KYLIN-3071
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Yang Hao
>Assignee: Yang Hao
>Priority: Major
> Fix For: Future
>
> Attachments: KYLIN-3071.apache-master.001.patch
>
>
> When calling DictionaryManager.trySaveNewDict, and growing dict is not 
> enabled, it only use the history dict which is equal, it may generate many 
> dict. We should supply a config to use contains instead of equal to reuse old 
> dict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)