[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2022-05-09 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533639#comment-17533639
 ] 

Zhong Yanghong commented on KYLIN-3430:
---

The following code:
{code}
Set activeResources = Sets.newHashSet();
for (CubeInstance cube : cubeManager.reloadAndListAllCubes()) {
activeResources.addAll(cube.getSnapshots().values());
for (CubeSegment segment : cube.getSegments()) {
activeResources.addAll(segment.getSnapshotPaths());
activeResources.addAll(segment.getDictionaryPaths());
activeResources.add(segment.getStatisticsResourcePath());
for (String dictPath : segment.getDictionaryPaths()) {
DictionaryInfo dictInfo = store.getResource(dictPath, 
DictionaryInfoSerializer.FULL_SERIALIZER);
if ("org.apache.kylin.dict.AppendTrieDictionary"
.equals(dictInfo != null ? 
dictInfo.getDictionaryClass() : null)) {
{code}
will make it very slow to do the clean up, since we have to load every 
dictionaries.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729424#comment-16729424
 ] 

ASF GitHub Bot commented on KYLIN-3430:
---

shaofengshi commented on pull request #422: KYLIN-3430 Global Dictionary Cleanup
URL: https://github.com/apache/kylin/pull/422
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729425#comment-16729425
 ] 

ASF subversion and git services commented on KYLIN-3430:


Commit a20f04fb17ccb58162471868e5adf3594fb41c77 in kylin's branch 
refs/heads/master from Temple Zhou
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=a20f04f ]

KYLIN-3430 Global Dictionary Cleanup


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729386#comment-16729386
 ] 

ASF GitHub Bot commented on KYLIN-3430:
---

TempleZhou commented on pull request #422: KYLIN-3430 Global Dictionary Cleanup
URL: https://github.com/apache/kylin/pull/422
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729381#comment-16729381
 ] 

ASF GitHub Bot commented on KYLIN-3430:
---

TempleZhou commented on pull request #421: KYLIN-3430 Global Dictionary Cleanup
URL: https://github.com/apache/kylin/pull/421
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729382#comment-16729382
 ] 

ASF GitHub Bot commented on KYLIN-3430:
---

TempleZhou commented on pull request #421: KYLIN-3430 Global Dictionary Cleanup
URL: https://github.com/apache/kylin/pull/421
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread kangkaisen (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729375#comment-16729375
 ] 

kangkaisen commented on KYLIN-3430:
---

[~temple.zhou]   LGTM,  +1. Thanks you.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729341#comment-16729341
 ] 

Temple Zhou commented on KYLIN-3430:


[~kangkaisen] Thanks for you reminder~

I have updated the attachment, please review it again. :D

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread kangkaisen (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729282#comment-16729282
 ] 

kangkaisen commented on KYLIN-3430:
---

[~temple.zhou] , Thanks for your patch.

The global dict paths in HDFS have two format:

for `GlobalDictionaryBuilder`:the HDFS path is `String baseDir = hdfsDir + 
"resources/GlobalDict" + dictInfo.getResourceDir() + "/"`,

for `SegmentAppendTrieDictBuilder`: the HDFS path is `String baseDir = hdfsDir 
+ "resources/SegmentDict" + dictInfo.getResourceDir() + "/"+ 
RandomUtil.randomUUID().toString() + "_" + System.currentTimeMillis() + "/";`

Please handle the SegmentDict, thanks you.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.002.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-26 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729256#comment-16729256
 ] 

Shaofeng SHI commented on KYLIN-3430:
-

Thanks Temple; [~kangkaisen] Kaisen, can you review this patch? Thank you!

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-12-25 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728645#comment-16728645
 ] 

Shaofeng SHI commented on KYLIN-3430:
-

[~temple.zhou], the attachment is missing, could you please attach again?

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-06 Thread kangkaisen (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534657#comment-16534657
 ] 

kangkaisen commented on KYLIN-3430:
---

Hi, [~temple.zhou].

There is also a config "kylin.dictionary.append-version-ttl", the default value 
is three days.  which means we will not delete the dir Within three days.

 

The Global Dictionary is not cube level, multiple cubes could use the same 
Global Dictionary.  So we shouldn't delete Global Dictionary when we drop one 
cube.

 

But you remind me,  if the Global Dictionary related cube are all dropped, 
currently, we will never delete the Global Dictionary in HDFS.  This is a issue 
indeed, thanks you very much!  I thought wrong.

 

As for how to check a Global Dictionary whether has reference or not, I think 
we could get the active Global Dictionary from the cube metadata, if all cube 
don't use a Global Dictionary, we could think the Global Dictionary is useless 
and delete it.

Thanks you !

 

 

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-05 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533465#comment-16533465
 ] 

Temple Zhou commented on KYLIN-3430:


Hi Kaisen,

Emm... I check the versions dir in my HDFS, and I found that there were more 
than 3 versions dir exist despite of the default value of 
"kylin.dictionary.append-max-versions" is 3.
{noformat}
kylin@kylin5:~$ hdfs dfs -du 
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID
   
45634188  136902564  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509701213681
53439651  160318953  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509701420683
53439651  160318953  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509702256099
53527132  160581396  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509754562354
53588834  160766502  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509840996984
53655766  160967298  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509927413190
53655766  160967298  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509935653127
53655766  160967298  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509948477575
53655766  160967298  
/tmp/kylin-data/kylin_metadata/resources/GlobalDict/dict/ADL.APP_USER_FRESHNESS_DI/DXYID/version_1509953184643
{noformat}
In addition, even the cube is disabled or removed, the GD versions dir still 
exist in my HDFS...  :( :( :( 

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-05 Thread kangkaisen (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533424#comment-16533424
 ] 

kangkaisen commented on KYLIN-3430:
---

Hi [~temple.zhou] [~Shaofengshi]  the Global Dictionary Cleanup is not a issue.

Global Dictionary will delete expired versions dir when commit. If you think 
the three version is redundant, you only need to set  
"kylin.dictionary.append-max-versions" to 1.

 

As for the lock issue, you can refer to 
https://issues.apache.org/jira/browse/KYLIN-2506

{quote}

So, up to now, How do we ensure the correctness of the global dict in 
distributed env?

1 Distributed lock: it ensure only one thread could write the global dict at 
the same time.
2 MVCC: we write the global dict in the working dir and read the global dict 
form the versions dir.
3 every time we read the global dict, we will construct the 
AppendTrieDictionary from the metadata in the latestVersion dir.

Based on above 3 points, we could ensure global dict is sequential write and 
parallel read in distributed env.

{quote}

Thanks you.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-04 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533243#comment-16533243
 ] 

Shaofeng SHI commented on KYLIN-3430:
-

Hi Temple, you're correct, the GD is growing up, so the newer one will cover 
the old ones. Only the latest version is needed. Then I take back my 3) 
comment. 

 

@kangkaisen please review this as well.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-04 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533238#comment-16533238
 ] 

Temple Zhou commented on KYLIN-3430:


Hi Shaofeng,

I have no objection to the 1) & 2), but the 3).

I do not check about whether a dictionary has reference or not, as I found that 
even the GD was being used, there were still redundant GD.

{noformat}
hdfs@t1:~$ hdfs dfs -du -h 
/kylin/kylin_metadata/resources/GlobalDict/dict/DEFAULT.TEST_GLOBAL_DICT/USER_ID/
148  444  
/kylin/kylin_metadata/resources/GlobalDict/dict/DEFAULT.TEST_GLOBAL_DICT/USER_ID/version_1530759906521
157  471  
/kylin/kylin_metadata/resources/GlobalDict/dict/DEFAULT.TEST_GLOBAL_DICT/USER_ID/version_1530759913804
166  498  
/kylin/kylin_metadata/resources/GlobalDict/dict/DEFAULT.TEST_GLOBAL_DICT/USER_ID/version_1530759915097
{noformat}

Besides, I analyzed the contents of GD and found that the newer GD seems to 
contain everything in the older GD, so I think only the newest GD should be 
reserved, even the GD is being used by active cube. I don't know if I express 
it clearly.


> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-04 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532727#comment-16532727
 ] 

Temple Zhou commented on KYLIN-3430:


Shaofeng, I think you need to review the patch, as I do not know if the logic 
of cleaning the GD is correct and safe.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-02 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529556#comment-16529556
 ] 

Shaofeng SHI commented on KYLIN-3430:
-

Temple, assigned to you, please go ahead.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-07-01 Thread Temple Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529412#comment-16529412
 ] 

Temple Zhou commented on KYLIN-3430:


[~Shaofengshi]
 Hi Shaofeng, I'd like to contribute. Maybe, you can assign the issue to me. :D 

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Priority: Major
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2018-06-28 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526112#comment-16526112
 ] 

Shaofeng SHI commented on KYLIN-3430:
-

Hi Temple, thanks for the reporting! I think you're correct that there is no 
check and cleanup for the GD. Would you like to contribute a patch? Thanks!

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Priority: Major
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)