[
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533424#comment-16533424
]
kangkaisen commented on KYLIN-3430:
-----------------------------------
Hi [~temple.zhou] [~Shaofengshi] the Global Dictionary Cleanup is not a issue.
Global Dictionary will delete expired versions dir when commit. If you think
the three version is redundant, you only need to set
"kylin.dictionary.append-max-versions" to 1.
As for the lock issue, you can refer to
https://issues.apache.org/jira/browse/KYLIN-2506
{quote}
So, up to now, How do we ensure the correctness of the global dict in
distributed env?
1 Distributed lock: it ensure only one thread could write the global dict at
the same time.
2 MVCC: we write the global dict in the working dir and read the global dict
form the versions dir.
3 every time we read the global dict, we will construct the
AppendTrieDictionary from the metadata in the latestVersion dir.
Based on above 3 points, we could ensure global dict is sequential write and
parallel read in distributed env.
{quote}
Thanks you.
> Global Dictionary Cleanup
> -------------------------
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
> Issue Type: Improvement
> Components: Tools, Build and Test
> Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
> Reporter: Temple Zhou
> Assignee: Temple Zhou
> Priority: Major
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin
> metadata, but, after that, the Global Dictionary still exists in my HDFS and
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't
> shrunk.}}
>
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)