[
https://issues.apache.org/jira/browse/KYLIN-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214197#comment-16214197
]
kangkaisen commented on KYLIN-2764:
-----------------------------------
Thanks you very much, liyang and shaofeng.
Shaofeng, you should let me do the merge work,thanks you.
I don't have further change, but there is a issue in 2764 branch:
After KYLIN-2800
https://github.com/apache/kylin/commit/ac77008ee81d4dcc2956b1a2cfd6eaa7ae9fc5d9
There isn't the first point I had pointed in the comment:
{quote}
1. The FK column in fact table could be UHC column.
{quote}
So the latest commit in 2764 branch coube be simplify, This is the commit to
apply KYLIN-2800:
https://github.com/apache/kylin/commit/48f3fb1953a413acfdd405539a7cfd211a5e85de.
> Build the dict for UHC column with MR
> -------------------------------------
>
> Key: KYLIN-2764
> URL: https://issues.apache.org/jira/browse/KYLIN-2764
> Project: Kylin
> Issue Type: Improvement
> Components: Job Engine
> Affects Versions: v2.0.0
> Reporter: kangkaisen
> Assignee: kangkaisen
> Fix For: v2.3.0
>
> Attachments: job-memory-after.png, job-memory-before.png
>
>
> KYLIN-2217 has built dict for normal column with MR, but the UHC column
> still build dict in JobServer. Like KYLIN-2217, we also could use MR build
> dict for UHC column. which could thoroughly release the memory pressure and
> improve job concurrent for JobServer as well as speed up multi UHC columns
> procedure.
> The MR input is the output of "Extract Fact Table Distinct Columns", the MR
> output is the UHC column dict. Because it is very hard build global dict with
> multi reducers, I use one reducer handle one UHC column and allocate enough
> memory to the reducer. According to my test, 8G memory is enough.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)