kangkaisen created KYLIN-2764:
---------------------------------

             Summary: Build the dict for UHC column with MR
                 Key: KYLIN-2764
                 URL: https://issues.apache.org/jira/browse/KYLIN-2764
             Project: Kylin
          Issue Type: Improvement
          Components: Job Engine
    Affects Versions: v2.0.0
            Reporter: kangkaisen
            Assignee: kangkaisen


KYLIN-2217 has built dict for  normal column with MR,  but the UHC column still 
build dict in JobServer. Like KYLIN-2217, we also could use MR build dict for 
UHC column. which could thoroughly release the memory pressure and  improve job 
concurrent for JobServer  as well as speed up multi UHC columns procedure.

The MR input is the output of  "Extract Fact Table Distinct Columns", the MR 
output is the UHC column dict. Because it is very hard build global dict with 
multi reducers, I use one reducer handle one UHC column and allocate enough 
memory to the reducer. According to my test, 8G memory is enough.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to