zhao jintao created KYLIN-4015:
----------------------------------

             Summary: Kylin build cube error at the "Build UHC Dictionary" step
                 Key: KYLIN-4015
                 URL: https://issues.apache.org/jira/browse/KYLIN-4015
             Project: Kylin
          Issue Type: Bug
          Components: Metadata
    Affects Versions: v2.5.2
         Environment: Fusion Insight
            Reporter: zhao jintao
            Assignee: zhao jintao


Hi All:

We know, kylin builds dimension dictionary in kylin job client. But if a cube 
has uhc dimensions, it will cost much more CPU and memory resources. Kylin 
provides the ability to build uhc dictionary using the MR engine to reduce the 
resource consumption of the build engine.

But I find that the "Build UHC Dictionary" step build error. This step run 
using MR engine. This is the error info from yarn:

org.apache.hadoop.mapred.YarnChild: Exception running child : 
java.io.IOException: 
hdfs://hacluster/xxx.../xxx/fact_distinct_columns/xxx/FIELD_NAME.dic-r-00001 
not a SequenceFile.
 at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:)
 at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:)
 at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:)

The reason of this problem is that the "Extract Fact Table Distinct " step 
output two type of files:".dci" and ".rldict"; but the ".dci" file is not  a 
sequence file, so the "Build UHC Dictionary" step should filter ".dci" file 
when run with MR engine.

I resolve this problem and will summit my code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to