[ 
https://issues.apache.org/jira/browse/KYLIN-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216373#comment-16216373
 ] 

leontong commented on KYLIN-2794:
---------------------------------

I don't think we can simply compare string values in 
MultipleDictionaryValueEnumerator. we probably need to convert into byte array 
first,  as dictionaries are comparing values in byte array format. otherwise, 
this error may come up again.

> MultipleDictionaryValueEnumerator should output values in sorted order
> ----------------------------------------------------------------------
>
>                 Key: KYLIN-2794
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2794
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v2.0.0
>         Environment: hadoop hadoop-2.6.0-cdh5.8.2   hive 2.1 hbase 0.98
>            Reporter: 翟玉勇
>            Assignee: Yifei Wu
>            Priority: Critical
>              Labels: scope
>             Fix For: v2.2.0
>
>
> Dictionary exception during merge of segments.
> {code}
> 2017-08-18 14:17:48,828 ERROR [pool-11-thread-1] 
> threadpool.DistributedScheduler:188 : ExecuteException 
> job:8d031b5f-2d3f-445f-a62b-7bc560d919ea in server: ******
> org.apache.kylin.job.exception.ExecuteException: 
> org.apache.kylin.job.exception.ExecuteException: 
> java.lang.IllegalStateException: Invalid input data. Unordered data cannot be 
> split into multi trees
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:134)
>       at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:185)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kylin.job.exception.ExecuteException: 
> java.lang.IllegalStateException: Invalid input data. Unordered data cannot be 
> split into multi trees
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:134)
>       at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
>       ... 4 more
> Caused by: java.lang.IllegalStateException: Invalid input data. Unordered 
> data cannot be split into multi trees
>       at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(TrieDictionaryForestBuilder.java:92)
>       at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(TrieDictionaryForestBuilder.java:78)
>       at 
> org.apache.kylin.dict.DictionaryGenerator$StringTrieDictForestBuilder.addValue(DictionaryGenerator.java:212)
>       at 
> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:79)
>       at 
> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:64)
>       at 
> org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries(DictionaryGenerator.java:104)
>       at 
> org.apache.kylin.dict.DictionaryManager.mergeDictionary(DictionaryManager.java:267)
>       at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.mergeDictionaries(MergeDictionaryStep.java:146)
>       at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.makeDictForNewSegment(MergeDictionaryStep.java:136)
>       at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.doWork(MergeDictionaryStep.java:68)
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
>       ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to