Use global dictionary,ERROR infos: When Use global dictionary, a recursive function in the source code, led to the following error occurred:
com.google.common.util.concurrent.ExecutionError: java.lang.StackOverflowError at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2261) at com.google.common.cache.LocalCache.get(LocalCache.java:4000) at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874) at org.apache.kylin.dict.CachedTreeMap.get(CachedTreeMap.java:292) at org.apache.kylin.dict.CachedTreeMap.get(CachedTreeMap.java:52) at org.apache.kylin.dict.AppendTrieDictionary$Builder.addValue(AppendTrieDictionary.java:821) at org.apache.kylin.dict.AppendTrieDictionary$Builder.addValue(AppendTrieDictionary.java:804) at org.apache.kylin.dict.GlobalDictionaryBuilder.build(GlobalDictionaryBuilder.java:78) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:323) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:185) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.StackOverflowError at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:307) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) at org.apache.kylin.dict.AppendTrieDictionary$DictSlice.rebuildTrieTreeR(AppendTrieDictionary.java:320) -----邮件原件----- 发件人: ShaoFeng Shi [mailto:shaofeng...@apache.org] 发送时间: 2016年11月10日 18:43 收件人: user 抄送: dev 主题: Re: Cube Merge Error If you're using the precise distinct count (or say bitmap) on a non-integer column, Kylin need build dictionary to do a string -> integer convertion. To support ultra high cardinality, the "global dictionary" should be selected as it has less memory footprint, which can support up to 2 billion cardinality: https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/ 2016-11-10 14:35 GMT+08:00 仇同心 <qiutong...@jd.com>: > Hi, > > The first step in the cube to merge, #1 Step Name: Merge Cuboid > Dictionary > > > > > > Error Log info: > > > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:91 : Dictionary value samples: > 10101001120172=> > > 479, 10101003212212=>480, 10101003812579=>481, 10101005033448=>482, > 10101005046605=>483 > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:92 : Dictionary cardinality: 108330611 > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:93 : Dictionary builder class: > org.apache.kylin > > .dict.DictionaryGenerator$StringDictBuilder > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:94 : Dictionary class: > org.apache.kylin.dict.Tr > > ieDictionary > > 2016-11-10 14:08:00,798 INFO [pool-7-thread-1] > dict.DictionaryManager:169 > : Growing dict is not enabled > > 2016-11-10 14:08:00,803 INFO [pool-7-thread-1] > dict.DictionaryManager:186 > : 5 existing dictionaries of the same column > > 2016-11-10 14:08:00,804 INFO [pool-7-thread-1] > dict.DictionaryManager:420 > : DictionaryManager(1455639799) loading Dicti > > onaryInfo(loadDictObj:true) at /dict/DMT.DMT_KYLIN_PAY_SYT_ > ORDR_DET_I_D/OUTBIZNO/6ab19cbd-93f6-48b2-9eb8-a4 > > > > > > 2016-11-10 14:09:00,651 INFO [pool-7-thread-1] > dict.DictionaryManager:404 > : Saving dictionary at /dict/DMT.DMT_KYLIN_PA > > Y_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e419.dict > > 2016-11-10 14:09:02,658 DEBUG [pool-7-thread-1] > persistence.ResourceStore:207 : Directly saving resource > /dict/DMT.DMT_K > > YLIN_PAY_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e41 > 9.dict > (Store kylinDev_metadata@hbase) > > 2016-11-10 14:09:04,263 ERROR [pool-7-thread-1] > execution.AbstractExecutable:115 : error running Executable: > MergeDictio > > naryStep{id=b0d32a86-3516-4232-b041-aabe127cccc5-00, name=Merge Cuboid > Dictionary, state=RUNNING} > > > > java.lang.OutOfMemoryError: Requested array size exceeds VM limit > > at java.util.Arrays.copyOf(Arrays.java:2271) > > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream. > java:113) > > at java.io.ByteArrayOutputStream.ensureCapacity( > ByteArrayOutputStream.java:93) > > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream. > java:140) > > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2147) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102) > > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078) > > at org.apache.kylin.storage.hbase.HBaseResourceStore. > putResourceImpl(HBaseResourceStore.java:239) > > at org.apache.kylin.common.persistence.ResourceStore. > putResource(ResourceStore.java:208) > > at org.apache.kylin.dict.DictionaryManager.save( > DictionaryManager.java:413) > > at org.apache.kylin.dict.DictionaryManager.saveNewDict( > DictionaryManager.java:209) > > at org.apache.kylin.dict.DictionaryManager.trySaveNewDict( > DictionaryManager.java:176) > > at org.apache.kylin.dict.DictionaryManager.mergeDictionary( > DictionaryManager.java:269) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > mergeDictionaries(MergeDictionaryStep.java:145) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > makeDictForNewSegment(MergeDictionaryStep.java:135) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > doWork(MergeDictionaryStep.java:67) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.execution.DefaultChainedExecutable. > doWork(DefaultChainedExecutable.java:57) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > JobRunner.run(DefaultScheduler.java:136) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > > > In /bin/ setenv.sh : > > export KYLIN_JVM_SETTINGS="-Xmx100g -Xms100g -Xmn2g -XX:+UseG1GC > -XX:MaxPermSize=128M -verbose:gc -XX:+PrintGCDetails > > -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLo > > gFileSize=64M" > > > > but this error still appear,so Where is to modify the parameters can > be solved ?? > > > > > > > > > > > > > -- Best regards, Shaofeng Shi 史少锋