If you're using the precise distinct count (or say bitmap) on a non-integer column, Kylin need build dictionary to do a string -> integer convertion. To support ultra high cardinality, the "global dictionary" should be selected as it has less memory footprint, which can support up to 2 billion cardinality: https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/
2016-11-10 14:35 GMT+08:00 仇同心 <qiutong...@jd.com>: > Hi, > > The first step in the cube to merge, #1 Step Name: Merge Cuboid Dictionary > > > > > > Error Log info: > > > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:91 : Dictionary value samples: 10101001120172=> > > 479, 10101003212212=>480, 10101003812579=>481, 10101005033448=>482, > 10101005046605=>483 > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:92 : Dictionary cardinality: 108330611 > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:93 : Dictionary builder class: org.apache.kylin > > .dict.DictionaryGenerator$StringDictBuilder > > 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1] > dict.DictionaryGenerator:94 : Dictionary class: org.apache.kylin.dict.Tr > > ieDictionary > > 2016-11-10 14:08:00,798 INFO [pool-7-thread-1] dict.DictionaryManager:169 > : Growing dict is not enabled > > 2016-11-10 14:08:00,803 INFO [pool-7-thread-1] dict.DictionaryManager:186 > : 5 existing dictionaries of the same column > > 2016-11-10 14:08:00,804 INFO [pool-7-thread-1] dict.DictionaryManager:420 > : DictionaryManager(1455639799) loading Dicti > > onaryInfo(loadDictObj:true) at /dict/DMT.DMT_KYLIN_PAY_SYT_ > ORDR_DET_I_D/OUTBIZNO/6ab19cbd-93f6-48b2-9eb8-a4 > > > > > > 2016-11-10 14:09:00,651 INFO [pool-7-thread-1] dict.DictionaryManager:404 > : Saving dictionary at /dict/DMT.DMT_KYLIN_PA > > Y_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e419.dict > > 2016-11-10 14:09:02,658 DEBUG [pool-7-thread-1] > persistence.ResourceStore:207 : Directly saving resource /dict/DMT.DMT_K > > YLIN_PAY_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e419.dict > (Store kylinDev_metadata@hbase) > > 2016-11-10 14:09:04,263 ERROR [pool-7-thread-1] > execution.AbstractExecutable:115 : error running Executable: MergeDictio > > naryStep{id=b0d32a86-3516-4232-b041-aabe127cccc5-00, name=Merge Cuboid > Dictionary, state=RUNNING} > > > > java.lang.OutOfMemoryError: Requested array size exceeds VM limit > > at java.util.Arrays.copyOf(Arrays.java:2271) > > at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream. > java:113) > > at java.io.ByteArrayOutputStream.ensureCapacity( > ByteArrayOutputStream.java:93) > > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream. > java:140) > > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2147) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102) > > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123) > > at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078) > > at org.apache.kylin.storage.hbase.HBaseResourceStore. > putResourceImpl(HBaseResourceStore.java:239) > > at org.apache.kylin.common.persistence.ResourceStore. > putResource(ResourceStore.java:208) > > at org.apache.kylin.dict.DictionaryManager.save( > DictionaryManager.java:413) > > at org.apache.kylin.dict.DictionaryManager.saveNewDict( > DictionaryManager.java:209) > > at org.apache.kylin.dict.DictionaryManager.trySaveNewDict( > DictionaryManager.java:176) > > at org.apache.kylin.dict.DictionaryManager.mergeDictionary( > DictionaryManager.java:269) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > mergeDictionaries(MergeDictionaryStep.java:145) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > makeDictForNewSegment(MergeDictionaryStep.java:135) > > at org.apache.kylin.engine.mr.steps.MergeDictionaryStep. > doWork(MergeDictionaryStep.java:67) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.execution.DefaultChainedExecutable. > doWork(DefaultChainedExecutable.java:57) > > at org.apache.kylin.job.execution.AbstractExecutable. > execute(AbstractExecutable.java:113) > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$ > JobRunner.run(DefaultScheduler.java:136) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > > > In /bin/ setenv.sh : > > export KYLIN_JVM_SETTINGS="-Xmx100g -Xms100g -Xmn2g -XX:+UseG1GC > -XX:MaxPermSize=128M -verbose:gc -XX:+PrintGCDetails > > -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLo > > gFileSize=64M" > > > > but this error still appear,so Where is to modify the parameters can be > solved ?? > > > > > > > > > > > > > -- Best regards, Shaofeng Shi 史少锋