If you're using the precise distinct count (or say bitmap) on a non-integer
column, Kylin need build dictionary to do a string -> integer convertion.
To support ultra high cardinality, the "global dictionary" should be
selected as it has less memory footprint, which can support up to 2 billion
cardinality:
https://kylin.apache.org/blog/2016/08/01/count-distinct-in-kylin/

2016-11-10 14:35 GMT+08:00 仇同心 <qiutong...@jd.com>:

> Hi,
>
> The first step in the cube to merge, #1 Step Name: Merge Cuboid Dictionary
>
>
>
>
>
> Error Log info:
>
>
>
> 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1]
> dict.DictionaryGenerator:91 : Dictionary value samples: 10101001120172=>
>
> 479, 10101003212212=>480, 10101003812579=>481, 10101005033448=>482,
> 10101005046605=>483
>
> 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1]
> dict.DictionaryGenerator:92 : Dictionary cardinality: 108330611
>
> 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1]
> dict.DictionaryGenerator:93 : Dictionary builder class: org.apache.kylin
>
> .dict.DictionaryGenerator$StringDictBuilder
>
> 2016-11-10 14:08:00,798 DEBUG [pool-7-thread-1]
> dict.DictionaryGenerator:94 : Dictionary class: org.apache.kylin.dict.Tr
>
> ieDictionary
>
> 2016-11-10 14:08:00,798 INFO  [pool-7-thread-1] dict.DictionaryManager:169
> : Growing dict is not enabled
>
> 2016-11-10 14:08:00,803 INFO  [pool-7-thread-1] dict.DictionaryManager:186
> : 5 existing dictionaries of the same column
>
> 2016-11-10 14:08:00,804 INFO  [pool-7-thread-1] dict.DictionaryManager:420
> : DictionaryManager(1455639799) loading Dicti
>
> onaryInfo(loadDictObj:true) at /dict/DMT.DMT_KYLIN_PAY_SYT_
> ORDR_DET_I_D/OUTBIZNO/6ab19cbd-93f6-48b2-9eb8-a4
>
>
>
>
>
> 2016-11-10 14:09:00,651 INFO  [pool-7-thread-1] dict.DictionaryManager:404
> : Saving dictionary at /dict/DMT.DMT_KYLIN_PA
>
> Y_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e419.dict
>
> 2016-11-10 14:09:02,658 DEBUG [pool-7-thread-1]
> persistence.ResourceStore:207 : Directly saving resource /dict/DMT.DMT_K
>
> YLIN_PAY_SYT_ORDR_DET_I_D/OUTBIZNO/745317e4-a682-4f6a-8ac3-cec2a466e419.dict
> (Store kylinDev_metadata@hbase)
>
> 2016-11-10 14:09:04,263 ERROR [pool-7-thread-1]
> execution.AbstractExecutable:115 : error running Executable: MergeDictio
>
> naryStep{id=b0d32a86-3516-4232-b041-aabe127cccc5-00, name=Merge Cuboid
> Dictionary, state=RUNNING}
>
>
>
>   java.lang.OutOfMemoryError: Requested array size exceeds VM limit
>
>          at java.util.Arrays.copyOf(Arrays.java:2271)
>
>          at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.
> java:113)
>
>          at java.io.ByteArrayOutputStream.ensureCapacity(
> ByteArrayOutputStream.java:93)
>
>          at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.
> java:140)
>
>          at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2147)
>
>          at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102)
>
>          at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123)
>
>          at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078)
>
>          at org.apache.kylin.storage.hbase.HBaseResourceStore.
> putResourceImpl(HBaseResourceStore.java:239)
>
>          at org.apache.kylin.common.persistence.ResourceStore.
> putResource(ResourceStore.java:208)
>
>          at org.apache.kylin.dict.DictionaryManager.save(
> DictionaryManager.java:413)
>
>          at org.apache.kylin.dict.DictionaryManager.saveNewDict(
> DictionaryManager.java:209)
>
>          at org.apache.kylin.dict.DictionaryManager.trySaveNewDict(
> DictionaryManager.java:176)
>
>          at org.apache.kylin.dict.DictionaryManager.mergeDictionary(
> DictionaryManager.java:269)
>
>          at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.
> mergeDictionaries(MergeDictionaryStep.java:145)
>
>          at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.
> makeDictForNewSegment(MergeDictionaryStep.java:135)
>
>          at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.
> doWork(MergeDictionaryStep.java:67)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
>
>          at org.apache.kylin.job.execution.DefaultChainedExecutable.
> doWork(DefaultChainedExecutable.java:57)
>
>          at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
>
>          at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:136)
>
>          at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>
>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>
>          at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> In /bin/ setenv.sh :
>
>      export KYLIN_JVM_SETTINGS="-Xmx100g -Xms100g -Xmn2g  -XX:+UseG1GC
> -XX:MaxPermSize=128M -verbose:gc -XX:+PrintGCDetails
>
> -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLo
>
> gFileSize=64M"
>
>
>
> but this error still appear,so Where is to modify the parameters can be
> solved  ??
>
>
>
>
>
>
>
>
>
>
>
>
>



-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to