Step 4. we gave 4gb to kylin server.
#4 Step Name: Build Dimension Dictionary Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.IdentityHashMap.resize(IdentityHashMap.java:471) at java.util.IdentityHashMap.put(IdentityHashMap.java:440) at org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(TrieDictionaryBuilder.java:464) at org.apache.kylin.dict.NumberDictionaryBuilder.build(NumberDictionaryBuilder.java:43) at org.apache.kylin.dict.DictionaryGenerator$NumberDictBuilder.build(DictionaryGenerator.java:186) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81) at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73) at org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321) at org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50) at org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41) at org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113) On Thu, Jun 22, 2017 at 7:59 PM, ShaoFeng Shi <shaofeng...@apache.org> wrote: > In which step it ran out of memory? could you share the JSON of the Cube > definition? It can be found in the "JSON(Cube)" tab. > > 2017-06-23 8:48 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>: > >> The column has count distinct measure as well. so it still doesn't need >> GD? i tried, but appears it ran out of memory. >> >> On Thu, Jun 22, 2017 at 5:36 PM, ShaoFeng Shi <shaofeng...@apache.org> >> wrote: >> >>> For integer values, Global Dictionary is not needed. >>> >>> So what you do is just set "integer:4" as the encoding in the dimension, >>> and leave blank for the global dictionary. >>> >>> 2017-06-23 6:30 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>: >>> >>>> Thanks ShaoFeng. >>>> >>>> so to clarify. for UHC dimension. It is integer. So i can set >>>> encoding to integer and then also include it in GD for count distinct? or >>>> leave it out of GD and add it as integer encoding only? >>>> >>>> >>>> >>>> On Wed, Jun 21, 2017 at 10:55 PM, ShaoFeng Shi <shaofeng...@apache.org> >>>> wrote: >>>> >>>>> Hi Sonny, >>>>> >>>>> I see; it is a defect: for one column Kylin at most use 1 dictionary, >>>>> it couldn't differenciate ordinary dict and Global dict when that column >>>>> is >>>>> used in both dimension and measure. >>>>> >>>>> 25million is a Ultra High Cardinality dimension, it is not suitable >>>>> for dict as the dict size will beyond Java heap size. In this case, please >>>>> use fixed_length encoding; If that column is integer or long type, you can >>>>> use "integer" encoding. In the meanwhile, keep using GD for the count >>>>> distinct measure. >>>>> >>>>> 2017-06-22 13:37 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>: >>>>> >>>>>> I see what you mean @ShaoFeng Shi. >>>>>> >>>>>> I noticed one of the measures I have defined is also a dimension. So >>>>>> what can I do in this case? it is both needed as a count distinct >>>>>> measure >>>>>> and dimension. The typical dictionary gives java heap space error. its >>>>>> approximately 25m unique keys. Any ideas on how best kylin can handle >>>>>> this? should I remove it as GD and add as dim & fix length? >>>>>> >>>>>> On Wed, Jun 21, 2017 at 10:33 PM, Sonny Heer <sonnyh...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> No, not as a dimension. Only for Count distinct measures. >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 21, 2017 at 10:25 PM, ShaoFeng Shi < >>>>>>> shaofeng...@apache.org> wrote: >>>>>>> >>>>>>>> Hi Sonny, are you using GlobalDictionary for a dimension? If so, >>>>>>>> pls change to use ordinary dictionary. >>>>>>>> >>>>>>>> The GlobalDictionary is a "one-way" dictionary, as it can only >>>>>>>> encode a String to an integer, it doesn't support decode the String >>>>>>>> from an >>>>>>>> integer. The main usage for GlobalDictionary is the precise Count >>>>>>>> Distinct, >>>>>>>> as bitmap only accepts integer as input, so Kylin use the GD to do the >>>>>>>> conversion. >>>>>>>> >>>>>>>> 2017-06-22 6:23 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>: >>>>>>>> >>>>>>>>> After finally getting the global dictionary to work with building >>>>>>>>> the cube there are now exceptions during query. >>>>>>>>> >>>>>>>>> ERROR in query: >>>>>>>>> "AppendTrieDictionary can't retrive value from id" >>>>>>>>> >>>>>>>>> >>>>>>>>> Here is where it ends up in the code::: -> >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> >>>>>>>>> final protected T getValueFromIdImpl(int id) { >>>>>>>>> >>>>>>>>> throw new UnsupportedOperationException("AppendTrieDictionary >>>>>>>>> can't retrive value from id"); >>>>>>>>> >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> >>>>>>>>> protected byte[] getValueBytesFromIdImpl(int id) { >>>>>>>>> >>>>>>>>> throw new UnsupportedOperationException("AppendTrieDictionary >>>>>>>>> can't retrive value from id"); >>>>>>>>> >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>> @Override >>>>>>>>> >>>>>>>>> protected int getValueBytesFromIdImpl(int id, byte[] >>>>>>>>> returnValue, int offset) { >>>>>>>>> >>>>>>>>> throw new UnsupportedOperationException("AppendTrieDictionary >>>>>>>>> can't retrive value from id"); >>>>>>>>> >>>>>>>>> } >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best regards, >>>>>>>> >>>>>>>> Shaofeng Shi 史少锋 >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> Sonny S. Heer >>>>>>> Senior Software Engineer >>>>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 >>>>>>> <(509)%20884-2574> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> Sonny S. Heer >>>>>> Senior Software Engineer >>>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> >>>>> Shaofeng Shi 史少锋 >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> Sonny S. Heer >>>> Senior Software Engineer >>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574> >>>> >>> >>> >>> >>> -- >>> Best regards, >>> >>> Shaofeng Shi 史少锋 >>> >>> >> >> >> -- >> >> >> Sonny S. Heer >> Senior Software Engineer >> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574> >> > > > > -- > Best regards, > > Shaofeng Shi 史少锋 > > -- Sonny S. Heer Senior Software Engineer m: 360-434-4354 h: 509-884-2574