Step 4.  we gave 4gb to kylin server.

#4 Step Name: Build Dimension Dictionary

Caused by: java.lang.OutOfMemoryError: Java heap space

        at java.util.IdentityHashMap.resize(IdentityHashMap.java:471)

        at java.util.IdentityHashMap.put(IdentityHashMap.java:440)

        at
org.apache.kylin.dict.TrieDictionaryBuilder.buildTrieBytes(TrieDictionaryBuilder.java:464)

        at
org.apache.kylin.dict.NumberDictionaryBuilder.build(NumberDictionaryBuilder.java:43)

        at
org.apache.kylin.dict.DictionaryGenerator$NumberDictBuilder.build(DictionaryGenerator.java:186)

        at
org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:81)

        at
org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:73)

        at
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:321)

        at
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:222)

        at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)

        at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:41)

        at
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:54)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)

        at
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)

        at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)

On Thu, Jun 22, 2017 at 7:59 PM, ShaoFeng Shi <shaofeng...@apache.org>
wrote:

> In which step it ran out of memory? could you share the JSON of the Cube
> definition? It can be found in the "JSON(Cube)" tab.
>
> 2017-06-23 8:48 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>:
>
>> The column has count distinct measure as well.  so it still doesn't need
>> GD?  i tried, but appears it ran out of memory.
>>
>> On Thu, Jun 22, 2017 at 5:36 PM, ShaoFeng Shi <shaofeng...@apache.org>
>> wrote:
>>
>>> For integer values, Global Dictionary is not needed.
>>>
>>> So what you do is just set "integer:4" as the encoding in the dimension,
>>> and leave blank for the global dictionary.
>>>
>>> 2017-06-23 6:30 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>:
>>>
>>>> Thanks ShaoFeng.
>>>>
>>>> so to clarify.  for UHC dimension.  It is integer.  So i can set
>>>> encoding to integer and then also include it in GD for count distinct?  or
>>>> leave it out of GD and add it as integer encoding only?
>>>>
>>>>
>>>>
>>>> On Wed, Jun 21, 2017 at 10:55 PM, ShaoFeng Shi <shaofeng...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Sonny,
>>>>>
>>>>> I see; it is a defect: for one column Kylin at most use 1 dictionary,
>>>>> it couldn't differenciate ordinary dict and Global dict when that column 
>>>>> is
>>>>> used in both dimension and measure.
>>>>>
>>>>> 25million is a Ultra High Cardinality dimension, it is not suitable
>>>>> for dict as the dict size will beyond Java heap size. In this case, please
>>>>> use fixed_length encoding; If that column is integer or long type, you can
>>>>> use "integer" encoding. In the meanwhile, keep using GD for the count
>>>>> distinct measure.
>>>>>
>>>>> 2017-06-22 13:37 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>:
>>>>>
>>>>>> I see what you mean @ShaoFeng Shi.
>>>>>>
>>>>>> I noticed one of the measures I have defined is also a dimension.  So
>>>>>> what can I do in this case?  it is both needed as a count distinct 
>>>>>> measure
>>>>>> and dimension.  The typical dictionary gives java heap space error.  its
>>>>>> approximately 25m unique keys.  Any ideas on how best kylin can handle
>>>>>> this?  should I remove it as GD and add as dim & fix length?
>>>>>>
>>>>>> On Wed, Jun 21, 2017 at 10:33 PM, Sonny Heer <sonnyh...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> No, not as a dimension.  Only for Count distinct measures.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 21, 2017 at 10:25 PM, ShaoFeng Shi <
>>>>>>> shaofeng...@apache.org> wrote:
>>>>>>>
>>>>>>>> Hi Sonny, are you using GlobalDictionary for a dimension? If so,
>>>>>>>> pls change to use ordinary dictionary.
>>>>>>>>
>>>>>>>> The GlobalDictionary is a "one-way" dictionary, as it can only
>>>>>>>> encode a String to an integer, it doesn't support decode the String 
>>>>>>>> from an
>>>>>>>> integer. The main usage for GlobalDictionary is the precise Count 
>>>>>>>> Distinct,
>>>>>>>> as bitmap only accepts integer as input, so Kylin use the GD to do the
>>>>>>>> conversion.
>>>>>>>>
>>>>>>>> 2017-06-22 6:23 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>:
>>>>>>>>
>>>>>>>>> After finally getting the global dictionary to work with building
>>>>>>>>> the cube there are now exceptions during query.
>>>>>>>>>
>>>>>>>>> ERROR in query:
>>>>>>>>> "AppendTrieDictionary can't retrive value from id"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here is where it ends up in the code::: ->
>>>>>>>>>
>>>>>>>>>     @Override
>>>>>>>>>
>>>>>>>>>     final protected T getValueFromIdImpl(int id) {
>>>>>>>>>
>>>>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>>>>> can't retrive value from id");
>>>>>>>>>
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     @Override
>>>>>>>>>
>>>>>>>>>     protected byte[] getValueBytesFromIdImpl(int id) {
>>>>>>>>>
>>>>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>>>>> can't retrive value from id");
>>>>>>>>>
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     @Override
>>>>>>>>>
>>>>>>>>>     protected int getValueBytesFromIdImpl(int id, byte[]
>>>>>>>>> returnValue, int offset) {
>>>>>>>>>
>>>>>>>>>         throw new UnsupportedOperationException("AppendTrieDictionary
>>>>>>>>> can't retrive value from id");
>>>>>>>>>
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Shaofeng Shi 史少锋
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>>> Sonny S. Heer
>>>>>>> Senior Software Engineer
>>>>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574
>>>>>>> <(509)%20884-2574>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Sonny S. Heer
>>>>>> Senior Software Engineer
>>>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi 史少锋
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Sonny S. Heer
>>>> Senior Software Engineer
>>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>>
>> --
>>
>>
>> Sonny S. Heer
>> Senior Software Engineer
>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


-- 


Sonny S. Heer
Senior Software Engineer
m: 360-434-4354 h: 509-884-2574

Reply via email to