I have generated the dict data size on TrieDictionaryForestBenchmark.
If cardinality is less than  20000, the dict size will be less than 802KB.
WIll the cardinality be less than 20000 to set a col as dict  if we want to
speed up query speed if the cell size (less than 1MB) is limit by hbase
admin?

cardinality 0 10000 20000 30000 40000 50000 60000 70000
dict size  64B  406KB  802KB  1MB  1MB  1MB  2MB  2MB

2017-11-25 22:42 GMT+08:00 杨浩 <yangha...@gmail.com>:

> Thanks. The biggest number  has been writen in "Kylin Guide", but it may
> affect query performance for hbase limit of KV cell size.  As there are
> many cubes in the KYLIN , query server would fetch the dict from hbase many
> times. Our hbase admin says, if a KV size is under about 500 KB, the query
> perfmance can be guaranteed. So the dict size should be less than 500KB in
> our env.
>
> We may choose 1 million or half of that as the guide to use dict to ensure
> the query perfmance
>
> 2017-11-24 22:25 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>
>> The cap is 5 million I remember, But it's better to control that less
>> than 1 million.
>>
>> 2017-11-24 20:33 GMT+08:00 杨浩 <yangha...@gmail.com>:
>>
>>> There are many cubes in our kylin env. Can any one give the numer of how
>>> big cardinal of a column if we want to code a column as dict?
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>

Reply via email to