Hi Hao,
It depends on how big the dictionary will be, and how fast you expect to
get. You can do some profiling test to verify it.
Kylin uses EHCache, if it couldn't fulfill the need, you can extend it to
other cache implementations.
2017-11-29 19:12 GMT+08:00 杨浩 :
> Thank
Thank you , it has been enabled for the count_disctinct column has been
stored in HDFS.
What I warry about is that if a big dict is not in the cache , a query may
be very slow for having to fetch data from hbase or hdfs.
2017-11-29 16:58 GMT+08:00 ShaoFeng Shi :
> Hi
Hi Hao,
Kylin will automatically detect whether a resource size exceeds HBase
cell's max size; if yes, it will save it to HDFS:
https://github.com/apache/kylin/blob/master/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/HBaseResourceStore.java#L419
Please check whether it works on
I have generated the dict data size on TrieDictionaryForestBenchmark.
If cardinality is less than 2, the dict size will be less than 802KB.
WIll the cardinality be less than 2 to set a col as dict if we want to
speed up query speed if the cell size (less than 1MB) is limit by hbase
Thanks. The biggest number has been writen in "Kylin Guide", but it may
affect query performance for hbase limit of KV cell size. As there are
many cubes in the KYLIN , query server would fetch the dict from hbase many
times. Our hbase admin says, if a KV size is under about 500 KB, the query
The cap is 5 million I remember, But it's better to control that less than
1 million.
2017-11-24 20:33 GMT+08:00 杨浩 :
> There are many cubes in our kylin env. Can any one give the numer of how
> big cardinal of a column if we want to code a column as dict?
>
--
Best
There are many cubes in our kylin env. Can any one give the numer of how
big cardinal of a column if we want to code a column as dict?