Re: how big cardinal of a column if we want to code a column as dict?

2017-11-29 Thread ShaoFeng Shi
Hi Hao, It depends on how big the dictionary will be, and how fast you expect to get. You can do some profiling test to verify it. Kylin uses EHCache, if it couldn't fulfill the need, you can extend it to other cache implementations. 2017-11-29 19:12 GMT+08:00 杨浩 : > Thank

Re: how big cardinal of a column if we want to code a column as dict?

2017-11-29 Thread 杨浩
Thank you , it has been enabled for the count_disctinct column has been stored in HDFS. What I warry about is that if a big dict is not in the cache , a query may be very slow for having to fetch data from hbase or hdfs. 2017-11-29 16:58 GMT+08:00 ShaoFeng Shi : > Hi

Re: how big cardinal of a column if we want to code a column as dict?

2017-11-29 Thread ShaoFeng Shi
Hi Hao, Kylin will automatically detect whether a resource size exceeds HBase cell's max size; if yes, it will save it to HDFS: https://github.com/apache/kylin/blob/master/storage-hbase/src/main/java/org/apache/kylin/storage/hbase/HBaseResourceStore.java#L419 Please check whether it works on

Re: how big cardinal of a column if we want to code a column as dict?

2017-11-29 Thread 杨浩
I have generated the dict data size on TrieDictionaryForestBenchmark. If cardinality is less than 2, the dict size will be less than 802KB. WIll the cardinality be less than 2 to set a col as dict if we want to speed up query speed if the cell size (less than 1MB) is limit by hbase

Re: how big cardinal of a column if we want to code a column as dict?

2017-11-25 Thread 杨浩
Thanks. The biggest number has been writen in "Kylin Guide", but it may affect query performance for hbase limit of KV cell size. As there are many cubes in the KYLIN , query server would fetch the dict from hbase many times. Our hbase admin says, if a KV size is under about 500 KB, the query

Re: how big cardinal of a column if we want to code a column as dict?

2017-11-24 Thread ShaoFeng Shi
The cap is 5 million I remember, But it's better to control that less than 1 million. 2017-11-24 20:33 GMT+08:00 杨浩 : > There are many cubes in our kylin env. Can any one give the numer of how > big cardinal of a column if we want to code a column as dict? > -- Best

how big cardinal of a column if we want to code a column as dict?

2017-11-24 Thread 杨浩
There are many cubes in our kylin env. Can any one give the numer of how big cardinal of a column if we want to code a column as dict?