Hi:

I have tested the  query time latency in In both cases.

In our CDH cluster environment, I get the following experimental results.

kylin sample data
Query Result Size
Compress Time
Query Duration(Compress)
Query Duration(Uncompressed)
0.25M
5ms
0.18s
0.23s
0.5M
20ms
0.38s
0.38s
0.7M
25ms
0.52s
0.45s

SSB data
Query Result Size
Compress Time
Query Duration(Compress)
Query Duration(Uncompressed)
0.25M
4ms
0.12s
0.15s
0.5M
7ms
0.25s
0.24s
0.7M
10ms
0.35s
0.35s
1M
13ms
0.41s
0.39s
5M
63ms
2.26s
2.27s
10M
135ms
5.10s
4.90s
16M
215ms
7.89s
7.60s

Conclusion:
Enable compression will improve query speed when result size<0.5M.
Turning on compression will reduce query speed in general when result size>1M.

So,it is recommended to set the default value of 
kylin.storage.hbase.endpoint-compress-result to false.


> 在 2020年1月4日,19:35,Yaqian Zhang <yaqian_zh...@126.com> 写道:
> 
> HI Kang:
> 
> Thank you for your compare and report!
> 
> I will test and verify the query time latency for this!
> 
>> 在 2020年1月3日,10:32,Zhou Kang <zhoukan...@outlook.com> 写道:
>> 
>> Hi,all
>> 
>> kylin.storage.hbase.endpoint-compress-result is TRUE as default.
>> In Xiaomi Group, we found compression will cause query time latency up to 30 
>> sec and more. After we analyze log in HBase, we found compression is useless 
>> in most situations.
>> 
>> Detail info you can see in : https://issues.apache.org/jira/browse/KYLIN-4322
>> 
>> And more, in our environment,
>> 
>> 1.     Only 0.05% data is bigger than 1M
>> 
>> 2.     Almost 70% compression data is larger than source data.
>> 
>> So, should we set this config FALSE as default.
>> And, kylin.storage.hbase.endpoint-compress-result should be override in cube 
>> or project, which is forbidden in CubeVisitService:visitCube now.
>> 

Reply via email to