Hi: I have tested the query time latency in In both cases.
In our CDH cluster environment, I get the following experimental results. kylin sample data Query Result Size Compress Time Query Duration(Compress) Query Duration(Uncompressed) 0.25M 5ms 0.18s 0.23s 0.5M 20ms 0.38s 0.38s 0.7M 25ms 0.52s 0.45s SSB data Query Result Size Compress Time Query Duration(Compress) Query Duration(Uncompressed) 0.25M 4ms 0.12s 0.15s 0.5M 7ms 0.25s 0.24s 0.7M 10ms 0.35s 0.35s 1M 13ms 0.41s 0.39s 5M 63ms 2.26s 2.27s 10M 135ms 5.10s 4.90s 16M 215ms 7.89s 7.60s Conclusion: Enable compression will improve query speed when result size<0.5M. Turning on compression will reduce query speed in general when result size>1M. So,it is recommended to set the default value of kylin.storage.hbase.endpoint-compress-result to false. > 在 2020年1月4日,19:35,Yaqian Zhang <yaqian_zh...@126.com> 写道: > > HI Kang: > > Thank you for your compare and report! > > I will test and verify the query time latency for this! > >> 在 2020年1月3日,10:32,Zhou Kang <zhoukan...@outlook.com> 写道: >> >> Hi,all >> >> kylin.storage.hbase.endpoint-compress-result is TRUE as default. >> In Xiaomi Group, we found compression will cause query time latency up to 30 >> sec and more. After we analyze log in HBase, we found compression is useless >> in most situations. >> >> Detail info you can see in : https://issues.apache.org/jira/browse/KYLIN-4322 >> >> And more, in our environment, >> >> 1. Only 0.05% data is bigger than 1M >> >> 2. Almost 70% compression data is larger than source data. >> >> So, should we set this config FALSE as default. >> And, kylin.storage.hbase.endpoint-compress-result should be override in cube >> or project, which is forbidden in CubeVisitService:visitCube now. >>