Agree with Yaqian, we may set the default value to FALSE
Best regards, Ni Chunen / George On 01/9/2020 10:41,Zhou Kang<zhoukan...@outlook.com> wrote: ( ̄▽ ̄)” Seems mail list disable rich text. kylin sample data +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ | Query Result Size | Compress Time | Query Duration(Compress) | Query Duration(Uncompressed) | +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ | 0.25M | 5ms | 0.18s | 0.23s | | 0.5M | 20ms | 0.38s | 0.38s | | 0.7M | 25ms | 0.52s | 0.45s | +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ SSB data +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ | Query Result Size | Compress Time | Query Duration(Compress) | Query Duration(Uncompressed) | +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ | 0.25M | 4ms | 0.12s | 0.15s | | 0.5M | 7ms | 0.25s | 0.24s | | 0.7M | 10ms | 0.35s | 0.35s | | 1M | 13ms | 0.41s | 0.39s | | 5M | 63ms | 2.26s | 2.27s | | 10M | 135ms | 5.10s | 4.90s | | 16M | 215ms | 7.89s | 7.60s | +────────────────────+────────────────+───────────────────────────+───────────────────────────────+ 发件人: Zhou Kang <zhoukan...@outlook.com> 答复: "dev@kylin.apache.org" <dev@kylin.apache.org> 日期: 2020年1月9日 星期四 上午10:34 收件人: "dev@kylin.apache.org" <dev@kylin.apache.org>, Yaqian Zhang <yaqian_zh...@126.com> 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression Hi, Yaqian Zhang: Thanks for your query latency tests. I retyped the test data for easy reading kylin sample data Query Result Size Compress Time Query Duration (Compress) Query Duration (Uncompressed) 0.25M 5ms 0.18s 0.23s 0.5M 20ms 0.38s 0.38s 0.7M 25ms 0.52s 0.45s SSB data Query Result Size Compress Time Query Duration (Compress) Query Duration (Uncompressed) 0.25M 4ms 0.12s 0.15s 0.5M 7ms 0.25s 0.24s 0.7M 10ms 0.35s 0.35s 1M 13ms 0.41s 0.39s 5M 63ms 2.26s 2.27s 10M 135ms 5.10s 4.90s 16M 215ms 7.89s 7.60s 发件人: Yaqian Zhang <yaqian_zh...@126.com<mailto:yaqian_zh...@126.com>> 答复: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" <dev@kylin.apache.org<mailto:dev@kylin.apache.org>> 日期: 2020年1月8日 星期三 下午8:04 收件人: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" <dev@kylin.apache.org<mailto:dev@kylin.apache.org>> 主题: Re: [DISCUSS] Cost-benefit of HBase scan result compression Hi: I have tested the query time latency in In both cases. In our CDH cluster environment, I get the following experimental results. kylin sample data Query Result Size Compress Time Query Duration(Compress) Query Duration(Uncompressed) 0.25M 5ms 0.18s 0.23s 0.5M 20ms 0.38s 0.38s 0.7M 25ms 0.52s 0.45s SSB data Query Result Size Compress Time Query Duration(Compress) Query Duration(Uncompressed) 0.25M 4ms 0.12s 0.15s 0.5M 7ms 0.25s 0.24s 0.7M 10ms 0.35s 0.35s 1M 13ms 0.41s 0.39s 5M 63ms 2.26s 2.27s 10M 135ms 5.10s 4.90s 16M 215ms 7.89s 7.60s Conclusion: Enable compression will improve query speed when result size<0.5M. Turning on compression will reduce query speed in general when result size>1M. So,it is recommended to set the default value of kylin.storage.hbase.endpoint-compress-result to false. 在 2020年1月4日,19:35,Yaqian Zhang <yaqian_zh...@126.com<mailto:yaqian_zh...@126.com><mailto:yaqian_zh...@126.com><mailto:yaqian_zh...@126.com%3e>> 写道: HI Kang: Thank you for your compare and report! I will test and verify the query time latency for this! 在 2020年1月3日,10:32,Zhou Kang <zhoukan...@outlook.com<mailto:zhoukan...@outlook.com><mailto:zhoukan...@outlook.com><mailto:zhoukan...@outlook.com%3e>> 写道: Hi,all kylin.storage.hbase.endpoint-compress-result is TRUE as default. In Xiaomi Group, we found compression will cause query time latency up to 30 sec and more. After we analyze log in HBase, we found compression is useless in most situations. Detail info you can see in : https://issues.apache.org/jira/browse/KYLIN-4322 And more, in our environment, 1. Only 0.05% data is bigger than 1M 2. Almost 70% compression data is larger than source data. So, should we set this config FALSE as default. And, kylin.storage.hbase.endpoint-compress-result should be override in cube or project, which is forbidden in CubeVisitService:visitCube now.